Change Detection in Multilook Polarimetric SAR Imagery With Determinant Ratio Test Statistic

In this article, we propose a determinant ratio test (DRT) statistic to measure the similarity of two covariance matrices for unsupervised change detection in polarimetric radar images. The multilook complex covariance matrix is assumed to follow a scaled complex Wishart distribution. In doing so, we provide the distribution of the DRT statistic that is exactly Wilks’s lambda of the second kind distribution, with density expressed in terms of Meijer G-functions. Due to this distribution, the constant false alarm rate (CFAR) algorithm is derived in order to achieve the required performance. More specifically, a threshold is provided by the CFAR to apply to the DRT statistic producing a binary change map. Finally, simulated and real multilook polarimetric SAR (PolSAR) data are employed to assess the performance of the method and is compared with the Hotelling–Lawley trace (HLT) statistic and the likelihood ratio test (LRT) statistic.


I. INTRODUCTION
T HE synthetic aperture radar (SAR) image change detection has become very important in remote sensing for monitoring dynamic processes on the Earth, for instance, damage assessment in urban areas, deforestation and clearcut detection, flooding, and monitoring of glaciers. Change detection is a process that analyzes multitemporal remote sensing images acquired on the same geographical area for identifying changes that occurred at distinct observation dates. The result is a generation of a change detection map in which changed areas are explicitly identified.
Several unsupervised change detection methods have been proposed in the literature. Two families of methods handle the change detection process: the statistical information theory and the hypothesis test theory. The first family is based on Manuscript  information-theoretic measures in order to design a distance between images. In doing so, surrounding pixels are modeled by a given distribution and distance measures are applied to obtain a comparative statistic. With this family, the change detection is less sensitive to acquisition conditions since the used methodology considers information from the neighborhood. This information is relevant when dealing with data corrupted by speckle or/and the number of looks is low: to name a few of statistical information distances: mutual information [1], [2], variational and mixed information [1], [3], and stochastic distances such as the Kullback-Leibler, Rényi, Bhattacharyya, and Hellinger distances [4]. The second family aims to compute a covariance equality test and to test the hypotheses of change versus no-change where an asymptotic distribution is usually employed by the test statistic. The simplest one is to compute the ratio of SAR amplitudes or intensities observed at different times. This kind of ratios is a well-known test statistic in singlechannel SAR-based change detection [5]. A large number of test statistics have been developed and described in the literature for automatic and unsupervised change detection, such as mean ratio/log-ratio measures [6], [7], Gauss log ratios [8], multitemporal coherence analysis [6], and maximumlikelihood ratios [9].
Polarimetric SAR (PolSAR) gives more scattering information than single-polarization channel SAR data, which can be used to detect the change and increase the quality of the change detection map. The first work on test statistics for change detection in multilook PolSAR data was proposed by Conradsen et al. [10]. They proposed a likelihood ratio test (LRT) for the equality of two complex covariance matrices 1 and 2 and gave the approximated distribution of the LRT statistic. The LRT algorithm worked by comparing two hypotheses: the null hypothesis (H 0 ) corresponding to "no-change" and an alternative hypothesis (H 1 ) corresponding to "change." The LRT approach was extended to the multitemporal case [11], [12] and multifrequency data [13]. Kersten et al. [14] compared three test statistics: the LRT statistic also called the Bartlett test, the contrast ratio test, and the ellipticity test. The second one was based on the largest and the smallest eigenvalues of 1 scaled complex Wishart distribution for the covariance matrix data. It was based on the complex-kind Hotelling-Lawley trace (HLT) that was used to measure the similarity of two covariance matrices. They applied a decision threshold to the test statistic to detect changes. The threshold was determined using the constant false alarm rate (CFAR) algorithm [15]. Akbari et al. proposed the Fisher-Snedecor (FS) distribution as the approximation of the sampling distribution of the HLT test statistic. The parameters of the FS distribution depended only on the dimension of the polarimetric data and the equivalent number of looks (ENL) estimated for images. They used the method of the matrix log cumulants [16] to estimate the ENL.
In this article, the ratio of the determinant of two matrices is proposed as a new test statistic to measure the similarity of two covariance matrices that are assumed to follow scaled complex Wishart distributions. The new test is called the determinant ratio test (DRT) statistic and it is able to produce a scalar value, to which a threshold is applied. The distribution of the test statistic under the null hypothesis is exactly Wilks's lambda of the second kind distribution with density expressed in terms of Meijer G-functions [17]. Wilks's lambda distribution corresponds to the product of d independent Beta distributed of the second kind random variables, where d is the number of polarimetric channels. The latter distribution depends on the dimension of the polarimetric data and the ENL.
For the production of a binary change detection map, a threshold is applied to the test statistic. Several thresholding methods have been suggested in the literature to determine the threshold in a completely unsupervised manner: to name a few of them, CFAR algorithm [15], Otsu's method [18], and Kittler and Illingworth (K&I) algorithm [19], [20]. In this article, we limit ourselves to the CFAR algorithm since we are more interested in the test statistic rather than the thresholding method. Consequently, the final binary change map is made at a predefined false alarm rate (FAR). To further illustrate the potential of our method in change detection for multilook PolSAR data, the well-known HLT and LRT test statistics for measuring the equality of two multilook covariance matrices are compared with our approach. Simulated and real multilook PolSAR data are used for this comparison.
This article is organized as follows. Section II introduces the statistical model for the covariance matrix data. Section III describes some properties related to the scaled complex Wishart distribution, the DRT for equality of two complex scaled Wishart distributions, and its sampling distribution. In Section IV, the proposed polarimetric change detection algorithm based on the DRT is presented, followed by the CFAR principle to determine the threshold. The HLT and LRT methods are briefly shown for a later comparison with our method. Section V demonstrates the performance of the method with a simulated and real PolSAR data set and discusses the results. Section VI is dedicated to a summary and a conclusion.

II. MULTILOOK POLSAR IMAGE MODEL
The polarimetric scattering vector is defined as with s xy representing the complex scattering coefficients, where x is the transmit and y is the receive polarization. Moreover, h denotes horizontal, v denotes vertical [21], [.] T means transposition, and d = dim(s) is the vector dimension. The vector s is a single-look polarimetric complex format representation of PolSAR data. It is assumed that s is a d-dimensional speckle vector, which follows a circular complex Gaussian distribution (s ∼ N C d (0, )), with a zero-mean vector and a covariance matrix . The multilooking of PolSAR data reduces the speckle effect characteristic of coherent imaging systems. The polarimetric multilooking operation is given by where L is the number of looks, (.) H denotes the Hermitian operator, and X ∈ + ⊂ C d×d is the multilook polarimetric covariance matrix considered as a random matrix defined on the cone, denoted + , of the positive definite complex Hermitian matrices. When L ≥ d, the unnormalized sample covariance matrix defined as Z = LX follows the nonsingular complex Wishart distribution [22] denoted as Z ∼ W C d (L, ) and X follows a scaled complex Wishart distribution, denoted X ∼ sW C d (L, ), with a probability density function (pdf) given by f X (X) = f Z (LX)|J Z→X |, where |J Z→X | = L d 2 is the Jacobian determinant of the transformation Z = LX [23]. The pdf of X is where etr(.) = exp(tr(.)) is the exponential trace operator, |.| is the determinant operator, and d (L) is the multivariate gamma function of the complex kind defined as where (L) is the standard Euler gamma function.

III. THEORY
This section describes some properties related to the scaled complex Wishart distribution and the DRT for equality of two complex scaled Wishart distributions.
Corollary 2: Let X and Y be two complex Hermitian positive definite random d × d matrices that follow scaled complex Wishart distributions defined as: The random variable defined by the determinant ratio of L x X and L y Y is distributed as the product of d independent Beta distributed of the second kind (also called Beta prime) random variables with parameters (L x − i ) and (L y − i ) where the pdf for Beta II (L x − i, L y − i ) is given by where G 1,1 1,1 · · · is the Meijer G-function defined in Section II-A. Proof: Corollary 2 is the consequence of (5) and the relationship between the ratio of two independent χ 2 random variables and the Fisher (F) distribution with two parameters given as follows.

A. Meijer G-Function
The Meijer G-function is a generalization of the generalized hypergeometric function that is defined using the contour integral representation [25] G m,n p,q are, in general, complex-valued. Contour L is a suitable integration contour that separates the poles of function (b j −s) from the poles of (1−a j +s). The Meijer G-function has been implemented in some commercial software packages, such as MATLAB and MAPLE.

B. Wilks' Lambda Distribution of Determinant Ratio Statistic
Definition 3: For noninteger values of n and q and integer values of p with n ≥ q ≥ p, the pdf of Wilks's lambda distribution of the second kind, denoted as (n, p, q), is given as [17] f where ). The cumulative distribution function is given by The kth moment E{Z k } is given by Proof: See Appendix B. With reference to Definition 3, the distribution of ((|L x X|)/(|L y Y|)) is given as follows: It is worthy to note that the determinant ratio depends on the parameters L x and L y . In the case L x = L y = L, Wilks's lambda distribution of the second kind is given by (2L, d, L). Fig. 1(a) and (b) shows the pdf of (2L, d, L), with multiple values of L, and different cases: d = 2 and d = 4, respectively.

IV. POLARIMETRIC CHANGE DETECTOR
We consider X and Y two statistically independent Hermitian positive definite random d × d matrices that follow scaled complex Wishart distributions with different distribution parameters defined as: Multilook PolSAR images acquired over the same geographical area before event (at time t x ) and after event (at time t y ) are used to detect any change by comparing at each position (i, j ) the full polarimetric information before change and after change given, respectively, by X(i, j ) and Y(i, j ). Here, we suppose that at each (i, j ), we have two matrices X(i, j ) and Y(i, j ). As a consequence, we resort to compute the determinant ratio of L x X and L y Y described in Section IV-A.

A. Determinant Ratio Statistic
The determinant ratio statistic is defined by DRT is used to measure the similarity between the two polarimetric covariance matrices X and Y and perform change detection by choosing between hypotheses [5] H 0 : Null hypothesis (H 0 ) corresponds to no-change, and hypothesis (H 1 ) corresponds to change. To quantify the difference between H 0 and H 1 , a threshold selection procedure is applied to the test statistic τ DRT . It is worth mentioning that the hypothesis tests are developed with distinct ENLs, i.e., L x = L y . Nevertheless, in the first experiment with simulations, we assume that the ENL is the same for both images. The exact distribution of the DRT statistic under null hypothesis (H 0 ) is Wilks's lambda distribution of the second kind given by For particular case where L x = L y = L, the DRT statistic becomes τ DRT ∼ (2L, d, L).

B. CFAR Thresholding Method
In our study, change detection is realized by applying a decision threshold to the test statistics [5]. We choose the CFAR algorithm [15] as a thresholding method in order to perform a fair comparison between our proposed approach and the HLT method since we are interested in the test statistic rather than the thresholding method. Let f τ DRT (τ ) be the distribution of the ratio τ DRT under the hypothesis H 0 . The significance level of the test α c , expressed in percent, is given as a function of the desired false alarm probability P fa and then given by α c = 100P fa . The threshold is determined from the distribution of the determinant ratio statistic.
We adopt the same approach presented in the paper of Akbari et al. [5] where changes from X to Y and reversely from Y to X were considered in the CFAR change detector. This results in using the following two ratios: The combined test is given by The combined threshold T is derived from When P fa is specified, the threshold is obtained by solving (22), and then, the CFAR change detector is obtained. The proposed unsupervised change detection based on DRT between two multilook PolSAR data acquired before and after change is summarized in the following steps. 1) Find a global estimation ofL x andL y .
3) Compute the CFAR threshold for a specific P fa . 4) Apply the threshold and obtain the binary change detection map. It is important to mention that the quality of the change detection map depends on the estimation accuracy of L x and L y . For an efficient estimation of the ENL, many methods have been put forward in the literature for automatic estimate.
To name a few of them, we mention the following. First, the method of Anfinsen et al. [16] was based on the maximum likelihood estimator for the ENL under the assumption that data follow the complex Wishart distribution. Second, the method proposed by Tao et al. [26] was based on the development of trace moments (DTMs). This ENL estimator cancels the textural variation using trace moments. Finally, the method developed by Bouhlel [27] was based on the fractional moments of the determinant of the multilook polarimetric covariance (FMDC) matrix. The FMDC estimator of the ENL had the particularity of being independent of the distribution of the texture model. All these methods consisted in performing a local estimate of the ENL by using a sliding window covering the whole image. Then, the distribution of estimates was drawn and the mode value of the density corresponds to the desired global estimate of the ENL. It is worthy to note that the presence of texture and correlation between samples can affect the estimation of the ENL.
In Section IV-C, we briefly describe the HLT and the LRT statistics in order to compare them with our proposed statistic.

C. HLT Statistic
The complex-kind HLT statistic is defined as The exact distribution of the HLT statistic is difficult to derive and an approximation was put forward by Akbari et al. [5].
It is the FS 1 distribution with three parameters used as an approximation to τ HLT and is given as follows: The expression of the FS distribution is given by where μ = E{τ HLT } > 0 is a scale parameter and ξ, ζ > 0 are two shape parameters. The parameters of the FS distribution are computed in terms of distribution parameters of scaled Wishart matrices X and Y. The solutions for μ, ξ , and ζ are defined by the following equation system [5]: where m (FS) ν and m (HLT) ν are the νth order moments of the FS distribution and the HLT statistic, respectively. For more details about the expressions of the moments, reader can refer to [5] and [28]. Technically, to estimate the shape parameters ξ and ζ , a minimum distance optimization is used to solve the following: Finally, a good fitting of the FS distribution depends on the estimation accuracy of (μ, ξ, ζ ), which in turn depends on the estimates of L x and L y [5]. It is worth noticing that the FS distribution of the HLT detector for a low number of looks does not provide a good approximation and fitting as it will be seen in the experiments. The combined test is derived as follows: The threshold is determined from the resolution of the following equation:

D. LRT Statistic
The Wishart LRT statistic was derived as [10] The test statistic for change detection based on the LRT is given by where The distribution of τ LRT under the null hypothesis (H 0 ) is approximated by using the associated asymptotic distribution of the test statistic [10] where χ 2 (d 2 ) denotes a central χ 2 distribution with d 2 degrees of freedom and The test with a desired P fa is given by where the threshold T is determined through the equation It is worth mentioning that the LRT is a one-sided test; however, the DRT and HLT are two-sided test.

V. EXPERIMENTAL RESULTS
The performance of the proposed DRT statistic is evaluated on both simulated and real PolSAR images. The HLT and the LRT tests are implemented for comparison with our test.

A. Simulated Data
We simulate two quad-pol data containing two L-look Pol-SAR images of 250 × 250 pixels and having four polarimetric channels (d = 4). The generated data follow a scaled Wishart distribution with a covariance matrix of the speckle j defined in Table I where j = {1, . . . , 7}. Then, the polarimetric data contain seven different classes (areas). Area 7 corresponds to the polarimetric properties of a heterogeneous urban area. Area 3 is simulated with the polarimetric properties of a homogeneous water region. The rest of the areas correspond to the properties of agricultural crops and vegetation regions. Different values of the number of looks L are used in this study, L ∈ {5, 6, 7, 8}. Fig. 2 shows the Pauli decomposition of the two simulated five-look quad-pol PolSAR data corresponding to images before and after change and the binary  By computing the DRT, HLT, and LRT statistics, Fig. 3 shows the results of change detection relative to these methods and for different cases of L. As it can be seen, each row corresponds to a particular value of L starting from 5 for the first row to 8 for the last row. Fig. 3(ai), (ai'), and (ai") where i∈ {1, 2, 3, 4} show, respectively, the logarithm of DRT, the logarithm of HLT, and the LRT statistics for different values of L. In addition, Fig. 3(bi) shows the comparison between the normalized histograms of τ DRT and the estimated Wilks's lambda of the second kind pdfs computed over the nonchange area. It is clear that the estimated pdf curves fit well with the normalized histograms. The same comparison is made in Fig. 3(bi') between τ HLT and the estimated FS pdfs used as approximations. It is shown that for a low number of looks smaller than 7, the FS distribution of the HLT detector is not providing a good fitting. However, the DRT detector works better for the small number of looks. Another comparison is made in Fig. 3(bi") between τ LRT and the estimated pdf given by (33). Furthermore, Fig. 3(ci), (ci'), and (ci") shows the corresponding binary change maps obtained by the thresholding CFAR algorithm. Indeed, binary change detection map is obtained by the rejection of the hypothesis test at a 1% significance level for the DRT, HLT, and LRT detectors. It is noted that changes can be seen in these statistics for each case of L-values. Moreover, the proposed method detects more efficiently the changed area even for highly speckled cases with a low number of looks. It is also worth noticing that at the central part of the binary images (region 2), where heterogeneous change detection is present, the DRT is better than the others, but all three detectors seem not to work properly. This leads us to extend this work under non-Wishart case [29], [30] or relaxed Wishart distribution where the ENL is a local parameter [31]. The fit ability between the estimated Wilks's of the second kind pdfs and the normalized histograms of τ DRT is evaluated qualitatively by using the Kolmogorov-Smirnov (KS) hypothesis test. The smaller value of KS indicates better the hypothesized model fits with the empirical distribution. A small value of p-value of the test indicates strong incompatibilities of the data with the employed distribution hypothesis. Table II lists the values of KS and the p-values of the test (in percentage) obtained for the four cases of L under null hypothesis (H 0 ). It is evident from the p-value that the fitted Wilk's of the second kind distribution can perfectly model the no-change area.
A quantitative evaluation of the change detection performance is also provided at four different significance levels or specified FARs. Table III shows the measured FAR [false positive rate (FPR)] and the detection rate [true positive rate (TPR)] for DRT, HLT, and LRT statistics for various levels of multilooking and for different specified FARs. As shown in Table III, the DRT statistic realizes higher detection rates and lower overall error rates than the HLT and LRT statistics at specified FARs, especially when the speckle is strong and the number of looks L is low. As L increases to reach 8 and at specified FARs, the detection rates for these statistical tests increase and the overall error rates decrease. Likewise, when L passes to 8, the detection rate of both the DRT and HLT becomes close. It is also worth noticing that the measured FAR is close to the specified FAR regardless of the detector used.
Another quantitative evaluation of the performance of the method is provided by the receiver operating characteristic (ROC) curves that are plotted for these statistics using the ground truth. The ROC curve is the evolution of the TPR as a function of the FPR [32]. Fig. 4 shows the ROC   For real data, we use a set of two RADARSAT-2 fully PolSAR images over an urban area in Suzhou city, East China, acquired on April 9, 2009, and June 15, 2010. The two images corresponding to before and after changes are multilooked with 24-looks (six looks in range and four looks in azimuth directions). Fig. 5(a) and (b) shows the Pauli decomposition images of these two PolSAR images. The ground-truth map is represented in Fig. 5(c) and contains nochange test pixels with white and change test pixels with gray.
In contrast, the black designates urban areas that do not satisfy the Wishart assumption. The ENL for both images before and after changes is estimated to be 7.2 and 6.9, respectively. Fig. 6(a), (a'), and (a"), respectively, shows the logarithm of the DRT, the logarithm of the HLT, and the LRT statistic. The plots of the normalized histograms and the estimated distributions for DRT, HLT, and LRT detectors are given, respectively, in Fig. 6(b), (b'), and (b"). Fig. 6(c), (c'), and (c") shows their corresponding binary change map obtained by the thresholding CFAR algorithm. The binary change detection map is obtained by the rejection of the hypothesis test at a 1% significance level for the DRT, HLT, and LRT detectors.
Regarding the quantitative assessment, the ROC curves for the DRT, HLT, and LRT statistics used in the experiments are given in Fig. 7. In this example, the obtained results show that the ROC curves of the HLT and LRT are above the ROC curve of the DRT. The measured AUC for these methods based on the DTR, HLT, and LRT statistics is 99.68%, 99.45%, and 99.13%, respectively. Different significance levels of the test are chosen where α c ∈ {0.5%, 1%, 5%, 10%} for evaluation. The detection rate, the measured FAR, and the overall error rate for the three methods are shown in Table V. At each specified FAR level, the best values are highlighted in red. We can clearly see that the DRT always gives the best detection rate for any specified FAR level. Also, for the overall error rate, the DRT gives good results in comparison with the other statistics for the case of a low value of significance levels     The images are 2 × 3 multilooked. Fig. 8 shows the Pauli decomposition of these two scenes with two images each are obtained by the JPL's UAVSAR sensor at two different times. Fig. 8(c) and (c') shows the ground truths used to compute the ROC curves. We recall that the white corresponds to change and the black to no-change. As we can see, the interesting area of this data set is an urban area where the changes occurred due to the effects of urbanization. Fig. 9 compares the change detection results relative to DRT, HLT, and LRT statistics. As it can be noted, the logarithms of max{τ DRT τ DRT }, max{τ HLT , τ HLT }, and τ LRT are first computed and shown, respectively, in Fig 9(ai), (ai'), and (ai"), where i ∈ {1, 2} representing scenes 1 and 2. The areas with change are marked in green (small change) and red (strong change). The areas without change are marked in blue. Fig. 9(b1) and (b2) show a good fitting between the normalized histograms of τ DRT and the estimated Wilks's lambda of the second kind pdf computed over the nonchange area, respectively. As shown in Section II, Wilks's lambda of the second kind distribution depends on the ENL and dimension (d) of the covariance matrix. The comparison between the normalized histograms of τ HLT and the estimated FS pdf computed over the nonchange area is shown in Fig. 9(b1') and (b'2). We can see here a good fitting of the FS distribution with the normalized histograms. The same comparison is realized in Fig. 9(b"1) and (b"2) with the LRT statistic and the approximated pdf given by (33) and computed over the nonchange area. As we can see, the estimated pdf curves based on χ 2 distributions do not fit well the normalized histograms. As consequence, the CFAR algorithm fails to provide the best threshold for the LRT statistic. Fig. 9(ci), (ci'), and (ci") shows the binary change   For a final comparison between these change detectors, we draw Table VI. It reports the detection rate, the measured FAR, and the overall error rate computed for the three detectors for scenes 1 and 2, at four different significance levels of the test α c . The DRT statistic obtains the highest detection rate in this example, followed by the HLT statistic. The LRT statistic gives the worst performance. The DRT also achieves a lower overall error rate compared to the HLT and LRT statistics except for scene 1 when α c ≥ 5%. Regarding the measured FAR, the DRT gives values close to the specified FAR in most cases (cases of scene 2), but in other cases, the values are far apart (cases of scene 1). The performance of the DRT statistic is evaluated as well through the ROC curve and compared with that of the HLT and LRT statistic. Fig. 10 shows the corresponding ROC curves of the three detectors, indicating a better performance for the DRT statistic for low FPR values followed by the HLT and then by the LRT. The AUC is also provided for this example. For scene 1, it is 81.60% for the DRT, 80.42% for the HLT, and 78.96% for the LRT. For scene 2, it is 78.31% for the DRT, 77.77% for the HLT, and 74.41% for the LRT. Indeed, the AUC given by the DRT statistic is larger than the others.

VI. CONCLUSION
The DRT statistic has been proposed for change detection in multilook PolSAR images. Under the null hypothesis corresponding to no change, the statistic follows exactly Wilks's lambda distribution of the second kind depending on the ENL and the covariance matrix dimension. The fit ability is evaluated quantitatively using the KS goodness-of-fit test. With this statistic test distribution, the decision threshold can be efficiently determined at a specified probability of false alarm by using the CFAR-threshold method. The performance of the method has been evaluated on simulated and real multilook PolSAR data where ground-truth data were available and has been compared to the performance of the known HLT and LRT detectors. The results in terms of measured FAR, detection rate, ROC curve, and AUC have shown that the proposed DRT statistic outperforms the HLT and LRT performances especially for low ENL values and can perform a binary change detection map very close to the ground-truth data. The method can be extended to a heterogeneous model by considering the texture in the change detection process. The presence of texture can be useful to improve the detection, but additional texture parameters need to be estimated.

APPENDIX B
The Mellin integral transform of the density function g(z) of the product d−1 i=0 X i of d independent random variables X i with density functions f i (x i ) is given by [33]

M{g(z)}(s)
Under suitable restrictions [34], [35] satisfied by all density functions, the density function g(z) is given by the following inverse formula: Since the Mellin integral transform of 1/(1 + x) L x −i+L y −i is given by and for any function f (x) it follows that the density function of the Beta distribution of the second kind has the following Mellin transform: Then, the Mellin integral transform of the pdf g(z) of the product of d independent random Beta type II variables is Accordingly, the g(z) pdf is written otherwise as a function of the inverse Mellin transform and is given as follows: Again, using the definition of the Meijer G-function (10), this results in where Z (s) = M{g(z)}(s) is the Mellin transform of the g(z) pdf. The kth moment E{Z k } is given by