Friday, September 12, 2014

Image Quality Assessment and Statistical Analysis


-Introduction-

Satellite imagery will often contain data redundancy which needs to be identified and removed before accurate analysis can be preformed. This image preprocessing can be accomplished through use of univariate statistics, multivariate statistics, and feature space plots. In this lab, feature space plots and correlation matrixes will be created to analyze bands in Landsat ETM+ and Quickbird satellite images to learn more about the practical uses of statistical analysis.






            • Screenshot of Landsat ETM+ imagery of Eau Claire, WI and the surrounding area




            • Screenshot of the Quickbird Imagery of the Florida Keys










            • Screenshot of the Quickbird Imagery of parts of Bangladesh







-Methods-

Programs: ERDAS IMAGINE 2013

Creation of Feature Space Plots

Navigate to Raster > Supervised > Feature Space Image. The "Create Feature Space Images" window will appear.


Image 1: Six bands will be analyzed in this Landsat ETM+ image. Fifteen feature space plots will be created through the unique pairing of bands 1, 2, 3, 4, 5, and 7. This image displays these band combinations. With " Output To Viewer" checked, the feature space plots were then displayed on the screen and shown in the Discussion section below.
 
 
Creation of Correlation Matrixes
To create a correlation matrix, the Model Maker tool was utilized. To open model maker navigate to Toolbox > Model Maker. A "New_Model" window will appear as well as a tool palette.

 
Image 2: Here is the first model created using the same Landsat ETM+ imagery as above. The model is rather simple with a raster connected to a function which in turn is connected to a matrix. Each of the three models followed this design. 
 
The matrixes created from the models needed to be opened in notepad, copied, and pasted (using the "Use Text Import Wizard" and checking the "Delimited" option) into excel for proper formatting. Further information on the statistical analysis is detailed in the Discussion section below.


-Discussion-

In determining data redundancy, visual interpretation through use of feature space plots and statistical analysis through use of correlation matrixes were utilized.

A feature space plot graphically illustrations the degree of correlation between two bands in satellite imagery by extracting the brightness values of each pixel and plotting their frequency. The brighter and "wider" a feature space plot, the higher the frequency of differing values, indicating low correlation. The darker and "narrower" a feature space plot, the lower the frequency of differing values, indicating high correlation.


Image 3: The fifteen feature space plots created from the Landsat ETM+ imagery. The yellow numbers were added with Photoshop for easy identification of band combinations.




Image 4: Narrow Feature Space Plot

From the feature space plots created, band combination 2 - 3, seen to the left, is an example of a narrow feature space plot. By visually examination, it can be determined that one of these bands could be eliminated from further analysis because they exhibit data redundancy. Data redundancy is another way of saying the bands portray similar information which could cause over- or underestimation of values in later analysis. Therefore, narrow feature space plots are undesirable.



Image 5: Wide Feature Space Plot


Band combination 4 - 5, seen to the right, is an example of a wide feature space plot. The amount of bright color in this plot indicates that the brightness values in the two bands have little correlation. This tells us there is little data redundancy and analyzing these bands together will give unique information. Wide feature space plots allow for accurate information to be gathered in later analysis and are therefore desirable.



After examining feature space plots, the analysis was taken a step further with the creation of correlation matrixes. The correlation of two bands is calculated by taking the covariance between the bands divided by the product of the bands standard deviation and will result in a coefficient varying between -1 and 1, with values close to 1 indicating data redundancy. If two bands are highly correlated their correlation coefficient must by equal to or greater than 0.95. When this occurs, one of the bands should be excluded to reduce error and computation costs of further analysis. In contrast, a low correlation coefficient indicates low correlation and unique information.

Three correlation matrixes were created in this lab. The first using Landsat ETM+ imagery and the others using Quickbird Imagery.



Table 1: Correlation Matrix of Landsat ETM+ Imagery
of Eau Claire, WI
In Table 1, it can be seen that band 2 has high correlation with band 1, with a coefficient near 0.926, and band 3, with a coefficient near 0.943. Nether of these coefficients exceed the threshold of 0.95 however, so a judgment call or preferably further analysis will be needed to determine if excluding Band 2 will be significant in reducing data redundancy.






Table 2: Correlation Matrix of Quickbird Imagery
of the Florida Keys
In Table 2, it can be seen that band 1 and band 2 have very high correlation, exceeding the threshold of 0.95 with a coefficient near 0.987. This indicates that one of these bands needs to be excluded to reduce data redundancy.





Table 3: Correlation Matrix of Quickbird Imagery
 of Bangladesh
In Table 3, it can be seen that, similarly to the last Quickbird image, bands 1 and 2 have high correlation exceeding the 0.95 threshold with a coefficient near 0.963. Again, one of these bands should be excluded from further analysis.



In determining which bands to use and which to exclude, it can be helpful to examine the rest of the matrix and determine if one of the bands in question has high correlation with another band. However, generally the type of analysis being preformed will determine which bands stay and which bands go when it comes to correlation. For example, if Landsat ETM+ imagery is used and bands 5 (NIR) and 1 (Visible Blue) show high correlation, it makes little sense to exclude band 5 if the goal of the analysis is vegetation identification (due to fundamental properties of plants). Using the same example, if the goal of the analysis is focused on water, it would make little sense to exclude band 1 (due to the high absorption of NIR radiation in water).


-Conclusion-

The visual results, in the form of feature space plots and correlation matrixes, can be seen in the Discussion section above. Creating, using and analyzing these graphical and statistical methods built on knowledge gained from the introductory from of this class to establish techniques for determining data redundancy in satellite imagery. Along with the removal of noise, the removal of data redundancy is an important part of preprocessing to insure the data collected will be accurate as possible.


-Sources-

Earth Resources Observation and Science Center, United States Geological Survey. (2011). Landsat ETM+

Global Land Cover Facility. (2012). Quickbird. www.landcover.org



No comments:

Post a Comment