The MODIS Global Land Cover Product (MOD12Q1)
The MODIS global land cover product is designed to provide information related to the state and seasonal-to-decadal scale dynamics in global land cover. The product consists of two suites of science data sets (SDS’s). MODIS land cover type (MOD12Q1), includes five main layers in which land cover is mapped using different classification systems. MODIS land cover dynamics (MOD12Q2) includes seven layers, and has been developed to support studies of seasonal and interannual variation (phenology) in land surface and ecosystem properties. Both products are global. In collections 1, 3 and 4 MOD12 was produced at a spatial resolution of 1-km. In collection 5, the spatial resolution has been increased to 500-m.
The MOD12Q1 global land cover product includes a set of internally consistent layers depicting different land cover classifications. These layers include the International Geosphere-Biosphere Programme (IGBP; Loveland and Belward, 1997) classification; a 14-class system developed at the University of Maryland (UMD; Hansen et al., 2000); a 6-biome system used by the MODIS LAI/FPAR algorithm (Myneni et al., 1997; Lotsch et al, 2001); the biome classification proposed by Running et al. (1995); and the plant functional type classification described by Bonan et al. (2002). Secondary labels (the most likely alternative IGBP class) and classification confidences (McIver and Friedl, 2001) are also provided for each pixel, and a lower spatial-resolution climate modeling grid (CMG) is produced for users who do not require the spatial detail afforded by main land cover product. The CMG provides both the dominant land cover type in each cell, as well as the sub-grid scale frequency distribution of land cover classes within each cell.
Algorithm Description
The classification strategy used by the MODIS land cover product employs a supervised decision tree classification algorithm called C4.5 (Quinlan 1993). This approach is supported by a variety of recent work demonstrating the utility of decision trees for land cover classification problems in remote sensing (DeFries et al., 1998; Friedl and Brodley, 1997; Friedl et al., 1999; 2000; 2002; Hansen et al., 1996; 2000; McIver and Friedl, 2001; 2002). C4.5 is a univariate decision tree that makes no assumptions regarding the frequency distribution of the data being classified. This attribute is particularly important at global scales, because virtually all classes of interest exhibit multimodal frequency distributions and therefore violate assumptions required by parametric supervised approaches such as the maximum likelihood classifier (Schowengerdt, 1997).
In addition to being nonparametric, C4.5 possesses several other traits that make it particularly useful for classification of land cover from MODIS data at global scales. First, C4.5 includes elegant and robust solutions for dealing with missing data. This attribute is especially crucial at high latitudes where a substantial proportion of the input MODIS data are missing because of low solar zenith angles, and in the tropics where missing data are frequent because of cloud cover. Second, C4.5 includes mature methods for “pruning” the estimated classifications, thereby avoiding classifications that are overfit to training data.
A key feature of the MODIS land cover classification algorithm is a technique known as “boosting” (Freund 1995). Boosting is one of numerous ensemble classification methods developed in the mid- to late 1990’s that have been widely shown to enhance classification accuracy (Bauer and Kohavi, 1999; Dietterich, 2000). Boosting also serves to minimize the sensitivity of the classification algorithm to both noise in feature data and labeling errors in training data.
Training Data
The MOD12Q1 algorithm relies heavily on a database of land cover exemplars for classification estimation. Because global land cover is highly diverse, a key requirement of this data base is that it be geographically and ecologically comprehensive, thereby capturing the global variability of land cover. To meet these needs, the System for Terrestrial Ecosystem Parameterization (STEP) was developed (Muchoney et al. 1999). STEP is designed to provide a classification-free and versatile database for site-based characterization of global land cover.
The current STEP database (i.e., for collection 5) consists of roughly 2000 sites distributed globally. However, the database is dynamic and requires ongoing maintenance and augmentation to meet the needs of the MODIS global land cover mapping effort. Sites included in the database are derived from manual interpretation of Landsat Thematic Mapper (TM) data, augmented by ancillary map data, as available.