research article unsupervised regionalization of...

December 17, 2015 17:7 International Journal of Geographical Information Science IJGIS2015˙V3

International Journal of Geographical Information ScienceVol. 00, No. 00, Month 200x, 1–20

RESEARCH ARTICLE

Unsupervised regionalization of the United States intolandscape pattern types

J. Niesterowicza, T.F. Stepinskia∗ and J. Jasiewicza,b

aSpace Informatics Lab, Department of Geography, University of Cincinnati, OH,USA; bInstitute of Geoecology and Geoinformation, Adam Mickiewicz University,

Poznan, Poland

(Received 00 Month 200x; final version received 00 Month 200x)

We present a pattern-based regionalization of the conterminous US – a par-titioning of the country into a number of mutually exclusive and exhaustiveregions that maximizes the intra-region stationarity of land cover patternsand inter-region disparity between those patterns. The result is a discretiza-tion of the land surface into a number of Landscape Pattern Types (LPTs) –spatial units each containing a unique quasi-stationary pattern of land coverclasses. To achieve this goal, we use a recently developed method which uti-lizes machine vision techniques. First, the entire National Land Cover Dataset(NLCD) is partitioned into a grid of square-size blocks of cells, called motifels.The size of a motifel defines the spatial scale of a local landscape. The landcover classes of cells within a motifel form a local landscape pattern whichis mathematically represented by a histogram of co-occurrence features. Usingthe Jensen-Shannon divergence as a dissimilarity function between patterns wegroup the motifels into several LPTs. The grouping procedure consists of twophases. First, the grid of motifels is partitioned spatially using a region-growingsegmentation algorithm. Then the resulting segments of this grid, each repre-sented by its medoid, are clustered using a hierarchical algorithm with Ward’slinkage. The broad-extent maps of progressively more generalized LPTs result-ing from this procedure are shown and discussed. Our delineated LPTs agreewell with the perceptual patterns seen in the NLCD map.

Keywords: automatic regionalization; landscape; pattern analysis; NLCD; large geodata

∗Corresponding author. Email: [email protected]

ISSN: 1365-8816 print/ISSN 1362-3087 onlinec© 200x Taylor & FrancisDOI: 10.1080/1365881YYxxxxxxxxhttp://www.informaworld.com


2 J. Niesterowicz et at.

1. Introduction

Regionalization is the spatial delimitation of natural, social, economic, cultural, or po-litical spheres of reality (Werlen 2009). In other words, it is the process of aggregatinga large number of individual geographical objects into a much smaller number of prefer-ably spatially contiguous and internally uniform units, which are more meaningful andeasier to analyze (Chorley and Haggett 1967, Johnston 1970). It has been applied tomany diverse domains including political science (George et al. 1997), geomorphology(Chai et al. 2009, Jasiewicz et al. 2014), hydrology (Peterson et al. 2011), climatology(Fovell and Fovell 1993, Stooksbury and Michaels 1991), biogeography (Kreft and Jetz2010, Patten and Smith-Patten 2008), agricultural science (Hollander 2012, Lark 1998),and ecology (Hargrove and Hoffman 2004, Omernik and Griffith 2014).

One of the best known examples of regionalization on a large scale is the division of theUnited States into physiographic regions by Fenneman (1915). Fenneman used a holisticapproach to regionalization; his division is based on years of analyzing multiple aspectsof the environment. Current regionalizations of the US into ecoregions are likewise basedon a combination of surficial factors (Hargrove and Hoffman 2004, Omernik and Griffith2014). In this paper we focus on a single aspect of the environment – Land Use/LandCover (LULC) – but, instead of identifying areas occupied by any single LULC class,we regionalize the US with respect to patterns of LULC classes. Algorithmic, automatic(unsupervised) subdivision of the US into Landscape Pattern Types (LPTs) – spatialunits containing unique quasi-stationary patterns of LULC classes (Wickham and Norton1994) – is the major novelty of this work. We refer to pattern as quasi-stationary toindicate that it does not substantially change across a delineated zone.

Identification and delineation of LPTs in a large area leads to the creation of broad-extent maps of LPTs. Such maps are of significant interest for conservation planning(Long et al. 2010, Cain et al. 1997, Powers et al. 2012) and large-scale land management(Partington and Cardille 2013), as well as in academic research whose goal is to findassociations between various environmental variables.

Previous pattern-based regionalizations, such as those by Fenneman (1915), Wickhamand Norton (1994), and Omernik and Griffith (2014), are based on manual methodolo-gies. This is a tedious process, especially when conducted on a large spatial scale usingrelatively high-resolution data. The process is also intrinsically subjective and skewed to-ward the analyst’s perspective – different analysts will divide a study area differently evenif they use the same data. The recent availability of high resolution LULC datasets oncontinental (NLCD) and even global (Chen et al. 2015) scales makes adopting a manualapproach increasingly difficult and instead calls for an automated means of performingregionalizations.

From a computational perspective a regionalization is an unsupervised classificationwhich results in the generalization of the original LULC map into an useful overviewacross a large area (broad-extent map of LPTs). Most of the existing work on unsuper-vised quantification and regionalization of LULC patterns focuses on forestry and con-servation. In this context, LPTs of binary (forest/non-forest) land cover patterns havebeen identified and delineated (Gustafson and Parker 1992, Long et al. 2010, Andrewet al. 2012). To the best of our knowledge only two previous studies aimed at unsuper-vised delineation of LPTs from multi-categorical LULC datasets have been undertaken.Cardille and Lambois (2010) divided the NLCD 1992 into a square grid having cellsof size 6.5 km from which they identified 17 different LPTs. However, their study wasrestricted to the identification of exemplars of these LPTs and no actual regionalization


International Journal of Geographical Information Science 3

map was presented. Partington and Cardille (2013) applied a similar methodology tothe 25 m resolution Earth Observation for Sustainable Development of Forests (EOSD)land-cover dataset (Wulder et al. 2008) within the Canadian province of Quebec. Theyidentified and mapped 5 LPTs at a spatial scale of 30 km.

All the aforementioned unsupervised regionalizations used the same approach to assess-ing the value of dissimilarity between landscapes. This approach consists of representinglandscapes as feature vectors of landscape metrics (LMs) (Haines-Young and Chopping1996, Herzog and Lausch 2001, McGarigal et al. 2002) and calculating the Euclidean dis-tance between these feature vectors to quantify the similarity between two landscapes.Hereafter we refer to such method as the standard approach. Our approach, inspired bythe methods developed in the context of machine vision (Gevers and Smeulders 2004,Datta et al. 2008), departs from this standard. Based on our earlier work (Jasiewiczand Stepinski 2013a, Niesterowicz and Stepinski 2013, Jasiewicz et al. 2015) we repre-sent landscapes as normalized histograms of co-occurrence features and we calculate thedissimilarity between two landscapes by calculating the Jensen-Shannon divergence be-tween two histograms. A machine vision-inspired approach is robust as there is no needto combine metrics of different meaning, ranges of values, and varying relevance into asingle value of similarity between two landscapes. It is also computationally efficient andthus suitable for application to large datasets such as the entire NLCD.

The aim of this paper is to describe a methodology for the pattern-based regionalizationof a very large multi-categorical raster and to apply it to the LULC dataset in order toreveal broad-scale geographical information that can not be objectively obtained by othermeans. We demonstrate the feasibility of our method by regionalizing the entire NLCD2011, thus revealing the forms and spatial extents of the most prevalent LPTs withinthe conterminous US. In addition, by performing the regionalization in a hierarchicalfashion we also test the degree to which an algorithm can be relied upon to “understand”perceptual similarities between different LPTs.

The remainder of this paper is structured as follows. Section 2 describes data used,our computational methodology, and the software we have developed and implemented.Section 3 presents the results of applying our methodology to the regionalization of theentire NLCD2011. Section 4 contains a discussion on the usage, abilities, limitations andpotential applications of the method.

2. Data, methodology and software

2.1. Data

For input into our regionalization algorithm we use the National Land Cover Database2011 (NLCD2011) provided by the Multi-Resolution Land Characteristics (MRLC) Con-sortium. It is a LULC multi-categorical raster map covering the entire conterminous USwith a spatial resolution of 30 m resulting in 161,190×104,424 cells. The NLCD is theresult of a decision-tree classification of Landsat-7 satellite images (Jin et al. 2013) into16 land cover categories within the conterminous US (see legend to Fig. 6). All the anal-yses and presented maps are in the Albers Equal Area projection, the original projectionof the NLCD.



2.2. Methodology

2.2.1. General framework

Our methodology employs several techniques from the domains of data science andmachine vision. The core concept is Complex Object-Based Image Analysis (COBIA)(Vatsavai 2013, Stepinski et al. 2015). Two different implementations of the COBIAframework have been independently developed, Vatsavai’s (2013) implementation con-centrates on images whereas our implementation (Jasiewicz and Stepinski 2013a, Stepin-ski et al. 2014, Jasiewicz et al. 2015) concentrates on categorical rasters including landcover datasets. Within the COBIA framework an image or, in our case, the NLCD raster,is divided arbitrarily into a regular grid of local blocks of cells at minimal computationalcost. These blocks of cells are the basic units of analysis. While such blocks are geomet-rically simple (they are square arrays of cells) they hold complex content (patterns ofdifferent LULC classes), hence the name – complex objects. Hereafter we will refer tosuch a block of cells as a motifel – the smallest processing element of the grid containinglocal motifs (patterns) of LULC classes.

COBIA fundamentally differs from a better-known image analysis framework calledObject-Based Image Analysis or OBIA (Hay and Castilla 2008, Blaschke 2010). OBIAuses clumps of cells (objects) which are geometrically complex but carry simple content(homogeneous values of cells) as its basic units of analysis. The key difference betweenthe two approaches is in the meaning of their basic processing units. While OBIA unitsrepresent particular visible features of the earth’s surface – such as a house, a road, or aparticular cultivated type of crops – COBIA units represent the local spatial arrangementof various visible features, thus, the COBIA framework is particularly well-suited toregionalization on the basis of landscape patterns. Furthermore, by its very design, agrid of motifels to be analyzed by COBIA is several orders of magnitude smaller thanthe original raster making the technique uniquely suited for analysis of large datasetsand for generation of broad-scale maps. To regionalize a raster using COBIA a numericaldescription of motifels and a function that calculates the level of dissimilarity betweenthem are needed.

2.2.2. Numerical representation of a motifel

For the numerical description of a motifel we use a histogram of color co-occurrencepattern features (Barnsley and Barr 1996, Chang and Krumm 1999). A co-occurrencefeature is a pair of classes assigned to two neighboring cells (Fig. 1E). Features are ex-tracted from a motifel by combining co-occurrence matrices calculated for eight differentdisplacement vectors along principal directions. For a raster with k possible classes theresult is a symmetric matrix which we reduce to a histogram with d = (k2 + k)/2 bins.Figs. 1A to 1D show examples of co-occurrence histograms stemming from four differentmotifels. In this hypothetical case k = 4 resulting in a co-occurrence histograms with10 bins. In the case of NLCD2011, k = 16 and the co-occurrence histogram has 136bins. A bin in a histogram gives a (normalized; divided by the sum of all bins) numberof co-occurrences (either horizontal, vertical or diagonal as shown in Fig. 1E) betweentwo LULC classes. For example, before the normalization the first bin in Fig. 1A had 20co-occurrences between green pixels in motifel A.

The k bins correspond to co-occurrence of same-class pairs and their values reflectboth, the abundance of the class and its spatial arrangement. For example, blue classhave the same abundance in motifels shown in Figs. 1A and B, but the blue-blue bin inthe histogram for motifel A is higher than the corresponding bin for motifel B because the



A

A B C D

B

C D

E F

0 0.11 0.75 0.80A

B

C

D

0

0

0

0.76 0.82

0.29

Figure 1. Panels A to D: Examples of co-occurrence histograms describing patterns of LULCclasses within motifels. Four hypothetical classes are indicated by four different colors. Panel E:Four pixel configurations constituting co-occurrence feature. Panel F: Table showing values ofdissimilarities between motifels A, B, C, and D calcuated using co-occurrence histograms and theJSD dissimilarity measure.

area occupied by the blue class in A is more aggregated. The (k2− k)/2 bins correspondto co-occurrences between different-class pairs and their values reflect a texture of classespattern in a motifel. For example, motifels C and D have the same abundance of green andorange classes but different textures. This difference is reflected by their correspondinghistograms with the green-orange bin in the histogram for motifel D being much higherthan the corresponding bin for motifel C reflecting its finer texture. In general, a largenumber of same-class co-occurrences indicates the domination of large, class-uniformpatches in the motifel, while a large number of inter-class co-occurrences indicates hightextural complexity of the motifel. Overall, a co-occurrence histogram describes bothcontent and texture of the motifel in one compact form. Application of color co-occurrencehistograms for assessing the level of similarity between blocks of land cover pixels wasfirst proposed by Barnsley and Barr (1996).

2.2.3. Dissimilarity measure

Dissimilarity measures assess the alikeness of patterns in two motifels by calculating thedissimilarity between the histograms which represent them. There exist a large numberof histogram dissimilarity functions (Cha 2007), from which we have selected the Jensen-Shannon Divergence (JSD) (Lin 1991) as it was shown to agree well with the human



visual perception of similarity between natural images (Rubner et al. 2001). The JSDexpresses the informational distance between two normalized histograms A and B as adeviation between the Shannon’s entropy of the mixture of the two histograms (A+B)/2and the mean entropy of individual histograms A and B. The value of JSD(A,B) is givenby the following formula:

JSD(A,B) =

√H

(A + B

2

)− 1

2[H(A) + H(B)], (1)

where H(A) indicates a value of the Shannon’s entropy of the histogram A:

H(A) = −d∑

i=1

log2Ai. (2)

Ai is the value of ith bin in the histogram A, and d is the number of bins (the same forboth histograms). The JSD has been proved (Endres and Schindelin 2003) to be a metricand thus it is well-suited to measure “distance” between landscapes. For normalizedhistograms the JSD measure always takes values from 0 to 1 with a value of 0 indicatingthat two motifels are identical, and a value of 1 indicating maximum dissimilarity (noneof the classes existing in one motifel can be found in the other).

It takes getting use to the concept of an overall, single value dissimilarity betweenpatterns, and, in addition, to the way dissimilarity is assessed by the JSD. Fig. 1Fis a table giving the values of JSD between four patterns shown in Figs. 1A to D.Visually, the patterns A and B are similar to each other and dissimilar to patterns Cand D, which, in turn, are dissimilar to each other. This is reflected by the values in thetable. However, note that according to JSD pattern C and D are overall more similarthan patterns C and A (or C and B, or D and A or D and B). The overall measureof dissimilarity is a result of a compromise (encapsulated by the choice of the patternrepresentation and the dissimilarity measure) between many factors contributing to thefinal assessment. For example, the JSD assessment regarding dissimilarity between Cand D versus dissimilarity between C and A may be questioned by those who emphasizetexture over composition.

Fig. 2 shows how different motifels can have the same degree of dissimilarity to a givenmotifel despite having different patterns. The focus motifel is located in the center (bull’seye) of the figure. To match a pattern to the pattern of the focus motifel all features ofboth patterns must match. Lack of matching in one or more features results in a JSD>0.Fig. 2 shows four motifels for which the level of dissimilarity from the focus pattern isJSD=0.1, eight motifels for which JSD=0.2, and eight motifels for which JSD=0.3. Thesmaller the values of JSD the more similar the patterns are to the focus pattern. Asmotifels fail to match an increasing number of features to the focus pattern they divergefrom it visually in many different ways. This is illustrated in Fig. 2 where motifels at eachlevel of dissimilarity are grouped by visually similar departures from the focus motifel.The focus motifel is a landscape dominated by pasture (yellow) with some forest (green)and a dash of an urban area (red). Four different ways in which a motifel can becomedissimilar from the central motifel are shown: (1) tending toward more urban areas, (2)tending toward more cropland (brown), (3) tending toward more water (blue), and (4)tending toward more forest. Thus, although all motifels in Fig. 2 are dissimilar to thefocus motifel at a level of at most 0.3, their mutual dissimilarities may be greater.



0.10

0.20

0.30

0.10

0.20

0.30

Figure 2. Illustration of the concept of dissimilarity between motifels. The focus motifel is locatedin the center (bull eye), other 20 motifels are arranged radially around the center according totheir dissimilarities from the focus motifel and azimuthally according to similarities betweenthemselves. See text for more details.

2.2.4. Segmentation and clustering

With motifel representation and distance function now defined, the most straightfor-ward way to regionalize the NLCD is to perform a clustering of all motifels in the gridand treat the spatial footprints of the resulting clusters as the final LPTs. This approachwas used in previous works on automated regionalization (Cardille and Lambois 2010,Partington and Cardille 2013). However, clustering of individual motifels may not pre-serve the spatial cohesion of LPTs (Niesterowicz and Stepinski 2013), which is a desirablecharacteristic of LPT regions. Additionally, when working with very large datasets andelecting to work with patterns at a small spatial scale (small motifels) the number ofmotifels to be clustered could be large making some otherwise appealing clustering tech-niques (for example, the hierarchical clustering) unfeasible. Therefore, we first perform



segmentation of the grid of motifels, and then cluster the segments instead of the mo-tifels themselves. Segmentation is the process of partitioning the grid into local areas(segments) of relatively uniform pattern. In principle, segmentation could be the onlyoperation necessary for regionalization, however, as the segmentation is a local aggrega-tion, it is possible, and indeed likely, that several disjointed segments could correspond tothe same LPT. Therefore, clustering of segments is necessary to obtain an unambiguousset of LPTs.

As we anticipate that future regionalizations may be performed on datasets differentthan the NLCD, and they may use different choices for motifel representation and dissim-ilarity function, we work with segmentation and clustering algorithms that require onlythe computation of dissimilarity between motifels. Thus, we don’t use algorithms suchas the K-means algorithm, which requires the averaging of histograms. As it happens,the co-occurrence histogram and the JSD are compatible with an averaging operation,but some other motifel representations (for example, representations used in Stepinskiet al. (2014)) and dissimilarity functions may not support averaging.

For segmentation we use a variant of a non-hierarchical greedy region-growing segmen-tation algorithm (Levine and Shaheen 1981) that has been customized to work only withdissimilarities. The algorithm has only one free parameter JSDm – a merging criterion.The JSDm value is a threshold indicating the maximum value of JSD between a motifeland its neighboring segment that allows for merging of this motifel with the segment.Dissimilarity between a single motifel and a segment is defined as the value of JSD be-tween this motifel and the medoid of the segment. A medoid is a motifel which belongsto the segment, and whose average value of JSD to all the other motifels in the segmentis the lowest; it is an equivalent of centroid when averaging is not supported. For detailsof the segmentation algorithm see Stepinski et al. (2015).

For the clustering of segments (represented by their medoids) we use a hierarchicalclustering algorithm with Ward’s linkage (Ward 1963). This algorithm uses only dissim-ilarities between segments as input and is based on the principle of minimization of thetotal within-segment variance - exactly what we want the segmentation to achieve. Inaddition, by using a hierarchical algorithm we can trace how an algorithm agglomeratesmore homogeneous LPTs into less homogeneous, more broadly defined LPTs. In otherwords, we get some insight into the relationship between the algorithmic and perceptualsimilarities of landscape patterns.

2.2.5. Determining the number of LPTs

The results of hierarchical clustering can be represented in the form of a dendrogram(see Fig. 3). By cutting the dendrogram at a certain height we obtain a clustering solution.The number of clusters (alternatively, the height at which a dendrogram is cut) is notdetermined by the hierarchical clustering algorithm itself, instead it can be selected byan analyst or estimated by an ancillary algorithm (Dunn 1973, Davies and Bouldin1979, Rousseeuw 1987, Salvador and Chan 2004). Here we use the L-method (Salvadorand Chan 2004) algorithm to estimate a suggested number of clusters. We stress thatthe result of such an estimation needs to be taken as a suggestion rather than a finaldetermination.

The L-method first constructs an evaluation graph which is a two-dimensional plotwith a number of clusters x plotted on the horizontal axis and a measure of quality ofclustering consisting of x clusters plotted on the vertical axis. For hierarchical clustering,the quality of clustering is measured by the minimum merging dissimilarity resulting inx clusters. An evaluation graph can be calculated directly from a dendrogram and has



4 LPTs

6 LPTs

3211 4 5 6 7 8 9 9 LPTs

1 15 LPTs2 3 4 5 6 7 8 9 10 11 12 13 1514

Figure 3. Heat map of dissimilarity matrix between 6,317 segments. Red-to-white color gradientindicates dissimilarities between segments from low to high. Dendrogram used to construct a heatmap is also shown up to the fifteen levels. The regions of heat map corresponding to nine andfifteen clusters are labeled by numbers and by unique colors. See text for more details.



Figure 4. L-method evaluation graph for the number of clusters from 2 to 26. The two regressionlines are the best fits to the portions of the graph corresponding to low and high number ofclusters regimes respectively. The point of their crossing indicates suggested number of clusters(eight).

a characteristic L shape (see Fig. 4). One portion of the L-shaped graph corresponds toa regime of many clusters where the minimum merging dissimilarity is small, whereasanother portion of the L-shaped graph corresponds to a regime of a few clusters where theminimum merging dissimilarity is large. The L-method calculates the location of a “knee”in the L-shaped graph where the two regimes meet, and outputs it as the recommendednumber of clusters.

2.3. Computer software

We use the GeoPAT toolbox (Jasiewicz et al. 2015) to divide the NLCD into a grid ofmotifels, to calculate their co-occurrence histograms, to calculate the JSD, and to performsegmentation. GeoPAT is a set of extension modules to GRASS GIS v.7 (Neteler et al.2012) and is written in ANSI C utilizing GRASS API. GeoPAT can be downloaded fromhttp://sil.uc.edu. Ward’s hierarchical clustering and the L-method were performed usingthe R software programming language and environment (R Core Team 2015).

3. Results

For regionalization of the NLCD we have chosen the size of the motifel to be 500×500 cellssetting the scale of land cover pattern at 15 km. This choice is arbitrary but similar to



the 9.65 km scale of the US landscape selected by Hammond (1964). Dividing the entireNLCD2011 into 15 km motifels results in a grid having the size of 323 × 209 = 67, 507.

Application of the segmentation algorithm with the JSDm value set to 0.3 agglomerates67,507 motifels into 6,317 irregular segments. This particular value of JSDm has beenselected empirically, smaller values result in a large number of small segments whereaslarger values lead to segments containing visibly inhomogeneous patterns. Although thesubsequent clustering inevitably increases inhomogeneity of LPTs, it is desirable to startfrom initially homogeneous segments. In preparation for clustering these segments intoLPTs, we calculate a 6,317×6,317 distance (dissimilarity) matrix which contains dissim-ilarities between each pair of segments. The hierarchical clustering is calculated usingonly values of the dissimilarity matrix.

Fig. 3 shows the heat map (Wilkinson and Friendly 2009) of dissimilarities between thesegments – a graphical representation of the dissimilarity matrix with rows and columnsrearranged in accordance with a dendrogram (so the segments similar to each other areclose to each other in the matrix) and the color reflecting a degree of dissimilarity from 0(red) to 1 (white). Clusters appear on the heat map as square red blocks located on thediagonal. The existence of red block structures on the diagonal of the heat map indicates,even without further mapping, that hierarchical clustering of segments will result in themeaningful regionalization of the US.

Fig. 3 also shows a dendrogram resulting from the hierarchical clustering. Although thedendrogram extends to 6,317 nodes, only fifteen nodes, corresponding to the top fifteenclusters are shown. These fifteen clusters are labeled by unique numbers and colors.Subdivisions of the fifteen clusters into smaller sub-clusters and their agglomeration intolarger clusters can be seen in the heat map. The nine larger clusters are also indicated byunique numbers and colors. One immediate observation is that at the very top level ofthe hierarchy, all segments are divided into just two clearly different LPTs, one consistingof clusters 1 to 9 (for partition into 15 clusters), and another consisting of clusters 10 to15. Interestingly, this broadest landscape pattern-based division corresponds to spatialdivision of the US into its western and eastern regions (see below).

Using the dendrogram we can regionalize the US into any number of LPTs from 1 to6,317. The L-method (Fig. 4) suggests ragionalization into eight clusters. After examiningthe dendrogram, the heat map, and the maps of several regionalizations with differentnumbers of LPTs, we decided to select division of US into nine LPTs as our primaryregionalization. Using nine instead of eight LPTs (as suggested by the L-method) dis-tinguishes between LPTs #3 and #4 which we thought are distinct enough (see Fig. 5)to be worthy mapping as separate LPTs. We also retained regionalizations into four, sixand fifteen LPTs (see Fig. 5).

Fig. 5 shows the geographical division of the US into the four, six, nine, and fifteenLPTs. The colors and labels indicating LPTs in regionalizations stemming from nine-partitioning and fifteen-partitioning are the same as those used to indicate clusters inthe dendrogram shown in Fig. 3. For each regionalization the set of exemplars charac-terizing each LPT is given. An exemplar gives an example of a pattern of LULC thatis characteristic to a given LPT; we use medoids of clusters used for a given regional-ization as exemplars. As the division of the US into LPTs is hierarchical, some of theLPTs in the nine-partitioning are agglomerates of LPTs in the fifteen-partitioning (seeFig. 3 to identify agglomerates). Similarly, some of the LPTs in the six-partitioning areagglomerates of LPTs in the nine-partitioning, and the LPTs in the four-partitioning areagglomerates of LPTs in the six-partitioning.

By cutting a dendrogram at progressively smaller values of minimum merging dissim-



4 LPTs

6 LPTs

9 LPTs

15 LPTs

1 2 3 4

1 2 3 4 5

6

1 2 3 4 5

6 7 8 9

1 2 3 4 5

6 7 8 9 10

31 511211 14

Figure 5. Regionalization maps with different number of LPTs as indicated. For each map the setof exemplars is shown to illustrate characteristic land cover patterns for each LPT. The exemplarsare color-coded to correspond to the map.

ilarity we obtain maps indicating progressively more homogeneous landscape patterns.With only two LPTs (not shown) the US divides roughly into its western and easternconstituents (there are some patches of “eastern” LPT in the western part; they cor-respond to areas covered by cropland and mixed forest). If four LPTs are selected theeastern part breaks into water (Great Lakes) and non-water, and the western part breaks



evergreen forestmosaic (8)

shrubmatrix (9)

grasslandmatrix (7)

cultivated cropsmatrix (2)

deciduous forest/pasture/cropsmosaic (3)

deciduous/pasture/evergreenmosaic (4)

woody wetland/crops/forestmosaic (6)

urbanmosaic (5)

water (1)

A

B

CD

openwater

ice/snow

developedopen space

developedlow int.

developedmedium int.

developedhigh int.

barrenland

deciduousforest

evergreenforest

mixedforest

shrub/scrub

grassland

pasture/hay

cultivatedcrops

woodywetlands

emergentwetlands

Figure 6. (A) Boundaries of nine LPTs superimposed on the NLCD map. (B) The NLCD legend.(C) Map of nine LPTs. (D) Legend for landscape pattern types, numbers in brackets indicateLPTs identification numbers as used in Fig. 5.

into the shrub matrix and the grassland matrix. With six LPTs, the cropland matrixseparates from a non-water LPT in the “eastern” branch and the evergreen forest mosaicseparates from the shrub LPT in the “western branch.”

A regionalization into nine LPTs is shown in details in Fig. 6. The purpose of thisfigure is to illustrate the degree of homogeneity of land cover patterns within each LPT(panel A). Panel B is the NLCD2011 legend. Panel C shows the regionalization of theUS into nine LPTs. This is the same map as shown in Fig. 5 (nine-partitioning) whichis included here to provide reference to the meaning of boundaries in panel A. Panel D



is the legend to landscape pattern types. Giving concise names to LPTs is a difficulttask because they are complex patterns of land cover classes. Here, we use a simplifiednomenclature proposed by Wickham and Norton (1994). The term matrix describes apattern dominated by a single LULC class, and the term mosaic describes a pattern towhich several LULC classes contribute significantly. We name LPTs using the names ofdominant land cover classes and adding either matrix or mosaic to it.

A regionalization into fifteen LPTs (see Fig. 5) leaves four LPTs (#1, #3, #5, and#8) unchanged from the nine-partitioning. Another four LPTs from the nine-partitioning(#2, #4, #6, and #7) are split (see Fig. 3) into pairs of more homogeneous LPTs. Thefinal LPT from the nine-partitioning (#9) is split into three more homogeneous LPTs.

4. Discussion and Conclusions

The novelty of the presented approach is that it regionalizes patterns of a variable (LULC)rather than the variable itself, and that it does this in an automatic, unsupervised fashion.To the best of our knowledge this is the first time patterns in the entire NLCD havebeen automatically regionalized. Earlier work by Cardille and Lambois (2010) identifiedexemplars of 17 characteristic LULC patterns in the NLCD 1992 but did not presenta map of their regionalization. Without a regionalization map their result cannot becompared to ours because the NLCD 1992 and 2011 use different, not compatible setsof LULC classes which results in incomparable sets of exemplars. The only other studydescribing a technique similar to the one described here was by Partington and Cardille(2013) who regionalized LULC in the Canadian province of Quebec. Their method differsfrom ours with respect to the manner in which motifels are represented, in the choiceof similarity function, and in the clustering algorithm used. Their regionalization wasperformed by a non-hierarchical clustering without the prior segmentation step. Thiswas feasible because they clustered, in a non-hierarchical fashion, only 1,398 motifels, theresult of using a relatively small size region (Quebec) and large sized (30 km) motifel. Sucha method would not be feasible for hierarchical regionalization of the NLCD, especiallyif one is interested in the regionalization of smaller-scale patterns.

It needs to be pointed out that the result of our regionalization cannot be quantita-tively assessed beyond calculating the quality of clustering which has only computationalrelevance (if one would like to compare two regionalizations calculated by two differentclustering algorithms) and does not comment on the usefulness of the resultant map.This is a general property of unsupervised machine learning where the goal is to explorea dataset and to reveal a priori unknown groupings in the data. In our case, the LPTsemerge from the data so to ask whether they are “correct” or “incorrect” is not a well-posed question. However, we can ask whether our regionalization is useful. This can beassessed qualitatively by visually examining Fig. 6 for the stationarity of LPTs patternsand for a subjective assessment of usefulness of the map to a problem at hand. We sus-pect that the regionalization at the level of nine LPTs, although highly illustrative, maynot offer insight beyond what is already known. However, higher level maps such as themap with 15 LPTs or one with even more detailed delineation, may indeed offer newinsight.

Visual inspection of Fig. 6A reveals that different LPTs appear to have somewhatdifferent levels of stationarity, which is a feature of hierarchical clustering which mayproduce “non-spherical” (in dissimilarity space) clusters. Non-spherical clusters tend tohave a larger range of dissimilarities between their elements than spherical clusters of



100 km

Colorado

Plateaus (20)

Southern

Rockies (21)

Arizona/New Mexico

Plateaus (22)

Soutwestern

Tablelands (26)

High

Plains (25)

Chihuahuan

Desert (24)

Arizona/New Mexico

Mountains (23)

evergreen forest

mosaic (8)

shrub

matrix (9)

grassland

matrix (7)

cultivated crops

matrix (2)

deciduous/pasture/

evergreen mosaic (4)

LPTs legend

Ecoregions legend

DC

BA

Madrean

Archipelago (79)

Figure 7. Comparison between regionalizations of the state of New Mexico into LPTs (A andB) and ecoregions (C and D). Panels A/C show boundaries of LPTs/ecoregions superimposedon the map of NLCD 2011; see Fig. 6 for the NLCD legend. Panels B/D show spatial extents ofLPTs/ecoregions. Black lines indicate state boundaries.

comparable size. Using a non-hierarchical algorithm, such as a K-medoids algorithm,could result in LPTs with a more similar degree of non-stationarity as such algorithmstend to produce more spherical clusters. However, in this paper we preferred to use ahierarchical algorithm for its ability to explicitly reveal different levels of regionalization(see Fig. 5). In addition, some LULC patterns may be characterized by larger variationsthan others, so it is not clear whether the same level of variation in all LPTs is a desirablefeature.

Methodology presented in this paper focuses on regionalization with respect to a pat-tern of a single aspect of environment – land cover. However, there is much interest inregionalization on the basis of patterns of multiple environmental factors (topography,land cover, soil properties, climate etc.) leading to the broad scale map of physiographicprovinces (Fenneman 1915), or, more recently, to the map of ecoregions (Omernik andGriffith 2014). The most recent edition of the ecoregions map is still delineated manuallypartially because its creators believed (Omernik and Griffith 2014) that regionalizationmust be based on the patterns of environmental variables rather than the values of



the variables themselves and because no pattern-based regionalization algorithms wereavailable to them. Attempts at algorithmic regionalization of ecoregions (Hargrove andHoffman 1999, 2004) relied on clustering the values of environmental variables and didnot utilize a pattern-based approach.

Our method, when extended to handle multiple environmental factors, can offer away for pattern-based, algorithmic regionalization of ecoregions. It has been shown towork well with the land cover (this paper) and with topography (Jasiewicz and Stepinski2013b, Jasiewicz et al. 2014). It can be applied in an analogous manner to other singleenvironmental factors. Future research will address the issue of how to segment andcluster motifels on the basis of a set of dissimilarity values corresponding to variousfactors. To show the potential of our method for future delineation of ecoregions Fig. 7shows a comparison between our map of LPTs (nine-partitioning) and the Level III mapof ecoregions (Omernik and Griffith 2014) within a state of New Mexico. There are fiveLPTs and eight ecoregions present. The two top panels show: (A) the boundaries of theLPTs superimposed on the NLCD2011 and (B) the actual LPTs. The two bottom panelsshow the same for ecoregions.

To assess quantitatively a degree of spatial association between LPTs and ecoregionswe use an information theoretic index called the V -measure (Rosenberg and Hirschberg2007). The V -measure uses two criteria: homogeneity (V H) and completeness (V C). Anassociation satisfies fully the homogeneity criteria if each LPT contains only locationsbelonging to a single ecoregion. An association satisfies fully the completeness criteriaif each ecoregion contains only locations belonging to a single LPT. Both V H and V Chave ranges between 0 and 1. For maps shown in Fig. 7 V H = 0.33 and V C = 0.53. Theinterpretation of these values is as follows. On average, the homogeneity of ecoregionsin a single LPT is increased over their homogeneity in the entire site (New Mexico) by33%. On average, the homogeneity of LPTs in a single ecoregion is increased over theirhomogeneity in the entire site by 53%. The relatively low value of V H is because eachLPT is found in two or more ecoregions. For example, LPT #8 (evergreen forest mosaic)is found in Southern Rockies and Arizona/New Mexico Mountains ecoregions; they maydiffer in some other environmental factors but they have the same pattern of land cover.The relatively high value of V C is because each ecoregion comprises predominantly ofa single LPT. For example, Chihuahuan Desert ecoregion comprises of LPT#9 (shrubmatrix). This is exactly as we would expect, several ecoregions may be characterized bythe same pattern of land cover but differ in other factors. The final V -measure is given bycomputing the harmonic mean of V H and V C and is equal to 0.41. Thus, at least withinan area shown in Fig. 7, there is a relatively high spatial association between LPTs andecoregions. Inclusion of additional environmental factors in our regionalization algorithmwill further increase the association between delineated units and ecoregions.

Finally, note that although in this paper we concentrated on unsupervised analysis,the COBIA framework also supports supervised analysis (Jasiewicz et al. 2014, 2015). Inthis way our method can also find application in conservation planning. For instance, itcould be used for identifying unprotected areas which are suitable for protection (Andrewet al. 2012) by identifying unprotected areas with patterns of land cover very similar tothose already protected. In the same way it can be used to identify areas suitable forlong-term monitoring to establish potential benefits of protection (Cardille et al. 2012).In ecology, it could be helpful in searching for control sites for areas that will experiencetreatment (for example, road construction or forest harvesting) (Dilts et al. 2010).


REFERENCES 17

Acknowledgments

This work was supported by the University of Cincinnati Space Exploration Institute,the National Aeronautics and Space Administration under grant NNX15AJ47G, and bythe National Science Center (NCN) under grant DEC-2012/07/B/ST6/01206.

References

Andrew, M.E., Wulder, M.A., and Coops, N.C., 2012. Identification of de facto protectedareas in boreal Canada. Biological Conservation, 146 (1), 97–107.

Barnsley, M.J. and Barr, S.L., 1996. Inferring urban land use from satellite sensor imagesusing kernel-based spatial reclassification. Photogrammetric engineering and remotesensing, 62 (8), 949–958.

Blaschke, T., 2010. Object based image analysis for remote sensing. ISPRS Journal ofPhotogrammetry and Remote Sensing, 65 (1), 2–16.

Cain, D.H., Riitters, K., and Orvis, K., 1997. A multi-scale analysis of landscape statis-tics. Landscape Ecology, 12 (630), 199–212.

Cardille, J.A. and Lambois, M., 2010. From the redwood forest to the Gulf Streamwaters: Human signature nearly ubiquitous in representative US landscapes. Frontiersin Ecology and the Environment, 8 (3), 130–134.

Cardille, J.a., et al., 2012. Representative landscapes in the forested area of Canada.Environmental Management, 49 (1), 163–173.

Cha, S.h., 2007. Comprehensive Survey on Distance / Similarity Measures between Prob-ability Density Functions. International Journal of Mathematical Models and Methodsin Applied Sciences, 1 (4), 300–307.

Chai, H., et al., 2009. Digital regionalization of geomorphology in Xinjiang. Journal ofGeographical Sciences, 19 (5), 600–614.

Chang, P. and Krumm, J., 1999. Object recognition with color cooccurrence histograms.In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pat-tern Recognition, Vol. 2 Fort Collins, CO: IEEE, 498–504.

Chen, J., et al., 2015. Global land cover mapping at 30m resolution: A POK-basedoperational approach. ISPRS Journal of Photogrammetry and Remote Sensing, 103,7–27.

Chorley, R.J. and Haggett, P., 1967. Models in geography. Vol. 2. London: Methuen;distributed in the U.S.A. by Barnes & Noble.

Datta, R., et al., 2008. Image Retrieval: Ideas, Influences, and Trends of the New Age.ACM Computing Surveys, 40, 1–60.

Davies, D.L. and Bouldin, D.W., 1979. A cluster separation measure.. IEEE transactionson pattern analysis and machine intelligence, 1 (2), 224–227.

Dilts, T.E., Yang, J., and Weisberg, P.J., 2010. The Landscape Similarity Toolbox: Newtools for optimizing the location of control sites in experimental studies. Ecography,33 (6), 1097–1101.

Dunn, J.C., 1973. A Fuzzy Relative of the ISODATA Process and Its Use in DetectingCompact Well-Separated Clusters. Journal of Cybernetics, 3 (3), 32–57.

Endres, D. and Schindelin, J., 2003. A new metric for probability distributions. IEEETransactions on Information Theory, 49 (7), 1858–1860.

Fenneman, N.M., 1915. Physiographic boundaries within the United States. Annals ofthe Association of American Geographers, 4, 84–134.


18 REFERENCES

Fovell, R.G. and Fovell, M.Y.C., 1993. Climate zones of the conterminous United Statesdefined using cluster analysis. Journal of Climate, 6 (11), 2103–2135.

George, J.A., Lamar, B.W., and Wallace, C.A., 1997. Political district determinationusing large-scale network optimization. Socio-Economic Planning Sciences, 31 (1), 11–28.

Gevers, T. and Smeulders, A.W., 2004. Content-based image retrieval: An overview. In:G.M.S.B. Kang, ed. Emerging Topics in Computer Vision. Upper Saddle River, NJ:Prentice-Hall, chap. 8, 333–384.

Gustafson, E.J. and Parker, G.R., 1992. Relationships between landcover proportion andindices of landscape spatial pattern. Landscape Ecology, 7 (2), 101–110.

Haines-Young, R. and Chopping, M., 1996. Quantifying landscape structure: a reviewof landscape indices and their application to forested landscapes. Progress in PhysicalGeography, 20 (4), 418–445.

Hammond, E.H., 1964. Analysis of properties in land form geography: an application tobroad-scale land form mapping. Annals of the Association of American Geographers,54(1), 11–19.

Hargrove, W.W. and Hoffman, F.M., 1999. Using multivariate clustering to characterizeecoregion borders. Computing in science & engineering, 1(4), 18–25.

Hargrove, W.W. and Hoffman, F.M., 2004. Potential of Multivariate Quantitative Meth-ods for Delineation and Visualization of Ecoregions. Environmental Management, 34(S1), S39–S60.

Hay, G.J. and Castilla, G., 2008. Geographic Object-Based Image Analysis (GEOBIA): Anew name for a new discipline. Lecture Notes in Geoinformation and Cartography, In:T. Blaschke, S. Lang and G. Hay, eds. Object-Based Image Analysis SE - 4. SpringerBerlin Heidelberg, 75–89.

Herzog, F. and Lausch, A., 2001. Supplementing land-use statistics with landscape met-rics: some methodological considerations. Environmental Monitoring and Assessment,72, 37–50.

Hollander, A.D., 2012. Using GRASS and R for Landscape Regionalization through PAMCluster Analysis. OSGeo Journal, 10 (6).

Jasiewicz, J., Netzel, P., and Stepinski, T.F., 2014. Landscape similarity, retrieval, andmachine mapping of physiographic units. Geomorphology, 221, 104–112.

Jasiewicz, J. and Stepinski, T.F., 2013a. Example-based retrieval of alike land-coverscenes from NLCD2006 database. IEEE Geoscience and Remote Sensing Letters, 10(1), 155–159.

Jasiewicz, J. and Stepinski, T.F., 2013b. Geomorphons - a pattern recognition approachto classification and mapping of landforms. Geomorphology, 182 (15), 147–156.

Jasiewicz, J., Netzel, P., and Stepinski, T., 2015. GeoPAT: A toolbox for pattern-basedinformation retrieval from large geospatial databases. Computers & Geosciences, 80,62–73.

Jin, S., et al., 2013. A comprehensive change detection method for updating the NationalLand Cover Database to circa 2011. Remote Sensing of Environment, 132, 159–175.

Johnston, R.J., 1970. Grouping and Regionalizing: Some Methodological and TechnicalObservations. Economic Geography, 46, 293–305.

Kreft, H. and Jetz, W., 2010. A framework for delineating biogeographical regions basedon species distributions. Journal of Biogeography, 37 (11), 2029–2053.

Lark, R., 1998. Forming spatially coherent regions by classification of multi-variate data:an example from the analysis of maps of crop yield. International Journal of Geograph-ical Information Science, 12 (1), 83–98.


REFERENCES 19

Levine, M.D. and Shaheen, S.I., 1981. A modular computer vision system for picturesegmentation and interpretation.. IEEE transactions on pattern analysis and machineintelligence, 3 (5), 540–556.

Lin, J., 1991. Divergence measures based on the Shannon entropy. IEEE Transactionson Information Theory, 37 (1), 145–151.

Long, J., Nelson, T., and Wulder, M., 2010. Regionalization of landscape pattern indicesusing multivariate cluster analysis. Environmental Management, 46 (1), 134–142.

McGarigal, K., et al., 2002. FRAGSTATS: Spatial Pattern Analysis Program for Cate-gorical Maps. Technical report, The University of Massachusetts, Amherst, MA, USA.

Neteler, M., et al., 2012. GRASS GIS: A multi-purpose open source GIS. EnvironmentalModelling and Software, 31, 124–130.

Niesterowicz, J. and Stepinski, T.F., 2013. Regionalization of multi-categorical land-scapes using machine vision methods. Applied Geography, 45, 250–258.

Omernik, J.M. and Griffith, G.E., 2014. Ecoregions of the conterminous United States:evolution of a hierarchical spatial framework. Environmental management, 54(6), 1249–1266.

Partington, K. and Cardille, J., 2013. Uncovering Dominant Land-Cover Patterns ofQuebec: Representative Landscapes, Spatial Clusters, and Fences. Land, 2, 756–773.

Patten, M.A. and Smith-Patten, B.D., 2008. Biogeographical boundaries and Mon-monier’s algorithm: A case study in the northern Neotropics. Journal of Biogeography,35 (3), 407–416.

Peterson, H.M., Nieber, J.L., and Kanivetsky, R., 2011. Hydrologic regionalization toassess anthropogenic changes. Journal of Hydrology, 408 (3), 212–225.

Powers, R.P., et al., 2012. A remote sensing approach to biodiversity assessment andregionalization of the Canadian boreal forest. Progress in Physical Geography, 37 (1),36–62.

R Core Team, 2015. R: A language and environment for statistical computing. Vienna,Austria. Technical report, http://www.r-project.org.

Rosenberg, A. and Hirschberg, J., 2007. V-Measure: A Conditional Entropy-Based Exter-nal Cluster Evaluation Measure. In: Joint Conference on Empirical Methods in NaturalLanguage Processing and Computational Natural Language Learning, 410–420.

Rousseeuw, P.J., 1987. Silhouettes: A graphical aid to the interpretation and validationof cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.

Rubner, Y., et al., 2001. Empirical Evaluation of Dissimilarity Measures for Color andTexture. Computer Vision and Image Understanding, 84 (1), 25–43.

Salvador, S. and Chan, P., 2004. Determining the number of clusters/segments in hier-archical clustering/segmentation algorithms. In: Proceedings of the16th IEEE Interna-tional Conference on Tools with Artificial Intelligence, 576–584.

Stepinski, T.F., Netzel, P., and Jasiewicz, J., 2014. LandEx - A geoweb tool for query andretrieval of spatial patterns in land cover datasets. IEEE Journal of Selected Topics inApplied Earth Observations and Remote Sensing, 7 (1), 257–266.

Stepinski, T.F., Niesterowicz, J., and Jasiewicz, J., 2015. Pattern-Based Regionaliza-tion of Large Geospatial Datasets Using Complex Object-Based Image Analysis. In:Procedia Computer Science 51, 2168–2177.

Stooksbury, D.E. and Michaels, P.J., 1991. Cluster analysis of Southeastern U.S. climatestations. Theoretical and Applied Climatology, 44 (3-4), 143–150.

Vatsavai, R.R., 2013. Gaussian multiple instance learning approach for mapping theslums of the world using very high resolution imagery. In: Proceedings of the 19thACM SIGKDD international conference on Knowledge discovery and data mining,


20 REFERENCES

1419–1426.Ward, J.H., 1963. Hierarchical grouping to optimize an objective function. Journal of the

American Statistical Association, 58 (301), 236–244.Werlen, B., 2009. Regionalisations, Everyday. In: N.J. Thrift and R. Kitchin, eds. Inter-

national Encyclopedia of Human Geography. Amsterdam: Elsevier, 286–293.Wickham, J.D. and Norton, D.J., 1994. Mapping and analyzing landscape patterns.

Landscape Ecology, 9 (1), 7–23.Wilkinson, L. and Friendly, M., 2009. The history of the cluster heat map. The American

Statistician, 63(2), 179–184.Wulder, M.A., et al., 2008. Monitoring Canadas forests. Part 1: Completion of the EOSD

land cover project. Canadian Journal of Remote Sensing, 34(6), 549–562.

research article unsupervised regionalization of...

Documents