chapter 1 raster databases - pdfs.semanticscholar.org · the techniques to store raster images...

9
Chapter 1 Raster Databases A Geographic Information System (GIS) is a system designed to store, edit, share and display geographic informations for statistical analysis or decision making. It can be visualized as a merger of cartography, statistical analy- sis, and database technology. The data stored in GIS uses spatio-temporal (space-time) location as the key index variable for informations. GIS can relate otherwise unrelated informations by using location and/or extent in space-time as the key index variable, in the same way as relational database containing text or numbers can relate many different tables using common key index variables. Traditionally, there are two broad methods used to store data in a GIS: raster image and vector data. In this chapter we will discuss the techniques to store raster images (i.e. raster databases) and applications in the field of GIS. 1.1 Raster Data A raster image is rows and columns of cells, called of pixels, organized in a rectangular grid and can be viewed on a computer monitor or other display medium or can be printed on paper for viewing (Fig. 1.1). Each cell/pixel of the grid stores a singular color value and is basic building block of any raster image. The resolution of raster image depends on number of pixels it contain and is generally denoted by # pixels in row × # column of the grid. For example a 800 × 600 resolution denotes that the raster image contains 600 rows of 800 pixel each. There are various approaches used to store Raster data. This include 1

Upload: lamcong

Post on 20-Jul-2018

234 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 1 Raster Databases - pdfs.semanticscholar.org · the techniques to store raster images (i.e. raster databases) and applications in the eld of GIS. ... generic raster data

Chapter 1

Raster Databases

A Geographic Information System (GIS) is a system designed to store, edit,share and display geographic informations for statistical analysis or decisionmaking. It can be visualized as a merger of cartography, statistical analy-sis, and database technology. The data stored in GIS uses spatio-temporal(space-time) location as the key index variable for informations. GIS canrelate otherwise unrelated informations by using location and/or extent inspace-time as the key index variable, in the same way as relational databasecontaining text or numbers can relate many different tables using commonkey index variables. Traditionally, there are two broad methods used to storedata in a GIS: raster image and vector data. In this chapter we will discussthe techniques to store raster images (i.e. raster databases) and applicationsin the field of GIS.

1.1 Raster Data

A raster image is rows and columns of cells, called of pixels, organized in arectangular grid and can be viewed on a computer monitor or other displaymedium or can be printed on paper for viewing (Fig. 1.1). Each cell/pixelof the grid stores a singular color value and is basic building block of anyraster image. The resolution of raster image depends on number of pixels itcontain and is generally denoted by # pixels in row × # column of the grid.For example a 800 × 600 resolution denotes that the raster image contains600 rows of 800 pixel each.

There are various approaches used to store Raster data. This include

1

Page 2: Chapter 1 Raster Databases - pdfs.semanticscholar.org · the techniques to store raster images (i.e. raster databases) and applications in the eld of GIS. ... generic raster data

2 CHAPTER 1. RASTER DATABASES

Figure 1.1: A raster image. Each cell is a pixel.

standard file-based structure of TIF, JPEG, etc. as well as binary large object(BLOB) data stored directly in a relational database management system(RDBMS). Each approach has it’s advantages and disadvantages. On onehand, properly indexed database storage typically allows quicker retrievalof the raster data but at the same time can require storage of millions ofsignificantly sized records. While On the other hand, standard file-basedstructures provide good compression and require less storage space, it isdifficult to index the data and hence has slower retrieval operation.

1.2 Raster data in GIS

Raster database is used extensively in the field of GIS. The inherent natureof raster maps such as one attribute maps etc. are ideally suited for mathe-matical modeling and quantitative analysis. Also the data storage techniquesused for raster data are usually easy to program and gives good performancefor data retrieval. Traditional GIS data models mainly deal with spatial datawhich emphasize static representations of reality. One commonly used formof raster data in the field of GIS is aerial photographs of some area. The pri-mary purpose of these raster data is to display the detailed image on a maparea or render its identifiable objects by digitization. Depending on the ap-plication, GIS uses additional raster data sets that can contain informations

Page 3: Chapter 1 Raster Databases - pdfs.semanticscholar.org · the techniques to store raster images (i.e. raster databases) and applications in the eld of GIS. ... generic raster data

1.2. RASTER DATA IN GIS 3

regarding elevation, a digital elevation model, or reflectance of a particularwavelength of light, Landsat, or other electromagnetic spectrum indicatorsfor various analysis purposes such as vegetation index, surface temperaturemonitoring etc.

1.2.1 Spatio-temporal Data

The spatio-temporal data has become crucial to understand cause and ef-fect scenarios and development of dynamic models for the analysis of it. Atemporal GIS aims to process, manage, and analyze spatio-temporal data.Considerable research effort has been put in the last decade towards temporaldatabases and temporal query languages([1]). An approach to incorporatetemporal information into GIS spatial data models is time-stamping layers(or the snapshot models) [2]. This model shows the states of a geographicdistribution at different time stamps without any explicit temporal relationsamong layers (Fig. 1.2). Every layer in the snapshot model shows the state ofgeographic distribution at one time stamp. Time intervals between any twolayers may vary and also there is no explicit implication for changes withinthe time lag of any two layers.

Figure 1.2: An example of snapshot model.

Many commercial available DBMS tools have extension for handling spatio-temporal data. For example Oracle Spatial has a extension called GeoRaster[4] that gives the capability to store, index, query, analyze, and deliver rasterimage and gridded data and its associated metadata. GeoRaster uses ageneric raster data model that is component-based, logically layered, and

Page 4: Chapter 1 Raster Databases - pdfs.semanticscholar.org · the techniques to store raster images (i.e. raster databases) and applications in the eld of GIS. ... generic raster data

4 CHAPTER 1. RASTER DATABASES

multidimensional. The core data in a raster is a multidimensional matrix ofraster cells. Each cell is one element of the matrix, and its value is called thecell value.

Spatio-temporal raster data data exhibits non-stationarity because eachof the layers represent the static information at one time stamp and notupdated with time. Due to this many statistical interpolation techniques forstatistical prediction, such as isotropic spatial autocorrelation and stationarytrend, cannot be applied directly to spatio-temporal raster data. Some workhas been done to generate mathematical model using fuzzy logic for spatio-temporal raster data [3].

1.2.2 Field Operations

Various sensors such as satellite based, weather monitoring sensors etc. pro-vide abundant field data periodically. These field data give most up-to-dateinformation about current events such as fires, floods, rain etc. Also thesefield data contain information which can facilitate various services. For ex-ample a set of aerial photographs contain roads, water bodies, building, treesetch which when converted into vector representation can be used for plan-ning/monitoring transportation systems, landcover, vegetation analysis etc.Field data are an essential part of GIS systems in creating/updating digitalarchival of fragile historical paper maps, maps of new locations and validatingthe available data sets (raster or vector).

Field data can be manipulated using either map algebra or image algebrato retrieve information. In general terms an Algebra is a mathematical struc-ture consisting of Operands and Operations. Although both map algebra andimage algebra take raster data as operand, the main difference between thetwo is in operations. Image algebra mainly deals with the image propertiessuch as color information, pixel size, number of pixel, hight/width of imageetc. while map algebra deals with the attribute map such as temperaturemaps, vegetation maps etc.

Map algebra

In Map Algebra, raster data (attribute maps) is operand and the operationcan be classified into four groups : Local, Focal, Zonal and Global.

• Local operation: A local operation converts one raster data intoanother raster data in such a way that the value of a cell in the new

Page 5: Chapter 1 Raster Databases - pdfs.semanticscholar.org · the techniques to store raster images (i.e. raster databases) and applications in the eld of GIS. ... generic raster data

1.2. RASTER DATA IN GIS 5

raster is computed only using the value of that cell in the original raster.The example of these operations are thresholding, point wise additionetc. (Fig 1.3).

0

Thresholding

2 85 1

3 9 1 2

3 2

3 1

3 1

54

0

0

0

0

1

0

1

1

1 0

0

0

0 1

0

Figure 1.3: An example of thresholding.

• Focal operation: A focal operation maps one raster data into a newraster such that the value of a cell in the new raster computed usingthe values of the cell and its neighboring cells in the original raster.Some example of these operations are focal sum, gradient etc. (Fig.1.4)

(d)

2 85 1

3 9 1 2

3 2

3 1

3 1

54

Focal Sum

26 12

25 29 32

27 33 27 10

15 15

19 21

15

720

(b)(a) (c)

Figure 1.4: An example of focal operation. (a) Rook neighborehood. (b)Bishop neighborehood. (c) Queen neighborehood. (d) Focal sum using queenneighborehood.

Page 6: Chapter 1 Raster Databases - pdfs.semanticscholar.org · the techniques to store raster images (i.e. raster databases) and applications in the eld of GIS. ... generic raster data

6 CHAPTER 1. RASTER DATABASES

• Global operation: A global operation computes and assigns the valueof a cell in the new raster as a function of the location or values of allcells in the original raster data. Some example operations are zonalsum, zonal average etc.

• Zonal operation: A zonal operation converts one raster data into newone in such a way that the value of a cell in the new raster is a functionof the value of that cell in the original raster and the values of othercells which appear in the same zone specified in another raster. Anexample of these kind of operations are distance from nearest facility.

Image algebra

In Image Algebra, images are the operands and operation are related to im-ages. Image operations ignores the absolute location of pixels. For example, atrim/crop operation creates a new raster image by extracting an user definedaxis-aligned subset from the original raster image (Fig 1.5). Most of theseoperations come from the image processing literature and used for displayor rendering the image for manual analysis of demonstration purpose. Someimportant operations in this category are zoom, rotate, smoothing, low passfilter, high pass filter etc.

1

2 85 1

3 9 1 2

3 2

3 1

3 1

54

Trim/Crop

5 1

9 2

8

1

3 2

Figure 1.5: An example of trim/crop operation.

1.2.3 Storage

There are various approaches to store raster data. These techniques vary de-pending on the application. The traditional approach is to store raster datain file system of operating system and to retrieve the data item of interest

Page 7: Chapter 1 Raster Databases - pdfs.semanticscholar.org · the techniques to store raster images (i.e. raster databases) and applications in the eld of GIS. ... generic raster data

1.2. RASTER DATA IN GIS 7

custom software can be used. For example user can store image of maps onhis personal computer (running MS Windows or Linux) and manually lookfor the routes by various image processing software. Another example is per-sonal photographs of a user stored in his personal computer. This approachhas several limitations such as limited ability to add or manage additionalattributes, it supports very limited queries etc. which makes it unsuitablefor more advanced application of the field of GIS. The other approach tostore raster data is database approach. This approach stores the raster dataitems attributes such as geo-location, time-stamp, various properties etc. indatabase tables. To retrieve the data items of interest, database query lan-guage such as SQL can be used. Also table schema definition allows user toadd attributes to improve ability to pose ad-hoc queries.

1.2.4 Retrieval Technique

Raster data sets are very rich in content as it record a value for all points inthe area covered. Retrieving the item of interest is a challenge because of thisrich content. Data items from the raster data can be retrieved by using eithermeta-data approach (database approach) or content based retrieval technique(image processing technique). The meta-data approach uses simpler SQLdata types such as numeric, string, date etc. is table schema and queries toselect a set of descriptive attributes such as location, time-stamp, subject etc.This approach stores values of descriptive attributes for each raster data itemand allows SQL queries on only these stores descriptive attributes which is alimitation of this approach. The another limitation of this approach is that itdoes not support “similarity” based queries, for example query to find all theraster data item similar to a given raster data item. These “similarity” basedqueries can be answered using content based retrieval technique or contentbased image retrieval technique (CBIR) of image processing field. CBIRhas become one of the most active research areas in the past few years.In this approach, content of an image is represented by extracted primitivevisual features such as representing color, shape and texture and similarimage queries are answered based on some combination of primitive features.CBIR is a two step approach to search for the data items of interest in thedatabase of raster images. The first step is to compute a feature vector orattribute relation graph (ARG) for each image in the database. The secondstep, given a query image, compute its ARG and compare to the ARGs inthe database for the image most similar to the query image. For the success

Page 8: Chapter 1 Raster Databases - pdfs.semanticscholar.org · the techniques to store raster images (i.e. raster databases) and applications in the eld of GIS. ... generic raster data

8 CHAPTER 1. RASTER DATABASES

of this approach, the feature and similarity measure, used to compare twoARGs, should be efficient. An additional approach based on Cone tree isalso proposed to solve similarity based queries in [5, 6, 7].

1.3 Concluding Remarks

Despite the computational advantages and simplicity of representation, rasterdatabase have certain disadvantages. The resolution of the data is deter-mined by the cell size. It is especially difficult to properly represent linearfeatures depending on the cell resolution. Also raster maps contains limitedinformation as it inherently reflect only one attribute or characteristic for anarea. Besides this processing associated attribute data may be cumbersomein the large database due to structure of data storage. Since most input datais in vector form, data must undergo vector-to-raster conversion which intro-duce increased processing requirements and also may introduce data integrityconcerns due to generalization and choice of inappropriate cell size.

Page 9: Chapter 1 Raster Databases - pdfs.semanticscholar.org · the techniques to store raster images (i.e. raster databases) and applications in the eld of GIS. ... generic raster data

Bibliography

[1] Tansel, A. U., Clifford, J., Gadia, S., Jajodia, S., Segev, A., Snodgrass, R.,“Temporal Databases: Theory, Design, and Implementation,” (Reading,MA: The Benjamin /Cummings Publishing Company, Inc., ed., 1993,)

[2] Armstrong, M. P., “Temporality in spatial databases,” in proceedings ofGIS/LIS, 1988, ed. 2, pp. 880-889.

[3] Rakefet Shafran-Natan, Tal Svoray, “Solving Spatio-temporal Non-stationarity in Raster Database with Fuzzy Logic,” in ISPA Workshops,2006, pp. 603-609

[4] Oracle Spatial User’s Guide and Reference.

[5] Zhang, P., Shekhar, S., Huang, Y. and Kumar, V., “Spatial Cone Tree:An Index Structure for Correlation-based Similarity Queries on SpatialTime Series,” in the Proceedings of International Workshop on Next Gen-eration Geospatial Information, Boston, MA, Oct. 2003.

[6] Zhang, P., Shekhar, S., Huang, Y. and Kumar, V., “Exploiting Spa-tial Autocorrelation to Efficiently Process Correlation-Based SimilarityQueries”, in the Proceedings of the 8th Symposium on Spatial and Tem-poral Databases, July, 2003, Santorini island, Greece.

[7] Zhang, P., Shekhar, S., Huang, Y. and Kumar, V., “Correlation Analysisof Spatial Time Series Datasets: A Filter-and-Refine Approach”, in theProceedings of 7th Pacific-Asia Knowledge Discovery and Data Mining,2003, Seoul, Korea.

9