the challenges of visualizing and modeling environmental...

5
The Challenges of Visualizing And Modeling Environmental Data Yingcai Xiao Computer Science Division, Department of Mathematics University of Akron, Akron, OH443254002 John P. Ziebarth National Center for Supercomputing Applications CAB261,605 E. Springfield, Champaign, IL 61820 Chuck Woodbury, Eric Bayer Intergraph Corporation Huntsville, AL 35894-0001 Bruce Rundell U.S. EPA 3HW41, 841 Chestnut Bldg, Philadelphia, PA 19107 Jeroen van der Zijp CFD Research Corporation 3325-D Triana Blvd SW, Huntsville, AL 35805 ABSTRACT Existing volume visualization techniques are rvpically applied to a three-dimensional grid. This presents some challenging problems in the visualization of environmental a!ata, These &ta often consist of unevenly distributed samples. Typically a two-step approach is used to visualize environmental &a. First the unevenly distributed sample data are modeled onto a uniform 3-D grid This grid model is subsequently rendered using conventional grid-based visualization techniques. This paper discusses some of the limitations of this approach and highlights areas where further research is needed to improve the accuracy of visualization for environmental applications. INTRODUCTION Visualization techniques have great practical applications in environmental studies. For example, remedy selectionsat hazardous waste sites, particularly where soils pose a risk, rely heavily on an understanding of the distribution and volume of contaminants that need to be addressed. Visualization techniques are very valuable for o-7803-3707-7196 ..$4.~ "1996lEEE making such assessments and for designing cost-effective remedies. In order to characterize a site, samples are taken across the site. Due to the difficulties and costs of conducting site investigations, particularly in the subsurface, environmental data are usually collected in a scattered manner. Subsurface sample points are typically biased towards known or suspected contaminated areas. It is rare that the sample points form a 3-D grid or are statistically random. Because of the scattered nature, environmental data usually can not be visualized directly using existing grid-based volume visualization techniques[4]. Typically a two-step approach is used to resolve the problem. First, the actual sample data are modeled to generatea uniform 3-D grid with interpreted data values being computed at every grid node. The grid is then rendered with conventional grid-based visualization techniques. The grid modeling is done by mathematical interpolations [6,9]. Reasonable results can be obtained when sample data are densely and uniformly collected throughout the volume of interest. Since this is not the casewith most environmental data, challenging problems are faced when visualizing this 413

Upload: others

Post on 20-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

The Challenges of Visualizing And Modeling Environmental Data

Yingcai Xiao Computer Science Division, Department of Mathematics

University of Akron, Akron, OH443254002

John P. Ziebarth National Center for Supercomputing Applications

CAB261,605 E. Springfield, Champaign, IL 61820

Chuck Woodbury, Eric Bayer Intergraph Corporation

Huntsville, AL 35894-0001

Bruce Rundell U.S. EPA

3HW41, 841 Chestnut Bldg, Philadelphia, PA 19107

Jeroen van der Zijp CFD Research Corporation

3325-D Triana Blvd SW, Huntsville, AL 35805

ABSTRACT

Existing volume visualization techniques are rvpically applied to a three-dimensional grid. This presents some challenging problems in the visualization of environmental a!ata, These &ta often consist of unevenly distributed samples. Typically a two-step approach is used to visualize environmental &a. First the unevenly distributed sample data are modeled onto a uniform 3-D grid This grid model is subsequently rendered using conventional grid-based visualization techniques. This paper discusses some of the limitations of this approach and highlights areas where further research is needed to improve the accuracy of visualization for environmental applications.

INTRODUCTION

Visualization techniques have great practical applications in environmental studies. For example, remedy selections at hazardous waste sites, particularly where soils pose a risk, rely heavily on an understanding of the distribution and volume of contaminants that need to be addressed. Visualization techniques are very valuable for

o-7803-3707-7196 ..$4.~ "1996lEEE

making such assessments and for designing cost-effective remedies.

In order to characterize a site, samples are taken across the site. Due to the difficulties and costs of conducting site investigations, particularly in the subsurface, environmental data are usually collected in a scattered manner. Subsurface sample points are typically biased towards known or suspected contaminated areas. It is rare that the sample points form a 3-D grid or are statistically random. Because of the scattered nature, environmental data usually can not be visualized directly using existing grid-based volume visualization techniques [4].

Typically a two-step approach is used to resolve the problem. First, the actual sample data are modeled to generate a uniform 3-D grid with interpreted data values being computed at every grid node. The grid is then rendered with conventional grid-based visualization techniques. The grid modeling is done by mathematical interpolations [6,9]. Reasonable results can be obtained when sample data are densely and uniformly collected throughout the volume of interest. Since this is not the case with most environmental data, challenging problems are faced when visualizing this

413

type of data. A set of teal world environmental data is used in this study to demonstrate some of these challenges and to highlight some areas where further research is needed.

CASE DESCRIPTION

The data used in this study were collected tiom a tank- farm where diesel fuel is stored. Leakage from some surface tanks has contaminated the underlying soil. Subsurface data were collected to analyze the extent of the contamination. Sample data measure diesel fuel concentration in parts per billion (ppb). To facilitate the visualization of the sample data, a set of color codes is designed to map concentration values to colors as shown in Figure 1. The layout of the tank- farm and sample points is depicted in Figure 2. The fuel tanks and related facilities can be seen above the surface. Subsurface sample points are represented as small cubes below the surface. Each small cube is color coded according to the scale in Figure 1 using the measured concentration value at its location. There are 145 sample points in the data set. The largest value in the sample data is 42400 ppb (shown in dark red), the minimum value is 0 (shown in dark gmen) and the mean value is 2500 (shown in light green). As can be observed from Figure 2 contamination is heaviest beneath the three yellow tanks. The highest value is present beneath the tank in the center. No contamination is detected at depth beneath the site.

The two-step approach is used to further visualize and analyze the contamination. A 50~50x50 grid model was generated utilizing three different interpolation methods. The 3-D grid covers the entire minimum bounding volume of the sample points. The interpolation methods being used are Shepard [7], Thin-plate Spline [I], and Volume Spline [6]. The resulting interpretive grid data are rendered using Intergraph Corporation’s Voxel Analyst on a Gateway P5- 133 personal computer. Figure 1 is again used for color coding. The rendered images arc shown in Figures 3,4 and 5. The curved surfaces in the images are iso-surfaces [4] of constant concentration value at 6000 ppb. The regions enclosed by the surfaces are regions of concentration values greater than 6000 ppb. Cut-away displays are generated using multiple slicing planes [8].

LIMITATIONS AND CHALLENGES

Examination of the figures reveals a great deal about the limitations of using the two-step visualization approach for environmental data. The limitations of using mathematical interpolation methods to model the non-uniformly spaced sample data is particularly evident. One can realize the challenges presented by analyzing the limitations in the current available technologies.

(1) Missing Data Values. In the two-step approach, the data being visualized are not the original data but the secondary

data derived through mathematical interpolations. Information carried by the original sample data can be lost in the interpolation process. Figure 3 shows the grid generated by Shepard method. The high (in dark ted) and the low (in dark green) of the original sample data are missing from the grid data. Shepard method is basically an inverse distance weighted method. It is a special case of metric interpolation [2]. Gordon and Wixom have shown that data values generated by metric interpolation are bounded by the maximum and minimum values of the original sample data [2]. The interpolated values approach the mean of the original sample data when the interpolated points am away from sample points. Unless one of the grid node is exactly on a sample point the data value carried by the sample point could be missing ti-om the grid generated by the Shepard method. To avoid missing original data values in the modeling process, one has to construct a structured or unstructmed grid so that each sample point is exactly on a grid node. Even if this is done, the problems discussed below will still exist.

(2) Ambiguous Results and Lack of Error Estimation. As can be observed from Figures 3,4 and 5, the same set of input sample data has generated three quite different outputs. Which one provides the best representation of the data? It is not easy to determine. Similar problem happens in geometric modeling. But there one can use the visual appearance (for example, smoothness) of the modeled geometry as a criterion for choosing one modeling method over the other. It is obvious that such a criterion is not applicable here. What is important here is the accuracy of the outputs. For example if one needs to compute the volume of the soil (in cubic feet) contaminated above the EPA limit, one would need an accurate result, or an error range should be provided to backup the result. Unfortunately most interpolation methods do not yield error estimations [9]. The only ones that provide error estimations are the statistical interpolation methods (for example, Krigging [5]). Statistical methods, however, require a large amount of input sample points to be valid. This is again not the case with most environment data-

(3) Bad Extrapolation. Since interpolation methods am meant for interpolating, inaccurate results could occur if they are used for extrapolating. It is hard to tell in a 3-D volume where interpolation ends and where extrapolation begins. In this case study, one can claim that data values on the grid nodes am extrapolated at the bottom of the volume of interest. This is because there are only a few sample points there. Examining the colors at the bottom of the volume in Figures 3,4 and 5, one can see that for the given sample data Shepard method extrapolated to the mean value (represented by the light green color), Thin-plate Spline extrapolated into negative values (blue color) and Volume Spline extrapolated into large positive values (dark red). None of the results are correct since the values at the bottom of the model should be zero (dark green) as seen from the sampled data values

414

shown in Figure 2. To solve the problem one needs to add asymptotic constraints to the interpolation methods so that the data values can be extrapolated gradually towards zero. Currently, no interpolation method allows additional constraints.

(4) Misinterpretation. Another problem associated with the interpolation methods is that they may generate physically impossible results. For example, the physical value of diesel fuel concentration can not be negative, but both Thin-plate method and Volume Spline method generate large negative values as shown in Figures 3 and 4 in blue colors. The reason for the misinterpretation is that interpolation methods do not know the physical meaning of the data. The physical problem in this case is a diffusion problem and is governed by the Poisson’s equation [3]. Presently, there is no way to integrate the equation into the existing interpolation methods. Some of the interpolation methods are derived with certain physical constraint. Thin-plate Spline for example was derived using a constraint to minimize the strain energy in a clamped elastic plate. This physical constraint could be meaningful in solid modeling for stress analysis, but is meaningless in modeling environmental data. This process is similar to that of fitting a parabola through three points taken from a circle. The result will certainly not be the original circle. Finding close-form interpolation functions for each type of physical data to be visualized is not a simple, if not impossible task. Therefore, it is desirable to find a way to incorporate physical constraints numerically into the modeling process so that the resulting data values obey the physical laws that govern the physical processes involved.

CONCLUSIONS

Visualizing environmental data, or any unevenly sampled data, is a challenging task. The two-step approach has many limitations. Those limitations need to be removed so that one can reliably use current visualization techniques to picture and analyze such unevenly sampled data. In order to remove the limitations, we are challenged to find modeling techniques that preserve all input data values, provide error estimations, and accept both asymptotic and physical constraints. Gtherwise, we are challenged to find approaches to visualize unevenly sampled data in which the challenges to data modeling can be avoided.

[l] Duchon, J., “Splines Minimizing Rotation-invariant Semi-Norms in Sobolev Spaces,” Conference Proceedings of Constructive Theory of Functions of Several Variables (Oberwolfch, April 25 - May 1, 1976), W. Schempp, and K. Zeller, eds., 1975, pp. 85 100.

[2] Gordon, W. J., and Wixom, J. A., “Shepard’s Method of ‘Metric Interpolation’ to Bivariate and Multivariate Interpolation,” Mathematics of Computation, Vol. 32, No. 141, 1978, pp. 253-264.

[3] Haberman, Richard, “Elementary Applied Partial Differential Equations with Fourier Series and Boundary Value Problems,” Prentice-Hall, 1983.

[4] Kaufman, A. E., ed., “Volume Visualization,” LEEE Comp. Sot. Press, 1990.

[5] Krige, D. G., “A Study of Gold and Uranium Distribution Patterns in the Klerksdrop Goldfield,” Geoexploration, No. 4, 1966, pp. 43-53.

[6] Nielson, G. M., “Scattered Data Modeling,” IEEE Computer Graphics & Application, January 1993, pp. 60-70.

[7] Shepard, D., “A Two-dimensional Interpolation Function for Irregularly-Spaced Data,” Proceedings of ACM National Conference, 1968, pp. 5 17-524.

[8] Speray, D., and Kennon, S., “Interactive Data Exploration on Arbitrary Grids,” Computer Graphics, Vol. 24, No. 5, November 1990, pp.5-12.

[9] Xiao, Yingcai, “Sparse Data Volume Visualization,” Ph.D. Dissertation, University of Alabama in Huntsville, 1994.

415

Figure 3. Rendering of the grid generated by Shepard method.

Figure 1. The color codes.

Figure 4. Rendering of the grid generated by Thin-plate Spline method.

Figure 2. The tank-farm and data sample locations. Figwe 5. Rendering of the grid generated by Volume Spline method.

416