definitions

20
Ecosystem Analysis Using Probabilistic Relational Modeling Bruce D’Ambrosio, Eric Altendorf, Jane Jorgensen Presented by Iulia Oroian and Leonard Rodrigo Tuesday Dec 2nd CSCE 582 Fall 2003 Instructor: Dr. Marco Valtorta

Upload: vivien-cooper

Post on 15-Mar-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Ecosystem Analysis Using Probabilistic Relational Modeling Bruce D’Ambrosio, Eric Altendorf, Jane Jorgensen. Presented by Iulia Oroian and Leonard Rodrigo Tuesday Dec 2nd CSCE 582 Fall 2003 Instructor: Dr. Marco Valtorta. Definitions. Ecosystems - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Definitions

Ecosystem Analysis Using Probabilistic Relational Modeling

Bruce D’Ambrosio, Eric Altendorf, Jane Jorgensen

Presented by Iulia Oroian and Leonard RodrigoTuesday Dec 2nd

CSCE 582 Fall 2003Instructor: Dr. Marco Valtorta

Page 2: Definitions

Definitions• Ecosystems

– Systems composed of interacting populations of organisms and their environment

• Community-level ecosystem model– An integrated model of the ecosystem as a whole

• Synthetic variables– Variables derived from observational data

• Aggregator– A “count” or value of a specific variable, included in

the synthetic variable space

Page 3: Definitions

Goal • To aid domain scientists in gaining insight into data.• Controlled experimentation in an ecosystem is

undesirable—therefore it is desirable to create comprehensive models from the vast amount of observational data available.

• Generally, individual, domain-specific teams apply traditional statistical methods to investigate correlations among variables in their separate datasets.

• Few methods exist for investigating the complex, noisy cross-disciplinary interactions that are crucial to understanding the ecosystem as a whole.

Page 4: Definitions

Abstract• Application of relational model discovery

methods to building comprehensive ecosystem models from data.

• In particular : two projects are considered - Crater Lake Ecosystem - West Nile Virus Disease Transmission

• In both cases the relational probabilistic model discovery is applied for building “community level” models of the ecosystems.

Page 5: Definitions

Project 1: Crater LakeProblem• The NPS is concerned about long-term changes

in the clarity of Crater Lake, a national park and the clearest deep-water lake in the world.

• So far, linking various domain-specific surveys into one overall assessment of lake health has been lacking.

• Using the relational model discovery methods the authors try to derive parameters that account for variations in explicit variables, like clarity of the lake water.

Page 6: Definitions

Project 1: Crater LakeData• Data are obtained from long-term studies of the lake

(some readings go back to 1880). • This data have been collected in tables using various

time and spatial scales. • For example: surface weather condition information,

phytoplankton densities, weather data at altitude.• Notice that the temporal and spatial granularity of the

data varies: surface weather condition information, is available on a daily basis, weather phytoplankton densities are measured only once or twice a month, and weather data at altitude is rarely available.

Page 7: Definitions

Project 1: Crater LakeMethod

• A set of temporal units were chosen to frame the analysis. For this purpose expert knowledge was used.

• These units were time periods corresponding to observed patterns of clarity of lake and for which data were available

In the project: Jun-Jul, Aug, Sep-Oct

Page 8: Definitions

Project 1: Crater LakeChallenges• Problem: deal with the time, which wasn’t explicitly

reified, therefore constructing paths like:“secchi.DesDepth.yrSegment.Phyto.density“ was a problem.

Solution: manually add a “Season” table.

• Problem: how to gain scientific insight into data Solution: learning models over not just variables in the

provided tables, but over their parents as well.

Page 9: Definitions

Project 1: Crater LakeA complete schema for the data tables related tothe temporal tables is shown in figure 1.

Page 10: Definitions

Project 1: Crater Lake• After performing the analysis ( meaning applying the relational model discovery

method), the following essential elements showed in the discovered model.

Page 11: Definitions

Project 1: Crater LakeResults

• One relationship that was discovered is that the dominant fish species in gill net catches was probabilistically dependent upon:- Secchi descending depth (water clarity) in the current year- mean fish weight in the current year- descending Secchi depth the previous year- dominant fish species two years previous

Page 12: Definitions

Project 1: Crater LakeResults

Other findings: • the fact that schools of Kokanee smolts swimming at the

edges of the lake were preyed upon by Rainbow trout and this phenomenon does not occur every year. A time lag of two years, discovered by the model, is consistent with experts’ observations. The relation between this interaction and water quality was previously unknown.

• The centrality of water clarity (measured by the Secchi “DesDepth” parameter)

• The lack of a direct relationship between Zooplankton count and water clarity.

These findings suggest that fish attributes may serve as a predictor of water clarity.

Page 13: Definitions

Project 1: Crater LakeResults

Another important result: learning models over notjust the variables in the providetables but over their parents aswell provide additional insight.

An example for theFishSpecimen table

is shown in Fig3.

Page 14: Definitions

Project 2: West Nile Virus

• Data available– Reports of dead birds testing positive– Reports of breeding populations of

mosquitoes testing positive– Human case reports– Landscape type

Page 15: Definitions

Project 2: West Nile VirusDatabase Types

• Static Type– Presence of permanent mosquito breeding

sites (tire disposal facilities, etc)– Landscape type

• Event Type– Located in place and time– Birds located testing positive for West Nile– Mosquitoes testing positive for West Nile

Page 16: Definitions

Project 2: West Nile VirusModeling Method

• Attempt to create a model of the spread of the West Nile Virus in Maryland, 2001

• “Selectors” are used to relate the correct subset of values to other nodes.

Page 17: Definitions

Project 2: West Nile VirusRelating Different Databases

• Location and Time are continuous variables– This is handled by creating a scale. The scale is

determined by examining previous case studies such as the life-cycle of disease-carrying mosquitoes and flight distance of competent bird hosts.

– In this particular study, the space / temporal scale consisted of 5 miles and 1 month.

• Selectors– Implemented as boolean types—true for elements in

the same range, and false for elements outside.

Page 18: Definitions

Project 2: West Nile VirusModel Fragment

Page 19: Definitions

Project 2: West Nile ModelResults

• The researchers found that there were insignificant cases to effectively use human and horses test cases to model the spread of the virus

• The model was, however, reasonably accurate, thus possibly implying that it is not necessary to gather data on insignificant hosts such as horses.

Page 20: Definitions

Conclusions and Future Work

• Relational probabilistic modeling provides a natural framework for investigating ecological data.

• Based on the system’s relational database the methods of relational learning provide the opportunity to learn comprehensive models directly from the data sources.

• There still are limitations in the current synthetic variable construction methods.