definitions
DESCRIPTION
Ecosystem Analysis Using Probabilistic Relational Modeling Bruce D’Ambrosio, Eric Altendorf, Jane Jorgensen. Presented by Iulia Oroian and Leonard Rodrigo Tuesday Dec 2nd CSCE 582 Fall 2003 Instructor: Dr. Marco Valtorta. Definitions. Ecosystems - PowerPoint PPT PresentationTRANSCRIPT
Ecosystem Analysis Using Probabilistic Relational Modeling
Bruce D’Ambrosio, Eric Altendorf, Jane Jorgensen
Presented by Iulia Oroian and Leonard RodrigoTuesday Dec 2nd
CSCE 582 Fall 2003Instructor: Dr. Marco Valtorta
Definitions• Ecosystems
– Systems composed of interacting populations of organisms and their environment
• Community-level ecosystem model– An integrated model of the ecosystem as a whole
• Synthetic variables– Variables derived from observational data
• Aggregator– A “count” or value of a specific variable, included in
the synthetic variable space
Goal • To aid domain scientists in gaining insight into data.• Controlled experimentation in an ecosystem is
undesirable—therefore it is desirable to create comprehensive models from the vast amount of observational data available.
• Generally, individual, domain-specific teams apply traditional statistical methods to investigate correlations among variables in their separate datasets.
• Few methods exist for investigating the complex, noisy cross-disciplinary interactions that are crucial to understanding the ecosystem as a whole.
Abstract• Application of relational model discovery
methods to building comprehensive ecosystem models from data.
• In particular : two projects are considered - Crater Lake Ecosystem - West Nile Virus Disease Transmission
• In both cases the relational probabilistic model discovery is applied for building “community level” models of the ecosystems.
Project 1: Crater LakeProblem• The NPS is concerned about long-term changes
in the clarity of Crater Lake, a national park and the clearest deep-water lake in the world.
• So far, linking various domain-specific surveys into one overall assessment of lake health has been lacking.
• Using the relational model discovery methods the authors try to derive parameters that account for variations in explicit variables, like clarity of the lake water.
Project 1: Crater LakeData• Data are obtained from long-term studies of the lake
(some readings go back to 1880). • This data have been collected in tables using various
time and spatial scales. • For example: surface weather condition information,
phytoplankton densities, weather data at altitude.• Notice that the temporal and spatial granularity of the
data varies: surface weather condition information, is available on a daily basis, weather phytoplankton densities are measured only once or twice a month, and weather data at altitude is rarely available.
Project 1: Crater LakeMethod
• A set of temporal units were chosen to frame the analysis. For this purpose expert knowledge was used.
• These units were time periods corresponding to observed patterns of clarity of lake and for which data were available
In the project: Jun-Jul, Aug, Sep-Oct
Project 1: Crater LakeChallenges• Problem: deal with the time, which wasn’t explicitly
reified, therefore constructing paths like:“secchi.DesDepth.yrSegment.Phyto.density“ was a problem.
Solution: manually add a “Season” table.
• Problem: how to gain scientific insight into data Solution: learning models over not just variables in the
provided tables, but over their parents as well.
Project 1: Crater LakeA complete schema for the data tables related tothe temporal tables is shown in figure 1.
Project 1: Crater Lake• After performing the analysis ( meaning applying the relational model discovery
method), the following essential elements showed in the discovered model.
Project 1: Crater LakeResults
• One relationship that was discovered is that the dominant fish species in gill net catches was probabilistically dependent upon:- Secchi descending depth (water clarity) in the current year- mean fish weight in the current year- descending Secchi depth the previous year- dominant fish species two years previous
Project 1: Crater LakeResults
Other findings: • the fact that schools of Kokanee smolts swimming at the
edges of the lake were preyed upon by Rainbow trout and this phenomenon does not occur every year. A time lag of two years, discovered by the model, is consistent with experts’ observations. The relation between this interaction and water quality was previously unknown.
• The centrality of water clarity (measured by the Secchi “DesDepth” parameter)
• The lack of a direct relationship between Zooplankton count and water clarity.
These findings suggest that fish attributes may serve as a predictor of water clarity.
Project 1: Crater LakeResults
Another important result: learning models over notjust the variables in the providetables but over their parents aswell provide additional insight.
An example for theFishSpecimen table
is shown in Fig3.
Project 2: West Nile Virus
• Data available– Reports of dead birds testing positive– Reports of breeding populations of
mosquitoes testing positive– Human case reports– Landscape type
Project 2: West Nile VirusDatabase Types
• Static Type– Presence of permanent mosquito breeding
sites (tire disposal facilities, etc)– Landscape type
• Event Type– Located in place and time– Birds located testing positive for West Nile– Mosquitoes testing positive for West Nile
Project 2: West Nile VirusModeling Method
• Attempt to create a model of the spread of the West Nile Virus in Maryland, 2001
• “Selectors” are used to relate the correct subset of values to other nodes.
Project 2: West Nile VirusRelating Different Databases
• Location and Time are continuous variables– This is handled by creating a scale. The scale is
determined by examining previous case studies such as the life-cycle of disease-carrying mosquitoes and flight distance of competent bird hosts.
– In this particular study, the space / temporal scale consisted of 5 miles and 1 month.
• Selectors– Implemented as boolean types—true for elements in
the same range, and false for elements outside.
Project 2: West Nile VirusModel Fragment
Project 2: West Nile ModelResults
• The researchers found that there were insignificant cases to effectively use human and horses test cases to model the spread of the virus
• The model was, however, reasonably accurate, thus possibly implying that it is not necessary to gather data on insignificant hosts such as horses.
Conclusions and Future Work
• Relational probabilistic modeling provides a natural framework for investigating ecological data.
• Based on the system’s relational database the methods of relational learning provide the opportunity to learn comprehensive models directly from the data sources.
• There still are limitations in the current synthetic variable construction methods.