pat langley dileep george stephen bay computational learning laboratory center for the study of...

20
Pat Langley Pat Langley Dileep George Dileep George Stephen Bay Stephen Bay Computational Learning Laboratory Computational Learning Laboratory Center for the Study of Language and Information Center for the Study of Language and Information Stanford University, Stanford, CA Stanford University, Stanford, CA Kazumi Saito Kazumi Saito NTT Communication Science Laboratories NTT Communication Science Laboratories Soraku, Kyoto, JAPAN Soraku, Kyoto, JAPAN Robust Induction of Process Robust Induction of Process Models from Time-Series Models from Time-Series Data Data arch was funded in part by NTT Communication Science Laboratories an arch was funded in part by NTT Communication Science Laboratories an NCC 2-1220 from NASA Ames Research Center. NCC 2-1220 from NASA Ames Research Center.

Upload: timothy-wade

Post on 27-Mar-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Pat LangleyPat LangleyDileep GeorgeDileep GeorgeStephen BayStephen Bay

Computational Learning LaboratoryComputational Learning LaboratoryCenter for the Study of Language and InformationCenter for the Study of Language and Information

Stanford University, Stanford, CAStanford University, Stanford, CA

Kazumi SaitoKazumi SaitoNTT Communication Science LaboratoriesNTT Communication Science Laboratories

Soraku, Kyoto, JAPANSoraku, Kyoto, JAPAN

Robust Induction of Process Models Robust Induction of Process Models from Time-Series Datafrom Time-Series Data

This research was funded in part by NTT Communication Science Laboratories and in partThis research was funded in part by NTT Communication Science Laboratories and in partby Grant NCC 2-1220 from NASA Ames Research Center.by Grant NCC 2-1220 from NASA Ames Research Center.

Page 2: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

A Process Model for an Aquatic EcosystemA Process Model for an Aquatic Ecosystem

model AquaticEcosystemmodel AquaticEcosystem

variables: phyto, zoo, nitro, residuevariables: phyto, zoo, nitro, residueobservables: phyto, nitroobservables: phyto, nitro

process phyto_exponential_decayprocess phyto_exponential_decay equations:equations: d[phyto,t,1] = d[phyto,t,1] = 0.307 0.307 phyto phyto

d[residue,t,1] = 0.307 d[residue,t,1] = 0.307 phyto phyto

process zoo_exponential_decayprocess zoo_exponential_decay equations:equations: d[zoo,t,1] = d[zoo,t,1] = 0.251 0.251 zoo zoo

d[residue,t,1] = 0.251d[residue,t,1] = 0.251

process zoo_phyto_predationprocess zoo_phyto_predation equations:equations: d[zoo,t,1] = 0.615 d[zoo,t,1] = 0.615 0.495 0.495 zoo zoo

d[residue,t,1] = 0.385 d[residue,t,1] = 0.385 0.495 0.495 zoo zood[phyto,t,1] = d[phyto,t,1] = 0.495 0.495 zoo zoo

process nitro_uptakeprocess nitro_uptake conditions:conditions: nitro > 0nitro > 0 equations:equations: d[phyto,t,1] = 0.411 d[phyto,t,1] = 0.411 phyto phyto

d[nitro,t,1] = d[nitro,t,1] = 0.098 0.098 0.411 0.411 phyto phyto

process nitro_remineralization;process nitro_remineralization; equations:equations: d[nitro,t,1] = 0.005 d[nitro,t,1] = 0.005 residue residue

d[residue,t,1 ] = d[residue,t,1 ] = 0.005 0.005 residue residue

Page 3: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Predictions from the Ecosystem ModelPredictions from the Ecosystem Model

Page 4: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Advantages of Quantitative Process ModelsAdvantages of Quantitative Process Models

they refer to notations and mechanisms they refer to notations and mechanisms familiarfamiliar to scientists; to scientists;

they embed they embed quantitativequantitative relations within relations within qualitativequalitative structure; structure;

they provide they provide dynamicaldynamical predictions of changes over time; predictions of changes over time;

they offer they offer causalcausal and and explanatoryexplanatory accounts of phenomena; accounts of phenomena;

while retaining the while retaining the modularitymodularity needed to support induction. needed to support induction.

Process models are a good target for discovery systems because: Process models are a good target for discovery systems because:

Quantitative process models provide an important alternative to Quantitative process models provide an important alternative to formalisms used currently in machine learning and discovery. formalisms used currently in machine learning and discovery.

Page 5: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Observed values for a set of continuous variables as they varyover time or situations

Generic processes thatcharacterize causal relationships amongvariables in terms ofconditional equations

Inductive Process ModelingInductive Process Modeling

A specific process model that explains the observed values and predicts future data accurately

Induction

training datatraining data

background knowledgebackground knowledge

learned modellearned model

Page 6: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Generic Processes as Background KnowledgeGeneric Processes as Background Knowledge

the variables involved in a process and their types;the variables involved in a process and their types; the parameters appearing in a process and their ranges; the parameters appearing in a process and their ranges; the forms of conditions on the process; andthe forms of conditions on the process; and the forms of associated equations and their parameters.the forms of associated equations and their parameters.

Our framework casts background knowledge as Our framework casts background knowledge as generic processesgeneric processes that specify: that specify:

Generic processes are building blocks from which one can compose Generic processes are building blocks from which one can compose a specific quantitative process model. a specific quantitative process model.

Page 7: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Generic Processes for Aquatic EcosystemsGeneric Processes for Aquatic Ecosystems

generic process exponential_decaygeneric process exponential_decay generic process generic process remineralizationremineralization variables: S{species}, D{detritus}variables: S{species}, D{detritus} variables: N{nutrient}, variables: N{nutrient}, D{detritus}D{detritus} parameters: parameters: [0, 1] [0, 1] parameters: parameters: [0, 1] [0, 1] equations:equations: d[S,t,1] = d[S,t,1] = 1 1 S S equations:equations: d[N, t,1] = d[N, t,1] = D D

d[D,t,1] = d[D,t,1] = S S d[D, t,1] = d[D, t,1] = 1 1 DD

generic process predationgeneric process predation generic process generic process constant_inflowconstant_inflow variables: S1{species}, S2{species}, D{detritus}variables: S1{species}, S2{species}, D{detritus} variables: variables: N{nutrient}N{nutrient} parameters: parameters: [0, 1], [0, 1], [0, 1] [0, 1] parameters: parameters: [0, 1] [0, 1] equations:equations: d[S1,t,1] = d[S1,t,1] = S1 S1 equations:equations: d[N,t,1] = d[N,t,1] =

d[D,t,1] = (1 d[D,t,1] = (1 ) ) S1 S1d[S2,t,1] = d[S2,t,1] = 1 1 S1 S1

generic process nutrient_uptakegeneric process nutrient_uptake variables: S{species}, N{nutrient}variables: S{species}, N{nutrient} parameters: parameters: [0, [0, ], ], [0, 1], [0, 1], [0, 1] [0, 1] conditions:conditions: N > N > equations:equations: d[S,t,1] = d[S,t,1] = S S

d[N,t,1] = d[N,t,1] = 1 1 S S

Page 8: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Previous Results: The IPM AlgorithmPrevious Results: The IPM Algorithm

1. Find all ways to instantiate known generic processes with 1. Find all ways to instantiate known generic processes with specific variables, subject to type constraints;specific variables, subject to type constraints;

2. Combine instantiated processes into candidate generic 2. Combine instantiated processes into candidate generic models, with limits on the total number of processes;models, with limits on the total number of processes;

3. For each generic model, carry out gradient descent search 3. For each generic model, carry out gradient descent search through parameter space to find good parameter values;through parameter space to find good parameter values;

4. Select the parameterized model that produces the lowest mean 4. Select the parameterized model that produces the lowest mean squared error on the training data.squared error on the training data.

Langley et al. (2002) reported IPM, an algorithm that constructs Langley et al. (2002) reported IPM, an algorithm that constructs process models from generic components in four stages:process models from generic components in four stages:

We showed that IPM could induce accurate process models We showed that IPM could induce accurate process models from noisy time series, but it tended to include extra processes.from noisy time series, but it tended to include extra processes.

Page 9: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

The Revised IPM AlgorithmThe Revised IPM Algorithm

Accepts as input those variables that can appear in the induced Accepts as input those variables that can appear in the induced model, both observable and unobservable;model, both observable and unobservable;

Utilizes the parameter-fitting routine to estimate initial values Utilizes the parameter-fitting routine to estimate initial values for unobservable variables;for unobservable variables;

Invokes the parameter-fitting method to induce the thresholds Invokes the parameter-fitting method to induce the thresholds on process conditions; andon process conditions; and

Selects the parameterized model with the lowest description Selects the parameterized model with the lowest description

length: length: MMdd = (M = (Mvv + M + Mc c ) ) log (n) + n log (n) + n log (M log (Me e )) . .

We have revised and extended the IPM algorithm so that it now:We have revised and extended the IPM algorithm so that it now:

We have evaluated the new system on synthetic and natural We have evaluated the new system on synthetic and natural data.data.

Page 10: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Evaluation of the IPM AlgorithmEvaluation of the IPM Algorithm

1. We used the aquatic ecosystem model to generate data sets 1. We used the aquatic ecosystem model to generate data sets over 100 time steps for the variables over 100 time steps for the variables nitronitro and and phytophyto;;

2. We replaced each ‘true’ value 2. We replaced each ‘true’ value xx with with xx (1 (1 + r + r n n), where ), where rr followed a Gaussian distribution (followed a Gaussian distribution ( = 0, = 0, = 1) and = 1) and nn > 0; > 0;

3. We ran IPM on these noisy data, giving it type constraints and 3. We ran IPM on these noisy data, giving it type constraints and generic processes as background knowledge.generic processes as background knowledge.

To demonstrate IPM's ability to induce process models, we ran it To demonstrate IPM's ability to induce process models, we ran it on synthetic data for a known system:on synthetic data for a known system:

In two experiments, we let IPM determine the initial values and In two experiments, we let IPM determine the initial values and thresholds given the correct structure; in a third study, we let it thresholds given the correct structure; in a third study, we let it search through a space of 256 generic model structures.search through a space of 256 generic model structures.

Page 11: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Experimental Results with IPMExperimental Results with IPM

The main results of our studies with IPM on synthetic data were:The main results of our studies with IPM on synthetic data were:

1. The system infers accurate estimates for the initial values of 1. The system infers accurate estimates for the initial values of unobservable variables like unobservable variables like zoozoo and and residueresidue;;

2. The system induces estimates of condition thresholds on 2. The system induces estimates of condition thresholds on nitronitro that are close to the target values; andthat are close to the target values; and

3. The MDL criterion selects the correct model structure in all 3. The MDL criterion selects the correct model structure in all runs with 5% noise, but only 40% of runs with 10% noise.runs with 5% noise, but only 40% of runs with 10% noise.

These suggest that the basic approach is sound, but that we should These suggest that the basic approach is sound, but that we should consider other MDL schemes and other responses to overfitting.consider other MDL schemes and other responses to overfitting.

Page 12: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Results with Unobserved Initial ValuesResults with Unobserved Initial Values

Page 13: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Electric Power on the International Space StationElectric Power on the International Space Station

Page 14: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Telemetry Data from Space Station BatteriesTelemetry Data from Space Station Batteries

Predictor variables included the battery’s current and temperature.Predictor variables included the battery’s current and temperature.

Page 15: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Induced Process Model for Battery BehaviorInduced Process Model for Battery Behavior

model Batterymodel Battery

variables: Rs, Vcb, soc , Vt, i, temperaturevariables: Rs, Vcb, soc , Vt, i, temperatureobservable: soc, Vt, i, temperatureobservable: soc, Vt, i, temperature

process voltage_chargeprocess voltage_charge process voltage_dischargeprocess voltage_discharge conditions:conditions: i i 0 0 conditions: conditions: i < 0i < 0 equations:equations: Vt = Vcb + 6.105 Vt = Vcb + 6.105 Rs Rs i i equations: equations: Vt = Vt = Vcb Vcb 1.0 / (Rs + 1.0) 1.0 / (Rs + 1.0)

process charge_transferprocess charge_transfer equations:equations: d[soc,t,1] = i d[soc,t,1] = i Vcb/179.38 Vcb/179.38

process quadratic_influence_Vcb_socprocess quadratic_influence_Vcb_soc equations:equations: Vcb = 41.32 Vcb = 41.32 soc soc soc soc

process linear_influence_Vcb_tempprocess linear_influence_Vcb_temp equations:equations: Vcb = 0.2592 Vcb = 0.2592 temperature temperature

process linear_influence_Rs_socprocess linear_influence_Rs_soc equations:equations: Rs = 0.03894 Rs = 0.03894 soc soc

Page 16: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Results on Battery Test DataResults on Battery Test Data

Page 17: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Best Fit to Data on Protozoan PredationBest Fit to Data on Protozoan Predation

Page 18: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

computational scientific discovery (e.g., Langley et al., 1983)computational scientific discovery (e.g., Langley et al., 1983)

knowledge-based learning methods (e.g., ILP, theory revision)knowledge-based learning methods (e.g., ILP, theory revision)

qualitative physics and simulation (e.g., Forbus, 1984)qualitative physics and simulation (e.g., Forbus, 1984)

scientific simulation environments (e.g., scientific simulation environments (e.g., STELLA, MATLABSTELLA, MATLAB))

Intellectual InfluencesIntellectual Influences

Our work on inductive process modeling incorporates ideas from Our work on inductive process modeling incorporates ideas from many traditions:many traditions:

However, the most similar research comes from Todorovski and However, the most similar research comes from Todorovski and Dzeroski (1997) and from Bradley, Easley, and Stolle (2001).Dzeroski (1997) and from Bradley, Easley, and Stolle (2001).

Their approaches also use knowledge to guide the induction of Their approaches also use knowledge to guide the induction of differential equation models, though without a process formalism.differential equation models, though without a process formalism.

Page 19: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Directions for Future ResearchDirections for Future Research

produce additional results on other scientific data setsproduce additional results on other scientific data sets

develop more robust methods for fitting model parametersdevelop more robust methods for fitting model parameters

explore alternative techniques that mitigate overfittingexplore alternative techniques that mitigate overfitting

extend the approach to handle data sets with missing valuesextend the approach to handle data sets with missing values

implement heuristic methods for searching the model spaceimplement heuristic methods for searching the model space

utilize knowledge of subsystems to further constrain searchutilize knowledge of subsystems to further constrain search

Despite our progress to date, we need further work in order to:Despite our progress to date, we need further work in order to:

Our goal is a robust approach to inductive process modeling that Our goal is a robust approach to inductive process modeling that can aid scientists and engineers in model construction.can aid scientists and engineers in model construction.

Page 20: Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

End of PresentationEnd of Presentation