csc 599: computational scientific discovery lecture 7: scientific processes and ids

42
CSC 599: Computational Scientific Discovery Lecture 7: Scientific Processes and IDS

Upload: kory-burns

Post on 31-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

CSC 599: Computational Scientific Discovery

Lecture 7: Scientific Processes and IDS

Overview

Processes

Describing Processes

Integrated Discovery System

Processes

Up until now we have had a mostly static view of science If the Universe did not change much we could:

Assemble database of its attributes Look for patterns among attributes

But the Universe does change! We need to

Assemble hierarchy (or some other structure) of changes

Look for patterns of changes Before and after patterns During patterns Sequence of state patterns

Attributes of a process

Instances of a process described with: Time

Could be a time range (start, finish) Rates of change Previous history

Objects Special cases:

The “doer” (“subject”) The “done-upon” (“direct object”, “indirect object”)John gave flowers to his mother.

Peripheral actors Environment

Location Magnitude

How big (either of process or one of its objects)

Events: Changes between staticness

One way to view processes is that they are about the changes themselves:

Example Events

The Size of the United States

Example Events (2)

Motion along fault By earthquake as opposed to creep

Attributes of Events

For events it is natural to ask: For each event

When? What/who? Where? Properties of event (How big?, etc.)

Among events Given the first

When is the next? Who/which will be next? Where will be next? How big will be next?

What is an overall sequence like?(Patterns in time, objects, location and size)

Processes as always active always active principlesprinciples

Another way to view processes is as always active Often opposing forces

cause no net change Macroscopic change

(“events”) result when opposing forces are out of balance

Example: Gravity and Normal Force are “always on”

Processes as always active (2)

Let's revisit earthquakes:

Attributes of Processes

All processes of events, plus: Quantity of forces as function of time Maximum limit of “homeostatic” forces

Friction Normal force

Additional attributes like opposing forces lets us ask “what is happening during the process”

Example before-during-after

Transition State Theory Energies associated with molecules Energy difference between transition state and

original molecules dictates reaction rate constant Have related a “during attribute” (energy of

transition state) with an “observable” attribute (process rate)

State or event sequences

Pick up cup raise, Raise cup Drop cup Cup falls Cup impacts Cup breaks

Cycles of States and “Lifecycles”

Birth, adolescence, adulthood, old age, death In living things In stars

Cycles and Periodicity

Period motion Cycle that completely resets itself

Pendulum motion Planetary motion Chemical cycles

Complications Entropy wears things down

Friction eventually stops pendulums Chemical cycles eventually run out of reagents

Apparent period might be symptomatic of deeper relationship

Moon orbits Earth every 28 days Moon slowly receding from Earth due to tidal forces

Cycles within Cycles

The Carbon Cycle(s)

Pools (Black) in Gigatons

Fluxes (Purple) in Gigatons/year

Illustration courtesy NASA Earth Science Enterprise

Multigenerational Lifecycles

Generation 1 makes generation 2 Generation 1 dies Generation 2 makes generation 3 Generation 2 dies Generation 3 makes generation 4 Generation 3 dies Generation 4 makes generation 5 . . .

Timescales and Magnitude

Things look static because they are so slow Growth of plants Motion of plates, recession of moon Lives of stars Growth of rings on Saturn

Use Time lapse photography (plant growth) Very precise measurement (plate motion, moon

recession) Look at whole populations of different ages (lives

of stars) Inferred ages of parts (Saturn's rings)

Timescales and Magnitude (2)

Things look static because they are so fast Motion of air molecules in a breeze-less room

Things blur because they are so fast Engines

Use: High speed photography (hummingbird wings) oscilloscopes, strobe lighting, laser pulses

(engines, chemical reactions) Confirmatory theory (kinetic theory of gases)

Timescales and Magnitude (3a)

Unique processes might also be viewed as continuum of magnitudes

“4500 to 4000 MYA a Mars-sized object hit Earth”

Has not happened since (fortunately!)

Timescales and Magnitude (3b)

Is object hitting Earth unique? Pea-size meteoroids - 10 per hour Walnut-size - 1 per hour Grapefruit-size - 1 every 10 hours Basketball-size - 1 per month 50-m rock that would destroy an area the size of

New Jersey - 1 per 100 years 1-km asteroid - 1 per 100,000 years 2-km asteroid - 1 per 500,000 years Mars-sized object – 1 per 4000 MYA?

Representing Processes

Each is unique Not much generalization

Sets Generalization within set

Single inheritance Limited generalization among sets

Multiple inheritance Fuller generalization among sets

Anything else?

Describing State Sequences

Finite State Machine Perhaps most of science

Push down automaton Natural and computer languages

Turing Machine Besides special cases of natural and computer

languages can you think of any examples?

Describing Changed Attributes

Qualitative Physics Change state when change attribute's derivative

Difference Equationsattr(t+1) = attr(t) + changeFunction(x,y,z)

Ordinary Differential Equations One independent variable (often time):

Newton's 2nd Law: F(x) = d2x(t)/dt2

Partial Differential Equations More than one independent variable:

¶2u/ ¶x2 + ¶2u/ ¶y2 = 0

Integrated Discovery System (IDS)

Pat Langley, Bernd Nordhausen, 1990

Knowledge base Hierarchy of States Continually refined with more data

Input History of descriptions of qualitatively different

states

Output Refined hierarchy of states

IDS Example

IDS is given the following history:State 1:

liquid acid A and liquid base B exist, then combinedState 2:

quantity of acid and base decrease,quantity of salt increases

State 3:Resulting state has some salt and some acid

IDS Example (2)

IDS Example (3)

IDS is next given the following history:State 1:

liquid acid A and liquid base B exist, then combinedState 2:

quantity of acid and base decrease,quantity of salt increases

State 3:Resulting state has some salt and some base

IDS Example (4)

IDS Substance KB

Knowledge Base Domain knowledge:

Histories

IDS input is histories Sequence of qualitative states

Each of which as “constant” behavior A qualitative state ends (and new one begins)

when: an increase or decrease of attribute starts or stops

That is, sign of attribute's derivative changes Structural change occurs

For example, substance appears or disappears mass(SUBSTANCE) decreases to 0 mass(SUBSTANCE) increases from 0

Histories (2)

Histories described by: Object description

liquid(C), HCl(C) Structural description

touches(C,D) Successor link

(Which state comes next) Transition condition

Attribute of successor linkTells conditions under which:

Current state ends New state begins

Histories (3)

ExamplesState 1:

Objects: liquid(A), HCl(A), liquid(B), NaOH(B)Structural:Successor: state 2Transition: combine(A,B)

State 2:Objects: liquid(C), HCl(C), liquid(D), NaOH(D),

liquid(E), NaCl(E)Structural: mass(C)<0, mass(D)<0, mass(E)>0Successor: state 3Transition: mass(C)=0

State 3:Objects: liquid(F), NaOH(F), liquid(G), NaCl(G)Structural: n/a Successor: n/a Transition: n/a

IDS State knowledge

Is-a hierarchy No distinction made between abstract and

instance states!State transition constraints:

Transition conditions“When mass HCl reaches 0 reaction state ends and

final state begins” Final conditions

“When water reaches 100 C it starts to boil”

Within state knowledge Eg. Ideal Gas Law

Beginning state/Final state knowledge“For HCl + NaOH -> NaCl, mass(NaCl) =

1.64*mass(HCl)”

IDS Discovery

Hill climbing without backtracking(Where have we seen

this before?)

“Clustering”Put new state in

hierarchyCompare states

lexicographicallyAlso considers merging

nodes

Cluster(SubRoot,NewState){for each child C of SubRoot

compute similarity between C and NewState

Let C_hi be child with highest match score

if (matchScore(C_hi,NewState) > threshold)if not(C_hi covers NewState )

generalize C_hi to cover NewStateCluster(C_hi,NewState)

elseadd NewState as child of SubRootmerge children of SubRoot

}

Clustering example:

Before clustering

After clustering

IDS Merging

Merging does Forms general

knowledge Cuts down on

number of states System not just

“database of histories”

merge_children(SubRoot, NewChild){for each child C of SubRoot but NewChild

Compute similarity between C and NewChildLet C1 = child with highest scoreLet C2 = child with second highest scoreLet C1_NewChild_s = match(NewChild,C1)Let C1_C2_s = match(C1,C2)if (C1_NewChild_s > C1_C2_s)

C1_NewChild = merge(C1,NewChild)if (C1_NewChild != SubRoot)

make C1_NewChild child of SubRootremove SubRoot children C1, NewChildmake C1, NewChild children of

C1_NewChildelse

C1_C2 = merge(C1,C2)if (C1_C2 != SubRoot)

make C1_C2 child of SubRootremove SubRoot children C1, C2make C1, C2 children of C1_C2

Merging Example

Before Merge After Merge

Discovering Laws

Qualitative Laws Successor links:

When make new non-leaf node, follow successor links of children generalize up to the most specific node that covers all

Quantitative Laws Use BACON like search for regularities:

Among attributes of given state When going from one state to its successor Between states (e.g. initial and final)

Use numbers at leaf nodes as raw data

Example Discovering Successor Links

Before successor link After successor link

Example Learning Quantitative Law

IDS Discussion

Among first systems to explicitly be aware of time Qualitative states -> Limits representation's

search space

Room for improvement Needs to be given object hierarchy Qualitative states is a severe limitation!

Ad hoc clustering (sensitive to order that histories presented)

Cannot explicitly parameterize time Assumes single inheritance

How would you fix some of these?