jaenisch, holger (2009) data modeling applications in ...€¦ · this file is part of the...
TRANSCRIPT
This file is part of the following reference:
Jaenisch, Holger (2009) Data modeling applications in space science and astronomy. PhD thesis, James Cook University.
Access to this file is available from:
http://eprints.jcu.edu.au/12151
Data Modeling Applications in Space Science and Astronomy
Holger Marcel Jaenisch
A portfolio mini-thesis submitted in partial fulfillment for the degree of
Doctor of Astronomy
James Cook University
ii
STATEMENT OF ACCESS
I, the undersigned, author of this work, understand that James Cook University will
make this thesis available for use within the University Library and, via the Australian
Digital Theses network, for use elsewhere.
I understand that, as an unpublished work, a thesis has significant protection under the
Copyright Act and;
I wish the following restrictions to be placed on this work:
1. Copyright (All Rights Reserved)
2. Permission to copy portions of this work for academic or research purposes is
granted, provided appropriate refere nce and citations to the author and this work are
made.
3. This work may not be reproduced and sold for profit without express permission
from the author.
Holger Marcel Jaenisch, PhD January 2009
Signature Date
iii
STATEMENT OF SOURCES
DECLARATION
I declare that this thesis is my own work and has not been submitted in any form for
another degree or diploma at any university or other institution of tertiary education.
Information derived from the published or unpublished work of others has been
acknowledged in the text and a list of references is given.
Holger Marcel Jaenisch, PhD January 2009
Signature Date
iv
STATEMENT ON THE CONTRIBUTION OF OTHERS
Supervision: Dr. Andrew Walsh
Dr. Graeme White
Dr. David Blank
Editorial Assistance: Mr. James Handley
Sponsorship: The following individuals and companies sponsored the presentation
and publication of several papers derived from this work (which are
noted in the References).
Dr. Nathaniel Albritton Amtec Corporation (USA)
Mr. Michael Hicklen dtech Systems, Inc. (USA)
Mr. Richard Esslinger Axiom Corporation (USA)
Mr. John Deacon Sparta, Inc. (USA)
Mr. Jeffrey Faucheux Sparta, Inc. (USA)
Dr. Carl Case Sparta, Inc. (USA)
Dr. Marvin Carroll Tec-Masters, Inc. (USA)
v
ACKNOWLEDGEMENTS
I would like to express my deep appreciation to the following individuals for technical
discussions and the development of creative test cases and examples demonstrating the
validity, utility, and appropriate application of Data Modeling as formulated in this thesis:
Marvin Barnett, James Handley, John Pooley, Dr. Marcel Thuerk, Ronald Juergens, Dr.
Miroslav Filipovic, Alexander Hons, William McDonald, Claude Songy, Richard Snow,
William Dionne, Michael McGary, Lee Ray, Dr. Kajesh Patek, Robert Morgan, James Myers,
Tim Aden, Scott McPheeters, Louis Bonham, Burt Vierge, Stoney Massey, Michael Conn,
Richard Gregg, Dr. David White, Marty Krizian, Stephanie Berry, Thomas Kotsch, Dr.
William Pickle, Dr. Mehmet Erengil, Dr. Stephen Kornguth, Dr. Matthew Edwards, James
Hunt, Todd Alexander, Emilio DiGiorgio, Thomas Martin, Joseph King, Dr. William
Douglas, Chester Rowe, Dr. Robert Carruth, Sieglinde Jaenisch, Kristi Jaenisch, David
Vineyard, Dr. Barry Johnson, Dr. John Caufield, Dr. Michael Curley, Dr. Nickolai Kukhtarev,
and Mr. Wintley Phipps.
vi
Data Modeling Applications in Space Science and Astronomy
Abstract
This Portfolio Thesis introduces seventeen scientific and technical papers that present
unique numerical and analytical algorithms that can be applied to any type of data and
imagery, independent of scene content, but are developed and illustrated here in the
context of astronomy and space sciences. Specifically, application to autonomous
remote vehicles for exploration is developed as a unique enabling technology. For
each paper, I present its significance to a hypothetical autonomous rover mission as an
example of how the new algorithms might find utility.
vii
Table of Contents
CHAPTER 1
PORTFOLIO THESIS OVERVIEW
1.1 Introduction. Page 1
1.2 Background. 3
1.3 Portfolio Introduction. 7
1.4 Robotic Rover Background –
An Example of the Need for Autonomous Software. 8
1.4.1 Autonomous Vehicles. 11
1.4.2 Path-Planning. 11
1.4.3 Reactive Navigation. 13
1.4.4 Data Modeling – The Key to Reactive Navigation. 14
1.5 Survey of Current Work. 15
1.6 Portfolio Thesis Papers – An Overview. 18
Paper 1-
Data Modeling For Radar Applications.
(Jaenisch & H andley, 2003). 18
Paper 2 -
Data Modeling of 1/f Noise Sets.
(Jaenisch & Handley, 2003). 19
Paper 3 -
Automatic Differential Equation Derivation From Sampled
Data For Simulation Parameter Sensitivity Analysis.
(Jaenisch, 2006). 19
Paper 4 -
Automatic Differential Equation Data Modeling for
UAV Situational Awareness.
(Jaenisch, 2003). 20
viii
Paper 5 -
Data Model Predictive Control as a New
Mathematical Framework for Simulation and VV&A.
(Jaenisch, Handley & Hicklen, 2006). 21
Paper 6 -
Analytical Data Modeling for Equivalence
Proofing and Anchoring Simulations with Measured Data.
(Jaenisch, Handley & Hicklen, 2006). 21
Paper 7 -
Data Modeling Change Detection of Inventory
Flow and Instrument Calibration.
(Jaenisch, Handley & Bonham, 2003). 22
Paper 8 –
Data Modeling for Fault Detection.
(Jaenisch, Handley, Pooley & Murray, 2003). 22
Paper 9 -
Data Modeling for Virtual Observatory Data Mining.
(Jaenisch et al., 2004). 22
Paper 10 -
Virtual Instrument Prototyping with Data Modeling.
(Jaenisch et al., 2003). 23
Paper 11 -
Data Modeling of Deep Sky Images.
(Handley, Jaenisch et al., 2004). 24
Paper 12 -
Image Resolution and Performance Analysis of
Webcams for Ground Based Astronomy.
(Lim, Jaenisch, et al., 2004). 24
Paper 13 -
Classification of Jacoby Stellar Spectra Data
using Data Modeling.
(Jaenisch and Filipovic, 2002). 25
ix
Paper 14 -
Enabling Unattended Data Logging and Publication
by Data Model Change Detection and
Environmental Awareness.
(Jaenisch, 2006). 26
Paper 15 -
Data Driven Differential Equation Modeling of fBm Processes.
(Jaenisch et al., 2003). 26
Paper 16 -
Data Modeling Augmentation of JPEG for Real-time
Streaming Video.
(Jaenisch and Handley, 2005). 27
Paper 17 -
Shai-Hulud: The Quest for Worm Sign.
(Jaenisch et al., 2005) 27
1.7 Conclusion. 28
x
CHAPTER 2
ADDITIONAL APPLICATIONS 30
2.1 Introduction. 30
2.2 Application to Data Ensemble Synthesis. 30
2.3 Application to Critical, Sensitive, and Key Parameter
Estimation for Analytical Modeling. 31
2.4 Application to State Estimation From Trajectory Estimates
for Tracking and Automatic Controller Design. 31
2.5 Application to Formal Proofing and Validation and Verification
of Software Simulations. 32
2.6 Application to Health Usage Management Systems (HUMS)
and System Health Diagnostics and Mean Time to Failure Prognostics. 32
2.7 Application to Data Mining. 33
2.8 Application to Virtual Instrument Prototyping. 33
2.9 Application to Image Processing, Analysis and Modeling. 34
2.10 Application to Image Compression and Synthesis. 34
2.11 Application to Spectral Analysis and Modeling. 35
2.12 Application to Unmanned Vehicles. 35
2.13 Application to Information Assurance and Network Security. 36
2.15 Conclusion. 37
References. 38
Portfolio Thesis Papers. 44
1
CHAPTER 1
PORTFOLIO THESIS OVERVIEW
1.1 Introduction.
This thesis addresses how requirements for planned deep space exploration missions
with autonomous sensor platforms can be met using current available mathematical
techniques. Current exploration missions rely on sending remote viewing cameras
and instruments to neighboring planets. As distances of these missions extend further
into space, message delays to and from Earth prohibit interactive decision-making and
mission execution.
Enabling remote exploration beyond the interactive delay boundary requires remote
sensor platform cognition, mentation (Blackmore 2004) and self -awareness. Onboard
algorithms must quickly process and understand their local changing environment to
enable in-situ and real-time coarse-of-action modification (the simplest form of which
is reactive navigation). Mission planners cannot be certain what will be encountered
on distant worlds. Without prior examples, knowing when something out of the
ordinary or a novelty is encountered requires cognitive awareness (because you
cannot know in advance what you do not know and therefore cannot prepare a
classifier to deal with it in advance). The intelligent sensor must discern novelties and
off-nominal events as though an investigator were standing planet-side on the distant
world. Infrequent data transmission and limited energy sources restrict onboard
computing power for image collection, processing, transmission, and in-situ science
analysis. Although the current use of digital based computing and processing has
become the main tool for remote control vehicles true efficient real-time computing
will require hardware based analog computing that is tolerant of harsh environments
and resistant to noise.
2
The shortcoming of analog computers (Korn and Korn 1964) has been touted to be the
limited processing of simple differential equations. Complex modern algorithms are
rarely in simple differential form. Hence, the real-time power of the analog computer
has not been brought forward because of the mathematical inability to recast complex
decision making architectures and algorithms into a series of differential equations
that can be adequately solved on analog computers that would enable miniaturization
and embedding on small frame autonomous vehicles. If this mathematical magic
bridge could be built (convert digital complex algorithms into differential equation
form), then the analog computational approach once again becomes a very reasonable
and very powerful tool to advance cognitive methods onto fully autonomous
platforms. The inability for an intelligent agent such as the analyst to reside station
side for remote missions limits the complexity of tasks that can be accomplished with
remote control based systems.
This limitation is partly compensated for by building knowledge models of the
environment and transmitting those models and any deviations from them as
“interesting” to mission analysts (Jacoby 1991). An intelligent and self-aware sensor
must be able to decide between photographing, analyzing and sampling rocks
encountered on a strewn field. The mathematical framework and bridge for modeling
changing information encountered during autonomous exploration with in-situ
derived differential equations and using them to detect novelties is called
Entropyology (en’tro-pe’-ah-lo-gee) (from entropy-ology, term coined in 1996 along
with Data Modeling by the author - the science of changing-information). This
portfolio thesis introduces the methods of Data Modeling and a myriad of successful
applications of these fundamental mathematical tools to demonstrate the feasibility of
developing autonomous probe missions.
Data Modeling is a fledgling theory. Full proofs of the general utility and limitations
of its methods are still being developed. The fact that a consistent set of methods can
be successfully applied to a diverse set of applications is itself not a proof of
universality. However, lacking such proofs does not preclude its utility. The power
of this new Data Modeling theory is the simplicity of the techniques involved once a
subtle but profound shift in the perspective of its use and interpretation of results has
occurred. The methods comprising Data Modeling are not themselves new. Many
3
will appear simple and previously tried-but abandoned. This is indeed the case.
Many simple but powerful techniques have fallen by the way side because of innate
limitations of the techniques themselves. But what if these limitations could be
addressed and solved or removed altogether? What if limitations that could not be
removed could be found to be exploitable and useful if viewed from a different
perspective and used in a novel way? Then, the powerful techniques resurrect and
become attractive again because the poison of the techniques limitations has been
transformed into a powerful cure. It is with a keen eye for recognizing the familiarity
of methods yet recognizing and discerning the subtlety of the way in which they are
used and applied that makes the difference between sulfur, salt-peter, charcoal and
gun powder. Simply recognizing the fundamental ingredients and knowing they must
be mixed or being aware of their general properties does not lead to the magical
formula for gunpowder or the more elusive technique for producing it. In the same
vein, the power of Data Modeling is not in the methods themselves but how they are
applied and how the results are used.
1.2 Background.
Cognition and mentation are aspects of self-awareness and do not fall under the
machine intelligence and artificial intelligence (AI) paradigm (Winkler 1972). AI is
the ability to program machines by example rather than by heuristic (text based rules
of thumb) or explicit program instructions. Programming by example is commonly
referred to as pattern matching or pattern recognition and emphasizes learning
differences between at least two distinct classes or sets. It is not possible to apply
pattern recognition methods to single class cases.
Machine learning is the ability to implement adaptive algorithms onto machines to
enable decisions to be modified in response to changing stimuli. Neither paradigm
encompasses the transition to biomechanics or mechanistic -bioinformatics which
embraces cognition (mental awareness and mentation), the human thought process of
extracting relationships and salient features, formulating hypothesis and making
predictions while updating views.
4
In order to achieve autonomy, the first principal of cognition which must be achieved
is self-awareness. Self-awareness is the ability to recognize that which is me from that
which is not me without anyone or anything explicitly showing the difference. This is
the process of forming expertise in a subject, just as an art specialist learns the details
of the technique of an individual master painter so that the specialist can then easily
discern forgeries when encountered in the future. This ability was attained without
having been trained on learning differences between the original and forgeries, since
forgeries are infinite in number but the originals are finite. Instead, sufficient detail is
discerned and learned that makes the specialist hypersensitive to deviations from such
details when encountered. The specialist is thus able to distinguish things that are the
same from a myriad of things that are similar but distinct without having to rely on
classification.
The expert may then attempt to teach such expertise to others by presenting examples
of forgeries and originals to give an example of what is being keyed off of. This
process is teaching classification by pattern recognition and is markedly inferior to
gaining expertise by experience alone. Students of the pattern recognition method
intrinsically learn to emphasize the differences between the presented examples more
than gaining expertise in details of the originals. This leads to an inability to recognize
good forgeries that the original expert can still easily distinguish. This original form
of gaining expertise is not pattern matching or pattern recognition it is a product of
cognition and mentation on aspects of similarities from a single class alone. This is
the ability that must be embodied as a computational process in a vehicle if it is to
achieve self-awareness or cognition.
The process of mentation is understanding and projecting variations of the learned
style or painting style that makes it possible for the expert who can recognize
forgeries to also recognize a newly discovered painting by the artist of interest even
when that image has never been seen before. This is a result of mentation which
followed cognition. This projecting of expectations ba sed on perceived experience
enables recognition of the sameness even when the object is vastly different from the
gross appearance of the original set. How can such elusive magical qualities be
embodied in a cybernetic process stored in an autonomous vehicle? This is the
question that is being addressed with the proper application of Data Models and the
5
Change Detectors they empower. This ability to discern from a single class can be
achieved with mathematical structures if used in profoundly subtle but powerful ways.
The Data Model is a formal differential equation representation of numerical data
(Christ, et al 1999) (Wang and Zheng 2005) that can be obtained from sensors that
observe the world around a rover. To make the processing power of the rover small
and able to respond to real-world environmental changes as they occur in real-time
requires true real-time computational speeds. Traditionally, differential equation
models are derived by humans after years of data collection and simplification with
abstraction of results into trends and can be represented with relatively simple smooth
and continuous differential equations. Such equations are few in number and their
solving is taught and readily available to the scientist or engineer, but deriving the
equations themselves is a holy grail in the symbolic computational arena and
represents a quest for true machine cognition (Cohen 2002) (Cohen 2003).
The techniques of Data Modeling allow new differential equations that describe the
relationships between observed data sets to be derived autonomously and independent
of human intervention. Once derived these equations can be solved using classical
mathematical techniques as required for analysis. The equations form a corpus of
knowledge or a descriptive hypothesis of the information relationships encountered.
This synthesis and abstraction of information relationships is the goal of data mining
and is vastly different from simple relational searching, keyword matching, or pattern
recognition applied to data base structures.
Database information structures (Duncan 1975) (Das 1992) have been labeled as
“data models”, though this is very misleading since these data models are simply flow
charts on the database contents showing relationships between information flow in an
organization. Nowhere is a model created or presented of the “data” itself.
Conversely, a Data Model is an explicit mathematical model of the data corpus itself
that can not only reconstruct the data from which it was derived, but also interpolate
as well as predict via extrapolation both forward and backward indefinitely. Hence, a
derived model of a data set is infinitely more powerful and useful than storing a data
set itself. Further, the information flow structures embodied by the database
community term of “data model” can in fact also be converted into a true Data Model
6
which would yield differential equation models of the information within the structure
itself, yielding formal proofing (Jaenisch 2006) and predictive modeling capability to
such models! This allows data models for databases to be converted into Data Models
and enables databases to be converted into true Knowledge bases. Information
processing is different from data processing. To be useful for cognition the former is
required.
In insects, such processing is referred to as hardwiring or “instinct” (Amos 2004). We
can make the analogy that instinct programming is to insects what mental thoughts are
to humans, in the same way analog computers based on operational amplifiers work in
true real-time but are specialized while digital computers are more general but far less
efficient and much slower in obtaining results. The ability to embody advanced
information processing algorithms into hardware in the form of analog computers has
heretofore been impossible. Since analog computers typically process only differential
equations, it would require all forms of information to be mapped into the form of a
forcing function feeding dynamic models or coupled differential equations.
Where would these system level model differential equations come from? These
differential equations are the autonomously derived Data Models that the author
created that extend the early Group Method of Data Handling (GMDH) (Madala and
Ivankhnenko 1994) and polynomial network developments far beyond simple pattern
matching. By generalizing and extending the GMDH work into a form of deriving
multi-variable analytical equations rather than simple nested second order
polynomials, true partial differential equation models can be derived and implemented
in analog computer hardware. This means the world can be sampled and understood
mathematically, enabling hardware to be created that processes in real-time these
perceptions only because they are in differential equation form.
Bayesian methods (Gustafson 1994) (Andrews and Stafford 2000) (Grzymala-Busey
1991) (Bolstad 2004) try to circumvent the inability to obtain the actual differential
equations of motion or dynamics by characterizing the average statistical behavior
and variance of observed dynamics. This covariance based description throws away
all the detailed information inherent in the relationships between the observed data by
7
representing only average occurrence as probabilities. The average observed behavior
never provides a mathematical model of the actual dynamics.
Without the actual equations, true forecasting forward and backward in time is not
possible. The crude GMDH-type methods, although applied to many areas, are
notorious for failing spectacularly to generalize or extrapolate and are very unstable
even for interpolating. By combining fractal and chaos algorithm methods with Data
Modeling differential equations, it is possible to convert streams of data directly into
differential equation form for ease of storing, comparing, inverse parameter
estimation, and future comparison as well as forecasting and backcasting. The tools of
Data Modeling make it possible to achieve a complete robust world-view paradigm
for a robotic sensor in a common framework tha t is independent of the sensor source
or type. This ability seems almost magical, but it is the goal of my thesis to show that
the fundamental mathematical foundation underpinnings already exist and are well
established. It simply requires a subtle but critical paradigm shift regarding how the
methods are viewed and applied. This shift enables a tremendous leap forward in
autonomous cognition. In fact, the change detector created from Data Models is the
first true example of a “Self-Aware” algorithm that is ready to be explored and
exploited for its potential to enable cognitive automatons (Collins and Horn 1991).
1.3 Portfolio Introduction.
In this Portfolio Thesis, I introduce seventeen scientific and technical papers that
present unique algorithms for information processing. These papers are the basis of
my submission for the degree of Doctor of Astronomy (DoA) at James Cook
University.
These algorithms can be applied to any type of data and imagery, independent of
scene content, but are developed and illustrated here in the context of astronomy and
space sciences. The computational processes could be as readily applied to
underwater reef studies, mammograms, portraits, microscopic images, radar images,
synthetic computer generated imagery, or simply pseudo-colored data arrays
comprised of code vectors, text, or any other embodiment of data.
8
The attached Portfolio Papers have been published over the period 2002 to 2006.
1.4 Robotic Rover Background - An Example of the Need for Autonomous
Software.
This section of the Portfolio Thesis is specifically to illustrate the purpose and need
for the algorithms that are presented below and in the attached papers. As stated
above, these algorithms have use in very many (possibly all) information processing
environments, but they are presented here in the context of astronomy as they were
developed for that use.
The use of robotic rovers (Bond and Allman 1996) is both an attractive and necessary
option for the exploration of planetary and or other remote objects. Although state-of-
the-art technology for remote control are presently used, fundamental limitations in
this approach exist preventing the realization of fully autonomous missions.
The principal issue with space exploration is the distance between the rover and the
Earth data acquisition and command station. The separation between the rover and its
base results in delayed communications and slower data streams.
For example, radio signals between Mars and Earth take from between 6 and 41
minutes transit time. Mars distance from Earth ranges from a minimum of 0.5 AU to
a maximum of 2.5 AU (when the Earth and Mars are on opposite sides of the Sun). 1
AU is about 150 million km and light travels at approximately 300,000 km/s. Light
travels 1 AU in approximately 500 seconds. Therefore, transmitting a radio signal to
Mars would take a minimum of 250 s (approximately 4 minutes) to a maximum of
1250 s (approximately 21 minutes). Round trip would be double these numbers, or 8
minutes to 42 minutes for a response to be received after transmitting the original
message. This does not take into account message drop out, handshaking for message
confirmation and message processing prior to response. This long distance itself also
imposes low communication bandwidth limits for information assurance and signal
quality.
9
This tyranny of distance precludes the use of teleoperation (every individual
movement controlled by human intervention) for controlling the vehicle and unveils
the requirement for vehicle autonomy.
Major improvements have been made in deep space telecommunications, including
the implementation of high-bandwidth communication systems operating at Ka-band
(Kurtz-above radio frequency of 32 GHz, which is off-limits to commercial users).
Ka-band is a factor of four higher in frequency than the current X-band technology
used in deep space communications, and will provide a data rate from Mars of more
than 2 megabits per second (today the maximum data rate transmitted to Earth by a
spacecraft at Mars is about 128 kilobits per second). Current technological
developments are also directed toward improving the performance of radio
communications through the use of large deployable spacecraft antennas and ground-
based antenna arrays.
In addition, optical transmission, which relies on laser light instead of radio waves, is
being explored to enable video-rate communications from Mars, and large gains in
data rate for outer solar system exploration. Laser communications from deep space
will be received by optica l telescopes operating either on the ground or on platforms
above the Earth's atmosphere. In the case of orbiting receivers and relay transducers,
real-time access can be anticipated in the future using a "trunk line" from the Earth to
a relay spacecraft in orbit around the distant planet, and proximity links between that
spacecraft and landers, rovers, and other vehicles. These exciting activities provide
intellectual and career opportunities for many systems engineers and experts in optical
communication.
However, the present communication delay makes Earth-bound remote control of
exploration vehicles at distances beyond the Earth's moon impractical – and
autonomous technology is a key to meeting this challenge. The Mars technology
program, for example , is developing technologies that will enable a rover to travel to
and sample a rock 10 meters away with a single command (instead of the five to 10
commands required by the present rovers) (Boden and Larson 1996).
10
Future planetary exploration missions may include the study of Mars by a series of
robotic missions, and the surface exploration of Venus and Titan. Plans currently call
for vehicles to carry samples back from distant bodies as far away as Venus or a
comet's surface, and for an investigation of the Moon as a source of minerals for
future space manufacturing, and even for earthly use. Deciding which rock to bring
back from a strewn field remains an elusive problem that will have to be addressed.
The subsurface exploration of solar system bodies , such as Mars and Jupiter’s moon
Europa, will require yet another dimension of robotic mobility, and more powerful
systems still will be needed for aerial vehicles operating in the atmosphere of Titan.
For the rover to function properly as an autonomous agent (Kott n.d.), it is essential
that it can sense, model, and discern its environment, enabling it to understand what it
perceives and allowing it to respond and react to these perceptions. For example, the
rover must know if it is going uphill or downhill, if there is an obstacle in front of it
and so on. This is achieved by the perception system of the rover. The perception
system senses the environment by using physical and virtual sensors.
Physical sensors could include wheel encoders, inclinometers, cameras, and laser
rangefinders which detect the immediate terrain environment of the robot. Virtual
sensors are mathematical functions defined by the values of some physical sensors.
For example, virtual sensors can give the absolute spatial location of the rover in
Cartesian coordinates.
Crude forms of virtual sensors are called smart sensors (Hall and McMullen 2004)
(Klein 2004). Smart sensors preprocess their own data prior to dissemination while a
virtual sensor may combine information from several sources including its own to
yield a data stream of information that is not directly observable by any sensor alone.
An example might be detecting networks of electrically connected objects such as
Improvised Explosive Devices (IED) or roadside bombs, where the entire IED is the
object of interest but no actual sensor can find such IEDs. Instead, a collection of
sensors is used to sense orthogonal and independent components of the IED with
varying levels of confidence, and the aggregate information from the various sensors
that alone only sees pieces of the IED infers the existence of the IED itself. This fused
suite of sensors forms a “virtual sensor” of IEDs and requires sensor fusion and
11
information fusion at all levels to be realized using a form of spatial voting and
reasoning to combine disparate sensor information.
1.4.1 Autonomous Vehicles
To meet the increasing requirements, a totally autonomous vehicle that can travel for
extended periods carrying out its assigned tasks is necessary. This is beyond the
present state-of-the-art of artificial intelligence, and methods currently under
investigation within the space community will provide only limited vehicle autonomy
at best.
This thesis sets out to demonstrate how Data Modeling overcomes many of these
limitations.
There are various degrees of autonomy and various approaches in the way autonomy
is granted to a rover. These factors determine the specific navigation technique used,
but all fall under two broad categories:
§ Path-Planning navigation
§ Reactive navigation (real-time obstacle avoidance navigation)
1.4.2 Path-Planning
In Path-Planning (Taguchi and Jugulum 2002), some form of terrain analysis is
performed and a safe route is decided before the vehicle is commanded to start
moving. In all cases, it is assumed that there is not previous knowledge of the terrain.
The route therefore requires reactive navigation as the rover moves towards a goal
and avoids obstacles or untraversable territory as it encounters them (Kim and Lewis
1998).
The core process invoked here is path planning, which suggests the superposition of a
global low-resolution height/elevation (topographic) map from the orbiter to high-
resolution map from the rover. The process of matching local terrain to global
12
overviews is called terrain matching and is achieved by an algorithm where the
translation that best matches the local and the global height map is sought.
An example of Path-Planning is Computer Aided Remote Driving (CARD). CARD
relies for its operation on stereo images sent from the rover to the ground-station.
Stereo imaging produces 3D pictures with depth perception, which enables the
operator to estimate a safe path for the vehicle, and designate this path using a 3D
cursor. The ground station computer calculates the control sequences for the vehicle
according to the designated path and then sends these to the vehicle. The vehicle
moves to the new location, sends a new picture and the process repeated. Depending
on the terrain and visibility conditions, the rover can move about 20 meters on each of
these iterations. Taking into consideration the delays involved (round-trip signal
delay, operator delay and computation delay), this approach results in an average
speed of somewhat less than a centimeter per second.
The major advantages of CARD compared to teleoperation is the relatively less
information transmitted. In addition, since the major computation is done on Earth,
the computers used can be powerful, saving the rover from carrying and powering
significant computers. However, no matter how fast the path planning and
computations are performed, the round-trip signal delay cannot be significantly
reduced and the average speed is unlikely to improve dramatically. Therefore, CARD
will be more suite d to short distance travels, such as traversing a difficult area or
performing a number of experiments in one location.
In semi-autonomous Path Planning navigation, the rover is given approximate routes
from Earth, but plans its local routes autonomously. Some operations are performed
on Earth while others are performed onboard the vehicle. In this scenario, a satellite
orbiting Mars sends stereo images of the areas of interest to the ground-station on
Earth. These images have a resolution of about a meter and enable operators to plan a
safe route for the vehicle for distances of, say, a few kilometers. In addition to path
planning, an elevation map is produced by computers. Both the elevation map and the
planned path are sent to the rover.
13
Onboard the rover, laser rangefinders and stereo cameras are used to obtain images of
the immediate environment. These images are used to compute a local topographic
map. This map is then matched to the local portion of the global map sent from Earth,
so that the rover can position itself on the global map and follow the designated route.
By comparison with the global map sent from Earth and the local map obtained from
the rover's sensors, a new detailed high-resolution map is produced by computers
onboard the rover. This map is eventually analyzed on the rover to determine the safe
areas over which to drive, while at the same time adhering to the route sent from
Earth.
1.4.3 Reactive Navigation.
Reactive Navigation (Maciejowski 1989) differs from path planning in that, while a
goal location is known, the rover does not plan its path but rather moves towards the
location by reacting to its immediate environment. There are various approaches to
Reactive Navigation, but all stem from the design belief that robust autonomous
performance can be achieved using minimal computational capabilities, as opposed to
the enormous computational requirements of path planning techniques.
Another modern approach to Reactive Navigation uses the adaptive capability of
artificial neural networks (Diamantaras and Kung 1996) (Jensen 1996), i.e., using
cumbersome back-propagation methods to tune reactive controllers or other more
specific methods such as support vector machines or radial basis function algorithms.
This approach maps perceptual situations into locomotion commands. These
mappings occur in distinct stages: (i) input data preprocessing, (ii) perceptual situation
classification, (iii) action association, and (iv) action validation. This series of
mappings leads to control architectures that are complicated, layered, brittle and
hierarchical. Existing applications of neural network techniques are limited by slow
training times and recent attempts to develop on-line reactive navigation capable of
on-line reflexive locomotion, using a trial-and-error feedback, have met with limited
success.
14
1.4.4 Data Modeling – The Key to Reactive Navigation.
Data Modeling has demonstrated autonomous performance – i.e., Data Modeling has
obtained Reactive Navigation (by deriving goals from mathematical predictions based
on local observed conditions) using minimal computational firmware (analog
computer). Analog computer implementation (Miola 1993) allows true real time
implementation. Analog computers are radiation hardened, temperature stable,
efficient in package size and power usage, and provide a method for closed loop servo
control to be achieved for the rover. Data Modeling gives the rover and its associated
sensor suite a higher reasoning capability (or a crude form of intuition) to supply
global and local navigation cues while enabling course-of-action autonomy (Widrow
and Walach 1996).
Entropyology combines, in novel and unique ways, mathematical theory and
techniques from data theory, similitude theory (Gelb and Vander Velde 1968),
analogy theory, cybernetics (Goldberg 1989), fractal geometry (Barnsley 2006), chaos
theory, catastrophe theory, complexity theory (self-organization), consciousness
theory, formal analysis (Jones 2002), parameter estimation (Raol, et al. 2004) (Doyle,
et al. 2002) (Mendel 1973), inverse modeling (Chalmond 2003), dimensional analysis
(Lipka 2003) (Palmer 2008), GMDH (ModelQuest 1996), Model Predictive Control
(MPC) theory (Lemke 2006), virtual sensor theory, and Data Modeling theory (Ladde
and Samba ndham 2004) into a series of analogical as opposed to simple numerical
and analytical methods for converting data into equations and exploiting them using
the rules of mathematics.
Even though the tools of Data Modeling were developed under the influence of the
various cited fields, there is no direct mathematical derivation or lineage from any of
them. The tools developed by the author are unique and were created after being
familiar with work in the various fields cited. The author created a commercial
software environment called MAJQANDA that is a collection of the classical and
Data Modeling based algorithms for decision-making and information processing.
15
1.5 Survey of Current Work.
A summary of the literature in this field is presented here in chronological order.
Only key works are presented, and the intent is to bring the reader up to speed in this
rapidly changing and exciting field. A more comprehensive survey of AI and machine
learning methods is outside the scope of this thesis and is readily available by simple
Google searching (http://www.google.com) on Internet resources. Repeating the
wealth of existing surveys here (none of which compare Data Modeling) is distracting
to the focus of this thesis. Further, a survey of machine learning and AI itself has also
been exhaustively documented on-line and will not be repeated here. The algorithms
in this thesis demonstrate the methods the author has personally applied to many real-
world applications in the past and where Data Modeling has proven superior.
Though this does not prove universal optimality, it does demonstrate diverse
applicability. The cited references for Data Modeling are by necessity sparse since the
author is the originator of the theory and it is too new to have wide audience. Several
examples of third parties using Data Modeling on their own projects is included in
this thesis to demonstrate impartial critical review and support of these methods.
Further, the application for a Data Modelin g patent as recent as April 2008 indicates
that these methods are indeed very new and therefore the review literature pertinent to
these methods cannot be expected to be extensive at this time.
This review starts with a paper describing a fully integrated modular, and object
oriented mobile vehicle simulator, named Expectations, that was published by Yung
and Ye (1997). In that work, all software was written in C/C++. Because this
software was modular, it allowed different algorithms for collision avoida nce,
navigation, path planning, and behavior learning to be easily swapped in and out
through a “plug and play” environment for algorithmic study and development. This
was a test-bed evaluation platform rather than a set of novel algorithms; however, it
did provide a variety of interesting benchmarks for later work, including that
presented in this thesis.
In the same year, Yung and Ye, this time joined with Fong, (Yung, Ye and Fong,
1997) presented an integrated virtual-world simulator for road vehicles that was
16
modular, object oriented, and hierarchical in its construction. This work now
provided a test-bed for automated driving, collision avoidance, navigation strategies,
tactics, and driving assistance. Again this work was a test-bed for evaluating the
proposed algorithms, and, again, it only presents a heuristic algorithm (explicitly
defined text based rules of thumb, and therefore cannot be learned from the
environment without human intervention).
Ye and Wang (2001), in a paper entitled “A Novel Na vigation Method for
Autonomous Mobile Vehicles” proposed software for a “navigator” that consisted of
software for obstacle avoidance, a goal seeker, a navigation supervisor, and an
environment evaluator. This methodology was specifically intended for use in
unknown and partially known environments, but relied on vehicles using traditional
fuzzy-logic operations (fuzzy-logic operations are a priori assigned binning of
numerical values into fixed increment meanings). Fuzzy rules cannot be discovered
autonomously but are specified by a human. These can be avoided.
The problems involved in navigation of autonomous vehicles using images from a
mounted camera are illustrated by Kumar, Jawahar and Narayanan (2004) . In their
work, items such as signboards were imaged by the camera and the vehicle position
was estimated with respect to the fixed signs. The methods employed were based on
contour correspondence between the current view and a reference view. This was an
elegant vision-system approach, and if sufficient onboard processing power were
available, it could have formed an interesting autonomous vision system. The paper
does not lead to analytical models, nor does it try to derive any such models.
In a paper published in 2005, Curtin et al., discusses the current state of autonomous
underwater vehicles (AUVs) and the steps involved to transform individual AUVs
into adaptive networked systems. Vehicle types are suggested based on their function
and operation environment (surface, interior, or bottom layer). In this work, various
platform navigation and control systems (software) were explored including methods
of creating autonomy for the vehicle. The laws that govern intelligent maritime
navigation are reviewed and an autonomous controller with conventional collision
avoidance behavior is described. Performance and behavior of various systems is
17
compared. The algorithms presented by Curtin et al., (2005) are, however, heuristic
rather than analytical.
Srini (2006), in web paper hosted by the IEEE Computer Society, presents a method
for using autonomous navigation principles in urban environments. In this
contribution, it was shown that current mobile sensor networks for autonomous
systems will also support autonomous modern vehicles equipped with actuators for
steering, throttling, and braking. Older vehicles are also supported by the system if
they are equipped with basic sensors and a smart phone. This paper documented
system interplay and mechanical fusion rather than autonomous environmental
awareness.
Fuzzy logic control, combined with behavior-based control, was introduced into the
science by Lei and Li (2007). Behavior-based control is changing the control
threshold based on the operational mode the device is in. This mimics but is not true
behavior based adaptation, since the various understood operational modes are pre-
programmed and the executed course of action simply depends on the mode the
device finds itself in.
Lei and Li presented a new method using behavior-based control of mobile robots,
specifically for path planning in unknown environments. A heading for the robot was
formed by connecting the target steering behavior and the obstacle avoidance
behavior that took the robot towards the target while avoiding obstacles. These
algorithms were statistical in their nature and not analytical.
Because these algorithms were statistical in nature, all the detailed information
inherent in the relationships between the observed data is represented as only average
occurrence using probabilities. The average observed behavior never provides a
mathematical model of the actual dynamics, which is crucial for use by a real-time
analog computer to eliminate stimulus A/D conversion and system latency.
A US patent (Brandon 2007) presents a method for gathering data with a Unmanned
Vehicle (UMV), storing that information into one or more Radio Frequency
Identification Units (RFID) storage devices (or some other data storage device), and
18
then releasing these devices for later collection and analysis of the stored data. This
technique was interesting because of the concept of allowing a small rover, with
limited on-board storage space, to offload data for future pick up. Limitations of the
number of off-loaded canisters are not addressed nor are the best methods for storing
data for later retrieval.
Again in 2007, a web published conference report by Kofod-Petersen et al. (2007),
gave several philosophical approaches to applying reasoning algorithms for
autonomous navigation. Most of the methods are heuristic in nature or statistical, and
there is no mention of analytical methods.
In 2008, several on-line articles appeared touting the military interest in swarm
technology and insect size autonomous vehicles for not only intelligence gathering
missions but also as offensive/defensive weapons with attack capability and
sophisticated ad hoc network centric information processing capabilities. It is these
later devices that most closely resemble the autonomous vehicles for remote sensing
that Data Modeling seeks to empower and enable.
1.6 Portfolio Thesis Papers – An Overview.
Here I list my papers presented for the degree of Doctor of Astronomy. Publication
details are given in the reference list. The papers are attached.
For each paper, I present the significance of the paper to a hypothetical autonomous
rover mission, which is, as explained above, here used as an example of where the
new algorithms are used.
Paper 1- Jaenisch, H., and Handley, J., “Data Modeling for Radar
Applications”. Radar Conference, 2003. Proceedings of 2003 IEEE,
pp. 379-386 (2003).
This is a general Data Modeling paper that shows the applicability and utilization of
the common core numerical algorithms across a large array of remote sensing
problems. This paper shows how, by converting numerical data into equation form,
19
universal data sharing and decision-making algorithms become computationally
efficient and autonomous. This work sets the stage for addressing in detail various
requirement of a rover mission that can be addressed using the Data Model methods,
thus alleviating the need for exotic Artificial Intelligence (AI) algorithms that are
unstable and computationally inefficient.
Paper 2 - Jaenisch, H., and Handley, J. “Data Modeling of 1/f noise sets”.
Noise in Complex Systems and Stochastic Dynamics. Edited by
Schimansky-Geier, Lutz; Abbot, Derek; Neiman, Alexander; Van
den Broeck, Christian. Proceedings of SPIE, Volume 5114, pp. 519-
532 (2003).
This paper presents a numerical method (that has the potential for use in an
autonomous planetary rover-type mission) for converting sensor data into a numerical
form suitable for modeling with analytical equations. This paper illustrates a general
purpose method for converting traditional measurements (such as temperature or
pressures as observed by a rover), imagery and internal status measurements, into a
single consistent algorithm for automatically deriving differential equations. These
derived differential equations are then used by the rover to make course-of-action
decisions.
Paper 3 - Jaenisch, H., “Automatic Differential Equation Derivation From
Sampled Data For Simulation Parameter Sensitivity Analysis”.
Intelligent Computing: Theory and Applications IV. Edited by
Priddy, Kevin L.; Ertin, Emre. Proceedings of SPIE, Volume 6229,
pp. 62290N (2006).
This paper presents a method by which the vehicle can process differential equations
obtained from sensor measurements (or status measurements) using the algorithm
described in the previous paper (Paper 2). The processing enables the identification
of the important parameters of the sensor measurement. This means, for example,
that the rover may be observing interesting changes in temperature or pressure as a
function of albedo or irradiance, and from these changes, for m a differential equation
model of the relationships. Once formed, parameter sensitivity analysis (as described
20
in this paper) enables the model to be simplified, and the crucial parameters to be
identified from among the available ensemble. Once identified, further detailed
monitoring by the vehicle can be undertaken to determine if unique or anomalous
events are occurring.
The algorithm presented in this paper enable the derivation of differential equation
models to be saved and checked with stored equations for similarity or uniqueness.
Instead of data basing new raw data, and trying to compare distributions yielding
ambiguous answers, this approach derives differential equation models and compares
the equations for similarity. Since these equations capture complex information
and/or relationships, the ability to compare such equations achieves far more than
pattern or template matching techniques allow, and is very computationally efficient,
and minimalist in storage size. Once the differential equations are built and
compared, they can be analytically inverted to obtain estimates and predictions of
input conditions expected for a variety of responses that may be encountered in the
future, or for simple function optimization. Novel in this approach is the ability of the
vehicle to vary the eigenmodes from a Fourier series model in real-time during the
mission.
Paper 4 - Jaenisch, H., and Handley, J., “Automatic Differential Equation
Data Modeling for UAV Situational Awareness”. Unmanned
Systems. Edited by McDaniel, Mark. Proceedings of the Huntsville
Simulation Conference (The Society for Modeling and Simulation
International), (2003).
This paper presents an algorithm for use in the machine reasoning center to achieve
inverse modeling and parameter estimation. This algorithm enables physics-based
differential equations to be derived from noisy sampled measurement data alone.
Once these physics-based dynamic equations are derived, this algorithm enables
automatic and non-human-guided inversion of the equation model to obtain estimates
of physical parameters which cannot be directly measured by the sensors. An
example would be to infer wind speed aloft, or atmospheric turbulence, from the
model using measurements of star scintillation.
21
Paper 5 - Jaenisch, H., Handley, J., Hicklen, M., “Data Model Predictive
Control as a New Mathematical Framework for Simulation and
VV&A”. Intelligent Computing: Theory and Applications IV. Edited
by Priddy, Kevin L.; Ertin, Emre. Proceedings of SPIE, Volume
6229, pp. 62290U (2006).
This paper gives an algorithm that enables the autonomous reasoning capability of the
vehicle to compare two different (and independently derived) differential equations
that come from a single set of sensor observations. The purpose of this comparison is
to identify important underlying changes that may have occurred over time (an
example might be the comparison of two independently derived models of
atmospheric turbulence, made over two different time intervals, and at two different
locations). Being able to analytically compare the models eliminates the need for
cumbersome “heuristic expert rules” that otherwise would have to be pre-programmed
into the rover. Rather, this algorithm compares the structure and form of the
equations and the scale of the coefficients, and is able to see if the differences are
important and if they require further sampling or noting by the science package.
Paper 6 - Jaenisch, H., Handley, J., Hicklen, M., “Analytical Data Modeling
for Equivalence Proofing and Anchoring Simulations with Measured
Data”. Modeling, Simulation, and Verification of Space-based
Systems III. Edited by Motaghedi, Pejmun. Proceedings of SPIE,
Volume 6221, pp. 62210H (2006).
This paper presents an algorithm which will give the ability to make two independent
predictions of measurement. It is thus possible to form a set of “expected”
observation. This ability forms the basis of a “goal” for decision-making and route
planning by enabling the rover to navigate to areas where the sensor measurement
most resembles the “expected” data based on the derived models. This ability is
achieved by giving the vehicle the power to convert differential equation models into
a Transfer Function form that makes mathematical “prognostication” possible.
22
Paper 7 - Jaenisch, H., Handley, J., Bonham, L., “Data Modeling Change
Detection of Inventory Flow and Instrument Calibration”,
Proceedings of SOLE Conference, Huntsville, AL, Aug 12-14, 2003 .
This paper presents an algorithm that enables an autonom ous vehicle to monitor its
own internal system status for mission readiness and health. This is accomplished by
a model of “nominal” performance derived throughout the mission made from
internal sensors monitoring operations such as system power, temperature, vibrations,
energy usage, fuel availability, etc. This ability enables the calibration of the sensors
and systems to be checked automatically, and internally, by comparison with previous
equation models of nominal performance and system health (using the modeling
procedures presented in the earlier papers). This paper introduces situational
awareness that allows decisions to be dealt with during the mission.
Paper 8 – Jaenisch, H., Handley, J., Pooley J., Murray S., “Data Modeling for
Fault Detection”, Society for Machinery Failure Prevention
Technology (MFPT), Volume 57, (2003).
This paper presents an algorithm that models internal system status to predict when
early engine or power system fault will occur. This will then enable a corrective and
preemptive course of action to be taken to mitigate system failure. It is critical for a
vehicle to efficiently manage its own critical resources whilst working autonomously
in an alien environment.
Paper 9 - Jaenisch, H., Handley, J., Lim, A., Filipovic, M., White, G., Hons,
A., Deragopian, G., Schneider, M., Edwards, M., “Data Modeling for
Virtual Observatory Data Mining”. Optimizing Scientific Return for
Astronomy through Information Technologies. Edited by Quinn,
Peter J.; Bridger, Alan. Proceedings of SPIE, Volume 5493, pp. 198-
220, (2004).
This paper presents a core Data Modeling algorithm that enables a large variety of
different types of equation models derived during a mission to be sorted and stored for
very fast retrieval and processing as needed. Efficient database management of
23
derived models enables the rover’s computational resources to be optimized, and
conserved, by using the minimum number of models that are required for the current
mission goals. Fast swapping, and storing, of new models is important to minimize
the delay in response time to changes in mission status, and an efficient method of
coding and storing derived models is necessary and presented here.
Data Modeling enables prediction of similarly but distinct imagery from previously
seen images and allows mathematical combining of information content into
analytical form, enabling fast mathematical analysis on the image structure. Unlike
bootstrapping or statistical methods that either resample the original image or use the
average statistics of the image to generate pixel values, the Data Modeling approach
provides an analytical expression where the input parameters are varied to create new
images that are similar to the original yet distinct.
These analytical equation Data Models are stored as a knowledge base and are
retrieved using index tags such as the Heptor or similarity (Reference Paper 6). Once
retrieved, the coefficients of the Data Model are perturbed to generate predictions of
the rover’s surrounding environment and establish a model of familiarity.
Paper 10 – Jaenisch, H., Handley, J., Pooley, J., Murray, S., “Virtual
Instrument Prototyping With Data Modeling”, JANNAF 39th
Combustion, 27th Airbreathing Propulsion, 21st Propulsion Systems
Hazards, and 3rd Modeling and Simulation Subcommittees Joint
Meeting, Colorado Springs, CO, (December 1 -5, 2003).
This paper presents a method to create custom “virtual sensors” as it encounters new
surroundings or observations. Using the previously defined algorithms, the algorithm
presented here enables the determination that some observed or predicted parameters
are critical. The important parameters can now be grouped automatically into a met-
model or combined-important-parameter model that enables observation of a new
inferred parameter. An example might be that a change in atmospheric turbulence
was found to be linked to changes in latitude and strongly sensitive to altitude. The
result would be a new equation model that couples altitude and latitude as a predictive
model for variation of weather patterns. This would have been unexpected, but would
24
have been crafted as the important variation encountered during its mission. The
algorithms presented here enable such a fusion of model information.
Paper 11 - Handley, J., Jaenisch, H., Lim, A., White, G., Hons, A., Filipovic,
M., Edwards, M., “Data Modeling of Deep Sky Images”. Modeling
and Systems Engineering for Astronomy. Edited by Craig, Simon C.;
Cullum, Martin J. Proceedings of SPIE, Volume.5497, pp. 449 -460
(2004).
This paper presents an algorithm for converting image data into a mathematical
equation. The equation takes up much less room than the raw data and preserves
sufficient fidelity for decision-making algorithms to work. Unlike current approaches
that require humans to decipher imagery, low-resolution image models are sufficient
for rovers because their decision criteria is simplified and based on scene roll-up
statistics. They do not impose the artificial pattern perception on imagery that
humans do and are able to work satisfactorily with perception models rather than
large memory expensive images. Moving beyond storing images is critical and
conversion into scene models for analytical comparison is necessary. These image
models may now be analytically compared and proofed with techniques described in
other papers (such as image compression). I demonstrated this algorithm on the M51
galaxy as shown in Data Modeling Enabled Real Time Image Processing for Target
Discrimination and assigned this to Handley so that he could demonstrate as a school
project the use of this algorithm on the entire Messier list which is published in this
work.
Paper 12 - Lim, A., Jaenisch, H., Handley, J., Filipovic, M., White, G., Hons,
A., Berrevoets, C., Deragopian, G., Payne, J., Schneider, M.,
Edwards, M., “Image resolution and performance analysis of
webcams for ground based astronomy”. Ground -based Telescopes.
Edited by Oschmann, Jacobus M., Jr. Proceedings of SPIE, Volume
5489, pp. 1152-1163 (2004).
This is an algorithm that enables small footprint and low profile optics to be used to
acquire high quality images. Space in rover vehicles is critical and large high density
25
CCD array cameras are slow to process. However, they are not necessary if sufficient
image and resolution enhancing software is available to enable smaller cameras with
higher frame readout speeds to be used. The algorithms I developed and published in
this paper demonstrate that a great deal of imaging efficiency can be gained by image
tracking and enhancing using low-resolution cameras. Such systems can rivals slower
high-resolution systems.
Paper 13 - Jaenisch, H., and Filipovic, M., “Classification of Jacoby Stellar
Spectra Using Data Modeling”. Imaging Spectrometry VIII. Edited
by Shen, Sylvia S. Proceedings of SPIE, Volume 4816, pp. 296 -307
(2002).
This paper presents an algorithm that creates a complicated classification scheme. As
large amounts of data are observed, the ability to sort and organize, and be able to
make decision about the significance of the data is critical for data management. The
algorithm provided here enables autonomous information ontology.
This paper solved what was considered in the literature to be an unsolvable problem:
using a neural network or other method to build a classifier to distinguish the Jacoby
stellar classes based on stellar spectra. The literature reported only achieving 70-80%
classification success on the baseline data with no reasonable explanation why the
baseline data itself could not be learne d 100% correctly. It was not even possible to
overfit the data to learn it 100% correctly in the published literature. My contribution
was to show with Data Model Change Detectors I was able to achieve 100% correct
classification and was the first to do so, but I also found the explanation for why the
other methods failed. I found which stellar classes were populated with contradictory
examples of spectra, indicating that some of the classes specified by human observers
were contradictory, overlapping, and not numerically distinct. This is very
fundamental and a very important result for the astronomy community because it
points to which star classes called different by astronomers are actually not distinct
from an information theory point of view and suggests the areas where reevaluation
by astronomers should be done. So the finding that 100% correct classification on the
training data was achieved was very unique and a breakthrough report.
26
Paper 14 - Jaenisch, H., “Enabling Unattended Data Logging and Publication
by Data Model Change Detection and Environmental Awareness”.
Unattended Ground, Sea, and Air Sensor Technologies and
Applications VIII. Edited by Carapezza, Edward M. Proceedings of
SPIE, Volume 6231, pp. 62310R (2006).
This paper describes the fully autonomous rover. The paper examines the use of the
core algorithms to enables an orbiter to select a suitable planetary landing site and to
deploy surface rovers.
The paper outlines the processes where surface rovers are deployed and allowed to
acclimatize themselves without prior knowledge. Objects are sensed and
autonomously recognized as being interesting and worthy of closer scrutiny. If closer
inspection with change detectors to guide decisions yields evidence of prior examples
then no further interest is taken. If the detection is novel then science is performed
and the novelty studied. The core knowledge gained by rovers can be embodied as a
coupled and sorted set of transfer function models capable of being interrogated to
reconstruct visited environments, or interrogated to predicted unseen environments.
This ability enables full transfer of experience to a new pick-up mission in a
completely transparent mathematical form. Legacy information is preserved and the
sensors are able to exhibit self-organizing behavior as they learn their environment.
Paper 15 - Jaenisch, H., Handley, J., Faucheux, J., “Data Driven Differential
Equation Modeling of fBm processes”. Signal and Data Processing
of Small Targets 2003. Edited by Drummond, Oliver E. Proceedings
of SPIE, Volume 5204, pp. 502 -514 (2003).
This paper presents a refined algorithm for automatic differential equation derivation
by the rover and enables the equation to be solved both numerically, and more
importantly, analytically. Analytical solution enables very small amounts of memory
to be used for storage and enables very fast execution of the derived models in the
memory. Numerical forms enable the equations to be converted into look-up-table
(LUT) format if that is a mission requirement for fast processing. Some system
27
architectures planned may require vector processors that require LUTs and integer
math operations in place of real math packages as mandated by standard differential
equation modeling. This algorithm facilitates conversion into integer processing
format.
Paper 16 - Jaenisch, H., and Handley, J., “Data Modeling Augmentation of
JPEG for Real-Time Streaming Video”. Visual Information
Processing XIV. Edited by Rahman, Zia -ur; Schowengerdt, Robert
A.; Reinbach, Stephen E. Proceedings of SPIE, Volume 5817, pp.
258-269 (2005).
This paper presents a novel algorithm for determining the quality of images being
modeled and is used in this thesis to score the results of scenery and to rank predicted
images with actual observed ones.
Paper 17 - Jaenisch, H., Handley, J., Faucheux, J., Lamkin, K., “Shai-Hulud:
The Quest for Worm Sign”. Data Mining, Intrusion Detection,
Information Assurance, and Data Networks Security 2005. Edited by
Dasarathy, Belur V. Proceedings of SPIE, Volume 5812, pp. 321 -329
(2005).
This paper introduces an algorithm for converting all binary data collected into a
special “fractal” format that can be concisely described with a small number of
descriptive statistics. These statistics are then used to derive change detectors and
useful equation models for decision-making during the mission. The descriptive
features required to adequately describe complicated data is limited to seven, simple
to calculate features, thereby forming a seven element vector called a “heptor”. This
heptor is used as the discriminating information to reduce the dimensionality of large
number sets when describing equation models in the vehicles memory.
28
1.7 Conclusion.
This is a mini Portfolio Thesis designed to introduce seventeen papers that constitute
this Doctor of Astronomy Portfolio.
In this chapter, I introduce a series of algorithms that have been developed and which
are the core of the Portfolio Papers. In Chapter 1 Section 4, I illustrate their use in the
context of an autonomous planetary rover; however, this work has many uses outside
this area. These other areas will be looked at in Chapter 2. In Chapter 1 Section 5, I
look briefly at work in the literature that is convergent with the work in my seventeen
papers, and in Chapter 1 Section 6 I introduce each of the papers of the Portfolio with
a small summary of their content. Here I lean heavily on the context of an
autonomous planetary roving vehicle.
29
30
CHAPTER 2
ADDITIONAL APPLICATIONS
2.1 Introduction.
I examine here in Chapter 2, a series of diverse applications that incorporate the core
algorithms that are presented in the seventeen Portfolio Papers and which have been
illustrated in Chapters 1.2 and 1.4 in the context of an intelligent and autonomous
planetary rover. The various core algorithms can also be used for a wide range of
astronomy and other disciplines; for example, for data mining, signal processing,
image processing, decision-making aides, and as a general replacement for
applications that traditionally have used classical neural networks.
To illustrate these applications, I now examine a brief cross-section of applications
that have or could be benefited from the rover-inspired algorithms. These are
demonstrated in the attached portfolio papers and details are given illustrating how the
core techniques have been used for various applications. Data Modeling has been
successfully applied to missing data reconstruction, signature synthesis, interpolation,
data compression, classifier performance modeling and adaptive feature selection.
2.2 Application to Data Ensemble Synthesis.
A novel application of these methods is for solving the inverse fractal problem for 1/f
noise sets. This simply means the ability to predict the missing va lues of a
complicated data set, while conserving the dynamics and structure of the original data.
The performance of these methods has been compared with classical data modeling
methods (Portfolio Paper 2, Data Modeling of 1/f Noise Sets, (Jaenisch and Handley,
2003)) and the applicability to different distributions of noise has been presented,
along with an overview of important applications including data and image
compression.
31
2.3 Application to Critical, Sensitive, and Key Parameter Estimation for
Analytical Modeling.
The portfolio algorithms enable an analytical differential-equation model to be
derived from a single simulation input and output data vector. The derived model was
analytically varied (real versus imaginary) to determine critical, sensitive, and key
parameters without the use of Design of Experiments (DOE). This ability is very
useful for formal analysis, data modeling, Validation Verification and Accreditation
(VV&A), and automatic software correctness proofing.
This work is described in Portfolio Papers 3 and 5, Automatic Differential Equation
Derivation From Sampled Data For Simulation Parameter Sensitivity Analysis
(Jaenisch, 2006) and Data Model Predictive Control as a New Mathematical
Framework for Simulation and VV&A, (Jaenisch, Handley and Hicklen, 2006).
2.4 Application to State Estimation From Trajectory Estimates for Tracking
and Automatic Controller Design.
Another important, and demonstrated application of the portfolio algorithms has been
the ability to autonomously make assessments of the physical characteristics of a
monitored object from its motion in the x-y plane. Traditionally, the process of state
estimation from a trajectory has been in the realm of the Kalman filter.
Unfortunately, the Kalman filter and Extended Kalman filter require a-priori defined
models of the dynamics of the object being observed. Predictions can only be made if
there exists a physics-based partial-differential equation that describes the equations
of motion. The published techniques for deriving the differential equation from
measured data enables the Kalman filter to be used for tracking very nonlinear
dynamics, and trajectories, and is applicable across many disciplines (see Portfolio
Paper Data Modeling Enabled Guidance, Navigation, and Control to Enhance the
Lethality of Interceptors Against Maneuvering Targets, (Jaenisch, Handley,
Faucheux, and Lamkin, 2005).
32
2.5 Application to Formal Proofing and Validation and Verification of
Software Simulations.
These analysis algorithms provide an analytical method for comparing simulated data
with measured data; the goal is to prove Equivalence and Consistency between the
model and the real data. The model output for a sensor and “real-world” measured
data are collected for comparison, and this method derives analytical Data Models
that are analyzed in frequency space, yielding a quantitative assessment of the model
relative to the measured data.
For logistics research, these methods enable reference points to be established for
complex dynamic processes such as inventory control and instrument calibration.
These mathematical models are a basis of comparison (and for the detection of
significant change) in inventory flow or instrument calibration.
The work on the formal proofing and Validation and Verification is covered in
Portfolio Papers 6 and 7; Analytical Data Modeling for Equivalence Proofing and
Anchoring Simulations with Measured Data, (Jaenisch, Handley and Hicklen, 2006)
and Data Modeling Change Detection of Inventory Flow and Instrument Calibration,
(Jaenisch, Handley and Bonham, 2003).
2.6 Application to Health Usage Management Systems (HUMS) and System
Health Diagnostics and Mean Time to Failure Prognostics.
The Data Modeling methods used by the rover of Chapter 2 can also be generalized
for use with other vehicles and systems (both manned and unmanned). Such
algorithms can be used to build a real-time diagnosis of operational conditions and
prognostic, i.e. a database of fault classification and time -to-failure estimates. Data
Modeling can achieve anomalous detection while only requiring nominal (no-fault)
conditions for training. This makes Data Modeling an attractive tool for Novelty
Detection and ambiguity resolution between nominal/anomalous; i.e. Data Modeling
can resolve ambiguities in diagnostic calls and manage risk uncertainty in prognosis
estimates of time-to-fail.
33
Data Modeling was successfully applied to Novelty Detection, and it can be applied to
Diagnostic Ambiguity Resolution and Prognostic Risk Uncertainty Management (see
Portfolio Paper 8; Data Modeling for Fault Detection, (Jaenisch, Handley, Pooley and
Murray, 2003)). Classified methods, such as adaptive multi-dimensional distance
measure neural networks, and Divergence Classifiers, were unsuccessful when
applied to a set of vibration diagnostics vectors with contain only slightly anomalous
conditions. When Data Modeling was applied to the same feature vector sets, 100%
correct classification was achieved (Jaenisch, Handley, Pooley and Murray, 2003).
2.7 Application to Data Mining.
The information storing and characterization capability intrinsic in Data Modeling (as
illustrated in Chapter 1) makes it possible for index-tagging of Virtual Observatory
data files with descriptive statistics. This is achieved by calculating 6 standard
moments at the time of data collection which are attached to the data file as
descriptive file tags. Data Change Detection Models are derived from these tags and
used to filter databases for similar or dissimilar information. This information could
be in the form of stellar spectra, photometric data, images, and text.
This application is covered in Data Modeling for Virtual Observatory Data Mining,
(Jaenisch et al., 2004).
Currently, no consistent or reliable method exists for searching, collating, and
comparing two-dimensional images. Traditionally, methods used to address these
data problems are disparate and unrelated to text data mining and extraction. Data
Modeling provides a unifying tool to enable data mining across all data classes.
2.8 Application to Virtual Instrument Prototyping.
Virtual instruments and sensors derived from mathematical models onboard an
autonomous rover (as outlined in Chapter 1) have utility in far more general settings
(Portfolio Paper 10, Virtual Instrument Prototyping with Data Modeling, (Jaenisch et
al., 2003)). They are mathematical embodiments of physically unrealizable devices.
The mathematical theory for prototyping virtual sensors using Data Modeling can
34
now be defined for many intelligent agent and robotic applications. The theory of
Change Detection using Data Modeling is presented. Examples of prototype virtual
sensors include: physical parameter sensor using trajectory, THz anthrax detector,
gear tooth health sensor, and fault detector, (Jaenisch et al., 2003).
2.9 Application to Image Processing, Analysis and Modeling.
These “information modeling”, and specifically imagery modeling techniques, can
also be used as a method for simulating CCD Focal Plane Array (FPA) images of
extended deep sky objects. These tools are used to model FPA fixed pattern noise,
shot noise, non-uniformity, and the extended objects themselves. The mathematical
model of the extended object is useful for correlation analysis and other image
understanding algorithms used in Virtual Observatory Data Mining. Application to
the objects in the Messier list resulted in a classifier that achieves 100% correct
classification (More details are given in Portfolio Paper 11, Data Modeling of Deep
Sky Images, (Handley, Jaenisch et al., 2004)).
Here another important image processing application was demonstrated and published
(Portfolio Paper 12: Image Resolution and Performance Analysis of Webcams for
Ground Based Astronomy, (Lim, Jaenisch, et al., 2004). This paper shows that it is
possible to achieving real-time super-resolution from ground-based images using for
small aperture telescopes.
2.10 Application to Image Compression and Synthesis.
Another application to image processing and storage is the augmentation of JPEG
image compression with fractal modeling techniques. The method enables sub-
sampling in conjunction with JPEG compression algorithms. Rather than directly
compressing large high-resolution images, we propose decimation to thumbnails
followed by compression (Portfolio Paper 16; Data Modeling Augmentation of JPEG
for Real-time Streaming Video, (Jaenisch and Handley, 2005). This enables
Redundant Array of Independent Disks (RAID) compression and facilitates real-time
streaming video with small bandwidth requirements. Image reconstruction occurs on
35
demand at the receiver to any resolution required using Data Modeling based fractal
interpolation.
This device -independent resolution capability is useful for real-time sharing of images
across virtual networks where each node has a different resolution capability. The
same image is constructed to whatever limitations exist at each individual node,
keeping the image “device independent” and image resolution scalable up or down as
hardware/bandwidth limitations and options evolve. For this work, a novel figure of
merit was created and demonstrated. It was useful for comparing image data that
mimics the human perception with a numerical score better than any previous metric
identified in the litera ture. It is used in this thesis to evaluate predicted imagery with
existing true image data.
2.11 Application to Spectral Analysis and Modeling.
An unusual application for the decision-making algorithms is its applicability to
Expert systems. By deriving Data Model polynomials in the place of neural network
and expert-system classifier rules, it was possible to achieve 100% correct
classification of the Jacoby stellar spectra (Portfolio Paper 13; Classification of
Jacoby Stellar Spectra Data using Data Modeling, (Jaenisch and Filipovic, 2002)).
The Jacoby set is a challenging group of 161 spectra spanning the full range of
temperature, sub-temperature and luminosity groupings of standard star types. To
achieve full learning, the development of a cascaded decision architecture linking an
extensive network of polynomial decision equations was required. The two dominant
features were extracted, and complex decision maps generated. Also, the sensitivity
of the equation architecture to misclassification due to measurement noise was
analyzed.
2.12 Application to Unmanned Vehicles.
The principles which are applied to the “fully autonomous planetary rover of Chapter
1” are also applicable to any form of unmanned vehicle or robotic system. Unmanned
vehicles can be deployed and allowed to form environmental models of familiarity.
The vehicle is then capable of deciding if further inspection is warranted based on
36
change detectors. The core knowledge gained can be embodied as a coupled and
sorted set of transfer-function models capable of being interrogated to reconstruct
visited environments. Legacy information is preserved and the vehicle is able to
exhibit learned self -organizing behavior. These issues are covered in the Portfolio
Paper 14, Enabling Unatte nded Data Logging and Publication by Data Model
Change Detection and Environmental Awareness, (Jaenisch, 2006).
2.13 Application to Information Assurance and Network Security.
A final, and extremely important, application of the algorithms is in cyber security
and detection of attacks or anomalies in network information flow. Successful worm
detection at real-time OC-48 and OC-192 speed requires hardware to extract web-
based binary sequences at faster than these speeds, and software to process the
incoming sequences to identify worms. Computer hardware advancement in the form
of Field Programmable Gate Arrays (FPGAs) makes real-time extraction of these
sequences possible. Lacking are mathematical algorithms for worm detection in the
real-time data sequence, and the ability to convert these algorithms into LUTs that can
be compiled into FPGAs.
Data Modeling provides the theory and algorithms for an effective mathematical
framework for real-time worm detection and conversion of algorithms into LUTs.
Detection methods currently available, such as pattern recognition algorithms, are
limited both by the amount of time needed to compare the current data sequence with
a historical database of potential candidates, and by the inability to accurately classify
information that was unseen in the training process. Data Modeling eliminates these
limitations by training only on examples of nominal behavior. This results in a highly
tuned and fast running equation model that is compiled in a FPGA as a LUT and used
at real-time OC-48 and OC-192 speeds to detect worms and other anomalies. A proof
of concept has been demonstrated using binary data from a WEBDAV, SLAMMER
packet, and RED PROBE attack (Portfolio Paper 17, Shai-Hulud: The Quest for
Worm Sign (Jaenisch et al., 2005)).
37
2.14 Conclusion.
This Doctor of Astronomy Portfolio Thesis is in two chapters. In both chapters I
introduce the Portfolio Papers that have developed a series of core algorithms. In
Chapter 1 Sections 4 and 6, the algorithms are illustrated in the context of an
autonomous planetary rover vehicle. In Chapter 1 Section 5, I look briefly at work in
the literature that is relevant to the work in my seventeen papers, and in Chapter 1
Section 6, I introduce each of the papers of the Portfolio.
In this second chapter, I show how the mathematical basis of this work is applicable
to many areas of data collection and imagery.
The mini-thesis supports the portfolio of papers bound into this volume. The author
recommends this portfolio to the reader.
38
References.
Amos, M. (ed.), (2004), Cellular Computing, New York: Oxford University Press.
Andrews, D.F., and Stafford, J.E., (2000), Symbolic Computation for Statistical Inference, New York:
Oxford University Press.
Barnsley, M.F., (2006), SuperFractals , Cambridge, UK: Cambridge University Press.
Blackmore, S., (2004), Consciousness: An Introduction, New York: Oxford University Press.
Boden, D.G., and Larson, W.J., (1996), Cost-Effective Space Mission Operations, New York:
McGraw-Hill.
Bolstad, W.M., (2004), Introduction to Bayesian Statistics , New York: John Wiley and Sons.
Bond, V.R., and Allman, M.C., (1996), Modern Astrodynamics: Fundamentals and Perturbation
Methods, Princeton, NJ: Princeton University Press.
Brandon, F.L., “Method and System of Collecting Data Using Unmanned Vehicles Having Releasable
Data Storage Devices”, US Patent 20070131754A1, June 14, 2007.
Chalmond, B., (2003), Modeling and Inverse Problems in Image Analysis , New York: Springer.
Christ, M., Kenig, C.E., and Sadosky, C. (ed.), (1999), Harmonic Analysis and Partial Differential
Equations: Essays in Honor of Alberto P. Calderon, Chicago, IL: University of Chicago Press.
Cohen, J.S., (2002), Computer Algebra and Symbolic Computation: Elementary Algorithms, Natick,
MA: A.K. Peters.
Cohen, J.S., (2003), Computer Algebra and Symbolic Computation: Mathematical Models , Natick,
MA: A.K. Peters.
Collins, L.M., and Horn, J.L., (1991), Best Methods for the Analysis of Change: Recent Advances,
Unanswered Questions, Future Directions, Washington, DC: American Psychological
Association.
Curtin, T., Crimmins, D., Curcio, J., Benjamin, M., Roper, C., “Autonomous Underwater Vehicles:
Trends and Transformations”, The Marine Technology Society Journal, Volume 39, Number 3,
Fall 2005.
39
Das, S.K., (1992), Deductive Databases and Logic Programming, Wokingham, GB: Addison-Wesley.
Diamantaras, K.I., and Kung, S.Y., (1996), Principal Component Neural Networks: Theory and
Applications, New York: John Wiley and Sons.
Doyle, F.J., Pearson, R.K., Ogunnaike, B.A., (2002) Identification and Control Using Volterra Models ,
London: Springer.
Duncan, O.D., (1975), Introduction to Structural Equation Models, New York: Academic Press.
Gelb, A.. and Vander Velde, W.E., (1968), Multiple-Input Describing Functions and Nonlinear System
Design, New York: McGraw-Hill.
Goldberg, D.E., (1989), Genetic Algorithms in Search, Optimization, & Machine Learning, India:
Pearson Education.
Grzymala-Busse, J.W., (1991), Managing Uncertainty in Expert Systems, Boston, MA: Kluwer.
Hall, D.L., and McMullen, S.A.H., (2004), Mathematical Techniques in Multisensor Data Fusion,
Boston, Artech House.
Handley, J., Jaenisch, H., Lim, A., White, G., Hons, A., Filipovic, M., Edwards, M., “Data Modeling of
Deep Sky Images”. Modeling and Systems Engineering for Astronomy. Edited by Craig, Simon
C.; Cullum, Martin J. Proceedings of SPIE, Volume.5497, pp. 449-460 (2004).
Jacoby, W.G., (1991), Data Theory and Dimensional Analysis, Newbury Park, CA: Sage Publications.
Jaenisch, H., “ Fractal Dimension Analyzer and Forecaster”, US Patent 5,732,158 (March 24, 1998).
Jaenisch, H., “Automated Target Shape Detection For Vehicle Muon Tomography”, US Patent Pending
(Attorney Docket Number 23309 -003001, Submitted April 7, 2008).
Jaenisch, H., “Automatic Differential Equation Derivation From Sampled Data For Simulation
Parameter Sensitivity Analysis”. Intelligent Computing: Theory and Applications IV. Edited by
Priddy, Kevin L.; Ertin, Emre. Proceedings of SPIE, Volume 6229, pp. 62290N (2006).
Jaenisch, H., “Enabling Unattended Data Logging and Publication by Data Model Change Detection
and Environmental Awareness”. Unattended Ground, Sea, and Air Sensor Technologies and
Applications VIII. Edited by Carapezza, Edward M. Proceedings of SPIE, Volume 6231, pp.
62310R (2006).
40
Jaenisch, H., and Filipovic, M., “Classification of Jacoby Stellar Spectra Using Data Modeling”.
Imaging Spectrometry VIII. Edited by Shen, Sylvia S. Proceedings of SPIE, Volume 4816, pp.
296-307 (2002).
Jaenisch, H., and Handley, J., “Automatic Differential Equation Data Modeling for UAV Situational
Awareness”. Unmanned Systems. Edited by McDaniel, Mark. Proceedings of the Huntsville
Simulation Conference (The Society for Modeling and Simulation International), (2003).
Jaenisch, H., and Handley, J., “Data Modeling Augmentation of JPEG for Real-Time Streaming
Video”. Visual Information Processing XIV. Edited by Rahman, Zia-ur; Schowengerdt, Robert A.;
Reinbach, Stephen E. Proceedings of SPIE, Volume 5817, pp. 258-269 (2005).
Jaenisch, H., and Handley, J., “Data Modeling for Radar Applications”. Radar Conference, 2003.
Proceedings of 2003 IEEE, pp. 379-386 (2003).
Jaenisch, H., and Handley, J. “Data Modeling of 1/f noise sets”. Noise in Complex Systems and
Stochastic Dynamics . Edited by Schimansky -Geier, Lutz; Abbot, Derek; Neiman, Alexander; Van
den Broeck, Christian. Proceedings of SPIE, Volume 5114, pp. 519-532 (2003).
Jaenisch, H., Handley, J., Barnett, M., Esslinger, R., “Data Modeling for Predictive Behavior
Hypothesis Formation and Testing”. Data Mining, Intrusion Detection, Information Assurance,
and Data Networks Security 2006. Edited by Dasarathy, Belur V. Proceedings of SPIE, Volume
6241, pp. 62410P (2006).
Jaenisch, H., Handley, J., Bonham, L., “Data Modeling Change Detection of Inventory Flow and
Instrument Calibration”, Proceedings of SOLE Conference, Huntsville, AL, Aug 12-14, 2003.
Jaenisch, H., Handley, J., Carroll, M., Faucheux, J., Thuerk, M., Goetz, R., Egorov, M., Wiesenfeldt,
M., “Data Modeling Enabled Real Time Image Processing for Target Discrimination”. Infrared
Imaging Systems: Design, Analysis, Modeling, and Testing XVI. Edited by Holst, Gerald C.
Proceedings of SPIE, Volume 5784, pp. 178-189 (2005).
Jaenisch, H., Handley, J., Faucheux, J., “Data Driven Differential Equation Modeling of fBm
processes”. Signal and Data Processing of Small Targets 2003. Edited by Drummond, Oliver E.
Proceedings of SPIE, Volume 5204, pp. 502-514 (2003).
Jaenisch, H., Handley, J., Faucheux, J., Harris, B., “Data Modeling of Network Dynamics”.
Applications and Science of Neural Networks, Fuzzy Systems, and Evolutionary Computation VI.
Edited by Bosacchi, Bruno; Fogel, David B.; Bezdek, James C. Proceedings of the SPIE, Volume
5200, pp. 113-124 (2004).
41
Jaenisch, H., Handley, J., Faucheux, J., Lamkin, K., “Data Modeling Enabled Guidance, Navigation,
and Control to Enhance the Lethality of Interceptors Against Maneuvering Targets”. Multisensor,
Multisource Information Fusion: Architectures, Algorithms, and Applications. Edited by
Dasarathy, Belur V. Proceedings of SPIE, Volume 5813, pp. 126-137 (2005).
Jaenisch, H., Handley, J., Faucheux, J., Lamkin, K., “Shai-Hulud: The Quest for Worm Sign”. Data
Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2005. Edited by
Dasarathy, Belur V. Proceedings of SPIE, Volume 5812, pp. 321-329 (2005).
Jaenisch, H., Handley, J., Hicklen, M., “Data Model Predictive Control as a New Mathematical
Framework for Simulation and VV&A”. Intelligent Computing: Theory and Applications IV.
Edited by Priddy, Kevin L.; Ertin, Emre. Proceedings of SPIE, Volume 6229, pp. 62290U (2006).
Jaenisch, H., Handley, J., Hicklen, M., “Analytical Data Modeling for Equivalence Proofing and
Anchoring Simulations with Measured Data”. Modeling, Simulation, and Verification of Space-
based Systems III. Edited by Motaghedi, Pejmun. Proceedings of SPIE, Volume 6221, pp. 62210H
(2006).
Jaenisch, H., Handley, J., Lim, A., Filipovic, M., White, G., Hons, A., Deragopian, G., Schneider, M.,
Edwards, M., “Data Modeling for Virtual Observatory Data Mining”. Optimizing Scientific Return
for Astronomy through Information Technologies. Edited by Quinn, Peter J.; Bridger, Alan.
Proceedings of SPIE, Volume 5493, pp. 198-220, (2004).
Jaenisch, H., Handley, J., Pooley J., Murray S., “Data Modeling for Fault Detection”, Society for
Machinery Failure Prevention Technology (MFPT), Volume 57, (2003).
Jaenisch, H., Handley, J., Pooley, J., Murray, S., “Virtual Instrument Prototyping With Data
Modeling”, JANNAF 39th Combustion, 27th Airbreathing Propulsion, 21st Propulsion Systems
Hazards, and 3rd Modeling and Simulation Subcommittees Joint Meeting, Colorado Springs, CO,
(December 1-5, 2003).
Jensen, F.V., (1996), An Introduction to Bayesian Networks, New York: Springer.
Jones, R.B., (2002), Symbolic Simulation Method for Industrial Formal Verification, Boston, MA:
Kulwer.
Kim, Y.H., and Lewis, F.L., (1998), High-Level Feedback Control With Neural Networks, Singapore:
World Publishing.
Klein, L., (2004), Sensor and Data Fusion, Bellingham, WA: SPIE Press.
42
Kofod-Petersen, A., Cassens, J., Leake, D.B., Schulz, S., editors. Proceedings of the 4th International
Workshop on Modeling and Reasoning in Context (MRC 2007) with Special Session on the Role of
Contextualization in Human Tasks (CHUT), (http://www.ruc.dk/dat/).
Korn, G.A., and Korn, T.M., (1964), Electronic Analog and Hybrid Computers, New York: McGraw-
Hill.
Kott, A.(ed.), (n.d.), Advanced Technology Concepts for Command and Control, Xlibris.
Kumar, M.P., Jawahar, C.V., Narayanan, P.J., “Building blocks for autonomous navigation using
contour correspondences”, ICIP '04. 2004 International Conference on Image Processing, Volume
2, 24-27 Oct. 2004 pp. 1381 – 1384.
Ladde, G.S., and Sambandham, M. (2004), Stochastic Versus Deterministic Systems of Differential
Equations, New York: Marcel Dekker.
Lei, B., Li, W., “A Fuzzy Behaviours Fusion Algorithm for Mobile Robot Real-time Path Planning in
Unknown Environment”, IEEE International Conference on Integration Technology, 2007, 20 -24
March 2007, pp. 173 – 178.
Lemke, W., (2006), Term Structure Modeling and Estimation in a State Space Framework , Berlin:
Springer.
Lim, A., Jaenisch, H., Handley, J., Filipovic, M., White, G., Hons, A., Berrevoets, C., Deragopian, G.,
Payne, J., Schneider, M., Edwards, M., “Image resolution and performance analysis of webcams
for ground based astronomy”. Ground-based Telescopes. Edited by Oschmann, Jacobus M., Jr.
Proceedings of SPIE, Volume 5489, pp. 1152 -1163 (2004).
Lipka, J., (2003), Graphical and Mechanical Computation, Wexford: Wexford College Press.
Maciejowski, J.M., (1989), Multivariable Feedback Design, Wokingham, GB: Addison-Wesley
Publishing.
Madala, H , and Ivakhnenko, A., (1994), Inductive Learning Algorithms for Complex Systems
Modeling, Boca Raton, FL: CRC Press.
Mendel, J.M., (1973), Discrete Techniques of Parameter Estimation, New York: Marcel-Dekker.
Miola, A (ed.), (1993), Design and Implementation of Symbolic Computation Systems, Berlin:
Springer-Verlag.
43
ModelQuest: The Combined Power of Statistics and Neural Nets , (1996), Charlottesville, VA: AbTech
Corporation.
Palmer, A.C., (2008), Dimensional Analysis and Intelligent Experimentation, Hackensack, NJ: World
Scientific.
Raol, J.R., Girija,G., and Singh, J., (2004), Modelling and Parameter Estimation of Dynamic Systems,
Institute of Electrical Engineers.
Srini, V.P., “A Vision for Supporting Autonomous Navigation in Urban Environments”, Computer ,
Vol. 39, No. 12, pp. 68-77, Dec. 2006
(http://doi.ieeecomputersociety.org/10.1109/MC.2006.407).
Taguchi, G. and Jugulum, R. (2002), The Mahalanobis -Taguchi Strategy: A Pattern Recognition
System , New York: John Wiley and Sons.
Wang, D., and Zheng, Z., (2005), Differential Equations with Symbolic Computation, Basel:
Birkhauser.
Widrow, B., and Walach, E., (1996), Adaptive Inverse Control, Upper Saddle River, NJ: Prentice-Hall.
Winkler, R.L., (1972), Introduction to Bayesian Inference and Decision, New York: Holt, Rinehart,
and Winston.
Ye, C. and Wang, D., “A Novel Navigation Method for Autonomous Mobile Vehicles”, Journal of
Intelligent and Robotic Systems, Volume 32, Number 4, December 2001, pp. 361-388(28) .
Yung, N. H. C. and Ye, C., “EXPECTATIONS - An Autonomous Mobile Vehicle Simulator”. 1997
IEEE International Conference on Systems, Man and Cybernetics, Proceedings of IEEE, Volume
3, pp. 2290-2295 (1997).
Yung, N. H. C., Ye, C., and Fong, F.P, “Road Vehicle Navigation through Virtual World Simulation”.
1997 IEEE International Conference on Intelligent Transportation Systems, Boston, MA,
Proceedings of IEEE, pp. 508-513 (1996).
44
Portfolio Thesis Papers
Data Modeling For Radar Applications
Holger M. Jaenisch and James W. Handley Sparta, Inc.
4901 Corporate Drive, Suite 102
Huntsville, Alabama 35805 USA
Abstract - This paper presents the process of Data Modeling
for generating functional {[f(x)]n} models of algorithms or data
set relationships in simple polynomial form. Data Modeling has
been applied to missing data reconstruction, signature synthesis,
interpolation, data compression, classifier performance
modeling, and adaptive feature selection. This paper highlights
the mathematical process and successful applications.
I. DATA MODELING
A. Overview
The Data Modeling process can be summarized as a fusion
of numerical methods. In general terms, the process isolates
support vectors which are then cast into Turlington
polynomial form [1] for single variables and reconstructed at
index locations using Deslauriers-Dubuc dyadic interpolation
[2]. For the multivariate case, the vector of support vectors
are transformed into Peel-Willis-Tham basis functions [3]
across the support vectors that are used as input into
Kolmogorov-Gabor polynomials [4]. This forms the forward
model solution shown in (1).
),,()( 21 nxxxfzf K= (1)
The inverse solution shown in (2) is obtained using the
method of partial derivatives as defined by Darwiche [5] and
is readily approximated by using Turlington polynomial
approximation to which we apply Deslauriers-Dubuc dyadic
interpolation, but this time comprised of the Data Model
output as the predictive functional input.
[ )()()()( 21 zfxfxfxfT
n =L ] (2)
Jaenisch fused these numerical methods into the process of
Data Modeling, an enabling technology consisting of 1)
Deslauriers-Dubuc dyadic interpolation generalized to
Barnsley’s fractal based affine geometry transforms to
synthesize many different classes of data sets; and 2) the
ability to generate functional equation based models of order
3n of any algorithm or data set relationship. Equation based
models use both parametric and non-parametric statistics as
descriptive features for inputs and outputs, and yields discrete
equations even where no closed form previously exist. An
additional benefit of the equation abstraction process is the
robust determination of the contribution of each input feature
and their subsequent ranking [6] in terms of a) power to
which each variable is raised; and b) frequency of occurrence
of each variable in the final Data Model equation.
B. Fractional Brownian Motion fBm (1/f)
The most common type of noise found in nature is fBm as
shown in (3).
( )∫ ∞−
−−+Γ
=t H
H tdBttH
tB )'()'()(
1)( 2
1
21
(3)
The term 1/f noise is applied to any fluctuating quantity
V(t) whose spectral density SV(f) in log-log space varies as
1/fȕ over many decades with 0.5 < ȕ < 1.5 such as fBm. ȕ characterizes the steepness of the slope of the spectral
density, with larger ȕ values representing steeper slopes.
Gaussian or white noise would be characterized with ȕ = 0,
Brownian motion with ȕ = 2, and 1/f noise for ȕ values in
between [7].
The baseline methods of Data Modeling work best on those
data sets that are characterized as being 1/f; however, Data
Modeling incorporates numerical integration of data sets as
shown in (3) that are uncorrelated noise to convert them into
1/f data sets before generating the model of the data. This
works because the numerical integration of Gaussian noise
yields a data set that is a 1/f process. To recover the original
from the integrated model simply requires numerical
differentiation of the reconstructed data. Other methods for
achieving 1/f type data includes converting binary sequences
to –½ to ½ prior to integration and performing an inverse
cosine transform on the support vector, thereby generating a
waveform.
C. Dyadic Fractal Reconstruction
Data Modeling combines Deslauriers-Dubuc dyadic
interpolation as an operator on support vectors [8] (4) cast as
Turlington polynomials (5)
[ ] 01),( =−+ bwxy iiiα (4)
∑∑⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
++
⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢
⎣
⎡
+
++=
⎟⎠⎞
⎜⎝⎛
⎟⎠⎞
⎜⎝⎛
⎟⎠⎞
⎜⎝⎛
kd
dt
d
k
i
c
dt
d
c
dt
d
ik
i
i
DCBtAty1
10
1
10 101log
101
10log)(
(5)
to regenerate the original data. According to Deslauriers-
Dubuc, the dyadic extension of the fundamental interpolant is
given as
)3()1()1()3()()2/( ++++−+−+= tdFtcFtbFtaFtFtF (6)
where a, b, c, and d are scaling coefficients, and satisfies the
identity given in (7). In (7), [t] denotes the integer portion of
the number t.
∑+
−=
−=3][
2][
)()()(t
tn
ntFnyty (7)
Since the solution of the coefficients in (6) cannot be
obtained directly, regression methods may be used.
However, recasting Deslauriers-Dubuc dyadic interpolation
in a geometric form enables Barnsley’s Iterated Function
Systems (IFS) defined as
⎟⎟⎠
⎞⎜⎜⎝
⎛+⎟⎟
⎠
⎞⎜⎜⎝
⎛⎟⎟⎠
⎞⎜⎜⎝
⎛=⎟⎟
⎠
⎞⎜⎜⎝
⎛
n
n
nn
n
nf
e
y
t
dc
a
y
tw
0 (8)
to be used instead. Elton [9] proved that this process will
converge for all continuous functions in (9), which we have
achieved using Turlington polynomials.
∑ ∫=
∞→=
+
n
kX
kn
xdxfxfn 0
)()()(1
1lim µ (9)
Unlike Deslauriers-Dubuc dyadic interpolation and
Barnsley’s IFS interpolation where only 4 or 5 control points
and the scaling coefficients are used and supplied artificially
and separate from the data process, Data Modeling uses
straight-line segment transforms of all available data points
(if sub-sampled data, these support vectors are the available
data) and derives all the coefficients from the support vector
data itself, including an estimate of dn which is obtained from
Jaenisch [10] and is related by
( ).
1
1
log
1
NJN
NJ
N
J
dJ
n
≤
>−
=−
(10)
The sign of the scaling coefficient dn is assigned to be the
same as the sign of the derivative between support vector n
and support vector n - 1.
This innovation solves the inverse fractal problem by
enabling fractal interpolation and image regeneration to be a
robust and autonomous process readily implemented in real-
time.
Support vector identification is simplified to equi-spaced
partitioning of the original data vector. The number of points
N to sub-sample [11,12] in building the model is bounded by
(11) where R is the range of the data and ı is the standard
deviation of the data.
5.01
log
log
lim)Re(
1log
log
limmin)Re(
01
01
=
⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢
⎣
⎡
⎟⎠⎞
⎜⎝⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛
=
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢
⎣
⎡
⎟⎠⎞
⎜⎝⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛
=
→∆+
→∆+
N
NJ
R
J
N
NJ
R
J
i
Ji
i
Ji
σ
σ
(11)
The number of sub-sample points N can be solved for
between these bounds minimizing the variance of the residual
between reconstructed and original. The authors have found
that naively using either 10% of the original data set or using
N from the second equation in (11) only as the sub-sample
rate yields excellent results.
This process can be used to increase the resolution of any
data set to any user specified resolution by treating the
current in-hand data set as a set of support vectors. Data
Modeling preserves the fine structure content of the data, is
computationally efficient, and enables optimal interpolation
for ensemble generation.
D. Equation Modeling
The equation modeling technology of the Data Modeling
process generalizes neural networks by truncated Taylor
series expansion and uses multivariate regression to derive
high order polynomials using simple low order (< 3) building
blocks. This technique is amenable to the generation of
multi-class decision classifiers. Advantages of using Data
Modeling include: 1) Feature selection process is fully
autonomous; 2) Generates a polynomial decision equation; 3)
Input variables may call external simulations and executables
for their value(s); 4) Output variables may call external
simulations and executables with value(s); 5) Output may
become input to new equation models; 6) Expert systems
may be linked with polynomials as either input or output; 7)
Totally open-ended and scaleable; 8) Executes in polynomial
time; 9) No nested FOR-NEXT loops; and 10) Can be
derived without transcendental functions.
Any decision transfer function that can be cast as a neural
network, Dempster-Shafer probability table, or Bayesian
Belief Network (BBN) architecture can be recast in
polynomial form and used to generate an objective transfer
function directly. This new simple algebraic model can be
optimized using other methods developed by Jaenisch in Data
Modeling to provide fast and efficient inverse models as well.
Equation modeling builds high order (3n) polynomials
using polynomial building blocks that never exceed third
order, thereby preventing precision problems associated with
numerical implementation. These third order building block
forms are derived from generating a prototype neural network
comprised of 3 input nodes, 2 hidden nodes, and 1 output
node. These basis functions process data from input to output
layer through a hidden layer. The processing of input data
with a bias into the hidden layer is modeled according to
Peel-Willis-Tham as in (12).
...1
1)(
3
3
2
210)(++++=
+= +− jjjbxb xaxaxaa
exf
j
(12)
The solution is obtained in a single pass using linear
regression, thereby avoiding back propagation.
The transformation of the hidden layer data into the output
layer is done using a sigmoidal processing function. With
this approximation, the transformation of the input data
through the hidden layer to the output layer is given in (13):
∑∑= =
=N
h
k
j
j
hhj tusty1 0
, )()( (13)
where N is the number of neurons in the hidden layer; k is the
neuron polynomial order; sj,h is the combined coefficients
representing the jth coefficient of the hth polynomial neuron
in the hidden layer multiplied by the weight from the hth
neuron to the output layer; uh(t) is the total input to the hth
neuron in the hidden layer at time t; and y(t) is the output
data. The equation modeling technology in Data Modeling
uses a maximum polynomial order of k = 3 and solves for the
resulting coefficients (sj,h) using multivariable linear
regression to map the input data points (uh(t)) into the output
(y(t)). This equation can be written in matrix form (14)-(17)
to allow for the solution of the system of equations using
multivariable linear regression:
Θ= Xy (14)
T
jjj ptytytyy )](),...,1(),([ ++= (15)
T
NkNNk ssssss ],...,,,...,,...,,[ ,,1,01,1,11,0=Θ (16)
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
++++=
k
NN
k
k
NN
k
ptuptuptuptu
tutututu
X
)(...)(1...)(...)(1
)(...)(1...)(...)(1
11
11
MOMMMOMM (17)
The final Data Model is derived from appropriate solutions
of the Kolmogorov-Gabor polynomial shown in (18).
L++++= ∑∑∑∑∑∑i j k
kjiijk
i j
jiij
i
ii xxxaxxaxaa0φ (18)
The final form of the Data Modeling equations and the
polynomial order shown in (19) are nested following
Ivakhnenko with n representing the number of layers in the
final Data Model and x(bi(t)) the inputs mapped from the
previous layer.
[ ] n
n
txO
tbxtbxbxtftx
3)(
)))))))))((((((((,()( 21
=
= K (19)
If a Data Model is needed for data sets whose
dimensionality is greater than one such as images (2-D) or
image sequences (3-D), the Hilbert sequence shown in (20)
and (21) is used to transform the two dimensional data into a
one dimensional data sequence for application of the Data
Modeling techniques [13].
)()()()( 43211 nnnnn PwPwPwPwH ∪∪∪=+ (20)
)()()()( 43211 nnnnn PwPwPwHwP ∪∪∪=+ (21)
Equations (20) and (21) are subject to Lindenmayer’s L-
system grammars represented in (22) and defined by (23).
°=−→−+→+
→−++−→+−−+→
90δ
FF
FLRFRLFR
FRLFLRFL
(22)
),,(),,(
),,(),,(
),sin,cos(),,(
δααδαα
αααα
+→⇒−−→⇒+
++→⇒
yxyx
yxyx
lylxyxF (23)
Using Data Modeling, polynomials as large as order 65,000
(~ [O(3)]10 from (19)) have been generated in real-time.
For completeness, since Data Modeling as described thus
far is contractive and therefore assumes interpolation,
expansive projection of support vectors and associated
scaling coefficients is possible. This extrapolation is
accomplished by invoking the Ruelle-Takens Theorem [14]
and treating the Data Model as the attractor for the underlying
process. Then using support tuples as defined by (24) and
linear prediction (25), reasonable forecasts have been
achieved within the Cramer-Rao bound (26).
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
−+
+=Χ
))1((
)(
)(
)(
τ
τ
Ntx
tx
tx
tM
(24)
n
N
j
jnjn xydy += ∑=
−1
(25)
N
original
error
σσ ≥ (26)
Discussions regarding the non-violation of the Nyquist-
Shannon sampling theorems due to application driven
sampling criteria can be found in [15].
The focus of this work thus far has been on the generation
of Data Models in a forward direction by mapping multiple
functional inputs into the output objective function. The
inverse Data Modeling problem solution uses the output
objective function as the only input into a set of Data Models
estimating each functional input independently. This can be
achieved using Bayesian Belief Networks as a low-resolution
approach, or the compilation of a resultant BBN into a
canonical polynomial form given by
∑ ∏Θ∏=Θx
xxx
fxf
fx ii
ii
iiF .),(
~~
λλ (27)
The inverse solution is obtained by applying the method of
partial derivatives [5] to (27). The partial derivative equation
can be represented as
.)()(
)()(
)(
1)|Pr( 2
2 ⎟⎟⎠
⎞⎜⎜⎝
⎛
∂∂
Θ∂∂
−∂Θ∂
∂=
Θ∂∂
yxuyxuxu
eFeFeF
eF
eF
ey
λλ(28)
An estimate to these partial derivatives can be obtained
from Turlington polynomials (5), and in fact the entire Data
Modeling process (1)-(26). The authors have examined each
of these methods and prefer the full Data Modeling approach
for solving the inverse problem.
E. Decision Architectures
Although the process of passing information through a
completed decision architecture is top down (Fig. 2), the Data
Modeling approach for constructing a decision architecture is
unlike other approaches by working from the bottom up.
Data Modeling can characterize new data sets (multiple
partitions if necessary) and simply build an architecture
consisting of classifier and controller transfer functions. Data
Modeling begins at the lowest decision level and breaks the
process up into a series of bimodal decisions. This is possible
because if two non-coincident points exist in N-space, then an
N-dimensional analog to y = mx + b passing between the two
points can always be found, limiting the maximum number of
classifiers to converge to N - 1.
This process is learned in a cascade fashion guaranteeing
100% learning (Fig. 1) and results in a classifier (shown as A,
B, and C in Fig. 2). If more than one classifier is created, a
decision has to be made at a higher level which classifier to
execute and under what conditions. The sole function of this
controller equation model would be to analyze the conditions
and determine which classifier under its control to execute.
These are shown as 1 and 2 in Fig. 2.
100
Examples
A
83% of total
learned
168 Unlearned
251 Siphoned
F21, F32, F47B
99% of total
learned
11 Unlearned
35 Siphoned
F33, F55, F77, F69
99.99% of total
learned
1 Unlearned
4 Siphoned
F21, F65, F87
C
100% of total
learned
0 Unlearned
F20, F43, F74
D
Training
FinishedAvailable Features (F21- F90) Outcomes For A, B, C, D – Fault/No Fault
100
Examples
A
83% of total
learned
168 Unlearned
251 Siphoned
F21, F32, F47B
99% of total
learned
11 Unlearned
35 Siphoned
F33, F55, F77, F69
99.99% of total
learned
1 Unlearned
4 Siphoned
F21, F65, F87
C
100% of total
learned
0 Unlearned
F20, F43, F74
D
Training
FinishedAvailable Features (F21- F90) Outcomes For A, B, C, D – Fault/No Fault
Fig. 1 Data Model construction using cascade approach.
1
F1, F2, F5
ARCHITECTURE
• Common feature subset
• Multiple NETWORKS
with different TASKS
NETWORK
• Common feature subset
• Multiple CLASSIFIERS with
different DECISIONS
=I
3
1 2
A B C D
II3
1 2
A B C D
α3
1 2
I II III IV
=
CLASSIFIER
• Common feature subset
• Multiple DECISION
TRANSFER FUNCTIONS
F1, F5, F7
A
F2, F3, F4
B
F2, F3, F7, F9
1
C1? C2? C3?
Step 1
(Classifiers)
Step 2
(Apex)
F2, F4, F9
C
F1, F7, F9
2
Step 3
(Branch Addition)
C1? C2? C3?C1? C2? C3?
1
F1, F2, F5
ARCHITECTURE
• Common feature subset
• Multiple NETWORKS
with different TASKS
NETWORK
• Common feature subset
• Multiple CLASSIFIERS with
different DECISIONS
=I
3
1 2
A B C D
II3
1 2
A B C D
α3
1 2
I II III IV
=
CLASSIFIER
• Common feature subset
• Multiple DECISION
TRANSFER FUNCTIONS
F1, F5, F7
A
F2, F3, F4
B
F2, F3, F7, F9
1
C1? C2? C3?
Step 1
(Classifiers)
Step 2
(Apex)
F2, F4, F9
C
F1, F7, F9
2
Step 3
(Branch Addition)
C1? C2? C3?C1? C2? C3?
Fig. 2. Completed decision architecture representation.
This allows the Data Modeling approach to be totally open-
ended and scalable, and allows for entire branches to be
added to the tree if necessary without retraining the entire tree
[16]. The tree structure can continue to expand if the
complexity of the decision architecture increases. This
allows for entire branches to be added to the entire tree if
necessary without retraining the entire tree. If necessary,
each individual equation in the architecture can be
represented by an entire network of polynomial based
equation models, thereby allowing for the nesting of decision
architectures within other decision architectures. This
method is fully adaptive, and only chooses and uses those
features that are necessary to accurately make the correct
decision at each decision transfer node.
This is akin to discovering the optimal Bayesian Belief
Network structure from the data autonomously, where the
grouping of terms in the polynomial directly defines the
connectivity and strengths between decision nodes in a
Bayesian Belief Network. Each individual equation in the
architecture can be represented by an entire network of
polynomial based equation models, thereby allowing for the
nesting of decision architectures within other decision
architectures. The best features do not need to be identified a
priori because they will be identified by the decision
architecture when anomalous conditions are encountered.
Traditionally, if new data becomes available, decision
architectures must be entirely retrained by evaluating where
the new data should be placed and new classifiers constructed
using both the old exemplar data and the new exemplars. By
converting Bayesian Belief networks into a polynomial form,
new branches and nodes can be added without the need to
retrain the entire network.
Data Modeling is used to characterize the data feature
vector to determine which original feature vector the new
data is the most like (worst case to distinguish between).
Transfer functions are derived to classify the new feature
vector. This constructs a new layer in the architecture
consisting of the newly generated transfer functions and the
transfer functions related to the most alike feature vector
selected above. In the place of the original transfer functions
in the architecture, we place new controller transfer functions
that simply distinguish when to use each of the two transfer
functions below it [16].
F. Generation of Lookup Tables (LUT)
Once a Data Model is built, it can then be used to generate
a series of simple analog nomographs. The nomograph is a
tool that provides quick graphical answers for applications
where numerical data are substituted into formulas [17]. The
nomograph shown on the left in Fig. 3 is a graphical
“program” that solves the simple equation x + y = w. This
concept is generalized to more complicated functional forms
and to many more variables with the Data Modeling equation
generation approach. Yet the use of the final decision look
up table will remain just as simple and easy to use in real
time.
Through the creation of a transfer function equation and the
use of the method of partial derivatives (23), a decision map
is generated that captures the equation output for various
values of the input features. These decision maps become a
lookup table.
An example of a nomograph (lookup table) is shown on the
right in Fig. 3, where each axis represents a different physics
based feature input into the process and the output decision is
read directly based on the x and y location and associated
lookup table values. By varying two features at a time across
the valid range while holding the other variables constant, the
polynomial output is plotted at each increment location
creating the decision map.
Fig. 3. Lookup table generation with Data Modeling.
Features 1 and 2 are used directly to determine the same
decision as the original Data Model equation without running
the equation. The use of lookup tables that are generated by
these equations as nomographs is amenable in applications
that have only a limited amount of processing power and
storage capability, as well as in applications where real time
implementation is required.
Field updates of new information in the lookup table also
occur directly. As new information comes in during a
mission or as the mission evolves, feature values in the LUT
are changed at locations defined by the intercept of two
feature values. This eliminates the need to generate an entire
new decision map whenever new information becomes
available. This method provides updates to the decision
landscape lookup table in real-time and in situ. Post-mission
or whenever necessary, these changes can be used to derive a
new decision map that encompasses the new information.
These decision lookup tables form a basis of graphically
encoding complex decision making algorithms into real time.
Non-computational forms for in-flight updates (IFTU) or
situational awareness network sharing is possible by
minimizing the data and optimizing the information form that
can be shared across a limited bandwidth channel.
G. Calibration on Demand (COD)
Calibration On Demand (COD) can be used in real-time to
indicate when incoming data sets are of a form not previously
accounted for or anticipated during training on either actual
or simulated data sets. COD allows sensing of off-nominal
conditions while being trained exclusively on nominal
conditions. This is a key theoretical enabling concept for
Situational Awareness via algorithms.
A use for COD is in monitoring individual sensor
measurements and overall system snapshots for change
detection. Individual sensor measurements provide a
characterization of system level conditions. This yields an
estimate of nominal or off-nominal condition.
Off-nominal detection can be used to initiate man in the
loop intervention or the automatic generation of new or
additional features, new or additional sensor looks to generate
more detailed data, spawn the generation of a new classifier
to deal with the new off-nominal information, or simply yield
a tip-off condition.
Rather than use the actual input feature values, we use
parametric and non-parametric descriptions of the entire
feature value set. This can include the conversion of features
to standardized Z-scores. This treats the feature value set as a
time series and models the distribution by overfitting to a
minimized regression function. By setting the target value of
the regression to minimize and then characterizing the
polynomial fluctuation around a minimum, in practice the
output feature value used for training is ½.
An overfitted Data Model can be generated that uses
Shewhart’s Control Chart theory [18] to monitor if the
modeled process is under Statistical Process Control. In
Shewhart’s theory, trial limits are generated for X using
RAXLCL
RAXUCL
X
X
2
2
−=
+= (29)
where A2 is a table lookup value and R is the average range.
In Data Modeling, these trial limits are set using the
polynomial output from the training data. The upper and
lower boundaries are assigned to be one-half of a standard
deviation above and below the mean value of the polynomial
output from the training data.
It is very important to note that even though the ranges of
input features may be the same as seen during training, which
is easily tested without resorting to Data Modeling; we can
also detect instantly combination of values for incoming
features that are not anticipated or seen before. This off-
nominal detection is a tip-off condition that can be used to
predict additional regions that need to be populated by
decision making classifiers or decision transfer nodes.
COD can evaluate the operational ranges and feature
regimes of proposed classifiers. This method shows regions
of nominal behavior that indicate adequate coverage by the
proposed classifier.
II. APPLICATIONS
A. Adaptive Wavelet Basis Function Selection
The first problem with using wavelet transforms is
determining when to stop sampling and apply the transform.
The second problem is a priori selection of the proper basis
function.
The non-constant location of minimum and maximum
scale-space values when using orthogonal basis functions
necessitates the need for a computer intensive sort algorithm
to insure their location at the beginning of the data array. No
current method for picking the basis function a priori and in
an autonomous fashion that insures constant location of the
minimum and maximum scale space values has been reported
in the literature.
The wavelet transform cannot be applied to a single point,
it requires that a group of measurements be taken and
characterized together. To overcome this, a fixed number of
samples, called a data window, are measured and the wavelet
transform applied to them. This technique requires the a
priori selection of either a dyadic or non-dyadic window size,
and the transform is applied only when enough measurements
are taken to fill the window.
Window selection can be performed autonomously and
robustly using Data Modeling. The entire Data Modeling
process of defining the optimal data window size, along with
the ideal information based sub-sampling to derive the basis
function is called the Self Defining Basis Function Adaptive
Window (SDBFAW) [19]. The total number of points N to
sample insures stationarity across the window and satisfies
the second equation in (11), thereby providing a solution to
the first wavelet transform application problem listed above.
The actual SDBFAW is determined by sub-sampling with the
window found in the first equation in (11). Sub-sampling
works because of ergodicity within the stationary window.
This provides a solution to the second wavelet transform
problem listed above.
In (11), R is the range and ı the standard deviation. Note
that in order to calculate (11) in real time, Jaenisch developed
an approximation to the standard deviation that can be
calculated recursively on the data in real time as it comes in.
The mean (30) and standard deviation approximation (31) are
1
1
1
1
+
+
++ +⎟⎟
⎠
⎞⎜⎜⎝
⎛=
i
i
i
i
iiN
y
N
Nµµ (30)
( ) ( ).
1
2
11
1
2
1
+
++
++
−+⎟⎟
⎠
⎞⎜⎜⎝
⎛=
i
ii
i
iii
N
y
N
N µσσ (31)
Once the optimal basis function is derived, the wavelet
transform can be generated using standard wavelet transforms
routines available in the literature. It should be noted that this
process does not require that the basis function be orthogonal
or compactly supported. Both orthogonal and non-orthogonal
basis functions can be used to characterize sensor
measurements.
B. Data Synthesis
Data Modeling has been successfully compared to classical
methods including polynomial curve fitting, Markov chains,
wavelet transforms, mean and covariance methods, and
Kalman filters (Fig. 4). Column 1 of Fig. 4 shows the
original data sequence of 1024 points and a histogram
showing the distribution of the original. In Column 2, results
from using each of the classical techniques mentioned above
are shown along with a histogram of the distribution of each
resultant. Note that only the Markov chain process captures
the underlying distribution of the original, but fails to capture
the temporal and spatial characteristics of the original.
Column 3 of Fig. 4 shows the fractal Data Modeling
approach using five different sub-sampling rates varying
from 1% to 25% of the original. For each of these sub-
sampling rates, both the temporal, spatial, and underlying
distribution characteristics of the original are captured.
Columns 4 and 5 show the spatial error and power spectra
differences between the original, classical method, and the
fractal reconstruction. Using the 5% Data Model (50 points)
as a baseline, only the wavelet transform method provided a
slightly better RSS error, but with a substantial difference in
both the power spectra and in temporal and spatial
characteristic retention. Two peaks occurred in the log power
spectra that were captured and amplified by the fractal
method albeit with a shift in their locations.
Fig. 4. Comparison of fractal Markov modeling and classical methods.
The wavelet power spectra showed no peaks at all in the
lower trace, indicating a significant distortion in the original
signal and loss of frequency content. It should also be noted
that all of the fractal Data Models generated power spectra
more like the original than any of the classical methods, and
that using 50 points of the original data outperformed using
50 wavelet transform coefficients.
C. Adaptive Feature Selection
Data Modeling provides the ability to do adaptive feature
sub-selection in a robust fashion where the contribution of
each feature to the final solution is calculated. The process of
generating the Data Models generates a ranked list of each of
the inputs in terms of the number of times each input appears
in the final solution Data Model. Input features that provided
only redundant information are removed from consideration
during the equation generation process.
This process was successfully demonstrated on the
THAAD missile program, where IV&V was performed on
both a Bayesian quadratic classifier and neural network.
Using this adaptive feature ranking and selection process, the
authors were able to suggest the use of different features from
the current set that increased the performance of both the
original Bayesian quadratic classifier and the original neural
network. This boosted the performance of the Bayesian
quadratic classifier equal to the original neural network
performance, and enabled the neural network to equal the
performance of the Data Model in achieving perfect
classification. The Data Model form still surpasses the neural
network form in terms of computational efficiency and in
providing the actual functional form represented by the neural
network. This robust feature selection algorithm can be
applied generically to other feature sets in an attempt to
increase performance.
D. Classifier Modeling
Fig. 5 shows an example of a Data Modeling output that
yielded 100% classification a difficult simulated THAAD-
like data set. Shown in the upper right corner is the actual
equation used to make the decision. To the lower left of this
group is a ranking of the input features (bar chart) derived
from Data Modeling and a graph of the overlap of each of the
two best input features (above bar chart).
Fig. 5. Application of Data Modeling to radar feature data.
It should be noted that a good deal of overlap exists
between the features selected by Data Modeling, and yet Data
Modeling was able to take these features and generate perfect
classification. Such is not the case with the quadratic
classifier (example shown in the lower right quadrant using
the best features available in terms of separability). Also
shown in the lower right of Fig. 5 is an example of a decision
map that was generated using an actual Data Modeling based
equation classifier, which shows the complex nature of the
decision space as compared to the quadratic classifier
decision space shown in the upper left. Training support
vectors are denoted as white dots.
Successful construction of a Phased Derived Range (PDR)
physics based feature classifier polynomial using Data
Modeling was demonstrated on PDR type data sets.
Beginning with data that existed in hardcopy format only,
these were digitized and scaled to the correct range. Once
this was done, Data Modeling was used to synthesize
ensembles of PDR type data sets in order to create
characterizing statistics. These statistics were then used to
derive equation-based classifier like the one discussed
previously for the THAAD work.
Data Modeling has also been used to successfully capture a
Dempster-Shafer classifier process. Using only hardcopy
data, the data representing Dempster-Shafer classifier output
versus range and quality of the intelligence gathered was
scanned into the computer and digitized and scaled to the
correct range in the same manner as the PDR data mentioned
earlier. This equation-based classifier yields the probability
of an object being a Class 1 type given the range and the
quality of the intelligence gathered, and can be embedded
into testbeds and simulations for real time use. This negates
the need for Dempster-Shafer source code or executables.
The process listed above was done blindly without any
knowledge of the Dempster-Shafer algorithm other than the
hardcopy data that was used.
E. Discovery of Bayesian Belief Network Architecture
Data Modeling has also been used to successfully capture
the decision architecture process of a Bayesian Belief
network. Using a 12 node BBN generated in Netica™, 100
examples of decisions from the BBN (conflicted versus
deconflicted scenarios) and the values for the input variables
for each of these 100 cases was learned by equation based
Data Modeling technology. Data Modeling determined how
significant each feature was to the final result. Data
Modeling found that only 3 of the variables (upper tier
intercept time after launch, and lower tier commit time after
launch, and kill assessment time) were required for the
equation-based model to adequately capture the decision
process. In order to demonstrate that Data Modeling can in
fact use all variables if necessary, the training process was
allowed to continue past this much simpler solution until all
variables were used.
Data Models were also generated that solved for each of
the inputs given the output conflicted/deconflicted value. This
provides a mechanism for solving the inverse problem of
predicting the input values for a given output.
III. DATA MODELING PERFORMANCE
In analyzing different algorithms for conversion to Data
Models, it was found that many of these complex algorithms
cannot run in real-time due to the length of code (~ 5000
SLOC), memory usage (> 1 MB), and execution time (~
seconds). Using the Data Modeling approach, equation-based
models were built that used 75 Kbytes of memory and took
approximately 2 milliseconds to execute, and yielded
performance comparable to the original algorithms. In
addition, when the results were encoded into a lookup table
format, the execution time was reduced to 0.7 milliseconds
with an increase in memory required to approximately 300
Kbytes of memory to execute. A final icon encoding scheme
was employed that stored the same amount of information as
the lookup table in a 6 Kbyte JPEG compressed image.
REFERENCES
[1] T. Turlington, Behavioral Modeling of Nonlinear RF and Microwave
Devices. Boston, MA: Artech House, 2000.
[2] G. Cherbit, Fractals: Non-integral Dimensions and Applications. New
York: Wiley, 1991.
[3] G. Page, J. Gomm, and D. Wilson, Applicaton of Neural Networks to
Modeling and Control. London: Chapman and Hall, 1993.
[4] A. Ivakhnenko and V. Lapa, Cybernetics and Forecasting Techniques.
New York: Elsevier, 1967.
[5] A. Darwiche, “A differential approach to inference in Bayesian
Networks,” University of California (Los Angeles), 2000.
[6] H. Jaenisch and M. Filipovic, “Classification of Jacoby spectra using
Data Modeling,” Proceedings of SPIE: Imaging Spectrometry VIII, vol.
4816, pp. 296-307, July 2002.
[7] J. Feder, Fractals. New York: Plenum Press, 1988.
[8] B. Scholkopf and A. Smola, Learning with Kernels. Cambridge, MA:
MIT Press, 2002.
[9] M. Barnsley, Fractals Everywhere, Second Edition. Boston, MA:
Academic Press, 1993.
[10] H. Jaenisch, S. Taylor, J. Handley, and M. Carroll, “Optimal fractal
image and data compression,” Proceedings of the Southeastern
Simulation Conference, pp. 67-77, October 1995.
[11] J. Handley, H. Jaenisch, and M. Carroll, “Identification of the behavior
transition point in data using a robust fractal method,” Proceedings of
the Southeastern Simulation Conference, October 1995.
[12] J. Handley, On The Existence of a Transition Point in Sampled Data
using Fractal Methods. Ann Arbor, MI: UMI, 1995.
[13] H. Peitgen, H. Jurgens, and D. Saupe, Chaos and Fractals, New York:
Springer-Verlag, 1992.
[14] D. Ruelle, Chaotic evolution and strange attractors. Cambridge:
Cambridge University Press, 1989.
[15] T. Lago, “Digital sampling according to Nyquist and Shannon,” Sound
and Vibration, vol. 36:2, pp. 20-22, February 2002.
[16] H. Jaenisch, J. Handley, S. Massey, C. Case, and C. Songy, “Network
centric decision architecture using for financial or 1/f data models,”
Proceedings of SPIE: App. and Science of Neural Networks, Fuzzy
Systems, and Evolutionary Comp, V, vol. 4787, pp. 86-97, July 2002.
[17] A. Levens, Nomography. New York: Wiley, 1948.
[18] E. Grant, Statistical Quality Control. New York: McGraw-Hill, 1952.
[19] H. Jaenisch, J. Handley, C. Songy, and C. Case, “Adaptive self-
defining basis functions for wavelet transforms specified with Data
Modeling,” Proceedings of SPIE: Algorithms and Systems for Optical
Information Processing VI, vol. 4789, pp. 123-133, July 2002.
Data Modeling of 1/f noise sets
Holger M. Jaenisch*a
and James W. Handley a
aSparta Inc., 4901 Corporate Drive, Suite 102, Huntsville, AL 35805
ABSTRACT
A novel method is presented for solving the inverse fractal problem for 1/f noise sets. The performance of this method is
compared with classical data modeling methods. Applicability to different distributions of noise is presented, along with
an overview of important applications including data and image compression.
Keywords: Data Modeling, fractal, noise, 1/f, inverse fractal modeling
1. BACKGROUND
Beginning with a random data set (N points) from an unknown stochastic process, we wish to identify the system
governing the process (system identification) given only discrete sampled estimates. The process of estimating the
stochastic process model parameters from estimates of the measured data is termed inverse modeling. If the final inverse
model can be cast into a functional form subject to certain constraints, it is termed a Data Model.
1.1 Data Model definition
A Data Model is defined as a function [f(x)]n subject to the following constraints:
1. Must be a continuous function or an equivalent discrete representation.
2. Must be an interpolation function.
3. Number of coefficients should not exceed 2
1−N and must not exceed N-1.
Further, a more desirable Data Model is achieved by adhering to the following guidelines:
1. Should contain no Boolean or programming statements, only algebraic expressions.
2. Should not contain looping structure or recursion.
3. Should contain no random number generators.
4. Should contain no probability based variables, conditionals, or output.
Because of the first constraint, discontinuous or piecewise linear solutions such as high order spline functions are not
valid Data Models. The second constraint eliminates regression based fits to data such as low order polynomials unless
the fit of the polynomial is sufficient to use as an interpolation function. For the third constraint, if an interpolation
function has greater than 2
1−N coefficients, such as a Lagrangian polynomial that requires approximately N-1
coefficients, it is termed a trivial Data Model.
1.2 Lagrange polynomials
An example of a trivial Data Model is an inverse model represented as a Lagrange polynomial. Lagrange polynomials
are of degree N-1 and are derived from a measurement set of N points. This polynomial is obtained using Lagrangian
interpolation and is defined as in (1). For data sets greater than 10 points, the Lagrange polynomial exhibits unstable
solutions, especially near end points. For this reason, it is of limited utility as a basis for Data Modeling1.
∑ ∏−
=
−
≠= −
−=
1
0
1
0
)(N
j
N
jii ij
ij
xx
xxyxP (1)
*[email protected]; phone 1 256 337-3768; fax 1 256 830-0287; sparta.com
1.3 Chebyshev polynomials
Another inverse model form is based on the use of orthogonal basis functions such as Chebyshev polynomials of the
explicit form T that can also be combined with trigonometric identities to yield the explicit
expression form
))(coscos()( 1 xnxn
−=
1)(0 =xT xxT =)(1 )()(2)( 11 xTxxTxT nnn −+ −= (2) .1≥n
These polynomials are then used to approximate a function using the form
121
1
1 )()( cxTcxfN
k
kk −≈ ∑=
− .11 ≤≤− x (3)
Functions derived from the Chebyshev approximation technique or from other polynomial methods such as Legendre
polynomials, Laguerre polynomials, Hermite polynomials, or Radial Basis Functions do not capture the desired fine
resolution detail (fractal characteristics) of the original stochastic process2.
1.4 Multi-variable Data Modeling
Data Modeling can be generalized to the multi-variable case by invoking the constraints listed in Section 1.1. The multi-
variable form must be continuous, be an interpolation function, and result in fewer coefficients than the factorial
combinations of variables or the number of realizations (all possible combinations representation is a trivial case). In this
work, we only look at the single variable case. The multi-variable case is treated by the authors elsewhere3.
2. FRACTIONAL BROWNIAN MOTION
1/f or fBm noise is the most commonly found type of noise in nature. Formally, fBm is the integral of white noise.
However, the term 1/f noise is applied to any fluctuating quantity V(t) whose spectral density in log-log space varies as
1/fȕ, where ȕ characterizes the steepness of the slope of the spectral density. Gaussian or white noise would be
characterized with ȕ = 0, Brownian motion with ȕ = 2, and 1/f noise for ȕ values in between4. Mathematically, the fBm
process is represented by
( ).)'()'(
)(
1)( 2
1
21 ∫ ∞−
−−+Γ
=t H
H tdBttH
tB (4)
2.1 fBm Analogies
The direct integration method (5) uses a random number generator in order to capture the spatial characteristics of (4),
while FFT filtering (6) uses a random number generator to capture the frequency characteristics of (4). The Weierstrass-
Mandelbrot method5 (7) is a functional representation that eliminates the need for a random number generator, but does
not represent the spatial or frequency characteristics of (4) as well as (5) or (6). On the other hand, the Sxe method (8)
combines the benefits of (5) and (7) to represent both characteristics in a functional form. The discrete Sxe method (9) is
a discrete implementation of (8).
)(21
1 −+= − rndBB nn Direct Integration (5)
( ) ( )[ ]∑−
=
−+−
=
12
0
2
21
21
2
1
N
m
tif
nnmerndirnd
fB
πβ
FFT-filtering (6)
((∑∞
=
−=0
cos1)(i
ii nabnB π )) Weierstrass-Mandelbrot (7)
∑=
=stepn
x
exnB/
0
)sin()( Sxe Function (8)
)sin(1
e
nn nBB += − Discrete Sxe Function (9)
2.2 fBm Modeling
For the Chebyshev polynomial approximation method in (2) and (3), sampling of the continuous interval [-1, 1] is
achieved for an arbitrary f(x) by transforming the dependent variable x that occurs on the interval [a, b] by the
relationship
)(
)(2
ab
abxy
−+−
= .11 ≤≤− y (10)
We propose that a model of a specific fBm realization data set can be created from sub-sampled points (called the
stencil) of that realization as
].)1([)(
)()(
0 tktBkF
FwnB
H
n
∆−+==
(11)
We refer to this archetypal kernel as the Data Model of the fBm realization. This is possible because the fBm realization
defines the attractor for an unknown stochastic process. According to the Ruelle-Takens theorem6, addresses on this
attractor harbor information about the entire attractor. This should enable attractor reconstruction via interpolation from
a sparse sampling of address points. These address points define support vectors (12) or fiducial points for this process
because the support vectors are the points that define the critical decision boundary as a hyperplane in n-dimensional
space7.
[ ] 01),( =−+ bwxy iiiα (12)
If we think of a complex-spaced line in n-dimensional space as defining this boundary, we see its projection into 1-D
takes on the form of a 1/f type curve. Therefore, the fiducial points necessary to delineate and reconstruct this boundary
function are the support vectors defined as the sub-sampled points in (11) and can be represented in functional form
using Turlington polynomials8 of the form
.101log
101
10log)(
1
10
1
10 ∑∑
++
+
++=
md
dk
d
m
i
c
dk
d
c
dk
d
im
i
i
DCBkAkF (13)
We use the generic term 1/f when representing the entire class of 1/fβ type data sets. Because these processes are short-
time stationary and ergodic over finite non-zero length intervals, it should be possible to adequately capture the
statistical self-similarity within discrete intervals by sparse sampling of the interval to capture the local dynamics of the
attractor.
3. DATA MODELING
3.1 Dyadic fractal reconstruction
One theoretical approach for achieving functional Data Modeling from the sub-sampled 1/f realization is to use the
method of Deslauriers-Dubuc dyadic interpolation9. According to Deslauriers-Dubuc, the dyadic extension of the
fundamental interpolant is given as
)3()1()1()3()()2/( ++++−+−+= tdFtcFtbFtaFtFtF (14)
where a, b, c, and d are scaling coefficients, and satisfies the identity given in (15). In (15), [t] denotes the integer
portion of the number t.
∑+
−=
−=3][
2][
)()()(t
tn
ntFnyty (15)
Since the solution of the coefficients in (14) cannot be obtained directly, regression methods may be used. However,
recasting Deslauriers-Dubuc dyadic interpolation using Barnsley’s Hidden Variable Iterated Function Systems (IFS)
recasts the interpolation into geometric form using affine transformations defined as
+
=
n
n
n
nnn
nnn
n
n
g
f
e
z
y
x
mlk
hdc
a
z
y
x
w
00 (16)
to be used instead. This affine transformation can be further simplified for our application by
.0
+
=
n
n
nn
n
nf
e
y
x
dc
a
y
xw (17)
Elton10 proved that this process will converge for all continuous functions in (17), which we have achieved using
Turlington polynomials.
∑ ∫=
∞→=
+
n
kX
kn
xdxfxfn 0
)()()(1
1lim µ (18)
3.2 Iterated Function Systems (IFS) reconstruction
IFS reconstruction is defined by Barnsley11 in (17) as an affine transform subject to the constraints
=
−
−
1
1
0
0
n
n
ny
x
y
xw and for n=1,2,…, N. (19)
=
n
n
N
N
ny
x
y
xw
Therefore, the five real numbers a, c, d, e, and f that specify the transformation must obey the equation relationships
.
,
,
,
100
10
nnNnNn
nnnn
nnNn
nnn
yfydxc
yfydxc
xexa
xexa
=++=++
=+=+
−
−
(20)
This is a system of four equations and five unknowns, resulting in one free parameter in each transformation. This free
parameter is chosen to be dn and is called the vertical scaling factor. This choice allows the other four parameters to be
written in terms of the data and the parameter dn as
.)(
)(
)(
)(
,)(
)(
)(
)(
,)(
)(
,)(
)(
0
00
0
01
0
0
0
1
0
01
0
1
xx
yxyxd
xx
yxyxf
xx
yyd
xx
yyc
xx
xxxxe
xx
xxa
N
NNn
N
nnNn
N
Nn
N
nnn
N
nnNn
N
nnn
−−
−−−
=
−−
−−−
=
−−
=
−−
=
−
−
−
−
(21)
Barnsley gives no method for choosing the value for dn, and instead leaves this as a free parameter that is only
constrained to be on the interval [0, 1]. The development of this method is based on Deslauriers-Dubuc dyadic
interpolation using only 4 or 5 control points. The vertical scaling parameter along with the 4 or 5 control points are left
as user input. Barnsley stores only the IFS coefficients derived from the 4 or 5 control points and the dn scaling
parameters in order to reconstruct a crude statistical representation of the data.
IFS reconstruction is performed by starting with an initial point (usually one of the points given in (19)) and performing
one of the transforms given in (17) and defined by the coefficients in (21) to the point to generate a new point that lies on
the attractor for the function. This new point is then used as a starting point and another transform applied to it to find
another new point on the attractor. This is repeated iteratively until the attractor for the function fills in to within a
specified tolerance given as
.ˆ ε≤− xx (22)
The IFS coefficients in (21) determine the distance the current x location is from the starting point of the data set (as a
fractional percentage of the length of the entire data set) and then maps the new y value to an x location the same
fractional percentage into the transform interval. The y value is determined by the coefficients in (21) by determining
how far the vertical scaling coefficient raises the previous y value above or below the sloped line connecting the
beginning y and ending y of the transform interval.
Barnsley presents two different methods for performing this iterative calculation to reconstruct the attractor: 1) Random
Iteration Algorithm and 2) Deterministic Algorithm. The Random Iteration Algorithm is a Markov process. It uses a
single starting point and the transform to be applied is chosen at random. The resulting point is then remapped with
another transform chosen at random, and the process repeated iteratively until all desired points have met the condition
in (22).
The Deterministic Algorithm assumes the data set to be on a grid and performs all transforms against all available points
(control points). The result is the synthesis of the same number of interpolated points as the number of control points. As
before, each of these points are then mapped iteratively until the condition in (22) is met. Neither the Random Iteration
Algorithm or the Deterministic Algorithm yield a true functional f(x) form that can be directly evaluated by x.
Innovations by the authors to solve problems with this method include: 1) Modification of the algorithm to interpolate on
all available data; 2) Selection of support vectors (control points) directly from data set; 3) Estimation of dn; 4)
Reduction of number of iterations required to evaluate y for a given x; and 5) Replacement of random number generator
for Random Iteration Algorithm by an equivalent function are covered in the next section.
3.3 Data Modeling innovations for IFS
3.3.1 Interpolation on all available data
IFS interpolation routines as published in the literature use only 4 or 5 control points from the original data in order to
capture the dynamics of the process and derive part of the scaling coefficients. The authors have found that instead of
arbitrarily selecting 4 or 5 points from the data set, all available data points can instead be used to reconstruct any 1/f
type noise data set. This required a paradigm shift from the approach of Barnsley, where the goal was to select the
appropriate 4 or 5 points that yielded the best reconstruction results. By using more data points, better reconstruction
results was achieved without the necessity to search for combinations of control points.
3.3.2 Support vector selection
Support vector identification is simplified to equi-spaced partitioning of the original data vector. The number of points N
to sub-sample12,13 in building the model is bounded by (23) where R is the range of the data and ı is the standard
deviation of the data.
=→∆+
N
NJ
R
Ji
Ji
1log
log
limmin)Re(0
1
σ 5.01
log
log
lim)Re(0
1 =
=→∆+
N
NJ
R
Ji
Ji
σ (23)
The number of sub-sample points N can be solved for between these bounds in order to minimize the variance of the
residual between reconstructed and original. We have found excellent results by solving the second equation in (23) for
N (the number of points to sub-sample), and have determined that a good rule of thumb to go by in the absence of the
information contained in (23) is to sub-sample 10% of the original data set. The process of sub-sampling itself occurs as
shown in (11), and is related to the [-1, 1] interval sampling shown previously in (10).
3.3.3 Estimation of dn
3.3.3.1 Method 1
The morphological fractal dimension Df of an IFS interpolation function can be related to dn, by the relationship
)log(
)log(1 1
N
dD
N
n n
f
∑ =+= (24)
where Df is bounded between 1 and 2 and the interpolation function is not a straight line. The authors have noted that if
the fractal dimension Df of the desired interpolation function in (24) is known, its vertical scaling coefficients dn, can be
represented as a function of the fractal statistics of the data, and hence can be estimated from a characterization of the
data itself.
Using the morphological fractal dimension relationship in (24), it naturally follows that a first estimate of the dn
coefficients from the data can be obtained by inverting (24) and solving for dn. By isolating dn in (24), we obtain
.1
1
−
==∑ fDN
n n Nd (25)
If we assume that the magnitude of each of the scaling coefficient is equal and uniformly distributed over all intervals of
the data, the summation (Σ) can be removed from (25) by dividing both sides of the equation by N. This yields the
relationship
.2−= fD
n Nd (26)
The sign of the scaling coefficient dn is assigned to be the same as the sign of the derivative between support vector n
and support vector n – 1 given as
.
1
1
<−
≥=
+
+
nnn
nnn
n
yyd
yyd
d (27)
The authors have found that in practice, the use of the relationship in (26) yields scaling coefficients that are too large
and cause the interpolation function to not reconstruct the original data well. Care must also be taken when choosing
which fractal dimension characterization method to use when mapping Df into dn. The relationship in (26) is developed
based on morphological fractal statistic methods. In order to make the relationship in (26) useful for entropy based
fractal characterization methods, the parameter β was added yielding
.2 β−−= fD
n Nd (28)
with the value of β selected based on the underlying distribution of the original data and the signs of the scaling
coefficients set using (27).
3.3.3.2 Method 2
Because fractal statistics imply multi-resolution scaling, we propose that the estimate of the fractal dimension Df derived
from the sub-sampled points (11) is sufficient to enable reconstruction. Data Modeling uses these sub-sampled points
(support vectors) and derives all the coefficients from the support vector data itself. This includes an estimate of the
vertical scaling coefficient dn obtained from Jaenisch14 as
)(2
13 −+= JJJ β (29)
.
1
1
log
)1(
≤
>−
=−
NJN
NJ
N
J
d
J
n (30)
where β=0.25 for the Gamma distribution, 0.5 for the Weibull and Poisson distributions, -0.25 for the Exponential
distribution, –0.5 for the Triangular distribution, and zero for Gaussian and Uniform distributions. The sign of the
scaling coefficient dn is assigned to using the relationship in (26). This technique, as in the case of (26) and (28) also
assumes that the fractal dimension Df is distributed uniformly among all intervals of the data. If this is not the case,
multi-fractal analysis may be applied and the data partitioned into subintervals so that a spectrum of fractal dimensions
can be obtained15 or a functional representation of Df provided to allow the fractal dimension to vary across the data set.
Techniques based on multi-fractal analysis work very well, but are not covered in the scope of this work due to space
constraints.
3.3.3.3 Method 3
As a further simplification in the determination of dn, we propose that these scaling coefficients can be estimated directly
from the data itself without any preprocessing of the data set for fractal characterization. This estimate uses the
differences of means (∆µ) of neighboring segments connecting support vector points and is given by
2
)(
2
)(
2
)( 2112 nnnnnnn
yyyyyyd
−=
+−
+= ++++ (31)
This scaling coefficient estimate method can be done in real-time as the data is being measured.
3.3.4 Iteration reduction
In applications where the IFS algorithms used for Data Modeling yield an acceptable solution in sufficient time, no
further enhancement is needed. Real time versions must reduce the number of iterations necessary for convergence. We
propose the use of zooming as shown in Fig. 1. Using the Ruelle-Takens theorem, we can use the interpolated values that
define the new bounded segments near the desired x interval. These new points are used to restart the interpolation
sequence on the newly constrained domain.
B
C
D
E
F
f(3.5)?
ε
C
D
2
3
5 6
8
911
12
1
4
7
10
13
1
4
7
10
13C
D
1
4
7
10
13C
D
3.5
A
1 2 3 4 5 6
f(3.5)=3.721
ε
1
2
3
4
B
C
D
E
F
f(3.5)?
ε
C
D
2
3
5 6
8
911
12
1
4
7
10
13
1
4
7
10
13
1
4
7
10
13C
D1
4
7
10
13C
D
1
4
7
10
13C
D
1
4
7
10
13C
D
3.5
A
1 2 3 4 5 6
f(3.5)=3.721
ε
1
2
3
4
Fig. 1. Iteration reduction using zooming.
For the Random Iteration algorithm, once the number of points generated on the desired x interval equals the original
number of sub-sampled points, the IFS transform coefficients in (21) are calculated from these newly generated points
and the iteration process restarted on the new domain. This process is repeated until a solution is found within the
tolerance given in (22).
Using the Deterministic algorithm, only the transform from the segment that the desired data point lies on is used along
with the sub-sampled points. All of the sub-sampled points are mapped using this single transform, and the results stored
in a temporary buffer. The sub-sampled points are replaced with the newly generated points, and the process is repeated
until a solution is found that meets the criteria given in (22). The number of calculations required to find a solution
meeting the criteria in (22) for the Deterministic algorithm is bounded by Nx, where N is the number of points in the
Data Model, and the number of zooms x is directly proportional to the order of the tolerance (x ~ |log(ε)|).
3.3.5 Replacement of random number generator
In Monte Carlo applications where a uniform random number generator is used (such as the Random Iteration
Algorithm), we propose instead to use a stochastic function to derive approximate random digits. We have found that a
good functional representation for approximating random number digits on the interval [-1,1] is the derivative of the Sxe
function (8) and is given as
)sin( exR = (32)
where e is the exponential function for value one (~ 2.7183) , and x is generally used as a non-repeating increasing index
variable. If the desired deviates are to be on an interval other than [-1,1], the results can be renormalized to any
minimum (a) and maximum (b) value using
).)](1([ˆ21 abRaR −++= (33)
Use of this stochastic function for the generation of random deviates in the Data Modeling process eliminates one major
impediment towards a functional representation f(x).
4. APPLICATIONS OF DATA MODELING
4.1 Data hole patching
One application of Data Modeling is in hole patching for data sets16. Although many classical data and signal processing
methods assume that a data set is evenly sampled and contains no holes, this is not always the case in measured data
because of sampling constraints and data dropout. We have found that we can achieve excellent results in patching one
or more holes in data sets using the Data Modeling approach. This is done using the same sampling constraints shown
above in (23) or by naively sampling 10% of the available data evenly. If an evenly space point falls in a data hole
region, we simply omit it from the set of support vectors. The IFS transforms listed above are performed on the available
support vectors to reconstruct the missing data in the dropout area. This new data is simply placed into the original data
sequence where the missing data would have occurred. Results are shown in Fig. 6 in Section 5 of this work.
4.2 Data compression
The authors generally use Data Modeling to perform compression through decimation of the original data and saving
only the support vectors. This facilitates reconstruction on demand to any desired resolution using the Data Modeling
IFS transforms discussed previously. Unlike Barnsley’s method where all of the coefficients in (21) and the vertical
scaling coefficient dn are stored, we store only the sub-sampled points and generate all of the necessary coefficients upon
demand. This method yields an approximately 80% savings in storage from Barnsley’s techniques, and since the number
of points that are to be saved is known ahead of time, the compression ratio is also known in advance. This yields
excellent reconstruction results as demonstrated in Fig. 5 in Section 5.
4.3 Image compression
Data Modeling compresses images by first converting them to a one dimensional data sequence and decimating as
shown previously in (23). This generates a thumbnail representation of the original image. These thumbnails are low-
resolution images that capture the salient features of the original, can be stored in much less space than the original, and
can be reconstructed to any desired resolution upon demand. Two dimensional images are transformed into a one
dimensional data sequence using the Hilbert sequence15 given in (34) subject to Lindenmayer’s L-system grammars
defined by (35).
)()()()(
)()()()(
43211
43211
nnnnn
nnnnn
PwPwPwHwP
PwPwPwPwH
∪∪∪=∪∪∪=
+
+ (34)
°=−→−+→+
→−++−→+−−+→
90δ
FF
FLRFRLFR
FRLFLRFL
).,,(),,(
),,(),,(
),sin,cos(),,(
δααδαα
αααα
+→⇒−−→⇒+
++→⇒
yxyx
yxyx
lylxyxF (35)
The decimation methods contained earlier in this work can then be used to sub-sample the sequence, and Data Modeling
used to reconstruct the sequence to any resolution desired. The inverse Hilbert sequence (reverse of (34)) is then used to
reconstruct the two dimensional image derived from the Data Modeling process. It has been found that this process
yields excellent results as demonstrated in Fig. 2.
Fig. 2. Data Modeling technique applied to an image.
Once the thumbnail has been generated, it can then be further compressed using techniques such as JPEG-2000 and
transmitted in an efficient manner. On the receive end, the thumbnail would be uncompressed with JPEG-2000 and Data
Modeling used to reconstruct the image to the original size or larger on demand, thereby enabling streaming real time
video application.
4.4 Data extrapolation
In order to extrapolate data points, it is necessary to first introduce at least one new support vector that is outside of the
original transform intervals. This is accomplished by appending the first point of the data set to the end of the data set
according to
)( 11 iiNN xxxx −+= ++
iN yy =+1 Ni ,...,2,1= (36)
to form an extrapolation interval. New transforms using (21) are calculated, and the data sequence is reconstructed. Only
the points generated in the first ½ of the extrapolation interval are kept as extrapolated points. This process is repeated
by increasing i in (36) and appending the next point in the original data set to the end of the saved extrapolated points to
create a new extrapolation interval. The reconstruction process is continued until all of the original data points have been
used, and can be repeated as many times as necessary to generate the desired length forecast.
Historical
Fractal Forecast
Linear Prediction
256 5121
Fig. 3. Screenshot of Data Modeling compared to Linear Prediction for forecasting.
4.5 Quantum computing
Using the Data Modeling approach, mathematical operations can be performed on the sub-sampled portion of a larger
data set, reducing the number of calculations required by as much as one third (1/3). The resultant is nearly equal to
performing the same mathematical operation on the full data set. To prove this, we do the following:
Given a set of 256 numbers A, and an equal length filter B of 256 numbers • • • • • • •
Sub-sample every 10th number from A to yield 26 numbers in C
Sub-sample every 10th number from B to yield 26 numbers in D
Convolve A and B yielding 256 numbers in E
Convolve C and D yielding 26 numbers in F
Use fractal reconstruction to inflate 26 numbers in E to 256 numbers in G
Compute Differences in E and G yielding 256 numbers in H
This is shown in Fig. 4 below. The maximum difference value of H is 11%, near the approximate 10% reconstruction
tolerance. This approach can also be applied to an image using Data Modeling, because we treat the 2-D image as a 1-D
data set using the Hilbert sequence.
E
G
Maximum Error ~ 11%
A
C
Thumbnail
Original
A
C
Thumbnail
Original
B
Original
Filter
Thumbnail Filter
D
B
Original
Filter
Thumbnail Filter
D
⊗ =H
Fig. 4. Complex Process Applied to Sub-Sampled Data Yields Equivalent Result.
5. RESULTS
5.1 Data synthesis
Data Modeling has been successfully compared to classical methods including polynomial curve fitting, Markov chains,
wavelet transforms, mean and covariance methods, and Kalman filters (Fig. 5). Column 1 of Fig. 5 shows the original
data sequence of 1024 points and a histogram showing the distribution of the original. In Column 2, results from using
each of the classical techniques mentioned above are shown along with a histogram of the distribution of each resultant.
Note that only the Markov chain process captures the underlying distribution of the original, but fails to capture the
temporal and spatial characteristics of the original.
Column 3 of Fig. 5 shows the fractal Data Modeling approach using five different sub-sampling rates varying from 1%
to 25% of the original. For each of these sub-sampling rates, both the temporal, spatial, and underlying distribution
characteristics of the original are captured.
Columns 4 and 5 show the spatial error and power spectra differences between the original, classical method, and the
fractal reconstruction. Using the 5% Data Model (50 points) as a baseline, only the wavelet transform method provided a
slightly better RSS error, but with a substantial difference in both the power spectra and in temporal and spatial
characteristic retention. Two peaks occurred in the log power spectra that were captured and amplified by the fractal
method albeit with a shift in their locations.
The wavelet power spectra showed no peaks at all in the lower trace, indicating a significant distortion in the original
signal and loss of frequency content. It should also be noted that all of the fractal Data Models generated power spectra
more like the original than any of the classical methods, and that using 50 points of the original data outperformed using
50 wavelet transform coefficients.
Kalman Filter
Mean & Covariance
50 pt. Wavelet
256 pt. Fractal Model1/f Random Signal 1024 pts.
100 pt. Fractal Model
50 pt. Fractal Model
20 pt. Fractal Model
10 pt. Fractal Model
Markov Chain
Poly RSS 2.19
FMM RSS 0.94
Mark RSS 10.16
FMM RSS 1.48
Wave RSS 1.76
FMM RSS 1.99
M&C RSS 8.61
FMM RSS 3.23
Kalm RSS 2.31
FMM RSS 4.76
Polynomial
Poly RSS 33.42
FMM RSS 3.82
Mark RSS 70.23
FMM RSS 5.07
Wave RSS 26.55
FMM RSS 6.66
M&C RSS 53.63
FMM RSS 11.48
Kalm RSS 31.95
FMM RSS 8.91
0 max
0 max
0 max
0 max
Spatial Error Distribution Power Spectra (dB)
0 maxFig. 5. Comparison of fractal Data Modeling and classical methods.
5.2 Noise model comparison
Fig. 6 shows examples of Data Modeling for reconstruction and hole patching for three different 1/f type data sets. The
first two data sets (Rows 1 and 2) were generated using a Gamma distribution as shown, and the third data set was
generated using the Sxe function proposed earlier. We naively sampled the originals in Column 1 by 10% and
reconstructed as discussed earlier in order to obtain Column 2, and we introduced the dropouts shown in Column 3 and
patched them as discussed in Section 4.1. The Cramer-Rao bound of the process is the a priori theoretical best that we
can reconstruct each of the data sets and is given as
)/(100 NCR σ= (37)
where σ is the standard derivation of the N sub-sampled points. The error for the reconstructed curve is given in (38),
and the hole patching error is calculated in the same manner as (38) using only the dropped out points.
))ˆ((1
100 ∑=
−=npts
i
iinptsyyErr (38)
Original
Decimated and
Reconstructed Original with Dropouts Dropouts Patched
Cramer-Rao = 1.72% Recon Error = 2.03% Hole Error = 2.29%
Gamma
(A=2)
Gamma
(A=1)
Sxe
C-R = 2.07% Error = 3.01% Error = 2.95%
C-R = 2.36% Error = 3.98% Error = 3.67%
66 dropouts
53 dropouts
47 dropouts
1816 pts.
1253 pts.
Sub-sample
175 pts.
Sub-sample
119 pts.
1146 pts.Sub-sample
61 pts.
Original
Decimated and
Reconstructed Original with Dropouts Dropouts Patched
Cramer-Rao = 1.72% Recon Error = 2.03% Hole Error = 2.29%
Gamma
(A=2)
Gamma
(A=1)
Sxe
C-R = 2.07% Error = 3.01% Error = 2.95%
C-R = 2.36% Error = 3.98% Error = 3.67%
66 dropouts
53 dropouts
47 dropouts
1816 pts.
1253 pts.
Sub-sample
175 pts.
Sub-sample
119 pts.
1146 pts.Sub-sample
61 pts.
Fig. 6. Examples of Data Modeling for reconstruction and hole patching.
6. CONCLUSIONS
Data Modeling has solved the inverse fractal problem for 1/f noise sets. Innovations developed by the authors during the
course of this work include modifying the IFS algorithm to use all available data instead of 4 or 5 control points,
autonomous selection of the support vectors or control points directly from the data set, the ability to estimate dn from
the data itself, reducing the number of iterations required to find solutions, and the replacement of the random number
generator in the Random Iteration algorithm with a stochastic function. Applicability of these techniques has also been
demonstrated for data and image compression and reconstruction, hole patching, data forecasting, and applicability to
the field of quantum computing for various noise sets. It is possible to process your own data sets through the Data
Modeling algorithms we have provided in the Appendix in the form of a MathCAD function and work document or
QBASIC source code. The author’s permission is granted to use this algorithm provided the author’s are acknowledged
and referenced in any and all derived implementations. If the investigator prefers to use a uniform random number
generator, the Sxe function can be replaced at the appropriate locations.
ACKNOWLEDGMENTS
The authors would like to thank Marvin Carroll, Tec-Masters, Level-13, Licht Strahl Engineering INC, and Kristi
Jaenisch and Technical Dive College for the use of the MAJQANDA algorithm suite during the course of this work.
REFERENCES
1. Moin, P. Fundamentals of Engineering Numerical Analysis, Cambridge: Cambridge University Press, c2001.
2. Press, W.H., Flannery, B.P., Teukolsky, S.A., and W.T. Vetterling, Numerical Recipes in FORTRAN, Cambridge:
Cambridge University Press, c1989.
3. Jaenisch, H.M. and J.W. Handley, “Data Modeling for Radar Applications”, To be published in Proceedings of the
IEEE Radar Conference 2003, Huntsville, AL.
4. Feder, J. Fractals. New York: Plenum Press, 1988.
5. Edgar, G.A. Classics on Fractals, Reading, MA: Addison-Wesley, c1993.
6. Ruelle, D. Chaotic evolution and strange attractors. Cambridge: Cambridge University Press, 1989.
7. Scholkopf, B. and A. Smola, Learning with Kernels. Cambridge, MA: MIT Press, 2002.
8. Turlington, T. Behavioral Modeling of Nonlinear RF and Microwave Devices. Boston, MA: Artech House, 2000.
9. Cherbit, G. Fractals: Non-integral Dimensions and Applications. New York: Wiley, 1991.
10. Elton, J. “An Ergodic Theorem for Iterated Maps,” Journal of Ergodic Theory and Dynamical Systems, 7:481-488
(1987).
11. Barnsley, M. Fractals Everywhere, Second Edition. Boston, MA: Academic Press, 1993.
12. Handley, J., Jaenisch, H. and M. Carroll, “Identification of the behavior transition point in data using a robust fractal
method,” Proceedings of the Southeastern Simulation Conference, October 1995.
13. Handley, J. On The Existence of a Transition Point in Sampled Data using Fractal Methods. Ann Arbor, MI: UMI,
1995.
14. Jaenisch, H.M., Taylor, S.C., Handley, J.W., Carroll, M.P., “Optimal Fractal Image and Data Compression”,
Proceedings Of The Southeastern Simulation Conference '95, pp. 67-77, Orlando, FL 1995.
15. Jaenisch, H., Barton, P., and R. Carruth, “Determining the Fractal Dimension of Scenes and Digital Signals using
ROSETA and Other Novel Approaches,” Proceedings of SPIE Conference, Session: Signal Processing, Sensor
Fusion, and Target Recognition II, Vol. 1955, pp.298-315, Orlando, FL 1993.
16. Jaenisch, H.M. “Fractal Interpolation for Patching Holes in Data Sets”, Proceedings of the Southeastern Simulation
Conference ’92, Pensacola, FL 1992.
APPENDIX
0 50 1000
0.5
1
yy i
xxi
0 500 10000
0.5
1
original i
model ii
i ii,
yyi
ck i( ) xx
i 1−⋅ dk i 1−( ) yy
i 1−⋅ ffk i( )+
+:=xxi
ak i( ) xx
i 1−⋅ ek i( )+:=
yy0
model0
:=xx0
0:=ki
floor Ki( ):=K
insegs
sin iexp 1( )( ) 1+
2
⋅ 1+:=
ffn
xnnsegs
modeln 1−⋅ xn
0model
n⋅− d
nxn
nsegs( ) model0
⋅ xn0
− modelnsegs
− ⋅−
b:=
cn
modeln
modeln 1−− d
n 1− modelnsegs
model0
−( )⋅−
b:=
dn
if n nsegs<model
n 1+ modeln 1−−
2,
model0
modeln
−
2,
:=
en
xnnsegs
xnn 1−⋅ xn
0xn
n⋅−
b:=a
n
xnn
xnn 1−−
b:=
b xnnsegs
xn0
−:=xnn
n:=
n 1 nsegs..:=nsegs 99=nsegs newn 1−:=
modelii
original
floor iinpts 1−
newn 1−
⋅
:=ii 0 1, newn 1−..:=original
i
originali
min original( )−
max original( ) min original( )−:=
newn ceilnpts
10
:=original
ioriginal
i 1− sin ie( )+:=original
00:=i 1 npts..:=npts 1000:=
DM fx itol, x, y,( ) newn length y( )←
nsegs newn 1−←
Nk
k 1+←
k 0 newn 1−..∈for
b xnsegs
x0
−←
dn
yn 1+ y
n 1−−
2←
n 1 nsegs 1−..∈for
dnsegs
y0
ynsegs 1−−
2←
an
xn
xn 1−−
b←
en
xnsegs
xn 1−⋅ x
0x
n⋅−
b←
cn
yn
yn 1−− d
ny
nsegsy
0−( )⋅−
b←
fn
xnsegs
yn 1−⋅ x
0y
n⋅− d
nx
nsegsy
0⋅ x
0y
nsegs⋅−( )⋅−
b←
n 1 nsegs..∈for
xprev x0
←
yprev y0
←
bestx x0
←
besty y0
←
k floor rnd nsegs( ) 1+( )←
xcurrent ak
xprev⋅ ek
+←
ycurrent ck
xprev⋅ dk
yprev⋅ fk
+( )+←
bestx xcurrent←
besty ycurrent←
xcurrent fx− bestx fx−<if
xprev xcurrent←
yprev ycurrent←
bestx fx− itol>while
besty
:=
'Fractal Data Modeling - QBASIC
'Holger Jaenisch, Ph.D. and Jamie Handley, c1992-2003.
nameagain: INPUT "Data file name => ", zz$
IF zz$ = "" THEN GOTO nameagain
percagain: INPUT "Subsample percentage (1-100) => ", perc
IF perc <= 0 THEN GOTO percagain
IF perc > 100 THEN GOTO percagain
perc = perc / 100!
tolagain: INPUT "Reconstruction tolerance (.1) => ", tol
IF tol <= 0 THEN GOTO tolagain
OPEN zz$ FOR INPUT AS #1
ntotal = 0
DO UNTIL EOF(1)
REDIM PRESERVE yorig(ntotal)
INPUT #1, yorig(ntotal)
ntotal = ntotal + 1
LOOP
CLOSE #1
ymin = yorig(0)
ymax = yorig(0)
FOR i = 1 TO ntotal - 1
IF yorig(i) > ymax THEN ymax = yorig(i)
IF yorig(i) < ymin THEN ymin = yorig(i)
NEXT i
FOR i = 0 TO ntotal - 1
yorig(i) = (yorig(i) - ymin) / (ymax - ymin)
NEXT i
n = CINT(perc * ntotal)
REDIM xx(n - 1), yy(n)
FOR i = 0 TO n - 1
xx(i) = CINT(((ntotal - 1) * i) / (n - 1))
yy(i) = yorig(CINT(((ntotal - 1) * i) / (n - 1)))
NEXT i
yy(n) = yy(0)
ERASE yorig
ntrans = n - 1
bb = xx(n - 1) - xx(0)
REDIM a(ntrans), b(ntrans), c(ntrans), d(ntrans), e(ntrans), f(ntrans)
FOR i = 1 TO ntrans
d(i) = ((yy(i + 1) + yy(i)) / 2) - ((yy(i) + yy(i - 1)) / 2)
a(i) = (xx(i) - xx(i - 1)) / bb
b(i) = 0!
c(i) = (yy(i) - yy(i - 1) - d(i) * (yy(n - 1) - yy(0))) / bb
e(i) = (xx(n - 1) * xx(i - 1) - xx(0) * xx(i)) / bb
f(i) = (xx(n - 1) * yy(i - 1) - xx(0) * yy(i) - d(i) * (xx(n - 1) * yy(0) - xx(0) * yy(n - 1))) / bb
NEXT i
OPEN "input.dat" FOR OUTPUT AS #1
FOR i = 0 TO n - 1
PRINT #1, xx(i), yy(i)
NEXT
CLOSE #1
REDIM yyout(ntotal - 1)
FOR i = 0 TO ntotal - 1
yyout(i) = -999
NEXT i
icnt = 0
again1:
OPEN "input.dat" FOR INPUT AS #1
OPEN "output.dat" FOR OUTPUT AS #2
DO UNTIL EOF(1)
INPUT #1, x, y
FOR k = 1 TO ntrans
IF INKEY$ <> "" THEN END
xnew = a(k) * x + b(k) * y + e(k)
ynew = c(k) * x + d(k) * y + f(k)
PRINT #2, xnew, ynew
IF ABS(xnew - CINT(xnew)) <= tol THEN yyout(CINT(xnew)) = ynew
icnt = icnt + 1
NEXT k
ileft = 0
FOR i = 0 TO ntotal - 1
IF yyout(i) = -999 THEN ileft = ileft + 1
NEXT i
IF ileft = 0 THEN
GOTO done1
END IF
LOOP
CLOSE #1
CLOSE #2
KILL "input.dat"
NAME "output.dat" AS "input.dat"
GOTO again1
done1:
CLOSE #1
CLOSE #2
KILL "input.dat"
KILL "output.dat"
CLS
SCREEN 1
WINDOW (-1, 1.1)-(ntotal, -.1)
PRINT "Total calcs = "; icnt
OPEN "final.out" FOR OUTPUT AS #1
FOR i = 0 TO ntotal - 1
PSET (i, yyout(i))
yyout(i) = yyout(i) * (ymax - ymin) + ymin
PRINT #1, yyout(i)
NEXT i
CLOSE #1
END
Automatic Differential Equation Derivation From Sampled Data For
Simulation Parameter Sensitivity Analysis
Holger M. Jaenisch*
James Cook University, Townsville QLD 4811, Australia
ABSTRACT
An analytical differential equation model from a single simulation input and output data vector is derived. The derived
model is analytically varied (real versus imaginary) to determine Critical, Sensitive, and Key parameters without the use
of Design of Experiments (DOE).
Keywords: Formal Analysis, Data Modeling, VV&A, Equivalence, Consistency, Transfer Function Modeling, Automatic Proofing
1. INTRODUCTION
When a large number of parameters exist for a simulation (greater than 40 input and/or output parameters), the
application of Design of Experiments1 (DOE) becomes unwieldy. Coupling of variables is lost when 2 way or 3 way
variation of parameters is performed, since there may be more than 2 or three parameters directly coupled. Instead, the
simulation is modeled analytically by a transfer function derivation using only input vector and associated output
vectors, and without access to source code or the compiled program. If more than one example is available, they may be
used as well. By not adjusting all coupled parameters simultaneously, a true estimate of the parameter sensitivity on the
output is not obtained. Instead, frequency space is used as an environment that limits the number of parameters into
dominant eigenmode terms and varies them one at one time by varying the real and imaginary terms on the analytical
model of the input vector and passing it through the analytical transfer function model of the simulation.
2. DATA MODELING
Data Modeling2 is a unique term coined to describe the process of deriving mathematical equations as models from
simulated or measured data. What follows is a specific embodiment that achieves Data Modeling using classical control
and Fourier theory. The final differential equations generated by the process are generically termed Data Models.
Classic transfer functions3-5 are defined using the raw input and output data from a plant (in this case simulation of
interest). The Z transform of the input and output data is defined as
n
n zazazazX −−− ++++= K2
2
1
11)( (1) n
n zbzbzbbzY −−− ++++= K2
2
1
10)(Single simulation input variable multiple realizations Single associated simulation output variable multiple realizations
In this form, the classic transfer function is formed by
Simulation
n
n
n
n
zazaza
zbzbzbb
zX
zY
InputZ
OutputZzH −−−
−−−
++++++++
===K
K2
2
1
1
2
2
1
10
1)(
)(
)(
)()( (2)
where the input vector X(z) defines the characteristic polynomial of the simulation, and whose roots (poles) and their
placement on a root-locus plot define the sensitivity of the simulation. This represents the TF for a single input variable
with multiple realizations resulting in a single output variable of multiple associated realizations. For multiple simulation
input/output variable cases, the transfer function is
[ ]
=
)()()(
)()()(
)()()(
)(
21
22221
11211
zHzHzH
zHzHzH
zHzHzH
zH
KKKK
K
K
L
MOLM
L
L
(3)
*[email protected]; phone 1 256 337-3768
where each resultant output variable is a linear combination of all input variables. The transfer function from input
variable j to output variable i is Hij(z). This matrix captures the influence of each individual input variable element to all
output variables.
This process can be greatly simplified by adopting a unique and novel interpretation or perspective of definitions. By
taking a vector of multiple simultaneous input variables and an equal length vector of multiple simultaneous output
variables (MIMO) and transposing them (transposing column vectors to rows), the result is simplified to an equivalent
single input/output system (SISO) where the multiple realizations of the single input variable are actually the single
simultaneous input variables. This MISO variable transfiguration (transpose followed by new interpretation of resulting
vector) is comprised of 8 simulation describing variable inputs and 2 computed output variables (from a kinetic impact
debris simulation6). The vector of simulation variable values is formed as follows:
=
=
69.71
14.337
11.0
51.9
27.155
41.0
36.7
38.0
)(
)(
)(
(deg)
(sec)
)( 2
ssScaleIntToTgtMa
kgMass
mAperture
mWavelength
FOV
RelSigVal
Time
mArea
ai µ
=
=
1
1
1
1
1
1
11.7
345
)(
Padding
Padding
Padding
Padding
Padding
Padding
mmMinSize
NFrags
bi
N inputs, single realization N outputs, single realization
i = [ 1 2 3 4 5 6 7 8 ]
[ ]ssScaleIntToTgtMaMassApertureWavelengthFOVRelSigValTimeAreaa =ℑ
[ ]69.7114.33711.051.927.15541.036.738.0=ℑa Single virtual input, N realizations
[ ]PaddingPaddingPaddingPaddingPaddingPaddingMinSizeNFragsb =ℑ
[ ]11111111.7345=ℑb Single virtual output, N realizations (4)
The justification for using a single realization or Monte Carlo run of data for developing a Simulation model from a TF
is obtained when measuring information content or entropy of the variables. The TF in (4) can be used to model the
simulation. Entropy is calculated by converting all digits into gray codes, and processing the codes as though they were
binary digits according to Shannon. The single realization in (4) has an entropy measure of 184; increasing entropy by
1%, requires another 40 cases. The equivalent mean and covariance matrix model (2x2 covariance plus 2 means for the
output), has entropy 40, require 532 Monte Carlos to equal the entropy in a single case derived TF. Figure 1 graphs
entropy as a function of variable number for Kolmogorov-Gabor polynomial, covariance,, and TF type simulation
models.
From (2), the differential coefficients a0, a1, b0, b1…,bn, are
)()()1)(( 2
2
1
10
2
2
1
1 zXzbzbzbbzazazazY n
n
n
n
−−−−−− ++++=++++ KK (5)
where a are original input variable values (the order of the z coefficient becomes the order of the derivative and is equal
to the number of input points) and b are original output variable values (again, order of the z coefficient is the equal to
the order of the derivative and equal to the number of input points).
)()( 011
1
1011
1
1 txbtdt
dxbt
dt
xdbt
dt
xdbtyat
dt
dyat
dt
ydat
dt
yda
n
n
nn
n
nn
n
nn
n
n ++++=++++ −
−
−−
−
− LL (6)
Kolmogorov-Gabor
Polynomial
Kolmogorov-Gabor
Polynomial
Transfer
Function
Covariance
Model
10
105
Log(Entropy)
1 100Log(Nvars)
Transfer
Function
Covariance
Model
10
105
Log(Entropy)
1 100Log(Nvars)
Fig. 1. Log of Entropy versus Log of number of variables.
By substitution, the detailed debris differential Data Model is obtained. Note the special numerical meaning of the
coefficients in front of the derivatives as being the various initial variable values.
=+++++++ )(38.036.741.027.15551.911.014.33769.712
2
3
3
4
4
5
5
6
6
7
7
tytdt
dy
dt
yd
dt
yd
dt
yd
dt
yd
dt
ydt
dt
yd
)(34511.71111112
2
3
3
4
4
5
5
6
6
7
7
txtdt
dx
dt
xd
dt
xd
dt
xd
dt
xd
dt
xd
dt
xd+++++++ (7)
Note that derivative terms, now by substitution, display the variable names they represent from the a and b arrays in (4).
dt
dMinSize
dt
1Pd
dt
2Pd
dt
3Pd
dt
4Pd
dt
5Pd
dt
6Pd36.741.027.15551.911.014.33769.71
2
2
3
3
4
4
5
5
6
6
7
7
++++
3
3
4
4
5
5
6
6
7
7
1111138.0dt
FOVd
dt
Wavelengthd
dt
Apertured
dt
Massd
dt
ScaleCoeffdNFrags ++++=+
Areadt
dTime
dt
lSigValRed34511.71
2
2
+++ (8)
Next, variable names for the derivative terms are substituted, emphasizing the substantial cross coupling captured.
3
3
04
4
05
5
06
6
00dt
2PdFOV
dt
3PdWavelength
dt
4PdAperture
dt
5PdMass
dt
P6dScaleCoeff
7
7
+++
7
7
002
2
0dt
ScaleCoeffd6PNFragsArea
dt
dMinSizeTime
dt
1PdlSigValRe 0=+++
2
2
3
3
4
4
5
5
6
6
dt
lSigValRed1P
dt
FOVd2P
dt
WavelengthdP3
dt
Apertured4P
dt
Massd5P 00000 +++++
AreaNFragsdt
dTimeMinSize 00 ++ (9)
To demonstrate the powerful cross coupling in the model, expand only the second order derivative of the inputs as an
example. Equation (10) shows a subset of the derivatives in (9) in order to demonstrate complexity and small variable
influence.
AreaNFragsdt
dTimeMinSize
dt
lSigValRed1P 0 002
2
++ =
AreaNFragsTimelSigValReMinSizeTimeRelSigValAreaTime1P 0 00 )())()(( +−+−−− (10)
As the derivative order of (10) grows (meaning more simulation variables), it becomes unwieldy. This is mitigated by
using a cascade of bi-quadratic term transfer functions1 and using the resulting coupled differential equations for each
transfer function in the cascade. Because the time index represents different variables, when the differentiation across
different variables occurs, high order variable coupling occurs. This functional (function whose variables themselves are
functions) differential equation is analytically solvable using the well known Volterra series. Further, the well know
Kolmogorov-Gabor (KG) discrete form of the Volterra Series is the solution to our functional differential equation. This
becomes a nested functional multi-variable polynomial shown as
LK ++++= ∑∑∑∑∑∑i j k
kjiijk
i j
jiij
i
iiL xxxaxxaxaaxxxy 021 ),,,( (11)
))))))))),,(((((((((),,,( 2121 kjinL xxxybybybyfxxxy KK = [ ] n
LxxxyO 3),,,( 21 =K (12)
whose direct solution gives the integral to (9) while mitigating the use of numerical integration solvers. The KG
polynomial uses the Group Method of Data Handling (GMDH) and self organizing methods to parsimoniously solve for
its coefficients using 3rd order building blocks and regression in a nested fashion. This yields a multi-input variable
polynomial for a single simulation output variable that provides a direct simulation output without solving by integration.
It should be noted that the number of examples required for generating a KG polynomial must be greater than number of
terms. Alternatively, a single variable GMDH version of the KG polynomial is obtained from all inputs going into a
single variable being tuned to the associated padded output.
This KG Data Model formulation is still somewhat cumbersome and does not lend itself well to simultaneous multiple
output variables or to periodic behavior modeling. Therefore, trigonometric basis functions in the form of Fourier series
are used as shown below, making Data Modeling in frequency space possible using the Fast Fourier Transform (FFT).
This is easily accomplished using the classic z transform defined for z = eiω , which is the same as the discrete Fourier
transform (DFT) defined by
∑=k
j kn
ji
nC )
2exp(
1 π (13)
Therefore, the TF is formed in frequency space using the fast Fourier transform (FFT) to determine the Fourier series of
the input and output sequences. Classically, these two Fourier series would be combined into a TF
∑=
+=n
j
jjjj xBxAxy1
sincos)( ωω )(
)(
InputFFT
OutputFFTTF = (14)
The Fourier series is a lengthy representation that includes many alias terms due to frequency folding. A simplified form
identifies the dominant frequency components or eigenmodes, labeling them as lambda terms after zeroing all other
frequency locations.
otherwise0Ȝdominantif1Ȝ
BAfxyN
ij
j
j
N
ij
jk ==
+=≅ ∑ )sin()cos(),,,()(22
21
ππ λλλλλ K
xBxA jjjjj ωωλ sincos += (15)
Eigenfunction extraction uses a variation of Wiener filtering to identify significant peaks in the dB power spectrum.
These peaks correspond to dominant sine and cosine terms occurring in the power spectrum. Once the FFT is generated,
the power spectrum of the data is calculated, which is the sum of the squares of the real and imaginary terms in (1), and
is then transformed into decibels (dB).
22 )Im()Re( iii BAP += ))max(
(log10)( 10
i
ii
P
PdBP = (16)
Least squares regression is used to fit a straight line to the entire dB power spectrum of the form
bmxy += (17)
where m is the slope of the line and b is the y-intercept. It should be noted that the zero order term is not used in the
fitting and detrending process; however, it is maintained in the final result. The power spectrum is detrended and squared
(or raised to higher powers to reduce the number of modes passing our detrend; or decreased to 1 to pass more nodes). A
second fit using linear regression is performed, where the y-axis of the data is the detrended power spectrum and the x-
axis is simply an index variable. This y-intercept is the noise threshold. Locations where the detrended power spectrum
peaks are greater than the threshold pass the filter unchanged in value from the original spectrum. Adjacent peaks are
combined together, leaving only the maximum value (peak) of the group. The truncated components are normalized to
the minimum and maximum of the full spectrum, thereby conserving entropy lost during peak pruning. This process is
repeated until convergence and avoids the matrix conditioning requirements associated with Principal Component
Analysis (PCA) The full process is described graphically in Figure 2.
Now that an eigenmode model exists, an important excursion can be made. If such simulation models are derived from
I/O data sets representing 2 different simulations to be validated or a simulation and measured data to be calibrated, both
can be accomplished by calculating Equivalence and Consistency between the structure of the 2 eigenmodels and their
coefficient values.
Fig. 2. Modified Wiener filtering process for identifying values above noise floor in power spectrum. Left graph illustrates log-log
power spectrum with linear regression fit used for detrending. Center graph shows the detrended power spectrum (order=2) and its fit.
Right graph shows final peaks retained (zero order maintained at original value throughout process and removed only for graphing).
TF models are now generated for both simulation and test flight or measured data, and their eigenfunction
representations (if chosen) determined. The eigenfunction TF models are compared for Equivalence and Consistency to
characterize the similarities and differences between the simulation and test flight data, with eigenfunction mode
locations and amplitudes compared to determine how alike or different the two models are. Equivalence is determined by
calculating the correlation between the magnitude of each of the TF models, and given by
( )( )( ) ( )∑
= −−
−−=
N
jjj
jj
Model2Model2Model1Model1
Model2Model2Model1Model1eEquivalenc
122
100 (18)
where Model1 and Model2 are the magnitudes of the TF models for the simulation and measured data given by
22
jjj BAModel1 += (19)
and Model1 and Model2 in (18) are the means of the result of (19).
Consistency between Model1 and Model2 is calculated by first subtracting the mean from each frequency vector exactly
as shown in (18). Next, each vector is multiplied by a Mask of zeros and ones, with ones in the mask vector at the
location of an eigenfunction mode, and a zero at all other points. This mask converts the full model into the
eigenfunction model, and is given by
≠=
=0Model11
0Model10Mask1
j
j
j (20)
≠=
=0Model21
0Model20Mask2
j
j
j
Mean subtraction and mask multiplication are given in equation form as
jjjj Mask2Mask1Model1Model1Model1 )( −=
jjjj Mask2Mask1Model2Model2Model2 )( −= (21)
Next, each frequency terms is converted into a Z-score using the standard deviation of the 2 (Model1 and Model2)
frequency terms at the jth location from (21) appended with a zero to make a total of three points, resulting in
j
j
j
Model1Model1
σ=
j
j
j
Model2Model2
σ= (22)
Mask2Mask1
))Norm(d)(1)(Mask22(Mask1
yConsistenc
N
1j
jjj
o
∑=
−= jjj Model1Model2d −= (23)
where Norm in (23) returns a normal probability on a distribution of mean zero and standard deviation of one, and the
argument dj for the probability calculation is given as shown. Figure 3 shows Equivalency and Consistency analyses. The
number of inputs for the first data set is 10 and for the second set 4. Three different scenarios are given. In the 1st,
input/output variable examples from two different Monte Carlo runs of the same simulation are shown; in the 2nd two
different fidelity versions of the same simulation; and in the 3rd the same simulation is running unrelated scenarios.
The Eigenmode Transfer Function (TF) is formed from lambda terms
=))((
))(()(
InputFFT
OutputFFTTF
λλλ (24)
))))))))),,((((((((( 21 kjin xxxybybybyf K = ),,,( 21 kf λλλ K (25)
where the KG polynomial model in (12) is replaced with an equivalent eigenfunction Data Model. If derivative variable
data is modeled with (15), it is simple to integrate analytically forming an integral equation. The integral of (15) is
∑∫ ∑==
−=+n
j
j
j
j
j
j
jn
j
jjjj xB
xA
dx xBxA11
cossin)sincos( ωω
ωω
ωω (26)
while the derivative of (15) is simply
0cossin1
=+−−∑=
n
j
jjjjjj xBxAdx
dy ωωωω (27)
0 5 100
2
j
p
0 10 20 30 400
500
1000
1500
0 2 40
2
jj
5.33 10.67 160
50
100
1
1
Equivalency = 100%Consistency = 100%
Input 1 (10 inputs)
Input 2 (4 inputs)
Outputs 1&2 Common Eigenfunction Modes
0 5 100
2
j
Equivalency = 50%Consistency = 42%
Input 1 (10 inputs)
Input 2 (4 inputs)
Outputs 1&2 Common Eigenfunction Modes
0 2 40
2
jj
0 10 20 30 40500
0
500
1000
1500
5.33 10.67 160
20
40
0 5 100
2
j
Equivalency = 0%Consistency = 0%
Input 1 (10 inputs)
Input 2 (4 inputs)
Outputs 1&2 Com m on Eigenfunction Modes
0 2 40
2
jj
0 10 20 30 401000
0
1000
2000
3000
5.33 10.67 161
0
1
Fig. 3. Comparison of Equivalency and Consistency.
Because the Fourier series is orthogonal, extraction of only dominant terms to form the differential equation model does
not destroy the underlying shape of the variable value curve. However, this is not the case in the time domain, where the
removal of a single power term of a polynomial gives a new polynomial, thereby completely changing the shape of the
function . Hence, simulation Data Modeling is performed in the frequency domain.
A transfer function model was derived using the input variable vector elements in (4), resulting in E coefficients of the
λTF.
)(
)(
aFFT
bFFTTF =λ (28)
E
0.666
0.692−
0.726−
0.196
38.213
0.196
0.726−
0.692−
0
1.654
0.647
0.946−
0
0.946
0.647−
1.654−
0
1
2
3
4
5
6
7
=
E is the matrix given in (28). The first column contains the real part of each frequency term, the second column the
imaginary part including its conjugate, and frequency index in column 3. This is used with (29) to create the derivative of
the TF given by the form in (29). Substituting (28) into (29) and removing the conjugate terms results in (30) (for
simplicity of display; they must be replaced to use the expression with an inverse FFT).
dTF
dx
4π
8 81
nterms 1−
j
Ej 0,− sin 2π E
j 2,i
8⋅
⋅ Ej 1, cos 2π E
j 2,i
8⋅
+
Ej 2,⋅
∑
=
⋅− 0
(29)
which is simply
(dTF/dx) – ((0.77 sin(0.7854 x)) + (1.61 sin(1.5708 x)) + (-0.65 sin(2.3562 x))+ (-152.85sin(3.1416x))
+(1.84 cos(0.7854 x)) + (1.44 cos(1.5708 x)) + (-3.15 cos(2.3562 x)))=0 (30)
Next, the simulation equivalent Data Model is used for Critical and Sensitive Parameter analysis.
3. CRITICAL/SENSITIVE/KEY PARAMETER IDENTIFICATION
• Critical parameters are simulation input variable vector elements for which smallest change in value
causes the largest change in range of one or more simulation output variable vector elements.
• Sensitive parameters are simulation input variable vector elements for which smallest change in value
cause the largest change in output simulation variable variance across all output variable vector
elements, regardless of their range.
• Key parameters are the intersection of the truncated critical and sensitive parameter lists.
This unique method for critical, sensitive, and key parameter estimation can mitigate the need for Design of Experiments
(DOE) with large numbers of (*.exe) inputs. This is only possible with analytical models. Variations of variables are
applied to the input model and changes in output variables measured when convolved with the simulation model,
represented in frequency space by the transfer function. This variation is always 2 way, regardless of number of
variables. The benefit of variable variation in frequency space is that the nonlinear coupling between variables in the
simulation is maintained without the explicit need to account for it using statistical methods, because modifying one
frequency space mode of the input model affects all input and output values simultaneously. Variable variation is not
applied to the transfer function, because any change in the modes of this function is a change in the system model, rather
than variations of the system model. This is consistent with control theory, which indicates that the input z transformed
denominator of the system transfer function defines the simulation characteristic polynomial, whose roots on the root-
locus plot indicate system stability, and hence parameter sensitivity.
One can use eigenmodels or the full system model TF. For this debris model, the TF consists of a total of 8 modes as
shown in (4) and (28), and is short enough to not require eigenmodeling. The TF for the baseline (example 67) is
)75.0cos(20.)5.0sin(65.0)5.0cos(73.)25.0sin(65.1)25.0cos(69.66.0 iiiiiTF πππππ ++−+−=5.1sin(65.0)5.1cos(73.)25.1sin(95.0)25.1cos(2.)cos(2.38)75.0sin(95.0 iiiiii )ππππππ −−+++−
)2sin(65.1)2cos(69. ii ππ −− (31)
The input model is equal in length and terms to the TF, but is changed into a notch or single band-pass filter applied to
the TF. This is done by zeroing all terms in the input model except the zero order, a single mode using 2 way variation,
and the complex conjugate of the term (3 non-zero terms total). For even number of points, the final mode before the
complex conjugate is real only and is also included and held constant along with the zero order. For odd number of
points, this mode does not exist. The real and imaginary values for the single mode are allowed to vary from –2 times to
+2 times their original mode value when only one example is available. When multiple examples are available, the initial
boundaries are set using the global minimum and maximum for each mode calculated from all the cases, and is modified
so that the maximum distance from current case to boundary is maintained on both sides of the current case. This ensures
that the current case is centered for all modes in the map. This variation establishes minimum and maximum margins for
constructing a parameter map for further analysis, and ensures that the original mode values are recoverable from the
center of each map. These margins are only a starting point for the analysis. The parameter maps must be observed to
determine if all of the structure in the parameter space is being observed. If not, the automatic scale is increased by the
analyst until no new structure is being introduced into the map at the boundaries, or until identified boundary conditions
yield invalid equivalent inputs.
Also, when multiple cases are available, the magnitude of the dot product between the real and imaginary components of
the modes is calculated. The dot products are then sorted and the example corresponding to the highest dot product
extracted as the baseline example from which the TF models will be built. For the debris model, this was example 67.
At each grid location, change in input and output variable values from baseline (due to single mode from input causing
changes in model output values in response) are obtained by summing the difference between the varied and original
input and variable output values given by
∑−
=
−=1
0
N
i
ii BaseInNewIndiffIn ∑−
=
−=1
0
N
i
ii BaseOutNewOutdiffOut (32)
These values are rollups across all 8 input and output variable vector indexes for the current complex point. Once these
changes are calculated, the ratio of the maximum output variable change to minimum input variable change is calculated
diffIn
diffOut
eInputChang
geOutputChan= (33)
Once the parameter map (µ map) is generated for the first mode, the process is repeated by allowing the next single
mode of the input model and its complex conjugate to change values. This is done for modes (in our example 3 total) by
excluding the final real valued only mode and the zero order. As before, the value of the real and imaginary components
of the current mode are varied from either –2 times to +2 times their original mode value or between the boundaries set
from the global minimum and maximum of the available cases or those increased by the analyst. These maps are
generated in equal size increments, enabling them to be combined independent of scale in their x and y axes. Figure 4
(left) shows the result of multiplying the magnitude of the logarithm of each of the 3 single mode based independent
maps created processing the baseline example from the debris model.
∏=
=3
1
,, )log(i
jiji MapFinalMap (34)
Real Mode 1
Mode 2
Mode 3
Mod
e 1
Mo
de
2M
od
e 3
Imag
Log(∑−=
−= 1
0
N
i
ii BaseOutNewOutLog(diffOut) )
-660.5
-265.8
-300.7
608.9
81.6
348.5
-467.5
-404.5
-564.5
711.9
235.3
437.5
-263.8
-157.2
-97.8
122.2
-84.6
-63.5
2 1 43
5
Real Mode 1
Mode 2
Mode 3
Mod
e 1
Mo
de
2M
od
e 3
Imag
Log(∑−=
−= 1
0
N
i
ii BaseOutNewOutLog(diffOut) )
-660.5
-265.8
-300.7
608.9
81.6
348.5
-467.5
-404.5
-564.5
711.9
235.3
437.5
-263.8
-157.2
-97.8
122.2
-84.6
-63.5
Real Mode 1
Mode 2
Mode 3
Mod
e 1
Mo
de
2M
od
e 3
Imag
Log(∑−=
−= 1
0
N
i
ii BaseOutNewOutLog(diffOut) )
-660.5
-265.8
-300.7
608.9
81.6
348.5
-467.5
-404.5
-564.5
711.9
235.3
437.5
-263.8
-157.2
-97.8
122.2
-84.6
-63.5
Imag
Log(∑−=
−= 1
0
N
i
ii BaseOutNewOutLog(diffOut) )
-660.5
-265.8
-300.7
608.9
81.6
348.5
-467.5
-404.5
-564.5
711.9
235.3
437.5
-263.8
-157.2
-97.8
122.2
-84.6
-63.5
2 1 43
5
Log(∑−
=
−=1
0
N
i
ii BaseVarNewVarLog(diffOut) )
Real Mode 1
Imag
-660.5
-265.8
-300.7
608.9
81.6
348.5
-467.5
-404.5
-564.5
711.9
235.3
437.5
-501.8
-222.4
-219.5
85.3
-104.6
-94.8
Mode 2
Mode 3
Mode
1
Mo
de
2M
od
e 3
1
2
3
4
5
Log(∑−
=
−=1
0
N
i
ii BaseVarNewVarLog(diffOut) )
Real Mode 1
Imag
-660.5
-265.8
-300.7
608.9
81.6
348.5
-467.5
-404.5
-564.5
711.9
235.3
437.5
-501.8
-222.4
-219.5
85.3
-104.6
-94.8
Mode 2
Mode 3
Mode
1
Mo
de
2M
od
e 3
Log(∑−
=
−=1
0
N
i
ii BaseVarNewVarLog(diffOut) )
Real Mode 1
Imag
-660.5
-265.8
-300.7
608.9
81.6
348.5
-467.5
-404.5
-564.5
711.9
235.3
437.5
-501.8
-222.4
-219.5
85.3
-104.6
-94.8
Mode 2
Mode 3
Mode
1
Mo
de
2M
od
e 3
1
2
3
4
5
Fig. 4. Critical parameter location identification using Maximum Output Change over Minimum Input Change (34) (left), and
sensitive parameter location identification using variance instead of difference (right). White dot in center is baseline, lines mark the
maximum valued red dot, with 5 maximum cases numbered. Coordinates are marked for each mode with mode 1 closest to the graph.
For sensitive parameter identification, the approach is the same as for critical parameters, except that the difference in
standard deviation of the output to baseline and the standard deviation of the input are used instead of difference in input
and output value directly. This is shown in (35) as
)( ii BaseInNewInStdDevdiffIn −= )()( BaseOutStdDevNewOutStdDevdiffOut −= (35)
and results in the map on the right in Figure 4 by again multiplying the log magnitude of each of the 3 single mode based
maps. For each of these maps, the maximum value and its corresponding location (x,y) or (real, imaginary) in each is
determined. This maximum point is the location of the maximum output change that corresponds to the minimum input
change (34), and defines the critical parameter test case (left) and sensitive parameter test case (right).
Once the maximum grid location is found, a new full mode input model is constructed using the amplitudes
corresponding to each mode of fixed grid location of maximum value. All terms are recovered using the minimum and
maximum range value for each map stacked to yield the ones in Figure 4 and knowledge of the maximum location. This
allows for direct calculation of the values. The zero order and final real only value mode are copied over directly from
the original input model. The identified varied input model is inverse FFT to return it from frequency space to its original
space.
Critical, sensitive, and key parameter analysis is applied to both single and multiple example case from the debris model,
varying each mode in the input model one at a time and generating a ranked critical, sensitive, and key parameter lists of
the debris model inputs. Table 1 shows the results obtained using example 67 (found to have the maximum dot product
as described previously in this work) as the baseline.
Table 1. Critical, Sensitive, and Key parameter ranking of inputs.
Rank
Critical
(Single Max
Output Value)
Effect
Sensitive
(Max Output
Variance Change)
Effect
Key
(Max Magnitude
Output Change)
Effect
1 Interceptor Target
Mass Ratio 25.9 x Target Area 295.8 x Time to Impact 18.6 x
2 Relative Signature
Value 6.6 x Wavelength 206.5 x Sensor Aperture 13.8 x
3 Time to Impact 6.3 x Target Mass 173.0 x Relative Signature
Value 12.2 x
4 Sensor Aperture 5.3 x Field of View 135.3 x Wavelength 11.8 x
5 Wavelength 1.5 x Time to Impact 76.5 x Target Area 11.7 x
6 Field of View 1.3 x Sensor Aperture 67.1 x Field of View 6.8 x
7 Target Area 1.0 x Relative Signature
Value 47.5 x Target Mass 6.7 x
8 Target Mass 1.0 x Interceptor Target
Mass Ratio 1.0 x Interceptor Target
Mass Ratio 1.0 x
For identifying the ranked critical parameter list, the ranking is done relative to the maximum output variable change.
Therefore, only knowledge of the input change is required off of the baseline vector (which is assumed to be zero). These
inputs are sorted into ascending order (from most critical to least critical). The effect is calculated by taking the
reciprocal of the ranking divided by the reciprocal of the least critical ranking. For identifying the ranked sensitive
parameter list, each input vector element is tested one at a time by zeroing the other input variables. The variance of the
resulting output is calculated, and the inputs ranked by sorting the variances into descending order. The effect is
calculated by dividing the ranking of each by the ranking of the least sensitive parameter. For identifying the ranked key
parameter list, the critical effect is multiplied by the sensitive effect for each variable, and then sorted into descending
order. A MathCAD document of this technique is provided in the Appendix, along with a short description.
When designing an experiment, first choose the number of variables that can be tested. Next, specify the total number of
tests that can be done. As an example, specify 2 variables to test per case and only 5 test cases. Therefore, the top 2 key
parameters from the analysis are the experiment test case parameters. To identify calibration or anchor test cases, map
values and corresponding locations are raster scanned out and sorted by value into descending order given by
)()( ,ijjNi MapSortRasterOutSort =+ N=Number of Columns (36)
These are the ranked validation cases (traditionally representing Latin hypercube axes for DOE). These locations are
used to reconstruct the full eigenmodes for the test cases, which are then inverse Fourier transformed as previously
described. The list is processed until 5 valid anchor test cases of unique key parameters are identified. This is done by
performing the full critical/sensitive/key parameter analysis above for each case in descending order. Figure 4 identifies
by number 1-5 on each map the 5 anchor test cases.
REFERENCES
1. Jones, R. B., Symbolic Simulation Methods for Industrial Formal Verification, Boston, MA: Kluwer Academic
Publications, 2002.
2. Jaenisch, H.M., Algorithms for Autonomous Unmanned Deep Space Probes, D.Sc. Thesis, James Cook Univ., 2006.
3. Rosko, Joseph S., Digital Simulation of Physical Systems, Reading, MA: Addison-Wesley, 1972.
4. Barnett, S. & Cameron, R.G., Introduction to Mathematical Control Theory, 2nd ed., Oxford: Clarendon, 1993.
5. Doyle, F.J., Pearson, R.K., Ogunnaike, B. A., Identification and Control Using Volterra Models, London: Springer-
Verlag, 2002.
6. Jaenisch, H.M., Handley, J.W,. Hicklen, M.L., “Data Model Predictive Control as a New Mathematical Framework
for Simulation and VV&A”, SPIE Defense and Security Symposium, Orlando, FL, April 2006.
APPENDIX (Critical/Sensitive/Key Parameter Estimation (MathCAD 12))
Description
The MathCAD 12 document contained in the Appendix takes an n input and 2 output data file. This file is read into the
variable z at the top of the first page of the document. Variables are in columns in the file, with the 2 output columns
appended as the final 2 columns. Monte Carlos (multiple examples) are accommodated in the input file, with each on an
individual line. This program can also accommodate simulation runs where the inputs do not vary, but the outputs vary
due to Monte Carlos. This program analyzes the dot product from the examples and chooses one as the baseline (ic),
processes that example using the single mode technique described in this work, and returns rankings both in the ratio
given in (33) and in numerical score (1, 2, 3, …). In order to increase the size of the parameter map space, multiply both
occurrences of jval on the left hand side of the equation by a number greater than 1 (for our examples, this was 6 and 24).
This document is currently set up to extract the maximum point and perform the full critical/sensitive/key parameter
identification described in this paper. In order to extract multiple key parameters per test case (cases 1 to 5 as shown in
Figure 4), an outer loop can be supplied around this document. Otherwise, it must be run manually once for each of the
5 numbered locations in Figure 4.
Automatic Differential Equation Data Modeling for UAV Situational Awareness
Holger Jaenisch, Ph.D. and James Handley
Sparta, Inc.
4901 Corporate Drive, Suite 102
Huntsville, AL 35805
email: [email protected]
KEYWORDS
Data Modeling, UAV, differential equations, situational
awareness, parameter estimation
ABSTRACT
This paper demonstrates the ability to
autonomously make assessments of the physical
characteristics of a monitored object only from
knowledge of its motion in the x-y plane.
Traditionally, the process of state estimation from a
trajectory estimate has been in the realm of the Kalman
filter. Unfortunately, the Kalman filter and Extended
Kalman filter require a priori defined models of the
dynamics of the object being observed. This entails the
existence of a physics based partial differential equation
that describes the equations of motion in the filter to
enable predictions of state for comparison to be made.
INTRODUCTION
With the advent of Data Modeling theory (Jaenisch
et al. 2003d), it becomes possible to introduce a
significant amount of intelligence into very small
footprint UAVs. This results in the ability for a UAV to
report salient information across an ad hoc network
rather than overwhelming available bandwidth with
unprocessed data.
As an alternative to reporting full up data streams, it
is possible to create real-time Data Models
(mathematical equations) of sensor measurements by
transmitting only sparsely sub-sampled sensor
values, and reconstructing via Data Modeling fractal
interpolation the full length high fidelity time series on
demand at the receiver. This algorithm and results have
been published in previous work given in the
References (Jaenisch and Handley 2003).
It is also possible to monitor the status of onboard
sensors to determine calibration and operational
characteristics using the Data Modeling theory of
change detection for situational awareness (Jaenisch
et al. 2003c). This algorithm and approach has also
been published in previous work given in the
References (Jaenisch et al. 2003b).
Data Modeling has also been used to compress
images via sub-sampling to a thumbnail size, and the
thumbnail then readily transmitted and reconstructed on
the receive end. Applications include IDM enabled
modems transmitting streaming live video at slow 56-
kilobyte rates.
The use of Data Driven Differential Equation Data
Modeling enables a differential equation to be
constructed from measured data alone enabling physics
based aerodynamics modeling with limited
computational resources. This enables estimates of the
object’s mass, equivalent area, and probable control
feedback loop parameters to be determined in a least
mean squared error (LMSE) fashion.
DATA MODELING THEORY
The Data Model is the smallest amount of
information required to functionally model empirical
dynamics. This model can take the form of a group of
real numbers, a single equation, or a network of
equations. The general philosophy of Data Modeling is
Data “Blank” Modeling, where the “Blank” takes on
the descriptive term Sequence, Relations, or Change.
The “Blank” depends on the type of construct modeled.
Data Sequence Modeling uses fractal methods to
capture the self-similarity and scaling dynamics of
sequences of data within a data series. Data Relations
Modeling are equations that capture the nonlinear
relationships between different data series. Data
Relations Models are tuned to be sensitive to changes
of input conditions enabling system change detection.
This is called Data Change Modeling. Other constructs
may be Data Modeled as well; this list is illustrative
rather than exhaustive.
DATA SEQUENCE MODELING
Data sets (time series, image, or other vectors of
information) can be reconstructed from a sparse sub-
sampling (Jaenisch and Handley 2003). Sparse sub-
sampling is achieved by naive decimation, and the
sparsely sub-sampled points are the Data Model of the
original data set. Because they are demonstrated to be
the support vector
[ 01),( =−+ bwxy iiiα ] (1)
of the set, the sample size need never exceed 10% of
the original number of points, akin to reconstructing a
musical score by saving only every 10th
note. Data
Sequence Modeling is based on a unique fractal
operator defined as
{ } 10111 ))(()(
−−+−
−−
−−
≡ iNnnnn
if xN
yyyy
N
yyyD
)()( 1101111 −+−−−+ −−+−+ nnninn yyyyyyy (2)
DATA RELATIONS MODELING
Data Relations Modeling is the process of finding a
multivariate expression that maps given finite sample
values of the independent variables into associated
values of the dependent values of the process. This
process involves approximating the Kolmogorov-
Gabor polynomial
L++++= ∑∑∑∑∑∑i j k
kjiijk
i j
jiij
i
ii xxxaxxaxaa0φ (3)
using nested functions of the form
))))))))).((((((((,()( 21 tbxtbxbxtftx nK= (4)
Data Modeling differs from conventional methods such
as linear, quadratic, and polynomial regression
techniques that only solve for the numerical coefficients
for a user specified function. Data Models discover the
form of the best function adaptively and autonomously.
The Data Modeling process of discovering automatic
functions is an evolutionary/genetic directed
approach where the set elements consist entirely of
simple polynomial building blocks. Data Modeling
also differs from genetic algorithms and symbolic
regression because it incorporates fractal
interpolation functions for achieving mathematical
modeling of complex multivariable processes (Jaenisch
et al. 2003a).
IMAGE COMPRESSION
In Fig. 1, the image on the left is that collected and
telemetered by the UAV. If the whole image was only
sampled at every 10th point (10:1 compression), the
thumbnail superimposed at the bottom center between
the two images is all that would be telemetered down.
This thumbnail was inflated on the receive side back up
to its original size using a fractal operator as shown by
the image on the right of Fig. 1. It has sufficient
resolution to be considered “lossless” by image feature
extraction and decision algorithms.
CALIBRATION ON DEMAND
Change detection type Data Models have been
created and reported in the literature for many
applications including network monitoring for
information based attacks. Unique to these change
detection Data Models is the ability to discern nominal
and abnormal behavior without ever having been
exposed to the latter, thereby achieving external
situation al awareness without a priori assumptions.
A particularly exciting example was reported in
recent literature showing where only 400 examples of
nominal information flow were needed out of a
possible 4 million training records to accurately
discriminate between nominal and 14 different types of
attack behavior. Yet not a single example of attack
was used for training. This unique ability to detect that
which is different makes change detection models
suitable for many roles in the UAV mission (Jaenisch et
al. 2003e).
DYNAMIC DIFFERENTIAL EQUATION
GENERATION ALGORITHM
One future requirement for UAV systems will be to
autonomously observe a moving target and form an
opinion as to its type, size, shape, likely trajectory, and
intent. To enable such an assessment, a critical
intelligence gathering and processing capability is the
system identification and parameter estimation
process. To achieve such estimates autonomously
requires the formulation of the dynamic equations of
motion governing the observed target.
Distinguishing an aircraft from a landcraft is done
using a 100 mph threshold, and landcraft can be further
divided into land and water based craft using a
threshold of 40 mph. Aircraft can be divided into
propeller (100-150 mph), turboprop (150-250 mph), jet
(200-500 mph), and missile (500+ mph).
An object’s maneuvering is characterized by
calculating the change in slope of the trajectory
anchored at the initial point. This change is
characterized in terms of higher order moments
i
N
j
i
j
Ni
xxM µ
σ−
−= ∑
=1
1 (5)
Once the object type has been determined, various
subsets of a priori physics based derivative forcing
functions are available for fitting. If this type of bulk
filter determination is not available, then all classes of
forcing functions are used. Because each class of
forcing function contains many of the same types of
forcing functions, it is not necessary to determine the
class to a great degree of accuracy.
2r
GMaGravity =
m
pAvCa D
Drag
25.0= (7)
Fig. 2 shows the measured example of a
maneuvering object trajectory. The true trajectory is
biased and distorted by the sensor measurement process
and therefore is not known. A Kalman filter cannot be
applied because this object is maneuvering and the
equations of motion cannot be known in advance. No a
priori model exists for the sensor measurement bias
and noise. This leads to a double confounded problem
of blind system identification and parameter
estimation in the form of modeling and predicting the
true trajectory and modeling and predicting the sensor
noise and bias.
Using the filtering criteria above, it was determined
that the object whose trajectory is given in Fig. 2 is
most likely a maneuvering missile, which activated the
following group of forcing functions as candidates
1
2
1
2
1 KvdA ρ−= (Axial drag)
21
2
2 KvdB ρ= (Cross force due to
cross velocity)
31
3
2 KdC ωρ−= (Magnus cross force
due to cross velocity)
41
4
2 KdD ωρ= (Magnus cross force
due to cross spin)
51
3
2 KvdE ρ−= (Cross force due to
cross spin)
The idea is to estimate and model the second
derivative of the moving object. This is accomplished
by autonomously selecting related physics based
derivative forcing functions. The control parameters
from these forcing functions are estimated using a
genetic algorithm to gain close initial guesses for final
optimization using Levenberg-Marquardt nonlinear
least squares fitting.
611
4
81 KvdP ωρπ−= (Spin-decelerating
moment)
71
4
2 KdQ ωρ= (Magnus cross torque
due to cross velocity)
81
3
2 KvdR ρ−= (Cross torque due to
cross velocity)
91
4
2 KvdS ρ= (Cross torque due to
cross spin)
The genetic algorithm is a directed random search
parameter estimator defined as follows: 1) Select a gene
pool size of nn, where n is the number of coefficients in
the forcing function; 2) Populate pool with random
values (apply prior probabilities if known); 3) Rank
according to selected cost function (determine points
falling outside ¼ standard deviation boundary of
objective function, and sum their Euclidean distances);
4) Select frequency of occurrence of mutation and its
form (negation or random perturbation); 5) Select
number of offspring to propagate forward as nn/2; 6)
Populate offspring pool with top ½ of current pool and
spliced offspring (splicing selects a random location in
the coefficient array and splices the left side of one
parent and the right side of another into a new
offspring); 7) Pool is reranked as in 3); 8) Stopping
criteria is all points are within ¼ standard deviation
boundary or user specified maximum number of
iterations achieved. The best candidate solution is then
passed through Levenberg-Marquardt for a final
optimization.
101
5
2 KdT ωρ−= (Magnus cross torque
due to cross spin)
2r
GMmFg = (Gravity force) (6)
As many as 72 other smaller perturbing forces
describing body shape, fin orientation, fin shape, fin
placement, as well as other stability derivatives could
also be included in the list; however, the above 11 are
the major contributors. The desired form of the final second order
differential equation is Because the local body coordinate system is
unknown and changes orientation with respect to a
fixed earth centered inertial coordinate system, only the
magnitudes of the forcing functions was used.
Performing correlation on the 11 forces in (6) ranked
the axial drag force and gravity were the most
significant and were modeled as
),,,,,,,( 112
2
mn uuccdt
dyyf
dt
ydKK= (8)
where f is a continuous function,
dt
dy and 2
2
dt
yd are the
first and second derivatives of y with respect to time, c1
to cn are constants, and u1 to um are forcing functions.
Step B – Physics based forcing function derivatives
A genetic algorithm is used as a part of Data
Modeling to determine the optimal mass, drag
coefficient, and effective area of the object. For the
trajectory in Fig. 2, the genetic algorithm predicted:
In order to ensure that smooth data curves and
derivatives were used in the differential equation
discovery process, data sets were sub-sampled and
Turlington polynomials and derivatives of the form
2037.0 mA = (Equivalent body area) "5.8217.0 == mD (Diameter equivalent body)
( )
∑
+++=
−m
d
xx
mm
m
DBxAxT1
10 101log)( (9) 69.0=DC
lbskgM 7318.33 == (12)
( )
( )∑ −
−
+
+=m
d
xx
d
xx
m
m
m
m
m
m
d
DB
dx
dT
1
101
10 (10) As discussed previously, two physics based forcing
functions were used and resulted in a Data Model of the
form
).5.0
,,(2
2
2
2
2
m
pAvC
r
GMtf
dt
yd D=
(13) were used. Turlington polynomials are continuous
orthogonal functions that are built in a piecewise
fashion but are everywhere differentiable.
Orthogonality enables construction of the polynomial
and its derivatives in real-time as data is being
received by a sensor such as a UAV. Data Modeling
theory has shown previously that sub-sampling of data
processes has a negligible impact on the results, and
was therefore used to decrease the number of
calculations and thereby speed up the process.
Step C – Thrust derivative
A Data Model for thrust of the form
),(),(2
2
3
2
2
T
T
dt
ydtf
m
Ftf
dt
yd
==
(14)
The modeling steps in the process used to determine
the differential equation are labeled as:
was built using a Turlington polynomial function
derived from a nominal thrust acceleration profile based
on the STAR, LOKI, Viper IIIA class targets. Fig. 2
shows the acceleration profile used and a comparison of
this Data Model to the sub-sampled Turlington 2nd
derivative curve. If the thrust and mass profiles are
known, they can be used to calculate the
acceleration due to thrust directly and this then in
turn used in the Data Model.
1. Target physics model
a. Position and velocity derivative
b. Physics based forcing function
derivatives
c. Thrust derivative
d. Lumped parameter stability derivatives
2. Sensor bias model
a. Position bias
b. Orientation error Step D – Lumped parameter derivatives
c. Measurement error
d. Boresight error Lumped parameters represent fin number,
placement, aspect, and other perturbing forces lumped
into a single group and modeled using eigenfunctions.
Generation of these eigenfunctions is done with a novel
method published previously using a modified Weiner
filtering technique on the Turlington polynomial of the
2nd derivative of the data in Fig. 6. This results in a Data
Model of the form
DERIVING THE TARGET PHYSICS MODEL
Step A – Position and velocity derivative
The 1200-point data set in Fig. 6 was 10% sub-
sampled to 120 points and Turlington polynomials
constructed of the sub-sampled trajectory, and its 1st
and 2nd derivatives. A Data Model was then generated.
).,,,( 321
4
2
2
λλλtfdt
yd=
(15)
),,(
1
2
2
dt
dyytf
dt
yd=
(11)
Target physics model
The derivatives from (11), (13), (14), and (15) are
treated as metavariables and fused into a Data Model
of the form
).,,,,(
4
2
2
3
2
2
2
2
2
1
2
2
5
2
2
=
dt
yd
dt
yd
dt
yd
dt
ydtf
dt
yd (16)
DERIVING THE SENSOR BIAS MODEL
Step A – Position bias
First, (16) is double integrated (using either a 4th
order Runge Kutta method or direct numerical
integration) at the proper step size to return the same
number of points (1200) as the original. The result is
compared with the original trajectory in Fig. 6 in order
to determine the sensor position bias, which is based on
the range ratio of the original and the target physics
model. For this data set, the bias (B) was 1.4 and when
applied to (16) results in
5
2
2
6
2
2
=
dt
ydB
dt
yd (17)
Step B – Orientation error
Sensor orientation error is modeled as a first
order differential equation of the form
))))))(((((,()(
1 tyytfdt
ydn
i
λλ ∆∆=
∆
K (18)
where ∆y is the difference between (17) and the original
trajectory, and λι are the lumped parameter dominant
eigenfunctions of dt
yd )(∆. The results of (18) can be
fused directly into the Data Model in (17) as a velocity
change after the first integration, or (18) can be
integrated and added to the resultant trajectory
after double integration of (17).
Step C – Measurement error
Sensor measurement error is modeled as a first
order differential equation of the form
))))))(((((,()(
1 tyytfdt
ydn
i
λλ ∆∆=
∆
K (19)
where ∆y is the difference between (17) corrected for
orientation error and the original trajectory, and λι are
the lumped parameter dominant eigenfunctions of
dt
yd )(∆. The results of (19) can be fused directly into
the Data Model in (17) as a velocity change after the
first integration, or (19) can be integrated and added
to the resultant trajectory after double integration
of (17).
Step D – Boresight error
Boresight error is modeled as a first order
differential equation of the form
))))))(((((,()(
1 tyytfdt
ydn
i
λλ ∆∆=
∆
K (20)
where ∆y is the difference between (17) corrected for
orientation and measurement error and the original
trajectory, and λι are the lumped parameter dominant
eigenfunctions of dt
yd )(∆. The results of (20) can be
fused directly into the Data Model in (17) as a velocity
change after the first integration, or (20) can be
integrated and added to the resultant trajectory
after double integration of (17).
Results
The resultant Data Modeling derived differential
equation output is shown in Fig. 2 along with the
residual error and its histogram. The proof of concept
mode lies in the 8th
bin of the histogram which
corresponds to a most frequently occurring error
bound of negative one (-1.0) to ½ (0.5) meters,
matching the original Cramer-Rao bound ambiguity.
REFERENCES
Jaenisch, H. and J. Handley, “Data Modeling of 1/f
noise sets”, Proceedings of SPIE Fluctuations and
Noise 2003, June 1-4, 2003. Santa Fe, NM.
Jaenisch, H., J. Handley, and L. Bonham, “Data
Modeling for Unmanned Vehicle Situational
Awareness,” Proceedings of the AIAA, 2003 Unmanned
Vehicle Conference, September 14-18, 2003. San
Diego, CA.
Jaenisch, H., J. Handley, and L. Bonham, “Data
Modeling Change Detection of Inventory Flow and
Instrument Calibration”, Proceedings of SOLE 2003,
August 10-14, 2003. Huntsville, AL.
Jaenisch, H., J. Handley, L. Bonham, “Enabling
Calibration on Demand for Situational Awareness”,
Army Aviation Association of America (AAAA)
Tennessee Valley Chapter Huntsville Alabama, Feb 12,
2003.
Jaenisch, H., J. Handley, and J. Faucheux, “Data Driven
Differential Equation Modeling of fBm processes”,
Proceedings of SPIE, August 3-8, 2003. San Diego,
CA.
Jaenisch, H., J. Handley, and J. Faucheux, “Data
Modeling of network dynamics”, Proceedings of SPIE,
August 3-8, 2003. San Diego, CA.
FIGURES
Fig 1. Complex Process Applied to Sub-Sampled Data Yields Equivalent Result.
Residual
Error
Original and
Model Overlay
0.0
-6151.4
0.0 1200.0
5.16
-13.18
Thrust
0.0 1200.0
Gravity
Drag
Lumped Stability Derivatives
0.0 1200.0
0.0 1200.0
Boresight
Measurement
Orientation
Fig. 2. Measured trajectory data compared with model overlay (top left), physics based and lumped stability
derivative forcing functions (top right), user input thrust acceleration profile (bottom left), and orientation,
measurement, and boresight error derivatives (bottom right).
Data Model Predictive Control as a New Mathematical Framework for
Simulation and VV&A
Holger M. Jaenisch*a,b
, James W. Handleya, and Michael L. Hicklen
b
aJames Cook University, Townsville QLD 4811, Australia
bdtech Systems Inc., P.O. Box 18924, Huntsville, AL 35804
ABSTRACT
This paper presents the mathematical theory and procedure for comparing two simulations analytically. The result is the
derivation of two equation models; one for each respective simulation. The derived models are analytically compared to
determine: equivalence, consistency, linearity, similarity, and degree of overlap. This yields a unique analytical tool for
comparing simulation versions or scenarios for VV&A. Methods as simple as regression can then be used to determine if
accreditation is maintained on new simulations or models. The derived analytical functions can themselves be
appropriately combined into an adaptive intelligent lookup table (LUT) equivalent model for real-time simulation
purposes.
Keywords: Formal Analysis, Data Modeling, VV&A, Equivalence, Consistency, Transfer Function Modeling, Automatic Proofing
1. FORMAL VV&A
Traditional validation, verification, and accreditation (VV&A) methods use exhaustive variable instantiating and testing.
This includes closing of shall statements and matching requirements to expected performance values. True validation and
verification follows a sequence of steps leading from requirements to source code inspection and analysis, and finally
collection of numerical data from the various subroutines and functions of operational code and comparing these
numbers to expected values. This is a very cumbersome time consuming process and can never be shown to be fully
completed. It remains a process of refining the V&V process and always leaves doubt as to the validity of the code for
new scenarios. This limits accreditation for intended uses only rather than global accreditation enabling excursions on a
validated simulation across all possible input scenarios.
Formal VV&A attempts to apply the tools of mathematics in the form of lambda calculus and formal logic to treat
programs as proof-able entities. The promise of being able to automatically solve for and prove the validity and
correctness of programs is an elusive but worthy goal. If we can demonstrate that programs are in fact equivalent to
mathematical equations, then transformation of the equations into other forms can be accomplished with simple algebraic
manipulation. Further, once these equations in the form of differential equations are derived from programs, then the
tools of formal mathematical proofing can in theory be applied to formally prove the correctness of the programs.
Towards this goal we present a novel numerical method for doing that which bypasses the lambda calculus and formal
logic in favor of dimensional and structural analysis of the derived equation models themselves, equivalence and
consistency become quantifiable values that have meaning and can formally be used to prove two different versions of a
program to be the same or different with a simple to understand number. Finally, the program source code itself no
longer needs to be evaluated. Rather, equations are derived that map the input into the resulting output. By comparing the
behavior of two or more independently derived equations, their similarity can again be formally evaluated and used for
VV&A. First, we will develop an informal proof of the equivalence of programs and equations.
EQUAL means equal over infinite domain, Equivalency is equal over a bounded domain. Consistency means
correlated linear changes in output between two models when driven by proportional similar perturbations.
Consistency quantifies implied causality. When both Equivalence and Consistency are 100%, the models are
congruent.
Given: Functions f1 (x,y,z) and f2 (x,y,z) and assuming Consistency and Equivalency Between f1 and f2 is proof-
able using methods of FORMAL ANALYSIS.
*[email protected]; phone 1 256 337-3768
Let
y1 = sin(x) y2 = x – x3 / 3! + x5 / 5! – x7 / 7! (Truncated Taylor Series)
Consistency and Equivalency are proof-able over a bounded range, Therefore
1) y1 = sin(x) Is Equivalent to y2 = x – x3 / 3! + x5 / 5! – x7 / 7!
2) y = f(x) Is a Black Box Description of the Equations
3) y1 and y2 Are Equivalent to f1(x) and f2(x)
4) Changes of variables in f1(x) and f2(x) yield consistent change in output.
5) f(sin(x)) = (sin.exe) and f(Taylor(x)) = (Taylor.exe)
6) Since an arbitrary f(x,y,z) is transformed into Fourier series, FFT is an invertible mapping
A Program (*.EXE) is a black box f(x) which takes as input numerical value for variables and returns as output
numerical values. Since a causal mapping between these I/O exist the program is a function or LUT table
generator. Since a Formal FUNCTION is also a causal mapping between I/O and a LUT generator (*.EXE) and
f(x…) are equivalent. Since any function can be encoded as a program (*.EXE) then the function and program
are equivalent:
Given f1(x) ~ f(*.EXE) f2(x) ~ f(*.EXE)
Then f1(*.EXE) and f2(*.EXE) are generalized fA(*.EXE) and fB(*.EXE)
Therefore fA(*.EXE) = fA(x,y,z) fB(*.EXE) = fB(x,y,z)
Assuming System Inputs FAin = fAin(x,y,z) and Outputs FAout = fAout(x,y,z), Then
x1 = FFT(FAin) and y1 = FFT(FAout) yields the FORMAL TRANSFER FUNCTION MODEL
TF1=y1/x1 And TF2=y2/x2
From Which New Output is Generated Using New Input x1’ and x2’. Since Fourier series are convergent, and
can be converted into differential equation form, they are proof-able. QED
Analytical comparison of simulations is accomplished using the method of classical control theory transfer functions
(TFs) derived from input/output data from the two simulations. The methods analyze the form of the two independently
derived TF models by comparing their mathematical structure and the value of their coefficients. Equivalence and
Consistency between models is measured, and therefore between the simulations from which they are derived.
Specifically, for comparison of a simulation and its equivalent Data Model, we are interested in the structure and form of
the TF model constructed from the simulation compared with the TF model constructed from the Data Model. Since the
TF models are comprised of unique modes mathematically expressed as cosine and sine functions, they are readily
transformed into analytical derivative terms, and the number of unique modes can be reduced by performing
eigenfunction modeling using modified Wiener filtering.
2. DATA MODELING
2.1 Transfer function theory
Data Modeling1 with transfer functions can be used to model discrete input and output (I/O) systems. The transfer
function2-3 (TF) provides an algebraic representation of a linear, time-invariant filter in the frequency domain. Because
Data Modeling transfer function models occur in frequency space using complex variables, they are called analytical
representations. The frequency domain transfer function is determined using the fast Fourier transform (FFT). The
frequency space representation is separated into the real and imaginary components
∑==j
N
ij
iii yyFFTA )cos())(Re( 2
21 ππ ∑==
j
N
ij
iii yyFFTB )sin())(Im( 2
21 ππ (1)
where A are the real terms, B the imaginary terms, N the number of points, and both i and j are counters ranging from
zero to N-1. Each frequency term in (1) results in a cosine-sine pair in the analytical convergent Fourier series
∑=
+=n
j
jjjj xBxAxy1
sincos)( ωω (2)
The transfer function is defined as the frequency domain representation of the output divided by the frequency domain
representation of the input, given in equation form by
ionDeconvolutInputFFT
OutputFFT
zX
zYzH ===
)(
)(
)(
)()( (3)
where H(z) is the transfer function, Y(z) the frequency domain representation of the output, and X(z) the frequency
domain representation of the input. The division in (3) is a deconvolution, which removes the effects of the input data set
from the output data set. The resulting ratio of FFT(output) to FFT(input) is the TF model of the process. Once
generated, the TF model is used to transform (convolve) any input data set into its related output data set, given by
))()(
)(( NewInputFFT
InputFFT
OutputFFTInvFFTNewOutput = (4)
If an inverse Fast Fourier Transform (InvFFT) is performed on the TF model in (3), the result is the system response
function (SRF)
)sin()cos()(22
N
ij
j
j
N
ij
jii BATFInvFFTSRFππ +== ∑ (5)
which is in the time domain and used with spatial convolution of a new input. Conversely, in the frequency domain
convolution is a simple complex multiply of the TF with a new input as shown in (4). Using eigenfunction Data
Modeling or modified Weiner filtering, the number of frequency terms can be reduced, yielding a compact function of
dominant eigenmodes (λ) remaining after pruning. This is shown in
otherwise0Ȝdominantif1Ȝ
BAfxyN
ij
j
j
N
ij
jk ==
+=≅ ∑ )sin()cos(),,,()(22
21
ππ λλλλλ K (6)
2.2 Eigenfunction modeling
Eigenfunction extraction uses a variation of Wiener filtering to identify significant peaks in the dB power spectrum.
These peaks correspond to dominant sine and cosine terms occurring in the power spectrum. Once the FFT is generated,
the power spectrum of the data is calculated, which is the sum of the squares of the real and imaginary terms in (1), and
is then transformed into decibels (dB).
22 )Im()Re( iii BAP += ))max(
(log10)( 10
i
ii
P
PdBP = (7)
Least squares regression is used to fit a straight line to the entire dB power spectrum of the form
bmxy += (8)
where m is the slope of the line and b is the y-intercept. It should be noted that the zero order term is not used in the
fitting and detrending process; however, it is maintained in the final result. The power spectrum is detrended and squared
(or raised to higher powers to reduce the number of modes passing our detrend; or decreased to 1 to pass more nodes). A
second fit using linear regression is performed, where the y-axis of the data is the detrended power spectrum and the x-
axis is simply an index variable. This y-intercept is the noise threshold. Locations where the detrended power spectrum
peaks are greater than the threshold pass the filter unchanged in value from the original frequency space representation.
Adjacent peaks are combined together, leaving only the maximum value (peak) of the group. The truncated components
are multiplied by a scale factor to maintain equal power in the truncated and full models, thereby conserving entropy lost
during peak pruning. This process is repeated until convergence and avoids the matrix conditioning requirements
associated with Principal Component Analysis (PCA) The full process is shown graphically in Figure 1.
Fig. 1. Modified Wiener filtering process for identifying values above noise floor in power spectrum. Left graph illustrates log-log
power spectrum with linear regression fit used for detrending. Center graph shows the detrended power spectrum (order=2) and its fit.
Right graph shows final peaks retained (zero order maintained at original value throughout process and removed only for graphing).
2.3 Comparison metrics
TF models are now generated for both simulations and their eigenfunction representations determined (λTF). The two
models are compared using Equivalency, Consistency, Probability of Being Random, Probability of Being a Linear
System Model, Sensitivity, Relative Model Sensitivity, Correlation, Percentage of Variation in One Model Captured by
the Second., Root Sum Squared Difference Between Models, Variance Similarity Between Models, Number of Unique
Eigenmodes, Entropy, Model Sensitivity Ratio, Percent Overlap in Model Eigenmodes, and Analysis Confidence4,5.
The λTF models are compared for Equivalence and Consistency to characterize the similarity or degree of congruence
between simulation models. This is done by comparing the structure of the two equations. In frequency space, this is
simply comparing model amplitudes. Equivalence is determined by calculating the correlation between the magnitude of
each TF (PSD) given by
( )( )
( ) ( )∑∑
∑
==
=
−−
−−=
N
j
i
N
j
j
N
j
jj
Model2Model2Model1Model1
Model2Model2Model1Model1
eEquivalenc
1
2
1
2
1100 (9)
where Model1 and Model2 are the magnitudes of the TF (PSD) models for the simulation and measured data given by
22
jjj BAModel1 += (10)
and Model1 and 2Model in (10) are the respective means of the result of (10).
Consistency between Model1 and Model2 is calculated by first subtracting the mean from each PSD as shown in (9).
Next, each respective result is multiplied by a Mask of zeros and ones, with ones in the mask corresponding to the
locations of an eigenfunction modes, and zeros at all other points. This mask converts the full model into the
eigenfunction model, and is given by
≠=
=0Model11
0Model10Mask1
j
j
j (11)
≠=
=0Model21
0Model20Mask2
j
j
j
Mean subtraction and mask multiplication are given in equation form as
jjjj MaskMaskModel1Model1Model 21)(1 −=
jjjj MaskMaskModelModel2Model 21)2(2 −= (12)
Next, each frequency term is converted into a Z-score using the standard deviation of the 2 (Model1 and Model2)
frequency terms at the jth location from (12) appended with a zero to make a total of three points, resulting in
j
j
j
ModelodelM
σ1
1 =
j
j
j
ModelodelM
σ2
2 = (13)
21
))(1)(2)(1(21
MaskMask
dNormMaskMask
yConsistenc
N
j
jjj
o
∑=
−= jjj ModelModeld 12 −= (14)
where Norm in (14) returns a normal probability on a distribution of mean zero and standard deviation of one, and the
argument dj for the probability calculation is given as shown. Equivalence and Consistency examples are displayed
below in Fig. 2.
0 5 100
2
j
p
0 10 20 30 400
500
1000
1500
0 2 40
2
jj
5.33 10.67 160
50
100
E q u i v a l e n c y = 1 0 0 %C o n s i s t e n c y = 1 0 0 %
I n p u t 1 ( 1 0 in p u t s )
In p u t 2 ( 4 in p u t s )
O u t p u t s 1 & 2 C o m m o n E ig e n f u n c t io n M o d e s
0 5 100
2
j
E q u i v a l e n c y = 5 0 %C o n s is t e n c y = 4 2 %
I n p u t 1 ( 1 0 in p u t s )
In p u t 2 ( 4 i n p u t s )
O u t p u t s 1 & 2 C o m m o n E ig e n f u n c t i o n M o d e s
0 2 40
2
jj
0 10 20 30 40500
0
500
1000
1500
5.33 10.67 160
20
40
0 5 100
2
j
E q u i v a l e n c y = 0 %C o n s is t e n c y = 0 %
I n p u t 1 ( 1 0 i n p u t s )
I n p u t 2 ( 4 i n p u t s )
O u t p u t s 1 & 2 C o m m o n E i g e n f u n c t i o n M o d e s
0 2 40
2
jj
0 10 20 30 401000
0
1000
2000
3000
5.33 10.67 161
0
1
E X E 1
E X E 2
E X E 1
E X E 2
E X E 1
E X E 2
0 5 100
2
j
p
0 10 20 30 400
500
1000
1500
0 2 40
2
jj
5.33 10.67 160
50
100
E q u i v a l e n c y = 1 0 0 %C o n s i s t e n c y = 1 0 0 %
I n p u t 1 ( 1 0 in p u t s )
In p u t 2 ( 4 in p u t s )
O u t p u t s 1 & 2 C o m m o n E ig e n f u n c t io n M o d e s
0 5 100
2
j
E q u i v a l e n c y = 5 0 %C o n s is t e n c y = 4 2 %
I n p u t 1 ( 1 0 in p u t s )
In p u t 2 ( 4 i n p u t s )
O u t p u t s 1 & 2 C o m m o n E ig e n f u n c t i o n M o d e s
0 2 40
2
jj
0 10 20 30 40500
0
500
1000
1500
5.33 10.67 160
20
40
0 5 100
2
j
E q u i v a l e n c y = 0 %C o n s is t e n c y = 0 %
I n p u t 1 ( 1 0 i n p u t s )
I n p u t 2 ( 4 i n p u t s )
O u t p u t s 1 & 2 C o m m o n E i g e n f u n c t i o n M o d e s
0 2 40
2
jj
0 10 20 30 401000
0
1000
2000
3000
5.33 10.67 161
0
1
E X E 1
E X E 2
E X E 1
E X E 2
E X E 1
E X E 2
Fig. 2. Comparison of Equivalency and Consistency with input and output sizes between models different.
If known inputs are available, the input vector is created for each model. If the input vector is only available for one
simulation, then it is assumed to be the same for both. If one simulation assumes input in different dimensions (number
of parameters and units of those parameters) than the other simulation, then the input vector can be transformed into a
common space by using a transfer function. The TFs in (15) yield the transformation of one set of units into another,
many inputs into a few outputs, and a few inputs into many outputs respectively.
)(
)()(
MetricFFT
EnglishFFTunitsTF =
)100,,2,1(
)2,1()(
inininFFT
outoutFFTManyToFewTF
K=
)2,1(
)100,2,1()(
ininFFT
outoutoutFFTFewToManyTF
K= (15)
Another method for handling unequal numbers of input vector elements is to pad the shorter of the two with zeros to be
equal length with the longer one. An example of this is given in Figure 2, which shows a comparison using a notional
data set where the number of inputs for the first data set is 10 and for the second set the number of inputs is 4. The inputs
and outputs are varied to show that the totally overlapping case (outputs closely match and inputs in the same locations
in different sized input vectors) flags at 100% Equivalence and Consistency, and degrades gracefully to a non-matching
case of 0% Equivalence and Consistency.
For each model, the probability of being random is given by
N
sNEigenModeProbRnd
2100= (16)
where N is the number of points (and equal to 2X the maximum possible number of eigenmodes), NEigenModes the
number of eigenmodes in the TF, and multiplied by 100 to convert the ratio into a percentage.
The probability of the resultant model being a linear system model is given by
)21max(
)21min(
1001
0
1
1
0
1
1
0
1
1
0
1
∑∑
∑∑−
=
−
=
−
=
−
==n
i
n
i
n
i
n
i
Linear
OutorOut
OutorOut
Prob [ ]2)1)()(( += Input1FFTTFIFFTOUT1 (17) ( 2
Output1OUT2 = )
and answers the question Is squaring the input the same as squaring the output? In (17), OUT1 is generated by taking
the FFT of the input, squaring it, multiplying it by the TF, and then taking the inverse FFT. This is then compared
directly in (17) with the square of the original output given as OUT2. In this work, all input vectors have 1 added to them
in the frequency space to insure that no division by zero errors are encountered; however, in practice it has been found
that this is not always necessary.
For multiple input cases, each model can be examined independently to determine the input element accounting for the
most model variance. This is done by setting one input vector element to its original value while setting all remaining
elements to zero and is given by
++
=∑∑
ModelTF
Input
MaxElemBA
BAModVar i
)(2
)(2100max
22
22
(18)
where 2(ΣA2+B2) defines the variance from the frequency space representation and A and B defined as in (1). Once all
input vector elements have been turned on one at a time and their variance ratio with the variance of the TF calculated,
the input vector element corresponding to the maximum value shown in (18) is the element that accounts for the most
model variance.
Relative Model Sensitivity (higher values indicate more sensitivity) is calculated by turning all input elements on
(defined as the baseline), and the variance compared to the variance of the TF by
ModelTF
BaseInput
BA
BARelModSens
)(2
)(2100
22
22
∑∑
++
= (19)
By first multiplying the TF by the FFT of a new input and then performing an inverse FFT, a new output is created,
where classical time space comparisons can be applied. Three metrics used by Data Modeling to compare the model with
the original are correlation, the percentage of the variance of one model captured by the second, and the root sum
squared difference. In equation form, these three metrics are given by
( )( )( ) ( )∑∑
∑−−
−−=
i
i
i
i
i
ii
Model2Model2Model1Model1
Model2Model2Model1Model1
Correl22
100
( )( )∑
∑−
−=
i
i
i
i
Model1Model1
Model2Model2
VarRatio 100
−
−=
∑∑
∑−
=
−
=
−
=
),max(
)(
11001
0
1
0
1
0
2
n
i
i
n
i
i
n
i
ii
Sim
Model2Model1
Model2Model1
RSS (20)
where Model1 and Model2 here are in time space and result from the TF multiplication and inverse FFT process, and
Model1 and 2Model are the means of each resultant. These 3 metrics can be calculated between the model and the
original data as a direct comparison of how well they match, or it can be performed between two different models to
score the degree of match..
The variance similarity between 2 TF models is given as a percentage by
))(2)(2max(
))(2)(2min(100
2222
2222
Model2Model1
Model2Model1
SimBAorBA
BAorBAVar
∑∑∑∑
++++
=
(21)
The unique eigenmode information captured by the model is also given as a percentage by
−=)
2
1(
)2
(
1100
erf
zerf
Unique EigenModes
lnModesModeNumberEige
onnModesCommNumberEigez = (22)
The entropy preserved in the model from the original data is given as a percentage by
22222
2222
))(2)(2max(
)(2)(2100
Model2TFModel1TF
Model2TFModel1TF
BAorBA
BABAEntropy
∑∑∑∑
++
++= (23)
The model sensitivity ratio is the ratio of the relative model sensitivity given in (19), and is given as a percentage by
(Model1)RelModSens
(Model2)RelModSensioModSensRat 100= (24)
The percentage of terms overlapping between models is the ratio of the number of eigenmodes common to Model 1 and
Model 2 to the maximum between the number of eigenmodes in Model 1 or Model 2. This percentage is given by
100),( Model2Model1
nModesCommonEige
sNEigenModesNEigenModeMax
NrlapTermsPercentOve = (25)
Finally, the analysis confidence is defined as the average of the consistency, variance similarity, correlation, and the
RSS similarity. In equation form, this is given as
4
SimSim RSSCorrelVARyConsistencnfidenceAnalysisCo
+++= (26)
Calculation of these metrics for two equation models independently derived from two independent I/O data sets
quantifies their degree of equality for VV&A. High equivalency validates the models, consistency verifies their similar
behavior, and accreditation is possible by checking that threshold conditions for these metrics are maintained.
Congruence is proven when all required thresholds are met or exceeded for a given application or intended use. Also,
theoretical predictability is a function both of being non-random and being linear, but is not pursued further at this time.
2.4 Metrics example
The authors chose to demonstrate the metrics in Section 2.3 using an interceptor debris model and its derived Data
Model5. Eight (8) input variables (wavelength, relative signature value, target area, field of view, scale coefficient, sensor
aperture, time to impact, and target mass) control the input to both the original model and the Data Model, with both
returning the number of fragments and the minimum fragment size. Transfer function models were constructed for each
by running the same 64 sets of 8 input vector elements through both the original model and the original Data Model
(extracting 64 examples). The 64x8=512 input vector elements were placed into a single input vector, and the 2 output
elements (number of fragments and minimum fragment size) were first padded up from 2 elements to 8 elements, and
then each set of 8 concatenated together in the same manner as the input vector elements. This process is the same used
for Monte Carlo type simulations, where multiple inputs sets are concatenated together into a single input vector, and
when convolved with the SRF or TF yields the outputs for the Monte Carlo cases en masse. For cases where 1 or a
subset of Monte Carlo runs is required, the input vector for the TF is filled to the maximum number of cases (here, 64
cases) by padding with repetitions of the Monte Carlo input cases.
Single TF models for original debris model and original Data Model were generated and analyzed. Results are given in
Table 1. TF output and the original output are close matches both in terms of correlation (88-90%) and RSS difference
(96%). In terms of the frequency space TF models constructed for the original debris model and its original polynomial
Data Model representation, the Equivalence between the model scored 77%, Consistency 45%, and percentage overlap in
model frequency terms 49%. Inclusion of more terms raised the Equivalence, Consistency, and Overlap scores, but not
the final Correlation or RSS scores. The authors chose to use shorter models with lower Equivalency and Consistency.
Model 1
(Original)
Model 2
(Data Model)
TF Model Correlation With Original Data (%) 88.7 89.7
TF Model Variance % of Original Data Variance (%) 291.8 214.6
RSS Similarity Between TF Model and Original Data (%) 96.2 96.7
Unique Modes Between Models (%) 45.1 37.4
Entropy Preserved From Original (%) 95.5 95.0
Probability of Being Random (%) 18.0 15.6
Probability of Being a Linear System Model (%) 12.3 86.6
Most Sensitive (Min Input Element Change Causing Max Output Variance) Tgt. Mass (6) Tgt. Mass (6)
Relative Model Sensitivity (Higher is More Sensitive) 787.1 862.3
Most Critical (Min Input Element Change Causing Max Output Range) Wavelength (4) Wavelength (4)
Model 1st Derivative 1.7 1.5
Model 2nd Derivative 1.9 1.6
Model 1 and Model 2 Compared
Correlation Between Models (%) 99.0
Model Variance Ratio (Model 2 / Model 1) (%) 73.5
RSS Similarity Between Models (%) 98.7
Variance Similarity Between Models (%) 91.3
Model Sensitivity Ratio (Model 2 / Model 1) (%) 109.5
Terms Overlapping Between Models (%) 48.9
Equivalence Between Models (%) 77.0
Consistency Between Models (%) 45.3
Analysis Confidence (%) 83.6 Table 1. Summary of Analysis between Full Debris Model and Data Model Representation.
2.5 Lookup Table (LUT) generation
Finally, for applications where even simple equation models cannot be executed in real-time, the equation models are
converted into lookup table (LUT) format. For the lookup tables in this work, the TF was used for initial LUT
population. This population was performed by creating new input vectors consisting of combinations of uniform random
input variables that are automatically normalized between zero and one. These new input vectors are then fast Fourier
transformed, multiplied by the TF, and inverse fast Fourier transformed. This LUT derived from successive repetition of
this process still contained areas for which no values were obtained. For this reason, we construct a multi-variable
polynomial as an equation Data Model that both adequately reproduces the data output from the TF, and also gives the
ability to predictably populate the remaining regions of the LUT not reached by processing through the TF. The
Kolmogorov-Gabor polynomial of the form
LK ++++= ∑∑∑∑∑∑i j k
kjiijk
i j
jiij
i
iiL xxxaxxaxaaxxxy 021 ),,,( (27)
represents a multiple input, single output transfer function that maps the inputs x1, x2,…, xL into the output y. Data
Modeling approximates this Kolmogorov-Gabor polynomial using a hierarchy of functional models as
))))))))),,(((((((((),,,( 2121 kjinL xxxybybybyfxxxy KK = [ ] n
LxxxyO 3),,,( 21 =K (28)
Polynomials of the form in (28) are Data Relations Models, which are multi-variable polynomials built using building
blocks no larger than 3rd order and comprised of combinations of up to 3 input variables. These polynomial building
block solutions are nested into higher order polynomial fits as shown in (27), which makes the Data Model a functional
(an equation or function that can be embedded into another equation or function as an input variable). This process is
illustrated in Figure A.1 of the Appendix.
LUTs for number of fragments and minimum fragment size (mm) were created using a multi-variable polynomial
derived from the TF for each output vector element (Figures A.2 and A.3 in the Appendix). This polynomial creation
process was provided the input list and allowed to perform its own ranking and choice of 3 inputs to use. Significance of
these 3 inputs was based on frequency of usage of each input in the final resultant polynomial. This resulted in the
selection of wavelength (defines sensor type and resolution), target mass, and target area for number of fragments; and
relative signature value, wavelength, and target mass for minimum particle size.
LUT construction is illustrated in Figure A.1 in the Appendix. Input/output pairs are passed through the multi-variable
linear regression routine using only at most 3 variables at a time, with multiple 3rd order polynomials nested together and
used as described above as functionals. Figure A.1 illustrates the code structure that includes the looping across nested
layers, building block types, and input variable combinations Once the equation Data Model is constructed, new inputs
corresponding to the grid locations on the LUT are passed through the final equation. Also illustrated in Figure A.1 in
the Appendix is the mapping of the output into the 16 individual 16x16 smaller LUTs contained in the two LUTs that
follow in the Appendix.
For convenience, 16 step increments were chosen for interrogating the equation and constructing the LUTs. In Figure 3
(number of fragments), wavelength was used the most number of times in the polynomial, and given no further
information is used to determine which of the 16x16 blocks contains the resultant. On the right side of the LUT is the
mean and standard deviation of each block for use as the answer. If wavelength and target mass are known, the
individual row in the block on the LUT is found, and the row averaged or its median or mode calculated. If all 3 inputs
are known, then the value is read off the grid directly by first going to the 16x16 block for wavelength, the row inside the
16x16 for target mass, and finally the intersecting column value in the row for target area. This process is described
directly on the LUT itself, and is repeated for the LUT in Figure A.3 in the Appendix for minimum fragment size.
REFERENCES
1. Jaenisch, H.M. and Handley, J.W., “Data Modeling for Radar Applications”, Proceedings of the IEEE Radar
Conference 2003, (May 2003).
2. Rosko, Joseph S., Digital Simulation of Physical Systems, Reading, MA: Addison-Wesley, 1972.
3. Barnett, S. & Cameron, R.G., Introduction to Mathematical Control Theory, 2nd ed., Oxford: Clarendon Press,
1993.
4. Jaenisch, H.M., Algorithms for Autonomous Unmanned Deep Space Probes, D.Sc. Thesis, James Cook University.
(Submitted to University for Approval), 2005.
5. Handley, J.W., Jaenisch, H.M., Barnett, M.H., Esslinger, R.A., Grover, D.A, Hunt, J.R., “Data Modeling Predictive
Control Theory for Deriving Hyper-Real-Time War Game Models from Real-Time Models”, SPIE Defense and
Security Symposium, Orlando, FL, April 2006.
APPENDIX
Combinations of
Inputs (3 at a time)
RSS Error
DeriveDM
(Nesting Loop)
Nest
(Building
Block Loop)
Variable
Loop
Generate
Inputs for LUT
Generate
Output in LUT
Format
Combinations of
Inputs (3 at a time)
RSS Error
Combinations of
Inputs (3 at a time)
RSS Error
DeriveDM
(Nesting Loop)
DeriveDM
(Nesting Loop)
Nest
(Building
Block Loop)
Nest
(Building
Block Loop)
Variable
Loop
Generate
Inputs for LUT
Generate
Inputs for LUT
Generate
Output in LUT
Format
Generate
Output in LUT
Format
Triple = a0 + a1x1 + a2x2 + a3x3 + a4x12 + a5x2
2 + a6x3
2 + a7x1
3 + a8x2
3 +
a9x33+ a10x1x2 + a11x1x3 + a12x2x3 + a13x1x2x3 + a14x1x2
2 +
a15x1x32 + a16x1
2x2 + a17x1
2x3 + a18x2x3
2 + a19x2
2x3
Fig. A.1. Lookup Table (LUT) creation routines. Shown is the outer nesting loop (layers), the individual building block calls inside of
nesting loop (AllVars, Single, Double, Triple, Quad, Quint), Combinations of Variables loop inside the Triple building block, LUT
input generation routine, LUT output formatting routine, and the equation form of the Triple building block.
Fig. A.2. Number of Fragments (x 10) LUT using wavelength (0.3 to 11.5 microns), target mass (255 to 345 kg), and target area (0.1
to 0.5 sq. meters).
Fig. A.3. Minimum Fragment Size LUT using Rel. Sig. Val. (0.04 to 0.98), wavelength (0.3 to 11.5 microns), and target mass (255
to 344 kg).
Analytical Data Modeling for Equivalence Proofing and Anchoring
Simulations with Measured Data
Holger M. Jaenisch*a,b
, James W. Handleya, and Michael L. Hicklen
b
aJames Cook University, Townsville QLD 4811, Australia
bdtech Systems Inc., P.O. Box 18924, Huntsville, AL 35804
ABSTRACT
This paper presents a novel analytical method for comparing simulation data with measured data, with the goal of
proving Equivalence and Consistency between the model and the real data. Our method overcomes the problems of
disparity in the inputs to the simulation and varying number of parameters between the simulation and the measured
flight data. Our method derives analytical Data Models that are analyzed in frequency space, yielding quantitative
assessment values for the model performance relative to the measured data. The model output for a sensor and its “real-
world” measured data are collected for comparison.
Keywords: Formal Analysis, Data Modeling, V&V, Equivalence, Consistency, Transfer Function Modeling, Automatic Proofing
1. INTRODUCTION
Traditional software development focuses upon code verification at a relatively low level adequate for small to medium
scale sub-system and system level development. Technology that evaluates “higher level models” has been developed
and is effective for large software intensive systems; however, anchoring and calibration of these models must still be
done to measured data. We propose the use of Data Modeling of simulation input/output (I/O) data and measured data as
the basis for this comparison.
In this work, we show that simulation to measured data anchoring can be done without access to source code or even the
simulation itself. All that is required is the output of the simulation and the measured data it represents. The input data is
not required, because it is assumed to be common to the simulation and the measured data process. By setting the sum of
the inputs equal to one (1) for both the measured process and the simulated process, a common denominator for Transfer
Function (TF) modeling is established. This leaves the independent numerators to be mathematically compared.
2. TRANSFER FUNCTION DATA MODELING
2.1 Transfer function theory
Data Modeling1-4 using transfer functions can be used to model both input and output systems. The transfer function5,6
(TF) provides an algebraic representation of a linear, time-invariant filter in the frequency domain. The frequency
domain for the transfer function is determined using the Fourier transform (FFT). The frequency space representation is
separated into the real and the complex or imaginary component
∑==j
N
ij
iii yyFFTA )cos())(Re( 2
21 ππ ∑==
j
N
ij
iii yyFFTB )sin())(Im( 2
21 ππ (1)
where A are the real terms, B the complex (imaginary) terms, N the number of points, and both i and j are counters
ranging from 0 to N-1. Each frequency term in (2) results in a cosine-sine pair in the analytical equation
∑=
+=n
j
jjjj xBxAxy1
sincos)( ωω (2)
which is the solution to the differential equation given by the derivative of (2) and given as the kth derivative in (4)
*[email protected]; phone 1 256 337-3768
=+
=−
=−−
=+−
=
∑
∑
∑
∑
=
=
=
=
04modsincos
34modcossin
24modsincos
14modcossin
)(
1
1
1
1
kxBxA
kxBxA
kxBxA
kxBxA
xy
n
j
jj
k
jjj
k
j
n
j
jj
k
jjj
k
j
n
j
jj
k
jjj
k
j
n
j
jj
k
jjj
k
j
k
ωωωω
ωωωω
ωωωω
ωωωω
(3)
If derivative data is modeled with (2), it is necessary to integrate the result in order to obtain the analytical solution to the
differential equation. The kth integral of (2) is given by
=+
=+−
=−−
=−
=
∑
∑
∑
∑
∫
=
−−
=
−−
=
−−
=
−−
04modsincos
34modcossin
24modsincos
14modcossin
))(()(
1
1
1
1
kxBxA
kxBxA
kxBxA
kxBxA
dxxy
n
j
jj
k
jjj
k
j
n
j
jj
k
jjj
k
j
n
j
jj
k
jjj
k
j
n
j
jj
k
jjj
k
j
k
ωωωω
ωωωω
ωωωω
ωωωω
(4)
The transfer function is defined as the frequency domain representation of the output divided by the frequency domain
representation of the input. In equation form, this is given by
)(
)()(
zX
zYzH = (5)
where H(z) is the transfer function, Y(z) the frequency domain representation of the output, X(z) the frequency domain
representation of the input, and both Y(z) and X(z) can be represented in the polynomial form
L++++= 3
3
2
210)( zazazaazY (6)
which is written in terms of the single variable z and is analogous to the Kolmogorov-Gabor polynomial used extensively
in Data Modeling. The input is transformed directly along with the output and the ratio calculated. If required, Data
Modeling techniques described in previous work and included in the References can be used to interpolate or pad the
input and output number of points to be equal.
Once the ratio in (5) is calculated, it can be inverse transformed back into the original space as a system response
function (SRF) given by
)sin()cos()(22
N
ij
j
j
N
ij
jii BATFInvFFTSRFππ +== ∑ (7)
or it can be maintained in the frequency domain as a TF using A, B, and the frequencies (2πj/N) shown in (7). Using
eigenfunction Data Modeling and modified Weiner filtering, the number of frequency terms kept can be reduced,
yielding a more compact function form, and the transfer function is written in terms of the dominant eigenfunctions.
2.2 Eigenfunction modeling
Eigenfunction extraction uses a variation of Wiener filtering to identify the peaks in the dB power spectrum. These peaks
correspond to sine and cosine terms when using a discrete Fourier transform for spectral decomposition. To facilitate
peak identification and extraction, a model of the data set’s dB power spectrum is built. This is done using the Fourier
transform defined by (1). Once the Fourier transform is generated, the power spectrum of the data is calculated, which is
the sum of the squares of the real and imaginary terms in (1), and is then transformed into decibels (dB).
22 )Im()Re( iii BAP += ))max(
(log10)( 10
i
ii
P
PdBP = (8)
Least squares regression is used to fit a straight line to the entire dB power spectrum of the form
bmxy += (9)
where m is the slope of the line and b is the y-intercept. Once the power spectrum is detrended and squared, a second fit
using linear regression is performed, where the y-axis of the data to fit is the detrended power spectrum and the x-axis is
simply an index variable. This y-intercept is used as a threshold, with locations where the detrended power spectrum
points greater than the y-intercept are allowed to pass the filter unchanged in value from the original spectrum, while
those locations that are less than the y-intercept are set to zero. Once a threshold is applied to the frequency terms and the
non-peak values zeroed, each remaining dominant frequency term represents a single term in the final eigenfunction of
the data set, which is given in (2). Adjacent peaks are combined together by leaving only the central (middle) peak of the
group. Finally, the truncated components are normalized to the minimum and maximum of the full spectrum, thereby
conserving entropy lost during peak removal.
2.3 Construction and use of transfer functions
When only an executable or examples of simulation runs are available to derive a Data Model, a slightly different
approach is used. If the number of example scenarios (Input/Output) is still fairly substantial, then the regression based
approach is still appropriate; however, in the case where very few cases are possible or even limited to just one test case
example, a completely different approach is necessary. For our example below (IRSim), we have only one example;
therefore we apply predictive control theory and use the transfer function as tool of choice.
Analytical comparison of measurement or flight data with test data is accomplished using the method of classical control
theory transfer functions (TFs) described in the previous section. Because transfer function models occur in frequency
space using complex variables, they are formally termed analytical methods. Simple TF models are derived from
input/output pairs of data. Independent TF models are built for both the simulation and for the test data using one data set
from the simulation and one set from the test data. The resultant TFs are Data Models that are compared using formal
mathematical methods of analysis.
A TF model is built by a simple ratio of Fast Fourier Transforms (FFTs) as shown in (5). The transfer function is
constructed by dividing the FFT of the output data set by the FFT of the input data set
ionDeconvolutInputFFT
OutputFFTTF ==
)(
)( (10)
This division is called a deconvolution. Deconvolution removes the input data set from the output data set. The resulting
ratio of output FFT to input FFT is the TF model of the process. The TF model is used to transform (convolve) any input
data set into its related output data set, given by
))()(
)(( NewInputFFT
InputFFT
OutputFFTInvFFTNewOutput = (11)
If an inverse Fast Fourier Transform (InvFFT) is performed on the TF model in (5) or (10), the result is the creation of
the system response function (SRF) in (7). The system response function is a time domain model can then be spatially
convolved with a new input to yield a prediction of the response output, or an FFT can be performed to transform it back
into frequency space for convolution.
The methods here analyze the form of the two independently derived differential equation models by comparing their
mathematical structure and the value of their coefficients to measure the Equivalence and Consistency between models,
and therefore between simulation and flight test data. Specifically, for comparison of simulation with flight test data, we
are interested in the structure and form of the TF model constructed from the simulation compared with the TF model
constructed from the flight test data. Since the TF models are comprised of unique modes mathematically expressed as
cosine and sine functions as shown in (2), they are readily transformed into analytical derivative terms a shown in (3).
The number of unique modes can be reduced by performing eigenfunction modeling and modified Wiener filtering, but
with or without eigenfunction modeling, the method yields an analytical differential equation form model of the transfer
function.
2.4 Application to measured flight test data
TF models are now generated for both simulation and test flight data, and their eigenfunction representations (if chosen)
determined. The eigenfunction TF models are compared for Equivalence and Consistency to characterize the similarities
and differences between the simulation and test flight data, with eigenfunction mode locations and amplitudes compared
to determine how alike or different the two models are. Equivalence is determined by calculating the correlation between
the magnitude of each of the TF models, and given by
( )( )( ) ( )∑
= −−
−−=
N
jjj
jj
Model2Model2Model1Model1
Model2Model2Model1Model1eEquivalenc
122
100 (12)
where Model1 and Model2 are the magnitudes of the TF models for the simulation and measured data given by
22
jjj BAModel1 += (13)
and Model1 and Model2 in (12) are the means of the result of (13).
Consistency between Model1 and Model2 is calculated by first subtracting the mean from each frequency vector exactly
as shown in (12). Next, each vector is multiplied by a Mask of zeros and ones, with ones in the mask vector at the
location of an eigenfunction mode, and a zero at all other points. This mask converts the full model into the
eigenfunction model, and is given by
≠=
=0Model11
0Model10Mask1
j
j
j (14)
≠=
=0Model21
0Model20Mask2
j
j
j
Mean subtraction and mask multiplication are given in equation form as
jjjj Mask2Mask1Model1Model1Model1 )( −=
jjjj Mask2Mask1Model2Model2Model2 )( −= (15)
Next, each frequency terms is converted into a Z-score using the standard deviation of the 2 (Model1 and Model2)
frequency terms at the jth location from (15) appended with a zero to make a total of three points, resulting in
j
j
j
Model1Model1
σ=
j
j
j
Model2Model2
σ= (16)
Mask2Mask1
))Norm(d)(1)(Mask22(Mask1
yConsistenc
N
1j
jjj
o
∑=
−= jjj Model1Model2d −= (17)
where Norm in (17) returns a normal probability on a distribution of mean zero and standard deviation of one, and the
argument dj for the probability calculation is given as shown.
If known inputs are available, the input vector is created for each model. If the input vector is only available from the
simulation, then it is assumed to be the same for both the simulation and the measured data. If the measured data
assumes input in different dimensions (number of parameters and units of those parameters), then the input vector can be
transformed into a common space by using a transfer function to do so. Also, this method works well even in cases
where the number of inputs in Model 1 and Model 2 are different. An example of this is given in Figure 1, which shows
a comparison using a notional data set where the number of inputs for the first data set is 10 and for the second set the
number of inputs is 4. The inputs and outputs are varied to show that the totally overlapping case (outputs closely match
and inputs in the same locations in different sized input vectors) flags at 100% Equivalence and Consistency, and
degrades gracefully to a non-matching case of 0% Equivalence and Consistency.
0 5 100
2
j
p
0 10 20 30 400
500
1000
1500
2
0 2 40
2
jj
5.33 10.67 160
50
100
1
1
Equivalency = 100%Consistency = 100%
Input 1 (10 inputs)
Input 2 (4 inputs)
Outputs 1&2 Common Eigenfunction Modes
0 5 100
2
j
Equivalency = 50%Consistency = 42%
Input 1 (10 inputs)
Input 2 (4 inputs)
Outputs 1&2 Common Eigenfunction Modes
0 2 40
2
jj
0 10 20 30 40500
0
500
1000
1500
5.33 10.67 160
20
40
0 5 100
2
j
Equivalency = 0%Consistency = 0%
Input 1 (10 inputs)
Input 2 (4 inputs)
Outputs 1&2 Common Eigenfunction Modes
0 2 40
2
jj
0 10 20 30 401000
0
1000
2000
3000
5.33 10.67 161
0
1
Fig. 1. Comparison of Equivalency and Consistency.
3. EXAMPLES
3.1 Simulation data compared to measurements
In our case, the input to both IRSim and the real-world data are assumed to be the same, but unknown to us. We used an
index array for the denominator ranging from zero to the number of measurements minus 1. We now proceed to derive
transfer function models as Fourier series representations for IRSim and for the measured data. Once created, the two TF
models are then analyzed to see how alike the two Fourier series are. As shown in Section 2, Equivalence is the
correlation between the two Fourier transforms, which measures if the peaks go up and down at the same time.
Consistency measures actual ratios of the value of the peaks to measure equivalency of the Fourier series coefficients.
Once these models are derived, analytical methods such as the Method of Derivatives can be applied to isolate critical
parameters or setting and perform sensitivity analysis. Likewise, the difference between the simulation and the real data
can be resolved by deconvolution of the two models in frequency space. This isolates the high frequency component not
captured by IRSim and also provides an analytical model for the difference that could now be included in IRSim as a
correctional factor if desired.
This simple method for achieving critical parameter estimation without resorting to Design of Experiments7-10 (DOE) is
only possible because we have an analytical model for the system and can thereby exploit the Method of Derivatives and
Jacobian analysis. Jaenisch has proposed an abbreviated form of this method for estimating Critical, Sensitive, and Key
parameters. This method uses an abbreviated TF consisting of the first order (mean) term plus the largest magnitude
eigenmode and its complex conjugate. When multiplying the FFT of the input by this abbreviated TF, the TF acts as a
notch or single band-pass filter on the input, which reduces the input variation space to the real and complex frequency
terms of the corresponding single eigenmode of the input. This reduction provides a method for then constructing a 2
parameter plot using the real frequency term on the x-axis and the complex part of the frequency term on the y-axis.
Minimum and maximum for the axes are determined by applying frequency space perturbations to the single eigenmode
in the original input.
Parameter plots are then constructed by varying the real and complex part of the input eigenmode between minimum and
maximum, constructing the corresponding controlled single eigenmode input along with its mean and corresponding
complex conjugate. The input is then convolved with the TF and a new output obtained. Baseline input and output
vectors are calculated by applying an inverse FFT to the abbreviated form of the original input vector and its
corresponding output. For one parameter plot, change in input and output from baseline are captured by summing the
difference between the derived and original input or output.
The authors were provided with two measured infrared (IR) signature histories and their one corresponding IRSim
generated signature. Figure 2 shows the two measurement signature histories: one for the notional solid fuel rocket
booster and the other for the notional liquid fuel rocket booster.
0 2 4 6 8 10 12 14 162
0
2
4
S o lid
L iq u id
IR
S ig n a tu re
In te n s ity
Fig. 2. Graph of measured data for solid fuel rock and liquid fuel rocket.
3.2 Solid Rocket Model versus Measured Signature
Using the methods described in Section 2, the TF model between IRSim and the measured data for the solid rock motor
(17 outputs) was created and compared for Equivalence and Consistency. The left graph in Figure 3 shows graphically
the output vector for both the measured and simulated data, and on the right shows the differences between them using
I×χ (chi-by-eye), since in Figure 2 they appear to be equal.
0 2 4 6 8 10 12 14 162
0
2
4
2.6
0.882−
sig1meask
sig1simk
0 5 10 150.05
0
0.05
3
8
ask
IR
Signature
Intensity
Difference
IR
Signature
Intensity
Fig. 3. Graph of IRSim data and measured data for the solid rocket motor (left), and their differences (right)
Once TFs are built for both the measured data and IRSim, they are then graphically displayed using the magnitude of the
frequency terms in each model as bars on a bar chart as shown in Figure 4.
0 50
5
10
m3 k
0 50
5
10
m4 k
Fig. 4. Case 1 TF model comparison of IRSim data and measured data.
These TF models are then compared using Equivalence and Consistency as described in Section 2 to yield an
Equivalence of 99.98% and a Consistency of 85.47%, which is consistent with the data shown in Figure 3. This results in
the differential equations for IRSim TF model given in (18) and the measurement TF model given in (19).
)6sin(107.4)4cos(724.4)4sin(038.1)2cos(203.)2sin(005.2(1717
4)1(xxxxx
dx
Modeld ππππππ−−++=
)12sin(964.2)10cos(14.)10sin(595.4)8cos(06.)8sin(348.)6cos(02.1 xxxxxx ππππππ +−−++−
))16cos(2.3)16sin(48.4)14cos(551.5)14sin(334.5)12cos(684.6 xxxxx πππππ −−−++ (18)
)6sin(963.3)4cos(686.4)4sin(014.1)2cos(215.)2sin(013.2(1717
4)2(xxxxx
dx
Modeld ππππππ−−++=
)12sin(294.3)10cos(055.)10sin(715.4)8cos(14.)8sin(328.)6cos(984.0 xxxxxx ππππππ ++−−+−))16cos(48.2)16sin(72.4)14cos(062.6)14sin(362.5)12cos(714.6 xxxxx πππππ −−−++ (19)
To verify the differential equation results, each one is passed through a Fourth Order Runge-Kutta (RK4) integrator to
solve the differential equation. The results for these two cases is shown in Figure 5 (left and center), where the curves lie
on top of each other and are indistinguishable as before. Figure 5 (right) is provided to show the small differences
between the two differential equation results.
Fig. 5. Graph of integrated results for the differential equation model in (18) overlaying simulated data (left); the differential equation
model in (19) overlaying measured data (center); and difference graph (right).
3.3 Determination of Extra Significant Frequency Components in Measurement Data
While comparing the IRSim TF model and the measured data TF model for the solid rocket, it was found by taking the
eigenfunction of the difference that two significant frequency components existed in the measured data that were not
present in the simulation data as shown in Figure 6. Two methods can be used to correct for this difference. One is to
build a TF of the solid rocket simulated data into the solid rocket measurement data that can then be used with new
simulated data to yield the equivalent measurement data. The second method is to construct an additive model of the
extra terms that can then be added to new simulated data runs to convert them into equivalent measurement data cases.
0
0.05
0.1
E ig en fu n c tio n
O f D iffe ren ce
(M ag n itu d e )
C y c le s
E ig en m o d es F o rm
C o rre c tio n F ac to r
0 1 2 3 4 5 6 7 8
Fig. 6. High frequency term (right most bar) shows frequency difference that existed in the measured data that was not found in the
IRSim data. Plot is of the magnitude of the differences.
To correct for the difference between the models, we first build a TF model that is the ratio of the measured data TF
model to the IRSim TF. This is given in equation form in (20) and in differential equation form in (21).
SBIRSSim
Measured
TF
TFController = (20)
)4cos(006.0)4sin(982.1)2cos(006.)2sin(005.1(1717
4)(xxxx
dx
Controllerd πππππ++−= (21)
)10cos(215.)10sin(125.5)8cos(192.2)8sin(392.3)6cos(0002.)6sin(895.2 xxxxxx ππππππ ++++++))16cos(056.1)16sin(672.7)14cos(3.)14sin(35.7)12cos(24.)12sin(132.6 xxxxxx ππππππ ++++++
By inputting the current IRSim output into this TF model, the result is the transformation of the output simulation data
into representative measurement data. This model can be used either to post-process IRSim output, or can be made
available for integration directly into IRSim.
As a second correction method, we build an additive model that can be used instead of the TF method given in (20) and
(21), and whose result is added to the simulated data. This model captures the two frequency terms given in Figure 6 and
is given in differential equation form as
))16cos(536.)16sin(176.)8cos(148.)8sin(016(.1717
4)(xxxx
dx
Correctord πππππ++−= (22)
3.4 Solid Rocket Measured Data versus Liquid Rocket Measured Data
To demonstrate how the Equivalence and Consistency metrics behave for data sets that are not alike, we compare the
measured data from the solid rocket with the measured data from liquid rocket (10 outputs), both of which are shown in
Figure 2 graphically, and whose TF models are given in Figure 7. In order to perform the comparison, the shorter 10
output liquid rocket case was zero padded to be equal length with the 17 output solid rocket case.
0 50
5
10
m3 k
0 5
0
5
10
m4 k
Fig. 7. TF models for measured data sets 1 and 2. Note differences occur after the first two bars.
The Equivalence and Consistency measures for this comparison were 12.46% and 4.58%, respectively, which shows a
low but non-zero Equivalence and Consistency between the two rocket measurement sets as expected. Table 1 is a rollup
summary of Equivalence and Consistency for the like-data comparison (solid rocket model versus measurement) and the
different data comparison (solid rocket measurement versus liquid rocket measurement).
Comparison of Solid Rocket
Model and Measurement
Comparison of Solid and Liquid
Rocket Measurement Sets
Equivalence 99.98% 12.46%
Consistency 85.47% 4.58% Table 1. Equivalence and Consistency Summary for Case 1 and Case 3
3.5 IRSim Data Model
Using the signature data provided in the two examples, a full Data Model of IRSim was generated. This TF DM uses the
two IR data signatures shown previously to generate a composite or prototype (average of the two in frequency
space).Once the prototype is created, equivalent inputs are solved for using the mathematical relationship
))()(
1( OutputFFT
prototypeFFTInvFFTInputEquivalent = (23)
Equivalent inputs are derived for both original signatures (solid and liquid rocket), and the FFT of the prototype is now
used as the TF. Since the FFT of the prototype was used to solve for the equivalent inputs, multiplying the new TF by the
FFT of the inputs will yield the original boundary cases. Additionally, by passing input vectors where each element is
constrained to be within the minimum and maximum of the element values from the solid rocket and liquid rocket
(boundary cases), new valid cases are created. The differential equation for this TF model is given by
)4cos(16.2)4sin(274.1)2cos(4.1)2sin(046.1(1717
4)(xxxx
dx
TFModeld πππππ+++−= (24)
)10cos(77.)10sin(52.1)8cos(452.)8sin(564.1)6cos(423.)6sin(269.1 xxxxxx ππππππ −+++++16cos(348.)16(12.)14cos(336.)14sin(595.)12cos(626.1)12sin(048. xxinxxxx ))ππππππ −++−−+
and an example of output is shown graphically in Figure 8. This was constructed using 17 notional inputs, whose
minimum and maximum values are given in Table 2. Also, if actual input variables are known, regardless of the number
they can be transformed into this equivalent set of 17 notional inputs, again using a TF of the form
)).()(
)17((17 nputsNewActualIFFT
tsActualInpuFFT
utsDerivedInpFFTInvFFTInputsNew = (25)
0 2 4 6 8 10 12 14 162
0
2
4
IR
Signature
IntensityCase 1
Case 2
Proto
New Output
Fig. 8. Sample output from the IRSim Data Model.
Minimum Maximum Minimum Maximum
Input 1 3.435 4.812 Input 10 -2.945 2.945
Input 2 -1.532 1.532 Input 11 -1.86 1.86
Input 3 -0.46 0.46 Input 12 -3.028 3.028
Input 4 -1.225 1.225 Input 13 -2.035 2.035
Input 5 -1.307 1.307 Input 14 -1.876 1.876
Input 6 -1.166 1.166 Input 15 -0.087 0.087
Input 7 -0.486 0.486 Input 16 -0.175 0.175
Input 8 -0.532 0.532 Input 17 -1.444 1.444
Input 9 -1.177 1.177 Table 2. Ranges for 17 notional inputs for the IRSim Data Model.
REFERENCES
1. Jaenisch, H.M., Algorithms Enabling Unmanned Deep Space Probes, D.Sc. Thesis, James Cook University (2005-
2006).
2. Jaenisch, H.M., and Handley, J.W., “Data Modeling for Radar Applications”, Proceedings of the IEEE Radar
Conference 2003, (May 2003).
3. Jaenisch, H.M., and Handley, J.W., “Automatic Differential Equation Data Modeling for UAV Situational
Awareness”, Society for Computer Simulation, Huntsville Simulation Conference 2003, (October 2003).
4. Jaenisch, H.M., Handley, J.W., Faucheux, J.P. “Data Driven Differential Equation Modeling of fBm processes”,
Proceedings of SPIE, San Diego, CA, August 4, 2003.
5. Rosko, Joseph S., Digital Simulation of Physical Systems, Reading, MA: Addison-Wesley, 1972.
6. Barnett, S. & Cameron, R.G., Introduction to Mathematical Control Theory, 2nd ed., Oxford: Clarendon Press,
1993.
7. Doyle, F.J., Pearson, R.K., Ogunnaike, B. A., Identification and Control Using Volterra Models, London: Springer-
Verlag, 2002.
8. Jones, R. B., Symbolic Simulation Methods for Industrial Formal Verification, Boston, MA: Kluwer Academic
Publications, 2002.
9. Taguchi, G., and Jugulum, R., The Mahalanobis-Taguchi Strategy: A Pattern Technology System, New York: John
Wiley and Sons, 2002.
10. Murphy, G., Similitude in Engineering, New York, Ronald Press Company, 1950.
Data Modeling Change Detection of Inventory Flow and Instrument
Calibration
Holger M. Jaenisch, James W. Handley, and Louis A. Bonham
Sparta Inc., 4901 Corporate Drive, Suite 102, Huntsville, AL 35805
ABSTRACT Data Modeling is a new mathematical tool for logistics research. It enables reference points to be
established for complex dynamic processes such as inventory control and instrument calibration. This paper presents an overview of Data Modeling and how it can be used to achieve
mathematical models that are a basis of comparison for detecting significant changes either in inventory flow or instrument calibration.
1. INTRODUCTION
1.1 Inventory Flow Data Modeling for inventory flow and control involves the creation of a mathematical equation such as
),,,(' PDPOVVfy ∆= (1)
that describes the relationship between inventory levels and their usage as measured by volume
(V), change in volume (∆V), point of origin (PO), or point of departure (PD). Simple metrics
that are part of traditional logistics analys is of inventory levels can be monitored autonomously to give a new level of self-awareness to business as usual for complex operations. For example,
keeping munitions and supplies at the proper levels between storage points and required field or transition points requires careful orchestration of delivery schedules, volumes, and queuing time
arrival events. If the intricacies contained in the complex inventory flow system or process can be adequately
modeled, the model can be used as a diagnostic and predictor for assessing when inventory flow has changed in an off-nominal fashion. Such change detection (novelty detection) would enable
monitoring and detection of inappropriate allocations of resources and potentially generates derivative indicators of impending critical events and disasters.
1.2 Inventory Abnormality The time directly preceding possible escalations of hostilities frequently results in an increase in
the requisition of certain materials or the unusual storage of quantities of certain items. This change in the flow of business as usual is very difficult to detect while it is happening, because this form of anomalous behavior is usually spread out over time and across many nodes in the
inventory flow structure. If an accurate mathematical model of the nominal flow characteristics of the inventory site were available for comparison, then changes in the underlying dynamics
would be easily seen and would draw the logistic analyst’s attention.
1.3 Instrument Response Modeling Aircraft use sophisticated ins truments in the form of Improved Data Modems (IDM), flight
recorders, transponders, radar, communication links, leveling instrumentation, and flight and attitude controllers. Whenever one of these instruments is suspected of being out of calibration, it
is removed from the aircraft and returned to the manufacturer for recalibration and testing. This process is both costly and lengthy, and is done many times unnecessarily. Data Modeling can be used to eliminate the unnecessary removal and recalibration of an instrument. Using Data
Modeling, the response of a newly calibrated instrument to a known stimulus for different instrument settings can be monitored and recorded during the early history of its useful life or
whenever the instrument is known to be functioning adequately. These stimuli need not be the same as those prescribed by the manufacturer for use during test bench calibration. Instead, the stimulus can be something that is readily available for use in the field.
These settings and responses are used to generate a tuned equation Data Model that associates
these instrument readings and nominal variations to a nominal output. This Data Model captures the performance curve of the instrument, and is sensitive to any changes or variations. Once generated, the Data Model can then be used to determine at any point and time in the future if the
current instrument is within nominal bounds, or if the instrument is now out of calibration and must be removed from the device and returned to the manufacturer.
2. DATA MODELING
2.1 Introduction Data Modeling is the process of finding a mathematical expression that provides a good fit
between given finite sample values of the independent variables and the associated values of the dependent variables of the process. The mathematical expression that fits the given sample of data is called a Data Model. This process involves finding both the functional form of the Data
Model and the numeric coefficients for the Data Model. Data Modeling differs from conventional methods such as linear, quadratic, and polynomial regression techniques that only
solve for the numerical coefficients for a function whose form is specified ahead of time. Data Modeling also differs from genetic algorithms and symbolic regression because it incorporates fractal interpolation functions for achieving mathematical modeling of complex multivariable
processes. The Data Modeling process of discovering automatic functions is an evolutionary and genetic directed approach where the set elements consist entirely of simple polynomial building
blocks. The final form of the Data Modeling equations and the polynomial order shown in (2) are
solutions to the Kolmogorov-Gabor polynomial with n representing the number of layers in the final Data Model and x(bi(t)) the inputs mapped from the previous layer.
[ ] n
n
txO
tbxtbxbxtftx
3)(
)))))))))((((((((,()( 21
=
= K
(2)
Data Model polynomials on the order of ~ O[310] have been generated within minutes which adequately capture high degrees of nonlinearity but still execute in real-time.
2.2 Change Detection The power of statistical quality control charts is the ability to flag when a dynamic process such
as inventory flow or instrument response is not in statistical control. One commonly used type of control chart is the X-Bar Chart defined by the upper control limit (UCL) and the lower control
limit (LCL)
.2
2
RAXLCL
RAXUCL
X
X
−=
+= (3)
The X-Bar chart plots how far away from the average value of a process the current measurement falls. According to Shewhart, values that repeatedly fall outside of three standard
deviations of the average have assignable causes and are labeled as being out of statistical control (off-nominal).
2.3 Calibration on Demand (COD) A Data Model of a complex process in statistical control can be used to generate a lookup table
(LUT), which is analogous to the X-Bar chart. Such a Data Model is built using only nominal cases. Tuning makes the model sensitive to feature values that fall outside the training
boundaries. Whenever these new features fall outside of nominal boundaries, tip-off is flagged. Combinations of incoming feature values that are not anticipated or that are within valid ranges but in combinations not seen before can be instantly detected with this approach. This enables
calibration to be specified and monitored dynamically or on demand (COD). The lookup table trial limits are set using the polynomial output from the training data. The upper and lower
boundaries are assigned to be one-half (½) of a standard deviation (ε) above and below the mean value of the polynomial output from the training data.
Fig. 1 illustrates how either inventory flow sensor measurements or instrument responses are converted into features and analyzed to detect if a change has occurred in system status from
nominal to yield tip-off conditions from the model.
Tip-Off
Tip-Off
Nominal
Tip-Off
Tip-Off
Nominal
Equation
Output
σ
T = w0+w1x1+w2x2+w3x3 +w4x12
+w5x
22 +w
6x
32 +w
7x
13 +w
8x
23
+w9x32 +w10x1x2 +w11x1x3
+w12x2x3
Skewness Standard
Deviation
KurtosisRollup Stats
Data Model
MeasurementData
MeasurementDataInstrument Response
ZScores
ZScores
Inventory Flow
[1 3 –2 4 –2 1 4]
DataSeriesData
SeriesPDFPDF
Fig. 1. Calibration on Demand using Data Modeling.
COD can monitor individual sensor measurements, feature vector elements, and overall system snapshots for change detection. Individual sensor measurements provide a characterization of system level calibration states.
3. APPLICATIONS
3.1 Inventory Flow and Monitoring
An inventory flow scenario was created using the Dynamic Inventory Model (DIM) and associated inputs given in the Appendix that simulates an inventory problem on a period-by-period basis. This inventory model was run using two different products stored at a single
location (e.g., warehouse). Two different customers that were the end users of these two products were also modeled ordering inventory from this storage location. These orders were placed and
filled along with inventory reordering and backordering by the storage location for a total of 256 time periods. During each of the 256 time periods, the number of orders received, inventory, quantity of sales lost, and quantity of units sold was recorded. These quantities were recorded on
a customer by product basis, and summed in order to determine the overall inventory flow, number of orders, number of sales lost, and number of units sold for the storage location. The
goal was to construct a LUT picture of nominal inventory flow conditions and to predict what off-nominal conditions might be.
Our approach does not directly use the inventory feature values as inputs into the Data Model. Instead, each of the 256 nominal feature examples was rolled up, thereby characterizing the
feature distribution independently. This was done for each nominal example using the following parametric and non-parametric statistics where N=4, x1 is inventory flow, x2 is the number of orders, x3 is the sales lost, and x4 the quantity of units sold, and the Range is the difference
between the highest and lowest xj.
∑=
=N
jjN xx
1
1 (4) ∑=
− −=N
j
jNxx
1
2
112 )(σ (8)
∑=
−=
N
j
j
N
xxSkew
1
3
1
σ (5) 3
1
4
1 −
−= ∑
=
N
j
j
N
xxKurt
σ (9)
151
6
16
−
−= ∑
=
N
j
j
N
xxM
σ (6) 105
1
8
18 −
−= ∑
=
N
j
j
N
xxM
σ (10)
=→∆+
N
NJ
Range
Ji
Ji
1log
log
limmin)Re(0
1
σ (7)
++= ∑
−
=
−+1
1
21 )(1log1 1
N
j
Range
xx
Nf
jjD (11)
Once the rollup features in Eq. (4) through Eq. (11) were calculated, a Data Model was tuned to
εσ ±=2
186
2 ),,,,,,,(f
DJMMKurtSkewxf (12)
where ½ is the target regression value and ε is the variation of the regression fit about ½. Once
the equation is trained and tuned, boundaries are placed on either side of the target regression
value that characterizes the fluctuation ε of the polynomial about ½. The Data Model generated
is shown in Fig. 2, and the boundaries that characterize the polynomial fluctuation were set at 0.4992 and 0.5010.
Fig. 2. Data Model of the Dynamic Inventory Model (DIM).
Fig. 2 also contains a Decision Map (LUT) generated using the Data Model. The white box represents the training range of the nominal cases. From the Decision Map, only the features shown on the x-axis need to be monitored in order to determine if the inputs will produce
nominal or off-nominal inventory flow, since the nominal region is bounded to a narrow band along the x-axis but extends along the full range of values for the y-axis. This boundary is
marked on the Decision Map and represents the values 0.33 < J < 0.37, 1.43 < Df < 1.45, 12.3 <
x < 16.3, and 16.1 < σ < 22.7. This model now predicts what inventory statistics would fall outside of the boundary, indicating unusual inventory dynamics that demand scrutiny by the analyst.
3.2 Instrument Calibration To demonstrate instrument calibration, the MiniOx oxygen analyzer with an accuracy of plus or
minus 1% was used. Since this type of oxygen sensor is based on measuring oxygen partial pressures (PPO2), the manufacturer specifies that concentrations of 100% oxygen (EAN100) and
20.9% oxygen (air or EAN21) be used for calibrating these instruments at a constant flow rate. The instrument is first connected to the EAN100 source, and once the sensor response has come to equilibrium, the instrument is then adjusted until the reading is 100%. If the instrument cannot
be adjusted so that the reading is 100%, then the sensor should be replaced. Once the reading is adjusted, the instrument is disconnected from the EAN100 source, exposed to EAN21, and the
reading noted. If the sensor reading does not return to indicating between 20-22%, then the sensor should be replaced. As the sensor nears the end of its useful life, it is possible to calibrate the sensor to 20.9% in air, but it may read only as much as 85% in 100% oxygen. This makes it
necessary to check the sensor periodically at both EAN100 and EAN21. However, in field applications EAN100 is not always available for calibration of oxygen sensors and analyzers. For
this reason, it is not always possible to explicitly follow the manufacturer’s procedures to check
instrument calibration in the field. However, EAN21 is generally readily available at varying pressures or flow rates in the form of SCUBA cylinders or tire fill stations.
3.2.1 Performance Curve Modeling
Using Data Modeling, the performance curve of a new or in-calibration oxygen analyzer instrument can be modeled using varying flow rates of air as a stimulus. Using a flow meter, known flow rates of air can be applied to the calibrated sensor and the oxygen percentage or
voltage output from the sensor recorded. From this, a tuned equation based Data Model is constructed that can be used in the field using the same flow rates. All that is then required in the
field to determine if the instrument or sensor is still calibrated is a flowing air source to yield the constant flow rates. Alternatively, a flow meter to regulate the flow of air across the sensor in a quantified manner may be used.
The sensor exhibits a small hysteresis effect that slightly elevates the reading of oxygen and then
rebounds to a low value. We propose to characterize this hysteresis curve for a calibrated sensor and then compare to this model in the field to determine instrument calibration based solely on airflow induced perturbations to shift the hysteresis curve.
To do this, an oxygen analyzer sensor that was in calibration was characterized. Ten different
measurement realizations were recorded as shown in Fig. 3, consisting of 1) Flow rate in liters per minute (L/min), 2) the power on equilibrium oxygen sensor percentage readings, 3) instrument oxygen percentage reading with air flowing across the sensor, and 4) final oxygen
sensor percentage reading (including any hysteresis effects) once the airflow was removed from the sensor. Using the power on instrument readings as a baseline, the amount of change in the
instrument readings while air was flowing across the sensor from the initial power on value and the change in the final readings from the power on values (thereby capturing the sensor hysteresis) were calculated.
0.1 -0.05 0.2 -0.2 -0.35 -0.5 -0.2 -0.4 0.1 0.0 -0.1 -0.8 0.2 -0.1 0.2 0.15 0.1 -0.1 -0.55 -0.8
6.0 22.2 22.3 22.156.0 22.3 22.5 22.15.75 22.0 21.65 21.55.75 21.6 21.4 21.25.75 22.3 22.4 22.34.8 22.0 21.9 21.24.8 21.4 21.6 21.34.75 21.75 21.95 21.94.7 22.3 22.4 22.24.5 22.4 21.85 21.6
-0.2 -0.5 0.1 -0.3 -1.2 -0.9 0.2 0.3 0.1 -0.1 0.05 -0.2 0.1 -0.1 0.1 -0.1 0.1 0.0 0.1 -0.1
6.0 20.4 20.2 19.96.0 20.9 21.0 20.65.9 20.5 19.3 19.65.75 20.0 20.2 20.35.75 20.1 20.2 20.04.6 20.5 20.55 20.34.5 20.4 20.5 20.34.5 20.5 20.6 20.44.5 20.5 20.6 20.54.5 20.5 20.6 20.4
Flow
Rate
(L/min)
Flow
Rate
(L/min)
Calibrated +6% Out of Calibration
Mean = 20.3
Standard Deviation = 0.34
Mean = 21.9
Standard Deviation = 0.40
% O 2
(Power ON)
(P)
% O2
(AirFlow)
(A)
% O 2
(Final)
(R)
A-P R-P
(Dr)
% O 2
(Power ON)
(P)
% O 2
(AirFlow)
(A)
% O 2
(Final)
(R)
A-P R-P
(Dr)
Rollup of All Calibrated
% O 2 Measurements
Rollup of All Out of Calibration
% O 2 Measurements
Fig. 3. Tabulated sensor measurements.
An additional ten measurements (also shown in Fig. 3) from a +6% calibration biased sensor
were also measured using the same conditions and flow rates as the nominal cases. Both the calibrated and biased measurements were within the manufacturer’s accuracy specifications for
acceptability for EAN21 (between ~ 20-22% as stated above), making it impossible to conclude that the sensor was out of calibration based on the measurements alone.
The Data Modeling process ranks these inputs and builds a tuned equation, which characterizes
the performance curve using only knowledge derived from the nominal sensor measurements. The tuned Data Model only required the final instrument reading (R) and the sensor hysteresis
(Dr) to model the performance curve. This yielded 100% correct classification. The equation for this Data Model is shown at the top of Fig. 4, labeled as nom1. If nom1 gives a value between 0.499 and 0.501, the sensor is calibrated. Any other values indicate the sensor is out of
calibration. The lookup table graphic can be used directly in the field by simply taking the power on and final oxygen analyzer percentage readings to determine the values for R and Dr and the
calibration status. Alternatively, the equation for nom1 itself can be evaluated if a calculator is at hand. Additionally, the known off-nominal cases were run against the tuned Data Model and all ten (100%) were flagged as off-nominal, indicating that the sensor needed to be either
recalibrated or replaced.
nom1 111.632Dr⋅ 383.752 R⋅+ 10.818 Dr2.⋅ 18.9819R
2.⋅− 10.86 Dr⋅ R⋅−+ 7.736 10-2⋅ Dr
3.⋅+ .312949 R3.⋅ .5324 R⋅ Dr
2.⋅−+ .2641 Dr⋅ R2.⋅ 2585.369−+:=
-1.2 Final – Initial Readings (Dr) 0.619.3
Fin
al R
eadin
g (
R)
2
0.9
17
.5 F
ina
l R
ea
din
g (
R)
2
2.7
-3.1 Final – Initial Readings (Dr) 2.5
3 SigmaTestPoint
Boundary
TrainingBoundary
3 SigmaTraining
DataBoundary
Fig. 4. Lookup table generated using Data Model for oxygen sensor monitoring and calibration.
The LUT shown in Fig. 4 can be used to see if the oxygen sensor is out of calibration. The
current values for R and Dr are determined and the intersection for the two values can be read directly from the chart. If the value falls in a white region, the sensor is calibrated. If the value
falls into black, the instrument is out of calibration. The tuned equation Data Model is able to predict that regions of calibration and out of calibration exist outside of the training box region.
On the left side of Fig. 4, the training region and boundary for the tuned equation based Data Model is enlarged and shown in detail. The box on the interior represents the training boundary and encompasses the region where the training examples (shown as points on the graph) reside.
Shown on the right side of Fig. 4 is a larger interrogation region derived from the tuned equation based Data Model, the same training boundary box, and the off-nominal points used for testing
the model.
4. CONCLUSIONS
Data Modeling has been successfully applied to inventory flow and instrument calibration
monitoring. A Data Model was created that defined the boundary for nominal inventory
dynamics. Additionally, Data Modeling was used to demonstrate in field Miniox calibration monitoring. This was accomplished by modeling the response hysteresis of the sensor to various
flow conditions. This created an equation and lookup table that can be used in the field to determine if the Miniox is still in calibration without the use of a pure O2 source.
ACKNOWLEDGMENTS
The authors would like to thank Marvin Carroll, Tec-Masters, Level-13, Licht Strahl Engineering INC, and Kristi Jaenisch and Technical Dive College for the use of the
MAJQANDA algorithm suite during the course of this work, and Greg Ogle of Southeastern Divers Inc. for his assistance in the oxygen sensor data collection process.
BIBLIOGRAPHY
1. Blanchard, B. Logistics Engineering and Management. Englewood Cliffs, NJ: Prentice-Hall, 1981.
2. Buzacott, J.A. and J.G. Shanthikumar. Stochastic Models of Manufacturing Systems.
Englewood Cliffs, NJ: Prentice-Hall, 1993. 3. Fraden, J. Handbook of Modern Sensors. New York: AIP Press, 1996.
4. Grant, E. Statistical Quality Control. New York: McGraw-Hill, 1952. 5. Jaenisch, H.M. and J.W. Handley, “Data Modeling for Radar Applications,” Proceedings of
the IEEE Radar Conference 2003, May 5-8, 2003. Huntsville, AL.
6. Jaenisch, H.M., J.W. Handley, L. Bonham, “Enabling Calibration on Demand for Situational Awareness”, Army Aviation Association of America (AAAA), Tennessee Valley Chapter,
Huntsville, AL, February 12, 2003. 7. Jaenisch, H.M., J.W. Handley, J.C. Pooley III, S.R. Murray. “Data Modeling For Fault
Detection,” 58th MFPT, Virginia Beach, VA, 2003.
8. Jaenisch, H.M., J.W. Handley, K.H. White, J.W. Watson Jr., C.T. Case, C.G. Songy, “Virtual Prototyping With Data Modeling”, Proc. SPIE. Seattle, WA July 7-11 2002.
9. Lamb, J.S. The Practice of Oxygen Measurement for Divers. Flagstaff, AZ: Best, 1999. 10. Lotfi, V. and C. Pegels. Decision Support Systems for Operations Management & Science.
Chicago, IL: Irwin, 1996.
11. McNitt, L.L. BASIC Computer Simulation. Blue Ridge Summit, PA: TAB Books, 1983. 12. MiniOx I Oxygen Analyzer Instruction Manual. Pittsburg, PA: MSA Company, 1995.
13. Webster, J.G. Medical Instrumentation: Application and Design. Boston, MA: Houghton Mifflin, 1978.
BIOGRAPHY
Dr. Holger Jaenisch has over 20 years experience working in the field of applied mathematics and algorithm creation. Dr. Jaenisch’s experience has included the application of Data Modeling to advanced discrimination algorithms and concepts, data, message, and image compression
techniques and simulation modeling. His expertise covers the areas of signal and data processing, image processing, and the modeling of processes, simulations, and systems with neural and
equation based architectures. His concepts have been successfully applied in the development of two commercially sold software packages for non- linear data analysis.
APPENDIX
Dynamic Inventory Model Inputs
Number of customers 2 Number of products 2 #######################################Customer 1 Product 1# Demand levels 2 Level 1 # Items 1 Prob of Demand Level .8 Level 2 # Items 10 Prob of Demand Level .2 # Lead times 2 Lead Time 1 # Periods 5 Prob of lead time level .8 Lead Time 2 # Periods 15 Prob of lead time level .2 Cost per order placed ($) 100 Annual per unit- holding cost ($) 10Stockout cost per unit ($) 100 Number of periods per year 52 Order quantity 2 Reorder point 2 Beginning inventory 10 Number of periods to simulate 256 #######################################Customer 1 Product 2# Demand levels 1 Level 1 # Items 3 Prob of Demand Level .5 # Lead times 2 Lead Time 1 # Periods 2 Prob of lead time level .5 Lead Time 2 # Periods 4 Prob of lead time level .5 Cost per order placed ($) 10 Annual per unit- holding cost ($) 1Stockout cost per unit ($) 5 Number of periods per year 52 Order quantity 20 Reorder point 10 Beginning inventory 20 Number of periods to simulate 256 #######################################
Customer 2 Product 1# Demand levels 2 Level 1 # Items 3 Prob of Demand Level .7 Level 2 # Items 4 Prob of Demand Level .3 # Lead times 1 Lead Time 1 # Periods 3 Prob of lead time level 1 Cost per order placed ($) 100 Annual per unit -holding cost ($) 20Stockout cost per unit ($) 30 Number of periods per year 52 Order quantity 10 Reorder point 25 Beginning inventory 5 Number of periods to simulate 256 #######################################Customer 2 Product 2# Demand levels 3 Level 1 # Items 1 Prob of Demand Level .5 Level 2 # Items 5 Prob of Demand Level .25 Level 3 # Items 10 Prob of Demand Level .25 # Lead times 3 Lead Time 1 # Periods 1 Prob of lead time level .25 Lead Time 2 # Periods 5 Prob of lead time level .25 Lead Time 3 # Periods 10 Prob of lead time level .5 Cost per order placed ($) 125 Annual per unit -holding cost ($) 25Stockout cost per unit ($) 10 Number of periods per year 52 Order quantity 15 Reorder point 20 Beginning inventory 20 Number of periods to simulate 256 #######################################
Dynamic Inventory Model Source Listing 'Dynamic Inventory Model (DIM) 'Holger M. Jaenisch, Ph.D. and Jamie Handley CLS DIM l(50, 2), d(50, 2), y(50) OPEN "input.log" FOR OUTPUT AS #1 INPUT "Number of customers" ; ncust PRINT #1, "Number of customers"; ncust INPUT "Number of products"; nprod PRINT #1, "Number of products"; nprod PRINT #1, "#######################################" REDIM max1(ncust) mx1 = 0 FOR nc = 1 TO ncust REDIM nrow(nprod) maxval = 0 FOR np = 1 TO nprod PRINT PRINT "Currently on Customer"; nc; "and Product"; np PRINT #1, "Customer"; nc; "Product"; np 916 GOSUB 1106 920 GOSUB 1206 924 GOSUB 1306 928 GOSUB 1406 932 GOSUB 1506 nrow(np) = n2 IF n2 > maxval THEN maxval = n2 IF U$ = "1" THEN 924 IF U$ = "2" THEN 920 IF U$ = "3" THEN 916 NEXT np 'rollup by customer REDIM w(maxval, 8) FOR np = 1 TO nprod zz$ = "c" + FORMAT$(nc, "0") + "p" + FORMAT$(np, "0") + ".dat" OPEN zz$ FOR INPUT AS #2 FOR i2 = 1 TO nrow(np) FOR kz = 1 TO 8 INPUT #2, h w(i2, kz) = w(i2, kz) + h NEXT kz NEXT i2 CLOSE #2
NEXT np zz1$ = "c" + FORMAT$(nc, "0") + ".dat" OPEN zz1$ FOR OUTPUT AS #2 FOR i2 = 1 TO maxval FOR kz = 1 TO 8 PRINT #2, w(i2, kz); NEXT kz PRINT #2, NEXT i2 max1(nc) = maxval IF maxval > mx1 THEN mx1 = maxval CLOSE #2 NEXT nc 'final rollup REDIM w(mx1, 8) FOR nc = 1 TO ncust zz1$ = "c" + FORMAT$(nc, "0") + ".dat" OPEN zz1$ FOR INPUT AS #2 FOR i2 = 1 TO max1(nc) FOR kz = 1 TO 8 INPUT #2, h w(i2, kz) = w(i2, kz) + h NEXT kz NEXT i2 CLOSE #2 NEXT nc zz1$ = "rollup.dat" OPEN zz1$ FOR OUTPUT AS #2 FOR i2 = 1 TO mx1 FOR kz = 1 TO 8 PRINT #2, w(i2, kz); NEXT kz PRINT #2, NEXT i2 CLOSE #2 PRINT CLOSE END 1106 PRINT PRINT "Probability distribution for demand period" PRINT PRINT "3 Demand levels = Low, Med, High"
PRINT "Input Demand levels and probabilities in ascending order" PRINT "to build histogram" PRINT INPUT "Number of levels of demand"; k1 PRINT #1, "# Demand levels"; k1 PRINT "demand, probability" p = 0 e1 = 0 FOR i = 1 TO k1 PRINT "Level"; i INPUT "# Items"; d(i, 1) INPUT "Prob of Demand Level"; d(i, 2) PRINT #1, "Level"; i; "# Items"; d(i, 1); PRINT #1, "Prob of Demand Level"; d(i, 2) e1 = e1 + d(i, 1) * d(i, 2) p = p + d(i, 2) d(i, 2) = p NEXT i PRINT PRINT "Expected demand for period"; e1 PRINT PRINT "Probability distribution for lead time" PRINT PRINT "Number of different lead times"; INPUT k2 PRINT #1, "# Lead times"; k2 PRINT PRINT "3 Lead time levels = Low, Med, High" PRINT "Input Lead time levels and probabilities in" PRINT "ascending order to build histogram" PRINT PRINT "If period is defined as month” PRINT "# Periods/yr=52 below), 3 weeks" PRINT "would be 0.75" PRINT PRINT "lead time (# time periods), probability" p = 0 e2 = 0 FOR i = 1 TO k2 PRINT "Lead Time"; i; INPUT "# Periods"; l(i, 1) INPUT "Prob of lead time level"; l(i, 2) PRINT #1, "Lead Time"; i; "# Periods"; l(i, 1); PRINT #1, "Prob of lead time level"; l(i, 2) e2 = e2 + l(i, 1) * l(i, 2) p = p + l(i, 2) l(i, 2) = p NEXT i PRINT PRINT "Expected lead time", e2 RETURN 1206 PRINT PRINT "Cost per order placed ($)"; INPUT c1 PRINT #1, "Cost per order placed ($)"; c1 PRINT "Annual per unit-holding cost ($)"; INPUT c2 PRINT #1, "Annual per unit -holding cost ($)"; c2 PRINT "Stockout cost per unit ($)"; INPUT c3 PRINT #1, "Stockout cost per unit ($)"; c3 PRINT PRINT "52 periods/year = week, 12 periods/year = month" PRINT "Number of periods per year"; INPUT n1 PRINT #1, "Number of periods per year"; n1 1299 RETURN 1306 PRINT PRINT "Order quantity"; INPUT q PRINT #1, "Order quantity"; q PRINT "Reorder point (Quantity threshold for reorder)"; INPUT r PRINT #1, "Reorder point"; r PRINT "Beginning inventory"; INPUT b PRINT #1, "Beginning inventory"; b PRINT "Number of periods to simulate"; INPUT n2 PRINT #1, "Number of periods to simulate"; n2 PRINT #1, "#######################################" FOR i = 1 TO k3 y(i) = 0 NEXT i RETURN 1406 t1 = 0 t2 = 0 t3 = 0 t4 = 0 k3 = l(k2, 1) zz$ = "c" + FORMAT$(nc, "0") + "p" + FORMAT$(np, "0") + ".dat" OPEN zz$ FOR OUTPUT AS #2 FOR i2 = 1 TO n2 'UPDATE PIPELINE GOSUB 2006
'GENERATE DEMAND GOSUB 2106 'PLACE ORDER GOSUB 2206 PRINT #2, t1; t2; t3; t4; jwh; b; s2; s1 t2 = t2 + b t3 = t3 + s2 t4 = t4 + s1 NEXT i2 CLOSE #2 'DISPLAY RESULT GOSUB 2306 RETURN 1506 PRINT PRINT "MENU OF OPTIONS" PRINT "1 TRY ANOTHER SET OF DECISIONS" PRINT "2 TRY ANOTHER SET OF COSTS " PRINT "3 TRY ANOTHER SET OF PROBABILITIES" PRINT "4 NEXT/TERMINATE " PRINT PRINT "ENTER OPTION"; INPUT U$ IF U$ = "1" THEN 1599 IF U$ = "2" THEN 1599 IF U$ = "3" THEN 1599 IF U$ = "4" THEN 1599 PRINT "INVALID RESPONSE" GOTO 1506 1599 RETURN 'UPDATE PIPELINE 2006 b = b + y(1) FOR i = 1 TO k3 - 1 y(i) = y(i + 1) NEXT i y(k3) = 0 RETURN 'GENERATE DEMAND 2106 p = RND FOR i = 1 TO k1 IF p > d(i, 2) THEN 2116 z = d(i, 1) i = k1 2116 NEXT i IF z > b THEN 2126 s1 = z s2 = 0 GOTO 2130 2126 s1 = b s2 = z - b 2130 b = b - s1 RETURN 'PLACE ORDER 2206 v = b FOR i = 1 TO k3 v = v + y(i) NEXT i jwh = 0 IF v > r THEN 2299 p = RND FOR i = 1 TO k2 IF p > l(i, 2) THEN 2226 l1 = l(i, 1) i = k2 2226 NEXT i t1 = t1 + 1 jwh = jwh + 1 IF l1 > 0 THEN 2236 b = b + q GOTO 1299 2236 y(l1) = y(l1) + q 2299 RETURN 'DISPLAY RESULT 2306 PRINT PRINT "RESULT OF SIMULATION" PRINT "DEMAND", t3 + t4 PRINT "UNITS SOLD", t4 PRINT "SALES LOST", t3 PRINT "ORDERS", t1 PRINT "AVE RAGE INVENTORY LEVEL ", t2 / n2 t5 = (n1 / n2) * c1 * t1 t6 = c2 * t2 / n2 t7 = (n1 / n2) * c3 * t3 t = t5 + t6 + t7 PRINT PRINT "COSTS ADJUSTED TO ANNUAL BASIS" PRINT PRINT "ORDERING", t5 PRINT "HOLDING", t6 PRINT "STOCKOUT", t7 PRINT "TOTAL", t PRINT PRINT "TYPE THE VALUE 1 TO CONTINUE"; INPUT U$ RETURN
DATA MODELING FOR FAULT DETECTION
H. M. Jaenischa, J. W. Handley
a, J. C. Pooley III
b, and S. R. Murray
b
aSparta, Inc., 4901 Corporate Drive, Suite 102, Huntsville, AL 35805
bAmtec, Inc., 500 Wynn Drive, Suite 314, Huntsville, AL 35816
Abstract: Data Models are high order [O(3n)] multi-variable characterizations derived
from simple third order polynomial building blocks. Such functions can be used to build
real-time operational condition diagnostic and prognostic, i.e. fault classification and time
to failure estimates. Data Modeling can achieve anomalous detection while only requiring
nominal (no-fault) conditions for training. This makes Data Modeling an attractive tool
for Novelty Detection and ambiguity resolution between nominal/anomalous; also Data
Modeling can resolve ambiguities in diagnostic calls and manage risk uncertainty in
prognosis estimates of time to fail. This paper presents the theory of how Data Modeling
was successfully applied to Novelty Detection, and how it can be applied to Diagnostic
Ambiguity Resolution and Prognostic Risk Uncertainty Management. Classifier methods
such as adaptive multi-dimensional distance measure neural networks and Divergence
classifiers were unsuccessful when applied to a set of diagnostic vibration feature vectors
with only slightly anomalous conditions. When Data Modeling was applied to the same
feature vector sets, 100% correct classification was achieved.
Key Words: Change detection; Data Modeling; diagnostics; health status; off-nominal;
prognostics; situational awareness; transmission vibration
Introduction: Fault diagnostics of rotordynamic devices such as the gearboxes of
helicopters (BlackHawk, AH64, Chinook, Kiowa, Huey, etc.) require the extraction of
diagnostic features from the spectral signals measured within the gearbox. These features
characterize the operational condition of the vital parts of the gearbox and are derived
from spectral signals comprised of a non-Gaussian signal plus white noise. Because of
this noisy environment, fault diagnosis can be a difficult challenge for most standard fault
diagnosis methods, leading to misdiagnosis of fault conditions. Techniques such as Phase
Coherent Filtering (PCF) can be used to boost the signal-to-noise (SNR) ratio and better
estimate the signal before feature extraction. Once this is done, features are extracted and
a Data Modeling equation based classifier algorithm built to perform fault detection [1].
Feature Vectorization: Hybrid time and spectral analysis techniques are used for feature
selection. The presence of severe noise makes regular diagnostic evaluation difficult. In
this environment, the underlying cause of an anomaly can be difficult to determine.
Anomalies are analyzed using hybrid techniques in order to determine the relevant
causality relationship to the shaft synchronous frequency or its harmonics that are
produced by the gearing and drive train reducers and multipliers. A causality relationship
determines when anomalies may indicate the presence of distress due to a defect. The
presence of these defect distresses can be seen as sidebands originating from modulation
of the component defect with the shaft synchronous frequency. In order to achieve
effective fault diagnosis, it is necessary to look at the combined effect of Amplitude
Modulation (AM) and Frequency Modulation (FM) because AM and FM rarely exist in
isolation. The combined effect of AM and FM increases the number and/or amplitude of
the sidebands and makes their pattern asymmetrical due to reinforcement and
cancellation. The result of AM and FM modulations has different phase relationships,
making it necessary to use higher order spectral analysis techniques in the form of the
Hilbert Transform to filter and isolate contributing effects. Once filtering is done, a
statistical acceptance criterion that is derived from the analysis process is used as a
fault/false alarm discriminator. Traverses outside of this region constitute severity of the
fault and the rate of the traversal is the fault degradation rate. This type of hybrid analysis
fault determination criterion also allows for the resolution of ambiguous cases.
Data Modeling: Data Modeling is the process of finding a mathematical expression that
provides a good fit between given finite sample values of the independent variables and
the associated values of the dependent values of the process. Data Modeling finds a Data
Model that fits the given sample of data. This process involves finding both the
functional form of the Data Model and the numeric coefficients for the Data Model. Data
Modeling differs from conventional methods such as linear, quadratic, and polynomial
regression techniques that only solve for the numerical coefficients for a function whose
form is specified ahead of time. Data Modeling also differs from genetic algorithms and
symbolic regression because it incorporates fractal interpolation functions for achieving
mathematical modeling of complex multivariable processes. The Data Modeling process
of discovering automatic functions is an evolutionary/genetic directed approach where
the set elements consist entirely of simple polynomial building blocks [2].
Change Detection: The power of statistical quality control and specifically control charts
is the ability to recognize when a process is not in statistical control. One commonly used
types of control chart is the X-Bar Chart defined by the upper control limit (UCL) and the
lower control limit (LCL)
.2
2
RAXLCL
RAXUCL
X
X
−=
+= (1)
The X-Bar chart plots how far away from the average value of a process the current
measurement falls. According to Shewhart [3], values that fall outside of three standard
deviations of the average have assignable causes and are labeled as being out of statistical
control (off-nominal).
Analogous to the X-Bar chart, a model of the complex process that is in statistical control
is called a Data Model [4]. A Data Model can be built by training all known nominal
cases against a constant output value. Overfitting makes the model more sensitive to
feature values or groups of feature values that fall outside the training boundaries.
Whenever these new features or groups of features begin to depart from within the valid
training boundaries, off-nominal or tip-off conditions are flagged.
Calibration On Demand (COD) is the ability to modify the boundaries between nominal
and off-nominal conditions as the evolving situation requires. This boundary is defined
by a set of control parameters. COD allows restricted pockets or segments within
previously declared nominal or off-nominal regions to be changed to the other condition
through the use of a lookup table (LUT). COD can be used in real-time to indicate when
incoming data sets are of a form not anticipated during simulations and not accounted for
previously. Off-nominal detection can then be used to initiate the generation of additional
feature sources to deal with the new off-nominal information in the form of a new
classifier. Fig. 1 illustrates how new gearbox accelerometer measurements are converted
into features and analyzed to detect if a change has occurred in system status from
nominal to yield tip-off conditions from the model.
Tip-Off
Tip-Off
Nominal
Equation
Output
σ
T = w0+w1x1+w2x2+w3x3 +w4x12
+w5x22 +w6x3
2 +w7x13 +w8x2
3
+w9x32 +w10x1x2 +w11x1x3
+w12x2x3
Skewness Standard
Deviation
KurtosisRollup
Stats
Data Model
Measurement
DataAccelerometer
Z
Scores
Gear
Box
[1 3 –2 4 –2 1 4]
Data
SeriesPDF
Tip-Off
Tip-Off
Nominal
Tip-Off
Tip-Off
Nominal
Equation
Output
σ
T = w0+w1x1+w2x2+w3x3 +w4x12
+w5x22 +w6x3
2 +w7x13 +w8x2
3
+w9x32 +w10x1x2 +w11x1x3
+w12x2x3
Skewness Standard
Deviation
KurtosisRollup
Stats
Data Model
Measurement
Data
Measurement
DataAccelerometerAccelerometer
Z
Scores
Z
Scores
Gear
Box
Gear
Box
[1 3 –2 4 –2 1 4]
Data
Series
Data
SeriesPDFPDF
Fig. 1. Calibration on Demand using Data Modeling.
COD can be used in real-time to indicate when incoming data sets are of a form not
previously accounted for or anticipated during training on either actual or simulated data
sets. COD allows sensing of off-nominal conditions while being trained exclusively on
nominal conditions. This is a key theoretical enabling concept for Situational Awareness
via algorithms.
A use for COD is in monitoring individual sensor measurements, feature vector elements,
and overall system snapshots for change detection. Individual sensor measurements
provide a characterization of system level conditions. This yields an estimate of nominal
or off-nominal condition.
Rather than use the actual input feature values, we use parametric and non-parametric
descriptions of the entire feature value set. This can include the conversion of features to
standardized Z-scores. This treats the feature value set as a time series and models the
distribution by overfitting to a minimized regression function. By setting the target value
of the regression to minimize and then characterizing the polynomial fluctuation around a
minimum, in practice the output feature value used for training is ½.
An overfitted Data Model can be generated that uses Shewhart’s Control Chart theory to
monitor if the modeled process is under Statistical Process Control. In Data Modeling,
the trial limits are set using the polynomial output from the training data. The upper and
lower boundaries are assigned to be one-half of a standard deviation above and below the
mean value of the polynomial output from the training data.
COD can evaluate the operational ranges and feature regimes of proposed classifiers.
This method shows regions of nominal behavior that indicate adequate coverage by the
proposed classifier. It is also very important to note that even though the ranges of input
features may be the same as seen during training, which is easily tested without resorting
to Data Modeling, we can also detect instantly combinations of incoming feature values
that are not anticipated or that are within valid ranges but not seen before. This off-
nominal detection is a tip-off condition that can be used to predict additional regions that
need to be populated by decision-making classifiers or decision transfer nodes.
Feature Vector Description: Feature vector data sets were synthetically generated for
both nominal and varying degrees of anomalous conditions as recorded by a single
gearbox accelerometer [5]. Generally, the accelerometer data is used to generate a feature
vector set or matrix denoted by
.
100,2,1,
100,22,21,2
100,12,11,1
2
1
,
=
=
mmmm
nm
aaa
aaa
aaa
FV
FV
FV
FV
L
MMMM
L
L
M
(2)
The vibration data recorded by this accelerometer goes through a process called
segmenting before the feature vector is generated. Segmenting is used for reducing noise
and is also known as synchronous averaging. The segmentation process generates a total
of 100 pieces of data that are then in turn used to generate a total of 100 feature vectors.
Each feature vector FVm (row vector) contains 100 elements that are generated using
various nominal and anomalous diagnostic discrimination techniques. The results of these
diagnostic discrimination techniques are converted into Z-scores [6] using
NZ
i
i
i
i
α
αα σ
µα −=
( )ii
i
M
M
2=α for even i. (3)
These Z-score values ranged in values from –4 to 4. The Z-scores from all 100 pieces of
data segmented above and from all 100 different diagnostic discrimination techniques
make up a feature vector matrix that is a total size of 100x100.
The synthetically generated data used is representative of vibration data recorded using
0.1 second intervals for durations of up to 12 seconds. A total of 4 different anomaly
percentages were used (1%, 5%, 10% and 15%) corresponding to 1, 4, 7, and 10
anomalous feature vector elements (matrix columns). The single anomalous feature
vector element case (1%) represents a bearing fault, and the increasing percentages of
anomalous feature vector elements up to 15% represent the propagation of this bearing
fault and its effects on the other critical components of the gearbox (such as deterioration
of the bearing, races, gears, and cage). These 4 different anomaly percentages were
varied across 4 different severities of anomaly value (plus or minus 1.5, 2, 3, and 4) for a
total of 16 anomalous feature vector matrices. In addition, a feature vector matrix of
nominal behavior was also generated (values between –1 and 1), for a total of 17 feature
vector matrices.
Model Construction: This nominal feature vector matrix was used to build an equation
based Data Model for nominal/off-nominal detection. The feature data consisted of 100
different Z-score values described above that were derived from different diagnostic
discrimination techniques. Only Features 21-90 were used as input features for the Data
Model. Features 1-20 were ignored because their values did not change across all of the
examples provided, and Features 91-100 were ignored because they were rollup Z-scores
(91-94) or zeros (95-100). For the 70 Z-scores used, there were a total of 100 examples of
nominal behavior generated as described above that were used in constructing the Data
Model. It should also be noted that although the Z-score values for the data used were in
the range –4 to 4, values used in actual Data Model training were left in raw from ranging
from –4096 to 4096. This was done to save the processing step of converting all of the
synthetically generated data from raw Z-scores to final Z-scores.
The conventional approach for performing novelty detection would be to take examples
of all 70 Z-scores for both nominal and anomalous cases and construct a high
dimensional (70 dimensional) classifier that would be used to predict if unknown sets of
Z-scores represent a nominal or anomalous case, and if anomalous, the type of anomaly.
The approach taken by the authors differs from the conventional approach. First, our
approach uses only nominal behavior examples to construct the Data Model. This Data
Model is then used to determine if a new unseen set of Z-scores is nominal/anomalous
using the principles of Shewhart’s Control Chart theory to determine when the process is
out of statistical control.
Second, our approach does not directly use any set or subset of the 70 Z-scores as input
into the Data Model. Instead, each of the 100 nominal examples of 70 Z-scores provided
was rolled up by characterizing the Z-score distribution independently. This was done for
each nominal example using the following parametric and non-parametric statistics:
∑=
=N
j
jNxx
1
1 (4) )min( xMin = (5)
∑=
− −=N
j
jNxx
1
2
112 )(σ (6) )max( xaxM = (7)
∑=
−=
N
j
j
N
xxSkew
1
3
1
σ (8) MinMaxRange −= (9)
31
4
1 −
−= ∑
=
N
j
j
N
xxKurt
σ (10) (11) 22 )1( σχ −−= NN
=→∆+
N
NJ
Range
Ji
Ji
1log
log
limmin)Re(0
1
σ (12)
++= ∑
−
=
−+1
1
21 )(1log1 1
N
j
Range
xx
Nf
jjD (13)
}))]((max[))(({10
1
∑=
−=j
i fPDFfPDFFOM αα (14)
The f(α) term in Eq. (14) is a multi-fractal spectrum [7]. Using this approach, each of the
70 Z-scores that represented a single example were reduced to 11 descriptive statistical
features that both adequately characterized the Z-score distribution for each of the 100
examples and reduced the quantity of inputs to be processed.
Once the rollup features in Eq. (4) through Eq. (14) were calculated, a Data Model was
generated using a target regression value of ½ as the nominal output. Once the equation is
trained, boundaries are placed on either side of the target regression value that
characterizes the fluctuation of the polynomial about ½. This is shown in Fig. 1, with
values outside of the boundary representing anomalous conditions. The Data Model
generated is given in the Appendix, and the boundaries that characterize the polynomial
fluctuation were set at 0.499 and 0.501.
The contribution of each rollup feature is given in Table I, and is the number of
occurrences of each feature in the final Data Model. The correlation of each rollup
feature with the target regression value is also given along with eigenranking values. The
eigenranking values are calculated by first determining the most highly correlated input
with the target regression value and removing it from the list. Next, the most highly
correlated feature remaining in the list is compared with the features that have been
removed from the list to determine cross-correlation. Eigenranking is then calculated by
subtracting the highest cross-correlation from the correlation with the target regression
value, thereby determining its new information contribution. This feature is then removed
from the list and the process repeated until all features have been processed.
Table I.
118.685.598.643.780.885.527.511.035.398.668.
118.100.035.016.005.110.103.022.054.064.002.
22171717171732324747
)11()9()5()10()12()6()7()13()14()4()8(
−−−−−−−−−−−−−
ngEigenRanki
nCorrelatio
Occurrence
Equation
Model Performance: Using these threshold values, the original nominal feature vector
matrix rollup stats were then passed through the Data Model and all 100 (100%) were
correctly classified as nominal. We chose from the anomalous cases the two cases that
were felt to be the most stressing. These cases were the single anomalous feature vector
element with the closest value to nominal (1% with values near plus or minus 1.5) and
the 10 anomalous element case with the closest value to nominal (15% with values near
plus or minus 1.5). These same cases had been looked at in previous work using both a
neural network approach and a divergence classifier to perform novelty detection. Neither
of these methods was able to achieve 100% correct classification as in the Data Modeling
approach.
These two cases provide the support vectors for the cases, and if the Data Model can
correctly identify these cases, all other nominal or anomalous cases can also be correctly
typed. Support vectors are the points that lie on the boundaries of the solution space and
define the complex decision boundary between nominal and anomalous behavior. The
support vectors are all of the non-redundant and non-coupled examples that are necessary
to define a model of this boundary. Examples that recast the same information contained
by other examples, but in different terms such as derivative terms or highly cross-
correlated features, would be redundant and would not be support vectors.
Each of these feature vector matrices was rolled up with the same characterizing statistics
given in Eq. (4) through Eq. (14), and these rollup statistics were passed through the Data
Model. For both cases, 100% of the cases were correctly classified as anomalous. These
results are shown graphically in Fig. 2.
The classical chi-squared test was applied as a discriminant for most stringent anomalous
cases and required a significance level of 0.25 ( to discriminate nominal from
anomalous. For the same cases, Novelty Detection using Data Modeling could
discriminate nominal from anomalous at significance level 0.44 ( . Therefore, the
Data Modeling approach is a superior method for preventing False Safes, and
significantly reduces misdiagnosis and associated consequent maintenance cost.
)75.02χ
)56.02χ
Nominal
1 Anomaly 10 Anomalies
26.0
0.0
Nominal
Nominal
Off-Nominal
Nominal
Nominal
Off-Nominal
11.7
.501
.499 Fig. 2. Results of Data Model used for nominal and anomalous test cases.
On the right side of Fig. 2 is a Decision Map generated using the Data Model. Along the
X-axis of the Decision Map is the feature given in Eq. (12) and on the Y-axis the feature
given in Eq. (14). These two features were allowed to range within their original training
boundaries plus or minus one standard deviation (σ). All other features were held at their
median value. The white box represents the valid training ranges of the original nominal
cases. The Decision Map displays the complex nature of the decision space that is
captured by the equation model, and that is not captured by other statistical
characterizations such as quadratic classifiers or Bayesian networks. The Decision Map
can be used as a direct lookup table (LUT) for real time implementations. This allows
restricted pockets or segments within previously declared nominal or off-nominal regions
to be changed to the other condition as the need arises by modifying the values in the
LUT directly and bypassing the original equation model. The Decision Map shown in
Fig. 2 can be reproduced using the code and the associated values listed in the Appendix.
Application to Diagnostic Ambiguity Resolution and Prognostic Risk Uncertainty
Management: Performing Diagnostic Ambiguity Resolution results in the use of two
separate networks: 1) Novelty Detection Network and 2) Fault Classification Network.
The Novelty Detection Network discussed thus far serves as a general monitor of all
feature vectors simultaneously by examining their Z-score distributions. This indicates if
feature vector Z-scores across all elements are occurring as expected. Once the novelty
detection network determines the current status to be either nominal or anomalous, this
information is then passed to the fault classification network. Data Modeling was used in
the construction of the Novelty Detection Network shown in this work; however, it can
also be used in the construction of both a Fault Classification Network and a Risk
Uncertainty Management (RUM) Network.
The high-resolution Fault Classification Network (Fig. 3) is comprised of four separate
embedded networks: fault detection, fault identification, fault severity ranking, and fault
degradation rate networks. These networks monitor individual feature vector elements to
verify the findings of the Novelty Detection Network, thereby mitigating false positives
and negatives.
Fault classification requires individual feature vector element and feature vector element
grouping information to be extracted and used. In this fashion, individual Data Models
are created for each individual sensor or feature vector element. Each one of the networks
can rely on its own independent features to achieve modeling and tip-off monitoring, and
no requirement is made that all available features be used. The feature driver sub-
selection is evolved by the decision architecture discovery process algorithm and is
entirely adaptive to the nature of the individual feature vector elements.
Rollup
Stats
Novelty
Detection
Network
CAR/COA
Network
Fault Classification Network
Detection
Identification
Severity Ranking
Degradation Rate
Classification Ambiguity Resolver (CAR)
Optimal
Classifier
Fault ID
Severity
Degradation
Rate
Diagnostic
Z-ScoresRollup
Stats
Novelty
Detection
Network
CAR/COA
Network
Fault Classification Network
Detection
Identification
Severity Ranking
Degradation Rate
Classification Ambiguity Resolver (CAR)
Optimal
Classifier
Fault ID
Severity
Degradation
Rate
Diagnostic
Z-Scores
Fig. 3. Classification Ambiguity Resolver (CAR) Network.
The Risk Uncertainty Manager (RUM) (Fig. 4) uses standardized Z-scores as input to
determine the theoretical failure time, probability of failure at the theoretical failure time,
and will directly give probability of failure for Resource Managers to anticipate logistic
and parts inventory management.
Prognosis
Uncertainty
Z-Scores
RUM/COA
Network
Risk
Prediction
Network
Risk Uncertainty Manager (RUM)
Action
Triage
Model
Prognosis
Uncertainty
Z-Scores
RUM/COA
Network
Risk
Prediction
Network
Risk Uncertainty Manager (RUM)
Action
Triage
Model
Fig. 4. Risk Uncertainty Manager (RUM) Network.
Conclusions: Data Modeling has been demonstrated as a viable algorithm for novelty
detection. Data Modeling has enabled anomalous behavior to be detected without a priori
examples of anomalous behavior. In addition, the anomalies detected were of a very
small magnitude and not perceptible to other advanced adaptive algorithms. This initial
success sets the stage for Data Modeling to be used in Diagnostic Ambiguity Resolution
and Prognostic Risk Uncertainty Management.
References
[1] Pooley, J.C. and S.R. Murray, “Rotordynamic Fault Diagnostics Using Phase
Coherent Filtering,” To be published in the Proceedings of the Society for
Machinery Failure Prevention Technology (MFPT), Virginia Beach, VA, April
2003.
[2] Jaenisch, H.M. and J.W. Handley, “Data Modeling for Radar Applications,” To be
published in Proceedings of the IEEE Radar Conference 2003, Huntsville, AL.
[3] Grant, E. Statistical Quality Control. New York: McGraw-Hill, 1952.
[4] Jaenisch, H.M., J.W. Handley, L. Bonham, “Enabling Calibration on Demand for
Situational Awareness”, Army Aviation Association of America (AAAA),
Tennessee Valley Chapter, Huntsville Alabama, Feb 12, 2003.
[5] Lee, C. and J.C. Pooley, “Feature Vector Generation and Fusion Techniques for SH-
60 Helicopter Gearbox Diagnostics,” Final Report to the JAHUMS Project Office,
February 2003.
[6] Nikias, C. L. and A. P. Petropulu. Higher-Order Spectra Analysis: A Nonlinear
Signal Processing Framework. Englewood Cliffs, NJ: Prentice-Hall, 1993.
[7] Jaenisch, H.M., C.J. Scoggins, M.P. Carroll, and J.W. Handley, “Entropy Fractal
Analysis of Medical Images Using ROSETA,” Proceedings of SPIE, Los Angeles,
CA January 24, 1994. Session: Medical Applications of Modern Imaging
Technology.
Appendix
DEFCUR A-ZOPEN "input" FOR INPUT AS #1INPUT #1, maxval, minval, range, mean, stdevINPUT #1, skew, kurt, chisq, Jaen, Hand, rosfomCLOSE #1maxval = (maxval - 3027.78@) / 268.4416@minval = (minval - -3107.3@) / 235.948@range = (range - 6134.3@) / 346.41@mean = (mean - -1.6663@) / 100.0386@stdev = (stdev - 1079.6204@) / 64.3872@skew = (skew - -.0622@) / .3681@kurt = (kurt - 1.9463@) / .4347@chisq = (chisq - -80957000@) / 9532140@Jaen = (Jaen - .32@) / .0527@Hand = (Hand - 1.3741@) / .0071@rosfom = (rosfom - 4.5092@) / .8543@GOSUB COEFF1iflag1 = 0IF l6o3 < .499 THEN iflag1 = 1IF l6o3 > .501 THEN iflag1 = 1l6o3 = LOG(ABS(l6o3))/LOG(10@)OPEN "output" FOR OUTPUT AS #1PRINT #1, l6o3, iflag1PRINT l6o3, iflag1CLOSE #1END
Use the following values to
reconstruct the Decision Map
in Fig. 2:
maxval = 2029
minval = -2545
range = 4710
mean = 16.3072
stdev = 964.8604
skew = 0.0874
kurt = 2.1116
chisq = -70450000
Hand = 1.5248
Jaen = Vary From .1057 to .5859
rosfom = Vary From .0059 to 7.2724
COEFF1:l1o2 = .1779 + .0294*rosfom + .4022*mean -.2041*maxval -.2074*rosfom*rosfoml1o2 = l1o2 + .0407*mean*mean + .0834*maxval*maxval -.033*rosfom*meanl1o2 = l1o2 + .0162*rosfom*maxval -.0081*mean*maxval -.2349*rosfom*mean*maxvall1o2 = l1o2 - .0479*rosfom*rosfom*rosfom -.0132*mean*mean*meanl1o2 = l1o2 -.2092*mean*rosfom*rosfom -.0854*rosfom*mean*meanl1o2 = l1o2 + .117*rosfom*maxval*maxval -.0203*maxval*rosfom*rosfoml1o2 = l1o2 + .0071*maxval*mean*mean + .0362*mean*maxval*maxvall1o3 = -.0549 + .1685*SIN(6.2832*chisq) + .5903*COS(6.2832*chisq)l1o3 = l1o3 -.0716*SIN(6.2832*rosfom) -.2275*COS(6.2832*rosfom)l1o3 = l1o3 + .134*SIN(6.2832*mean) -.1696*COS(6.2832*mean)l1o3 = l1o3 -.0673*SIN(6.2832*Hand) -.4242*COS(6.2832*Hand)l1o3 = l1o3 + .1103*SIN(6.2832*maxval) -.0127*COS(6.2832*maxval)l1o3 = l1o3 -.2275*SIN(6.2832*minval) -.3622*COS(6.2832*minval)l1o3 = l1o3 + .1424*SIN(6.2832*kurt) + .0648*COS(6.2832*kurt)l1o3 = l1o3 + .1099*SIN(6.2832*skew) + .1226*COS(6.2832*skew)l1o3 = l1o3 -.1476*SIN(6.2832*range) -.0191*COS(6.2832*range)l1o3 = l1o3 -.0014*SIN(6.2832*Jaen) -.1282*COS(6.2832*Jaen)l1o3 = l1o3 + .3289*SIN(6.2832*stdev) -.7289*COS(6.2832*stdev)l2o1 = .0244 + .6126*l1o2 -.1719*Hand + 1.4384*l1o3 -.2076*l1o2*l1o2l2o1 = l2o1 + .0004*Hand*Hand -.0239*l1o3*l1o3 + .3297*l1o2*Handl2o1 = l2o1 + .373*l1o2*l1o3 -.1078*Hand*l1o3 -.7229*l1o2*Hand*l1o3l2o1 = l2o1 + .3051*l1o2*l1o2*l1o2 + .0107*Hand*Hand*Handl2o1 = l2o1 -.5415*l1o3*l1o3*l1o3 -.0947*Hand*l1o2*l1o2l2o1 = l2o1 -.0662*l1o2*Hand*Hand + .6466*l1o2*l1o3*l1o3l2o1 = l2o1 -.8223*l1o3*l1o2*l1o2 -.022*l1o3*Hand*Hand + .3764*Hand*l1o3*l1o3l3o2 = .1214 + 1.3191*l2o1 -.1414*skew + .1093*stdev -.1803*l2o1*l2o1 l3o2 = l3o2 -.017*skew*skew -.0793*stdev*stdev + .2141*l2o1*skewl3o2 = l3o2 -.1368*l2o1*stdev + .1585*skew*stdev + .0057*l2o1*skew*stdevl3o2 = l3o2 -.1218*l2o1*l2o1*l2o1 -.0164*skew*skew*skew -.0342*stdev*stdev*stdevl3o2 = l3o2 + .0971*skew*l2o1*l2o1 -.0396*l2o1*skew*skew -.0441*l2o1*stdev*stdevl3o2 = l3o2 -.0611*stdev*l2o1*l2o1 -.0352*stdev*skew*skew + .0889*skew*stdev*stdevl2o2 = .0056 +.5233*l1o2 -.1429*skew + 1.6838*l1o3 + .0186*l1o2*l1o2l2o2 = l2o2 + .0026*skew*skew -.1106*l1o3*l1o3 -.1814*l1o2*skew l2o2 = l2o2 -.2555*l1o2*l1o3 + .5827*skew*l1o3 -.2691*l1o2*skew*l1o3l2o2 = l2o2 -.0629*l1o2*l1o2*l1o2 + .0083*skew*skew*skew -.5017*l1o3*l1o3*l1o3l2o2 = l2o2 + .1548*skew*l1o2*l1o2 + .1036*l1o2*skew*skew + .1486*l1o2*l1o3*l1o3l2o2 = l2o2 -.2098*l1o3*l1o2*l1o2 -.2228*l1o3*skew*skew + .0629*skew*l1o3*l1o3l3o1 = .0288 + 1.1318*l2o2 + .1871*mean -.2148*Hand + .0552*l2o2*l2o2l3o1 = l3o1 -.0372*mean*mean -.0443*Hand*Hand -.1808*l2o2*meanl3o1 = l3o1 + .1393*l2o2*Hand + .1863*mean*Hand + .0386*l2o2*mean*Handl3o1 = l3o1 + .0026*l2o2*l2o2*l2o2 -.038*mean*mean*mean + .0014*Hand*Hand*Handl3o1 = l3o1 -.0529*mean*l2o2*l2o2 -.0314*l2o2*mean*mean -.0274*l2o2*Hand*Handl3o1 = l3o1 -.0019*Hand*l2o2*l2o2 + .0524*Hand*mean*mean -.0337*mean*Hand*Handl4o3 = .1059 + .8827*l3o2 -.2142*Jaen + .0731*l3o1 + .0672*l3o2*l3o2l4o3 = l4o3 -.0812*Jaen*Jaen -.0185*l3o1*l3o1 + .3007*l3o2*Jaenl4o3 = l4o3 -.091*l3o2*l3o1 -.3331*Jaen*l3o1 -.3113*l3o2*Jaen*l3o1l4o3 = l4o3 -.1367*l3o2*l3o2*l3o2 + .037*Jaen*Jaen*Jaen + .5027*l3o1*l3o1*l3o1l4o3 = l4o3 + .3029*Jaen*l3o2*l3o2 -.2689*l3o2*Jaen*Jaen -1.2086*l3o2*l3o1*l3o1l4o3 = l4o3 + .8481*l3o1*l3o2*l3o2 + .3087*l3o1*Jaen*Jaen + .1116*Jaen*l3o1*l3o1l5o1 = -.0615 + .9876*l4o3 + .0708*rosfom -.0493*skew -.0079*l4o3*l4o3l5o1 = l5o1 + .1561*rosfom*rosfom -.0341*skew*skew + .1969*l4o3*rosfoml5o1 = l5o1 -.0968*l4o3*skew + .1803*rosfom*skew -.0531*l4o3*rosfom*skewl5o1 = l5o1 -.0085*l4o3*l4o3*l4o3 + .0471*rosfom*rosfom*rosfom -.0448*rosfom*skew*skewl5o1 = l5o1 + .0122*skew*skew*skew -.0228*rosfom*l4o3*l4o3 +.0763*l4o3*rosfom*rosfoml5o1 = l5o1 + .02*l4o3*skew*skew + .0022*skew*l4o3*l4o3 + .0329*skew*rosfom*rosfoml5o3 = .0213 + 1.0512*l4o3 + .3126*minval + .1807*kurt + .0269*l4o3*l4o3l5o3 = l5o3 + .0188*minval*minval + .0161*kurt*kurt -.0311*l4o3*minvall5o3 = l5o3 -.1121*l4o3*kurt + .1604*minval*kurt -.0744*l4o3*minval*kurtl5o3 = l5o3 + .0141*l4o3*l4o3*l4o3 -.0463*minval*minval*minval -.1443*minval*kurt*kurtl5o3 = l5o3 -.0237*kurt*kurt*kurt -.0743*minval*l4o3*l4o3 -.0773*l4o3*minval*minvall5o3 = l5o3 + .0144*l4o3*kurt*kurt -.0097*kurt*l4o3*l4o3 -.1706*kurt*minval*minvall6o3 = .0065 + .6065*l5o1 -.0362*mean + .3729*l5o3 -.3338*l5o1*l5o1 + .0161*mean*meanl6o3 = l6o3 -.1255*l5o3*l5o3 -.2353*l5o1*mean + .4507*l5o1*l5o3 + .1453*mean*l5o3l6o3 = l6o3 -1.1258*l5o1*mean*l5o3 + .0312*l5o1*l5o1*l5o1 + .0023*mean*mean*meanl6o3 = l6o3 + .3571*l5o3*l5o3*l5o3 + .528*mean*l5o1*l5o1 -.1531*l5o1*mean*meanl6o3 = l6o3 -.702*l5o1*l5o3*l5o3 + .3228*l5o3*l5o1*l5o1 + .1499*l5o3*mean*meanl6o3 = (l6o3 + .6534*mean*l5o3*l5o3)*.0005 + .5001RETURNSTOP
Data Modeling for Virtual Observatory data mining
Holger Jaenisch*a,b, James Handleya, Albert Lima, Miroslav Filipovicc,a, Graeme Whitea, Alex Honsa, Gary Deragopiana, Mark Schneidera, Matthew Edwardsb
aJames Cook University, Centre for Astronomy, Townsville QLD 4811, Australia bAlabama Agricultural and Mechanical University, Department of Physics, Huntsville, AL 35811
cUniversity of Western Sydney, Locked Bag 1797 PENRITH SOUTH DC NSW 1797, Australia
ABSTRACT
< 869.47 -3.27 41.37 602.25 10053.48 620.0042>
We propose a novel approach for index-tagging Virtual Observatory data files with descriptive statistics enabling rapid data mining and mathematical modeling. This is achieved by calculating at data collection time 6 standard moments as descriptive file tags. Data Change Detection Models are derived from these tags and used to filter databases for similar or dissimilar information such as stellar spectra, photometric data, images, and text. Currently, no consistent or reliable method for searching, collating, and comparing 2-D imagery exists. Traditionally, methods used to address these data problems are disparate and unrelated to text data mining and extraction. We explore the use of mathematical Data Models as a unifying tool set for enabling data mining across all data class domains.
Keywords: Data Modeling, data mining, Virtual Observatory, exo-planet, photometric, stellar spectra modeling,
eclipsing binary, video-based adaptive optics, fractal
1. INTRODUCTION
Search and processing time of legacy data1 can be improved using waveform characterization and modeling techniques. In order to decrease the time required to identify similar information in a database, it is necessary to bypass time-consuming sequential processing of complete files.
With files sizes of photometric and spectrographic databases in excess of terabytes of information collected per month, a method other than sequential access search and correlation must be developed and integrated in the early planning stages to insure legacy archive data sets already have hooks and tagging information embedded. The SLOAN Digital Sky Survey2 recently made its second release of stellar object data to the public. This second release consisted of 88 million objects taking up more than 6 terabytes of space. This equates to between 50 kilobytes and 100 kilobytes of information for each object in the SLOAN data release. The SLOAN survey is able to acquire data on 640 objects simultaneously, thereby pushing data acquisition rates into the terabytes per month class. For example, mining the Sloan database for exo-planets requires sequential processing of over 100,000 candidate files. Once candidates are identified, discriminating and resolving binary stars and exo-planets requires another sequential processing pass of 10-20,000 files selected in the first processing step. The flexible file structure of FITS has sufficient overhead room to accommodate several lines in the header of
descriptive numbers. These numbers accurately and uniquely identify the data sets. Because these features are located in the header, processing of the entire file is avoided. Using Data Modeling theory, dynamic descriptive variables given in Equations (8) – (12) and (23) characterize each file in the Virtual Observatory database. Once generated, these six numbers are stored directly in the FITS header of the file, and are extracted for further processing or characterization of the files. *[email protected]; phone 1 256 337 3768; James Cook University; Centre for Astronomy; Townsville, QLD 4811; AUSTRALIA.
We propose to add the following 6 numbers into the FITS header as tags:
1. Square root of 2nd order moment (Standard Deviation) 2. 3rd order moment (Skewness) 3. 4th order moment (Kurtosis) 4. 6th order moment (M6) 5. 8th order moment (M8) 6. Unique physics feature specific to data type such as the wavelength where maximum occurs, classification
of object, Dewey Decimal subject area describing object, file type, or any other physics based descriptor. These 6 features are described in Figure 1 and in Equations (8) – (12) and (23). For the 161 stellar spectra in the Jacoby database, the value of these 6 tags is given in Table 1. Throughout this paper, various applications of these 6 tag feature values are shown. Actual examples of these tag numbers were incorporated into the paper and are found in the abstract for the entire body of the paper, as well as in Figure 18 for the accompanying images in the paper. For the entire paper tag in the abstract, the Dewey Decimal system classification for this subject area is used for tag 6. Figure 2 shows this SPIE paper transformed into a fractal curve. Creation of this curve and the tags is discussed in Section 2.11.
2. EXAMPLES
2.1 Correlation
One classic method for comparing an unclassified spectrum is correlation. Correlation determines which object the current set of measurements is most like. Correlation between two equal length data sets x and y is given by
∑∑∑
−−
−−=
i
i
i
i
i
ii
yyxx
yyxx
yxnCorrelatio)()(
))((
),( (1)
The output value for the correlation ranges in value between 0 and 1, where a value of 1 signifies that the two data sets are identical, and a correlation value of 0 that no similarities exist between the two data sets. Using correlation, unclassified measured stellar spectrum must be compared point by point with each file in the database. Once this processing is completed, there are still several limitations that limit its usefulness. These limitations include:
1. Exhaustive searching – requires presence of actual legacy data files rather than header information 2. Time consuming 3. Lack of a well-defined decision boundary 4. Sensitivity of correlation
Models of extensive databases can be published and distributed easily to investigators by processing file names and tag data rather than full data files to perform local initial searches. Hot links to legacy data files enables data pull on request. Each point in the file and database must be processed for correlations. Next, it is compared against all other correlations for maximum. This requires exhaustive searching. The determination of the maximum correlation is without a well-defined criteria. Figure 3 shows the result of using correlation as a classifier for the Jacoby data set. Although the requested example compared favorably with the Jacoby database (correlated higher than 0.9 with the same spectra from which it was derived), it also scored equally high with 140 out of 161 other members of the same database. This ambiguity makes correct identification difficult without a priori knowledge of truth. This is shown on the graph in Figure 3 with most of the files exhibiting correlations between 0.9 and 1.0.
2.2 Graphics based search methods
Virtual Observatory databases can be studied using graphics based searching3. This discovers and displays spatial relationships that exist across data files without a priori knowledge or assumptions. Automated clustering of groupings within the database allows visual presentation of files as point clouds demarcated by ellipses encompassing clusters and labeled according to average, most frequently occurring, and range of content. From this initial search, groups are easily deleted, merged, post-processed, or Data Modeled. Once an ellipse or group is selected for Data Modeling, a rapid and real-time equation is derived that recognizes the tagged features associated with the selected groups being related. This highly tuned and sensitive Data Model filter now enables rapid search and similarity match or reject with higher precision than correlation or nearest neighbor methods. Figure 4 shows this type of graphics based search engine applied to the 6 tags for the data. The presence of 7 different groups in the clustering space denotes different characteristics exist in the data. 2.3 Stellar object change detection model
If a star is un-catalogued or if there is doubt about its accepted classification, then a new classification must be performed. This new classification is done by comparing its spectrum with an atlas of well-studied spectra of bright stars. Figure 5 shows 13 standard digital spectra from the principal MK spectral types measured with a spectrograph along with a histogram describing the distribution of values in the representative spectra4,5. The number of bins in the histogram is selected to be equal to the number of points in the spectral curve.
A single Data Change Model is generated that uses only the descriptive features of the single unclassified object given in Equations (8) – (12) and (23). The feature values for each of the objects in the database (previously stored in the header) are then passed through the Data Change Model. Stellar objects that are unlike the new set of measurements will tip-off and can be ignored, while objects whose statistics are like the new set of measurements will not tip-off. This simple Data Change Model provides the ability to condense the universe of terabytes of stellar spectra into a short list of candidate files for further processing. An example Data Change Model for this process is shown in Figure 6. This model contains no nested loops and is read directly down the page. Reading this equation down the page is the same as reading the terms in the equation across the page from left to right on a single line. The Data Change Model is orthogonal, meaning that each successive layer in the equation builds on the previous ones and refines the solution. Any intermediate solution layer can be used as the current best estimate of the output. It should be noted that to save space, lines in the Data Model were combined together. 2.4 Modeling of stellar database features
For cases where the stellar class is known but the measured spectrum for the object is unavailable, estimates of the tags are generated from a Data Model of the features themselves. Using the calculated tags for each Jacoby stellar spectra in Table 1, separate independent equation models are generated for each element comprising the tag. These equations are based on the Turlington polynomial model of the form
( )
∑−
=
−
−
+−+−+=
1
2
001.101111 101log)(001.)()(
N
j
xx
jj
j
mmxxmyxT jj
jj
jxx
yym
−
−=
+
+
1
1 (2)
where xi is the stellar classification and yi the individual tag element values in Table 1. Source code for generating these equation models is listed in Figure 7. This code uses the x and y data from Table 1 for each element to generate separate database feature programs. It should be noted that (2) consists of N-2 terms from the portion inside of the summation, or one term less than the number of coefficients generated using the source code in Figure 7. Stellar classification (temperature, sub-temperature, and luminosity) is encoded into a single number that
uniquely rolls up the three different classifications into one. These classifications are combined using
XY.ZZ (3)
where X is the luminosity class, Y is the temperature class, and ZZ is the sub-temperature class. The luminosity class is defined as I = 1, II = 2, III = 3, IV = 4, and V = 5. The temperature class is defined as O = 1, B = 2, A = 3, F = 4, G = 5, K = 6, and M = 7. The sub-temperature class maps directly into ZZ as 05 to 95. Six (6) equation models (1 for each tag element) are generated by transforming the stellar classification into a
single number using (3), and substituting the values from Table 1 for the tag element directly into (2). The tag values in Table 1 and this transformation are all that is required to generate the model. The starting condition (x1, y1) is read from Table 1 directly for each tag element. 2.5 Stellar database modeling using change detection
Each stellar spectrum in the database is characterized using the descriptive statistics described in the next section. Once these descriptive statistics are calculated, Data Change Models are constructed that allow alike database objects to pass through, while rejecting via tip-off those that are different from the training set. These Data Change Models can be run either independently or in a cascade fashion, with the output being the correct stellar classification. In previous work, the authors have demonstrated that Data Modeling can be used to generated a classifier that 100% correctly classifies the 161 stellar spectra in the Jacoby database. This Data Model based classifier was built in a cascade fashion using the values of the spectra at individual spectral lines as input. The resulting Data Model was an O(34) = 81st order polynomial model contained within approximately 200 software lines of code (SLOC). Using change detection, the resultant Data Model is O(33)=27, and is captured by 20 SLOC. Fewer Data Models were required, and each change detector was much shorter and less complicated than the original Data Models generated in previous work. Figure 8 gives an example of a Data Change Model for determining if a stellar spectrum is class O or one of the other temperature classes in the Jacoby database. 2.6 Change detection for brightness curve modeling
Eclipsing binary star light curves can also be header tagged using the same descriptive statistics except for wavelength. Once tagged, these curves can be readily data mined using the same concept demonstrated previously with the Jacoby stellar spectra database. Determining if a star is an eclipsing binary is dependent on the viewing geometry and in turn the orbital inclination of the binary star system. Because of this sensitivity, measurements of the light curve of the star look remarkably different for different orbital inclinations, and therefore viewing geometries. In order to show the applicability of Data Modeling for brightness curve modeling, we first must demonstrate that a change detection model can be built that will recognize light curve data from the same object, but from different viewing aspects and orbital inclinations, as being from the same object. This change detection model must also reject light curves from other objects as being different. To illustrate this, 63 synthetic light curves were generated, characterized, and a Data Change Model constructed. Table 2 lists the parameters used to generate 63 synthetic light curves, examples of which are shown in Figure 9. This model correctly classified all 33 of the original light curves of the object for varying orbital inclinations between 58 and 90 degrees as being the same object. The statistics from the remaining 30 light curves were then passed through the Data Model, and all 30 (100%) were correctly flagged as not being the binary star system of interest. 2.7 Change detection for exo-planet detection
Automatic exo-planet processing6 includes sorting collected photometric data of systems containing either eclipsing binary stars or exo-planets into two classes. Currently, the process of determining if exo-planets are present in a data set consists of taking measurements, correlating the measurements, and stacking them over fixed time periods to reduce the overall noise in the data. Once enough samples have been stacked and the noise reduced, classical pattern matching and search methods are employed in order to detect exo-planets. A simple change detection model such as the one shown in Figure 10 is constructed from the exo-planet data of interest. Once constructed, data from light curves including exo-planets and binary stars can be passed through the Data Model, and only the exo-planet data will be accepted by the Data Model. All other data types will be rejected as not matching the object.
Table 3 lists the parameters used to generate 100 synthetic light curves to demonstrate this concept (50 exo-planet light curves plus the 50 binary star cases) as shown in Figure 9. As before, 100% of the training data containing an exo-planet was flagged correctly. Next, 50 additional synthetic binary star data light curves were generated, characterized, and passed through the change detector. From this 50, 43 were correctly labeled as binary stars and the remaining 7 labeled as exo-planets. This corresponds to a correct classification of 86% of the binary stars and 100% of the exo-planets. This facilitates and demonstrates a filter that pre-qualifies light curves as exo-planetary. Tables 2 and 3 are provided in this paper to enable other investigators to use the same data set for comparison. 2.8 Image Data Modeling
Data Modeling can be used to construct short and concise Turlington and eigenfunction models for extended image morphology. These models can be generated in real-time as the image is acquired, and once generated, can be stored in a much smaller space than the pixel values of the original image. Only a limited number of eigenfunction terms compared to the number of points in the original image need to be saved in order to capture the spatial morphology of the object class. This lends itself to autonomous characterization, analysis, and data mining of images for class as stellar field, nebula, or galaxy. To demonstrate this modeling concept, the authors used the image of M51 given in Figure 11. This image was transformed from 2-D to 1-D using the Hilbert sequence, thereby maintaining 2-D correlation. An eigenfunction based Data Equation Model was then constructed. Results of this model are given in Figure 11, along with the eigenfunction model. Pseudo-code for generating eigenfunction based Data Equation Models is given in Figure 12. A detailed treatment of this method is contained in Reference 7.
2.9 Image selection for stacking and software speckle imaging
We assert as a fundamental premise that nominal occurs more frequently than off-nominal. If the reverse is true,
then the definition simply swaps, making it still hold. Based on this assumption, we can proceed with an unsupervised algorithm that will determine normal from abnormal without a priori knowledge or representation of either. For the application of image stacking and enhancement, we apply this principle to see if a Change Detection Model can be autonomously generated that can identify sharp images from blurry ones without a priori training on these image distinctions. In the same manner that stellar spectra and light curves can be coherently averaged and stacked to remove noise, images of stellar objects can also be stacked once they are registered (correlated with one another). To date, methods for selecting the best images as candidates for stacking have required man-in-the-loop to view the image and make a determination as to whether the candidate image is considered a sharp image. It is desired to rank images in terms of sharpness and only pick the very best images for stacking because it enables further blind deconvolution and image enhancement methods to bring out more fine detail8,9,10,11,12,13. Generation of the Data Model can be done in two different ways: 1) supervised, and 2) unsupervised. In supervised mode, legacy database images are analyzed with man-in-the-loop to determine if individual frames are sharp or non-sharp. Once a few candidate sharp frames are identified, the statistics are extracted from the images and a Data Change Model constructed. The sharp images are considered to be nominal with a value of 0.5, and if the equation Data Change Model yields tip-off, the image is different from the training sharp image and is discarded. Additionally, Data Models may be constructed using sub-block regions of the image, calculating their individual statistics, and averaging either the statistics or the Data Model output in order to make a decision whether the image is sharp or non-sharp. Pseudo code for constructing a Data Change Model is given in Figure 13. In unsupervised mode, no a priori knowledge of images being sharp or non-sharp is required. It is only assumed
that the majority of the images are background (non-sharp). A Data Change Model that flags the images as being either sharp or non-sharp is constructed as shown in Figure 14. Pseudo-code for constructing a Data Change Model unsupervised is also given in Figure 13. For the images analyzed for this paper, the final Data Change Model was determined by placing a tight boundary of 0.0012 on either side of the final sharp image class designation. All images that flagged inside of the two boundaries are considered candidate sharp images, while all others are considered non-sharp images and are discarded from further consideration.
2.10 Image enhancement
Once these images are stacked, they can be processed further for enhancement using Data Modeling. The authors have determined that the Van Cittert deconvolution nonlinear image processing method can be approximated through the application of a series of high pass and low pass kernel filters to an image. Additionally, using knowledge of the number of times the filters will be applied, the individual high pass and low pass filter can be combined together into an equivalent kernel. An equivalent kernel is one that when applied to the image gives the same answer as if a series of individual kernels had been applied to the image one at a time. To illustrate this, consider an image processing sequence consisting of applying a high pass filter, followed by a low pass filter, followed finally by a Gaussian filter to an image. Examples of these filters are
111
1151
111
−−−−−−−−
111
131
111
121
242
121
High Low Gaussian (4) Convolving each of these three kernels together yields a larger kernel that is called the equivalent kernel. This is shown below:
111
1151
111
−−−−−−−−
⊗111
131
111
⊗121
242
121
=
4810841
4210161024
8109415294108
10161522521521610
8109415294108
4210161024
14810841
−−−−−−−−−−−−−−−−−−−−−−
1
−−−−−−
(5)
When the equivalent kernel is applied to the image, the result is the same as convolving the image with each of the
three smaller kernels separately. Mathematically, the order that the kernels are applied in does not matter. The result of the convolution process that creates the resultant kernel is the same as long as all the kernels are applied the correct number of times. If a high pass filter of size 3x3 is applied to an image alternatively with a low pass filter that is also of size 3x3, the resultant for one application of these 2 filters is the same as having used a single equivalent kernel of 5x5. Taking this one step further, a series of 5 sets of 3x3 high pass and 3x3 low pass filters could be applied to the image so that
== )(5),,,,( HPLPHPLPHPLPHPLPHPLPHPLP 21 x 21 Equivalent Kernel (6)
For this example, an equivalent kernel of size 21x21 would yield the same answer in a single pass. To keep from having to spatially convolve such a large kernel or to invoke a Fast Fourier Transform (FFT), the authors have instead derived a Data Model of the kernel. We have found that the equivalent kernel process can be approximated through the use of a Data Modeling
equation that uses either the equivalent of tag features describing the individual pixel neighborhoods operated on
by the equivalent kernel, or a subset of the pixel neighborhood values directly as input into the Data Modeling
equation. The subset of the pixel neighborhood chosen corresponds to the center row and center column of the kernel as shown in Figure 15. Figures 16 and 17 show examples of a series of images that were selected, processed, and stacked interactively with Registax, a commercial freeware program for registering, processing, and stacking images. Also presented in Figures 16 and 17 is the stacked image that results from using the Data Modeling concept, and the scores determined for each. Available was a total pool of 66 images for Mars and 225 images for the moon. For Mars, a total of 18 of the 300
images were flagged as sharp by the Data Model equation, registered, and stacked. For the moon, Data Modeling determined that a total of 38 out of 225 images were sharp and good candidates for stacking. As noted under the figures, the Data Modeling flagged images provided an image whose resultant quality factor was better (larger) than using the Registax algorithm. Figure 18 contains the tags for each of the images in Figures 16 and 17. These tags were calculated by converting the bitmaps into ASCII files consisting of 4 columns of numbers (index, red, green, and blue color planes). The files were then read in and processed as text files as described in Section 2.11. The tag values were generated using the real-time versions of the standard moments given by
1
1
1
1
+
+
++ +
=
i
i
i
iii
N
y
N
Nµµ (7) ( ) ( )
1
2
11
1
2
1
+
++
++
−+
i
ii
i
ii
N
y
N
µσ=i
Nσ (8)
1
3
11
1
+
++
+
−+
=i
i
iiii
iN
yNSkew
Skewσ
µ
(9) 3
)3(
1
4
11
1 −
−++
=+
++
+i
i
iiii
iN
yNKurt
Kurtσ
µ
(10)
15
)15(
1
6
116
16 −
−++
=+
++
+i
i
iiii
iN
yNM
Mσ
µ
(11) 105
)105(
1
8
118
18 −
−++
=+
++
+i
i
iiii
iN
yNM
Mσ
µ
(12)
Equations (8)-(12) generate the first five elements of the tag, and the final tag value describes the image class. In this example, 3.40 represents an image (3) of the fourth planet Mars (0.4), and of the planet itself (0). For 3.31, this represents an image (3) of the third planet Earth (0.3) and of the 1st moon (0.01). Figure 19 contains the source code for generating the tag values. 2.11 Text mining
The final application for data mining is the application of the Change Detection principle to the querying of text-only articles. If the text documents are transformed in an appropriate manner to facilitate characterization with the same tag features already in use throughout the foregoing examples, then the same principles and techniques already outlined can be brought to bear. This would enable examples of text to be searched for in documents by deriving the numerical values for the aggregate features and then using change detection equations to pass or tip-off text documents that are either related or different from the search vector. If it is desired to flag only the unique word content of the document, it can be processed as in Reference 2, and then only the unique word content further processed as described in this work. The ability to avoid scanning and processing full text documents is identical to avoiding image file processing or light curve processing by simply reading the descriptive header tags. If a similar set of tags can be generated for standard text files, then only the headers need to be read to uniquely identify information content. The problem we now address is illustrating how such descriptive statistics as used in all previous examples (standard moments) can be meaningfully derived form text. We initiate the process by converting ASCII text into its equivalent binary representation. First, we determine the base 2 equivalent representations for each character in the ASCII character set (numbers 0-255). Once these unique 8 digit long binary sequences are determined, this new table of binary sequences is used to convert each character in the text document under consideration into binary. When in the binary form, it is a simple matter to convert the binary sequence wi into an equivalent fractional Brownian motion increment zi with unitary value by the transformation
12 −= ii wz (13)
Integration of the sequence of the form
∑=
=i
j
ji zy0
(14)
yields a Weiner curve or classic 1/f type data curve as a transformation of the entire text document into a fractal curve. This fractal curve is now readily and uniquely characterized by the tag features14. The tags are stored in the header of the document and change detection models built using these headers for unique identification. Figure 20 illustrates how text is converted into a fractal curve for characterization, culminating in the tags for this text document in the abstract.
3. DATA MODELING 3.1 Theory
Data Modeling15
is the process of deriving a mathematical expression from measured data. This inverse modeling and parameter estimation process finds both the functional form of the equation and its numeric coefficients. In the general sense, a Volterra series model is used to describe the linear and non-linear portions of a dynamic system as a continuous process
∫∑∫∞∞
=
∞
−−=0
111)(
0 0
)()(),,()( qqq
q
q
ddtutuhty ττττττ KKKL (15)
which can be more efficiently represented in the discrete case by replacing the integrals with summations to yield16
∑ ∑ ∑∞
=
∞
=
∞
=
−−∆=0 0 0
11)(
1
)()(),,()()(q
t
q
tutuhtyτ τ
ττττ KKL (16)
For multiple inputs, the Volterra series in (15) and (16) requires the use of kernels and convolutions rather than simple multiplications. For a simple second order two input case, the multiple input Volterra series will include a zero-order kernel, two first-order kernels, two second-order self-kernels (kernels working on a single input) and one second-order cross kernel (kernels that act on different inputs). As the system order and number of inputs is increased, the number of kernels and convolutions becomes combinatorial. For this reason, the multiple-input Volterra series approach is only
practical for low-order systems with two or three inputs. Kolmogorov-Gabor presented a discrete polynomial form of the Volterra series that is more amenable to computer implementation, especially when more than three inputs are necessary. Data Modeling approximates the Kolmogorov-Gabor polynomial of the form
LK ++++= ∑∑∑∑∑∑i j k
kjiijk
i j
jiij
i
iiL xxxaxxaxaaxxxy 021 ),,,( (17)
Ivakhnenko demonstrated that the Group Method of Data Handling (GMDH) using nested functions of the form17
))))))))),,(((((((((),,,( 2121 kjinL xxxybybybyfxxxy KK = [ ] n
LxxxyO 3),,,( 21 =K (18)
may be used to converge to the Kolmogorov solution by approximation. However, the GMDH approach fails to yield a parsimonious form. The authors have combined GMDH with genetic evolutionary directed algorithms to obtain a
single directed solution and final expression18. This information flow is provided in the pseudo code in Figure 13 for
constructing Change Detection Models. This structured approach forms intermediate meta-variables from combinations of three inputs combined into a single new fused output. This fused meta-variable becomes a new input variable available at the next layer. Each meta-variable is a node and can be thought of as the decision node or common point for information flow that is highly correlated and
related to the output goal. This allows the information flow tree to be visualized as a directed graph and the information relationships to be discovered between the various inputs and the ultimate goal output. Since the algorithm only uses the inputs necessary to achieve convergence, pruning of inputs occurs automatically and requires no external intervention. To show how these polynomials can be used to derive differential equations, we note that polynomials can be used to model forcing functions and derivative terms directly. These polynomials can be thought of in equation form as
)(xfdx
dy= (19)
which represents an x or index only solution to the differential equation that is of the same form as a trajectory solution (a traditional polynomial uses an x index and associates with it a y value). If y happens to be a derivative, the derivative as a function of time is gained directly. To demonstrate how a dynamic system can be Data Modeled with the same polynomial, we use as input the previous or lagged derivative value and associate with it the current derivative value. Now the x index is no longer an input, and we have specified y completely dependent on previous values, yielding the typical form
)(yfdx
dy= (20)
If the analytic derivative is not directly available from the measured data, it can be derived from the Turlington polynomial given in Equation (2) following Jaenisch et.al.19, who showed its derivative is obtained directly from
( )
( )
( )∑∑−
=
−
= −
−
−+
+
−−+=
1
2,
1
1001.
001.
1
1
)10ln(
101
10
)001(.
))(001(.1)(
N
j
ik
kk
ii
xx
ixx
k
jj
i
k
k
j
j
mmkB
dx
Td ψ ∑≠≠−=
−=i
kjj
ij
jkik j
01
,1, ψψ (21)
The data can also be directly modeled using eigenfunctions, and although this process is slower, analytical derivatives and integrals are then available as well. 3.2 Change Detection
Figure 13 provides pseudo-code of the algorithm for creating change detection models. By training to the category value of ½, an orthogonal term model is produced whose sensitivity can be controlled using higher and lower tap points. Also, since the feature vector is rolled up using parametric and non-parametric statistics, the queriant is free to obtain when constructing the Change Detection model groups of descriptive statistics that are lumped into a large feature vector set. This feature vector set is labeled nominal and used to find matches or deviations, with matches scored as a zero flag and deviations scored as a one flag. Tuning20,21,22 makes the model more sensitive to feature values or groups of feature values that fall outside the training boundaries. Whenever new features or groups of features begin to depart from within the valid training boundaries, off-nominal or tip-off conditions are flagged23,24.
Rather than use the actual input feature values, we use parametric and non-parametric descriptions of the entire feature
value set. This treats the feature value set as a time series and models the distribution by tuning to a minimized
regression function. Tuning is achieved by setting the target value of the regression to minimize and characterizing the ensuing polynomial fluctuation around the converged minimum for the placement of decision thresholds. In practice the output feature value used for training is the categorical value of ½.
3.3 Change Model construction
Each Data Model is generated using only a portion (50% or less) of the available nominal data, with the
remainder held back for testing the Data Model after construction. This supports the Data Modeling re-sampling theory that only a portion of the available data is required to produce adequate change models. A feature vector data set or matrix is generated and denoted by
.
,2,1,
,22,21,2
,12,11,1
2
1
,
=
=
nmmm
n
n
m
nm
aaa
aaa
aaa
FV
FV
FV
FV
L
MMMM
L
L
M
(22)
Each row of feature vector FV contains elements that are generated using different descriptive statistics. For analyzing the Jacoby stellar spectra, six (6) descriptive statistics that constitute the FV are given by
)( jxMax=λ (23)
Once the features in Equations (8) - (12) and (23) are calculated for each nominal data set, a Data Change Model is generated using a target regression value of ½ as the nominal output. Once the equation model is trained, boundaries are placed on either side of the target regression value that characterizes the fluctuation of the polynomial about ½. Typically, these boundaries are set at 0.499 and 0.501 with a standard deviation of .0012.
4. FUTURE WORK
Model based archive searching enables a full compliment of mathematical tools to be used, including the method of derivatives, finite element analysis, and automatic differential equation modeling. The automatic generation of analytical function models of stellar and photometric data allows interpolation and modeling in the form of analytical derivatives for use in constructing differential equation models across related groups. A Data Modeling algorithm is needed to decompose an equivalent kernel into the convolution of a single column and a single row vector. Finally, we propose an alternative to adaptive optics and speckle imaging using real-time Data Models to enable unsupervised sharp image detection for stacking and image enhancement. Using the concepts of Data Modeling, a novel real-time video and software based artificial adaptive optical system is possible.
5. SUMMARY
In this work, the authors have successfully demonstrated the use of Data Modeling and the tag descriptors to construct Change Detection models of individual data files, models of stellar database features, models of the stellar database itself, brightness curve modeling, exo-planet detection, stellar object image modeling, image selection for stacking and speckle imaging, stellar image enhancement, and text mining. Index tagging of Virtual Observatory data files with descriptive statistics enables both rapid data mining and mathematical modeling. The inclusion of these tags directly into the FITS structure facilitates a substantial improvement in search and processing time of legacy databases. Data Change Detector Modeling uses the descriptive tags for each object to determine if matches to the file of interest exist in the database. The recording of all future observational measurements should include the calculation and storage of these tags. For currently existing databases, computer programs should be written and executed that post-process and add these features to the information stored in current databases. This would facilitate rapid dissemination and sharing of model based information.
ACKNOWLEDGMENTS
The authors would like to thank Scott McPheeters, Tim Aden, and John Deacon for their continued support during the course of this work.
REFERENCES
1. National Virtual Observatory, http://www.us-vo.org/summer-school/index.html, May 24, 2004. 2. Sloan Digital Sky Survey, http://www.sdss.org/, May 20, 2004. 3. Jaenisch, H.M., Handley, J.W., Case, C.T., Songy, C.G., “Graphics based intelligent search and abstracting using
Data Modeling”, Proceedings of SPIE, Seattle WA July 7-11 2002. 4. Jaenisch, H.M., Filipovic, M.D., “Classification of Jacoby Stellar Spectra Using Data Modeling”, Proceedings of
SPIE Vol.4816 (2002). 5. Jaenisch, H.M., Collins, W.J., Handley, J.W., Hons, A., Filipovic, M.D., Case, C.T., Songy, C.G., “Real-time visual
astronomy using image intensifiers and Data Modeling”, Proceedings of SPIE Vol. 4796 (2002). 6. Hudgins, D., “Catch an Extrasolar Planet”, Astronomy, pp. 76-79, May 2002. 7. Handley, J., Jaenisch, H., Lim, A., White, G., Pennypacker, C., Edwards, M., “Data Modeling of Deep Sky Images”,
Proceedings of SPIE Vol. 5497 (2004). 8. Cidadao, A. J., “Thoughts on Super-Resolution Planetary Imaging”, S&T, pp. 127-134, December, 2001. 9. Jaenisch, H.M., Handley, J.W., Scoggins, J., Carroll, M.P., "Atmospheric Optical Turbulence Model (ATOM) Based
On Fractal Theory, " Proceedings of SPIE, Los Angeles, CA January 24, 1994. 10. Roddier, F. Adaptive Optics in Astronomy. Cambridge, UK: Cambridge University Press, 1999. 11. Lim, A., Jaenisch, H., Handley, J., Berrevoets, C., White, G., Deragopian, G., Payne, J., Schneider, M., “Image
Resolution and Performance Analysis of Webcams for Ground Based Astronomy”, Proceedings of SPIE Vol. 5489 (2004).
12. Lambert, A., Fraser, D., Jahromi, M.R.S., Hunt, B.R., “Super-resolution in image restoration of wide area images viewed through atmospheric turbulence”, Proceedings of SPIE Vol. 4792 (2002).
13. Fraser, D., Lambert, A., Jahromi, M.R.S., Clyde, D., Donaldson, N., “Can broad-band image restoration rival speckle restoration?”, Proceedings of SPIE Vol. 4792 (2002).
14. Jaenisch, H. and Handley, J., “Data Modeling of 1/f noise sets”, Proceedings of SPIE Vol. 5114 (2003). 15. Jaenisch, H.M. and Handley, J.W., “Data Modeling for Radar Applications”, Proceedings of the IEEE Radar
Conference 2003, (May 2003). 16. Westwick, D.T. Kearney, R.E., Identification of Nonlinear Physiological Systems, Piscataway, NJ: IEEE, 2003. 17. Madala, H.R., Ivakhnenko, A.G., Inductive Learning Algorithms for Complex Systems Modeling, Boca Raton, FL:
CRC Press, 1994. 18. Jaenisch, H.M., and Handley, J.W., “Automatic Differential Equation Data Modeling for UAV Situational
Awareness”, Society for Computer Simulation, Huntsville Simulation Conference 2003, (October 2003). 19. Jaenisch, H.M., Handley, J.W. , Faucheux, J.P., “Data Driven Differential Equation Modeling of fBm processes”,
Proceedings of SPIE Vol. 5204(2003). 20. Jaenisch, H.M., Handley, J.W., Pooley J.C., Murray S.R., “Data Modeling for Fault Detection”, Society for
Machinery Failure Prevention Technology (MFPT), (April 2003), published in Proceedings of MFPT (2004). 21. Pooley, J.C. and Murray, S.R., “Rotordynamic Fault Diagnostics Using Phase Coherent Filtering”, Proceedings of
the Society for Machinery Failure Prevention Technology (MFPT), (April 2003). 22. Pooley, J.C., Murray, S.R., Jaenisch, H.M., and Handley, J.W., “Fault Detection via Complex Hybrid Signature
Analysis”, JANNAF 39th Combustion, 27th Airbreathing Propulsion, 21st Propulsion Systems Hazards, and 3rd
Modeling and Simulation Subcommittees Joint Meeting, Colorado Springs, CO, (December 2003). 23. Jaenisch, H.M., Handley, J.W., Bonham, L.A., “Enabling Calibration on Demand for Situational Awareness”, Army
Aviation Association of America (AAAA), Tennessee Valley Chapter, (February 2003). 24. Jaenisch, H.M., Handley, J.W., Faucheux, J.P., Harris, B., “Data Modeling of Network Dynamics”, Proceedings of
SPIE Vol. 5200 (2003).
FIGURES
Tip-Off
Tip-Off
Nominal
Tip-Off
Tip-Off
Nominal
Equation
Output
σ
T = w0+w1x1+w2x2+w3x3 +w4x12
+w5x22 +w6x3
2 +w7x13 +w8x2
3
+w9x32 +w10x1x2 +w11x1x3
+w12x2x3
Skewness Standard
Deviation
Kurtosis
Rollup
Stats
Data Model
Data
SeriesPDF
Fig. 1: Data Change Modeling uses descriptive statistics as shown above that characterize the PDF or histogram of the input data to
build a model that is sensitive to small changes in input feature distribution.
Fig. 2 : This SPIE paper transformed into a fractal curve (left), and the histogram showing its distribution.
0 Correlation 1
0N
umbe
r of
Spe
ctra
37
Classes O/B
Class B/A/F
Class A/F
Class B/A
Class F/G/KClass M
Class G/K/M
Fig. 3 (Left) and Fig. 4 (Right): (Left) Correlation is not an acceptable classifier. 140 out of 161 objects score correlations above 90;
(Right), Graphics based search engine applied to Jacoby features.
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
3510 wavelength (angstroms) 7430 0 1
0 13510 wavelength (angstroms) 7430
O5V
B0V
B6V
A1V
A5V
F0V
F5V
G0V
G6V
K0V
K5V
M0V
M5V
HistogramsJacoby Spectra
O5V
B0V
B6V
A1V
A5V
F0V
F5V
HistogramsJacoby Spectra
G0V
G6V
K0V
K5V
M0V
M5V
Fig 5: Jacoby spectra examples. Full Jacoby spectra ranges from 3510 Å to 7427.2 Å. The mathematical description of these histograms using 6 descriptive statistics are proposed as index tags.
DEFCUR A-Z:pi=4@*atn(1@):open "input" for input as #1:input #1, stdev:input #1, skew:input #1, kurt:input #1, m6
input #1, m8:close #1:stdev=(stdev-0.1187)/0.0001:skew=(skew-0.3675)/0.002:kurt=(kurt+0.767)/0.0037
m6=(m6+7.844)/0.0285:m8=(m8+76.5682)/0.2019:l1o1=-0.5787-0.5977*skew+3.7225*kurt-3.0537*m8+0.1586*skew*skew
l1o1=l1o1+0.3774*kurt*kurt+1.2794*m8*m8+0.2712*skew*kurt+0.5787*skew*m8-1.9491*kurt*m8-0.503*skew*kurt*m8
l1o1=l1o1+0.0438*skew*skew*skew-1.1006*kurt*kurt*kurt+1.6117*m8*m8*m8-0.5895*kurt*skew*skew+0.1889*skew*kurt*kurt
l1o1=l1o1+0.5122*skew*m8*m8+0.738*m8*skew*skew+2.9299*m8*kurt*kurt-3.6302*kurt*m8*m8
l1o2=-0.821+0.3054*m6+1.1897*stdev+0.6089*kurt+5.2783*m6*m6+0.1364*stdev*stdev+3.2261*kurt*kurt+0.0459*m6*stdev
l1o2=l1o2-8.5618*m6*kurt+0.0052*stdev*kurt+0.7006*m6*stdev*kurt+1.5339*m6*m6*m6-0.112*stdev*stdev*stdev
l1o2=l1o2-1.6305*kurt*kurt*kurt-0.3927*stdev*m6*m6-0.7456*m6*stdev*stdev+4.9791*m6*kurt*kurt-5.0348*kurt*m6*m6
l1o2=l1o2+0.6639*kurt*stdev*stdev-0.4826*stdev*kurt*kurt
l2o1=0.1901+0.491*l1o1-0.8597*skew+0.9397*l1o2-0.059*l1o1*l1o1-0.2106*skew*skew-0.0241*l1o2*l1o2+0.1474*l1o1*skew
l2o1=l2o1+0.0242*l1o1*l1o2+0.1736*skew*l1o2-0.6347*l1o1*skew*l1o2+0.0232*l1o1*l1o1*l1o1+0.021*skew*skew*skew
l2o1=l2o1-0.1672*l1o2*l1o2*l1o2+0.1348*skew*l1o1*l1o1+0.4039*l1o1*skew*skew+0.3466*l1o1*l1o2*l1o2
l2o1=l2o1-0.254*l1o2*l1o1*l1o1-0.4502*l1o2*skew*skew+0.7386*skew*l1o2*l1o2
l2o3=0.2277+0.1695*l1o2+0.1386*m6+0.6821*l1o1-0.3724*l1o2*l1o2+0.1918*m6*m6-0.1865*l1o1*l1o1-0.0393*l1o2*m6
l2o3=l2o3+0.3757*l1o2*l1o1+0.121*m6*l1o1-0.717*l1o2*m6*l1o1+0.1848*l1o2*l1o2*l1o2-0.0936*m6*m6*m6
l2o3=l2o3-0.0106*l1o1*l1o1*l1o1+0.3308*m6*l1o2*l1o2-0.2212*l1o2*m6*m6+0.0678*l1o2*l1o1*l1o1
l2o3=l2o3-0.1594*l1o1*l1o2*l1o2+0.5126*l1o1*m6*m6+0.0819*m6*l1o1*l1o1
l3o3=-0.2074+0.8909*l2o1+0.3944*m8+0.2778*l2o3+0.2988*l2o1*l2o1+0.0589*m8*m8+0.2505*l2o3*l2o3-0.2058*l2o1*m8
l3o3=l3o3-0.4905*l2o1*l2o3+0.102*m8*l2o3+0.2339*l2o1*m8*l2o3+0.3629*l2o1*l2o1*l2o1-0.0622*m8*m8*m8
l3o3=l3o3-0.7296*l2o3*l2o3*l2o3+0.2209*m8*l2o1*l2o1-0.2667*l2o1*m8*m8+2.1015*l2o1*l2o3*l2o3
l3o3=l3o3-1.7526*l2o3*l2o1*l2o1+0.3038*l2o3*m8*m8-0.5106*m8*l2o3*l2o3
l4o2=0.0496+0.715*l3o3+0.353*m6+0.025*skew+0.0059*l3o3*l3o3-0.0501*m6*m6-0.0724*skew*skew+0.0167*l3o3*m6
l4o2=l4o2-0.16*l3o3*skew+0.1811*m6*skew+0.1065*l3o3*m6*skew+0.0429*l3o3*l3o3*l3o3-0.0413*m6*m6*m6
l4o2=l4o2+0.0148*skew*skew*skew-0.0067*m6*l3o3*l3o3-0.0393*l3o3*m6*m6+0.1431*l3o3*skew*skew
l4o2=l4o2-0.0834*skew*l3o3*l3o3+0.0357*skew*m6*m6-0.1426*m6*skew*skew:l4o2 = l4o2* (0.0004) + (0.5)
If l4o2>.4 and l4o2<.6 then iflag=0 else iflag=1
open "out1" for output as #1:print #1, l4o2,iflag:close #1:end Fig. 6: Change Detection Data Model for example spectra from Jacoby database. QBASIC source code is autonomously generated.
DECLARE SUB writefunc (a!(), b!(), cc!, m!, ycept!, iseg!, n!)
DEFSNG A-Z: ON ERROR RESUME NEXT: SCREEN 9: CLEAR : CLS
INPUT "File => ", zz$: INPUT "Number of points => ", n
INPUT "Filter Coeff (0.001) => ", cc: IF cc = 0! THEN cc = .001
REDIM xdata(n), ydata(n): OPEN zz$ FOR INPUT AS #1
FOR i = 1 TO n: INPUT #1, ydata(i): xdata(i) = (i - 1) / (n - 1): NEXT i
CLOSE #1: ymin = ydata(1): ymax = ydata(1): xmin = xdata(1): xmax = xdata(1)
FOR i = 2 TO n
IF ydata(i) < ymin THEN ymin = ydata(i)
IF ydata(i) > ymax THEN ymax = ydata(i)
IF xdata(i) < xmin THEN xmin = xdata(i)
IF xdata(i) > xmax THEN xmax = xdata(i)
NEXT i
FOR i = 1 TO n: ydata(i) = (ydata(i) - ymin) / (ymax - ymin)
xdata(i) = (xdata(i) - xmin) / (xmax - xmin): NEXT i
iseg = 1: REDIM mslope(n - 1), dslope(n - 1), a(n - 1), b(n - 1)
FOR i = 1 TO n - 1: xden = xdata(i + 1) - xdata(i)
IF xden <> 0 THEN mslope(i) = (ydata(i + 1) - ydata(i)) / (xdata(i + 1) - xdata(i))
NEXT i
FOR i = iseg + 1 TO n - 1: dslope(i) = mslope(i) - mslope(i - 1): a(i) = dslope(i) * cc: b(i) = xdata(i): NEXT i
m = mslope(iseg): ycept = ydata(iseg) - m * xdata(iseg)
CLS : WINDOW (0, 1)-(1, 0): xstep = 1 / n
FOR i = 1 TO n - 1: LINE (xdata(i), ydata(i))-(xdata(i + 1), ydata(i + 1)), 4: NEXT i
REDIM ytmp(n): ijk = 0: FOR k = 1 TO n: i = xdata(k): y = m * i + ycept
FOR j = iseg + 1 TO n - 1: y = y + a(j) * LOG(1 + 10 ^ ((i - b(j)) / cc)) / LOG(10): NEXT j: ytmp(k) = y: NEXT k
OPEN "datamdl.out" FOR OUTPUT AS #1: FOR i = 1 TO n: PRINT #1, ymin + ytmp(i) * (ymax - ymin): NEXT i: CLOSE #1
k = 0: FOR i = 1 TO n - 1: LINE (xdata(i), ytmp(i))-(xdata(i + 1), ytmp(i + 1)), 15: IF a(i) <> 0! THEN k = k + 1
NEXT i: REDIM aa(k), bb(k): k = 0: FOR i = 1 TO n - 1
IF a(i) <> 0! THEN k = k + 1: aa(k) = a(i): bb(k) = b(i)
NEXT i
OPEN "datamdl.bas" FOR OUTPUT AS #10
PRINT #10, "defsng a-z : on error resume next :cls: c="; cc
PRINT #10, "n="; k
PRINT #10, "redim a(n-1),b(n-1)"
PRINT #10, "for i=1 to n-1 : read a(i) : read b(i) : next i"
PRINT #10, "open " + CHR$(34) + "datamdl.out" + CHR$(34) + " for output as #1"
PRINT #10, "for x1=0 to"; n - 1
PRINT #10, "x=x1/"; (n - 1)
PRINT #10, "y = "; m; " * x + "; ycept
PRINT #10, "for j=1 to n-1 : y = y + a(j) * LOG(1 + 10 ^ ((x - b(j)) / c)) / LOG(10) : NEXT j"
PRINT #10, "y="; ymin; "+y*("; ymax; "-"; ymin; "):print #1, y : next x1 : close:end"
k1 = 1: FOR j = 1 TO k STEP 4
PRINT #10, "Data "; : PRINT #10, aa(j); ","; bb(j); ","; : k1 = k1 + 1
IF k1 > k THEN EXIT FOR
PRINT #10, aa(j + 1); ","; bb(j + 1); ","; : k1 = k1 + 1
IF k1 > k THEN EXIT FOR
PRINT #10, aa(j + 2); ","; bb(j + 2); ","; : k1 = k1 + 1
IF k1 > k THEN EXIT FOR
PRINT #10, aa(j + 3); ","; bb(j + 3): k1 = k1 + 1
IF k1 > k THEN EXIT FOR
NEXT j: CLOSE #10: ERASE aa, bb: LOCATE 23, 1: INPUT "", za$: END Fig. 7: Source code for generating Turlington polynomial models.
DEFCUR A-Z:pi=4@*atn(1@):open "input" for input as #1:input #1, peakloc
input #1, oskew:input #1, okurt:close #1
peakloc=(peakloc-8.8947)/4.3213:oskew=(oskew-1.1492)/0.0438:okurt=(okurt-0.2802)/0.1467
l1o2=-0.3336+0.0971*sin(6.2832*oskew)+1.7721*cos(6.2832*oskew)-1.1296*sin(6.2832*peakloc)
l1o2=l1o2+0.8456*cos(6.2832*peakloc)-0.3148*sin(6.2832*okurt)+0.4756*cos(6.2832*okurt)
l2o2=0.3536+1.1824*l1o2-0.3348*peakloc+0.6246*okurt-0.0529*l1o2*peakloc+0.0322*l1o2*okurt
l2o2=l2o2-0.1666*peakloc*okurt+0.4881*l1o2*peakloc*okurt
l1o1=0.143-1.7749*oskew-0.2264*peakloc+1.6572*okurt+3.3617*oskew*peakloc-0.0453*oskew*okurt
l1o1=l1o1-3.806*peakloc*okurt+0.0729*oskew*peakloc*okurt
l2o3=-0.1688+0.879*l1o2-0.3262*l1o1+0.2256*l1o2*l1o2+0.6561*l1o1*l1o1-0.9226*l1o2*l1o1
l2o3=l2o3-0.0518*l1o2*l1o2*l1o2+0.4181*l1o1*l1o1*l1o1+0.8435*l1o1*l1o2*l1o2-0.9513*l1o2*l1o1*l1o1
l3o2=-0.1369+0.17*l2o2+0.0712*l2o3-0.0834*l2o2*l2o2-0.1485*l2o3*l2o3+0.2223*l2o2*l2o3
l3o2=l3o2-0.7185*l2o2*l2o2*l2o2+2.5702*l2o3*l2o3*l2o3+4.437*l2o3*l2o2*l2o2-6.1732*l2o2*l2o3*l2o3
l3o2 = l3o2*0.0004+0.5001
IF l3o2 > .4988 AND l3o2 < .5012 THEN iflag = 0 ELSE iflag = 1
open "out" for output as #1:print #1, l3o2,iflag:close #1:end
Fig. 8: Change Detection Data Model for separating “O” class from other temperature class spectra. When output for iflag is zero, temperature class is “O”. QBASIC source code is autonomously generated.
Binary Stars&
Exo-Planets
Binary Stars&
Exo-Planets
Exo-Planet
Binary Star
Binary Star
Fig. 9: Example binary star and exo-planet light curves (left); zoom in to 3 specific curves on right to show overlap.
DEFCUR A-Z:pi=4@*atn(1@):open "input" for input as #1:input #1, stdev:input #1, m8:close #1
stdev = (stdev-0.0119) / 0.0071 : m8 = (m8-397208.9192) / 2.5846
l1o2=-0.1275+1.3376*stdev+0.2687*m8
l1o1=0.2485-1.0431*stdev-0.0192*m8-0.7598*stdev*stdev+0.2475*m8*m8+0.2762*stdev*m8
l1o1=l1o1+0.9751*stdev*stdev*stdev-0.0004*m8*m8*m8-0.0943*m8*stdev*stdev+0.2795*stdev*m8*m8
l1o3=0.4628-0.8694*stdev-0.9955*stdev*stdev+1.0526*stdev*stdev*stdev)
l2o3=0.1471-0.2874*l1o2+0.4846*l1o1+0.3234*l1o3-0.1091*l1o2*l1o2+0.0417*l1o1*l1o1
l2o3=l2o3+0.2806*l1o3*l1o3+0.5575*l1o2*l1o1+0.3354*l1o2*l1o3-0.9087*l1o1*l1o3
l2o3=l2o3-0.0048*l1o2*l1o1*l1o3+0.1274*l1o2*l1o2*l1o2-0.0065*l1o1*l1o1*l1o1
l2o3=l2o3+0.0768*l1o3*l1o3*l1o3-0.6829*l1o1*l1o2*l1o2+0.6652*l1o2*l1o1*l1o1
l2o3=l2o3+0.0161*l1o2*l1o3*l1o3+0.095*l1o3*l1o2*l1o2-0.374*l1o3*l1o1*l1o1+0.1275*l1o1*l1o3*l1o3
l2o2=0.3835+0.0005*l1o3-0.926*stdev-0.422*l1o1+0.0013*l1o3*l1o3-0.5794*stdev*stdev+0.912*stdev*l1o1*l1o1
l2o2=l2o2-0.2772*l1o3*stdev+0.2485*l1o3*l1o1+0.6057*stdev*l1o1+0.0109*l1o3*stdev*l1o1 -0.53*l1o1*l1o1
l2o2=l2o2-0.1332*l1o3*l1o3*l1o3+0.7475*stdev*stdev*stdev+0.2173*l1o1*l1o1*l1o1-0.1737*stdev*l1o3*l1o3
l2o2=l2o2-0.687*l1o3*stdev*stdev-1.2963*l1o3*l1o1*l1o1+1.0067*l1o1*l1o3*l1o3+0.4285*l1o1*stdev*stdev
l3o3=-0.0042-0.2612*l2o3-0.1423*stdev+1.1172*l2o2+0.0017*l2o3*l2o3+0.0039*stdev*stdev-0.1494*l2o2*l2o2
l3o3=l3o3-0.9722*l2o3*stdev-0.0942*l2o3*l2o2+1.4563*stdev*l2o2-0.0026*l2o3*stdev*l2o2+0.077*l2o3*l2o3*l2o3
l3o3=l3o3+0.1056*stdev*stdev*stdev-0.2506*l2o2*l2o2*l2o2-0.2379*stdev*l2o3*l2o3+0.4022*l2o3*stdev*stdev
l3o3=l3o3+0.349*l2o3*l2o2*l2o2-0.2614*l2o2*l2o3*l2o3-0.6417*l2o2*stdev*stdev+0.556*stdev*l2o2*l2o2
l2o1=-0.8407-2.3665*l1o2+4.7385*stdev+0.5401*l1o1+0.0312*l1o2*l1o2+4.9348*stdev*stdev-0.0819*l1o1*l1o1
l2o1=l2o1-2.0565*l1o2*stdev-0.3378*l1o2*l1o1-0.4258*stdev*l1o1+0.7526*l1o2*stdev*l1o1+2.0705*l1o2*l1o2*l1o2
l2o1=l2o1-15.1765*stdev*stdev*stdev-0.0116*l1o1*l1o1*l1o1-10.5109*stdev*l1o2*l1o2+19.9498*l1o2*stdev*stdev
l2o1=l2o1+0.483*l1o2*l1o1*l1o1-1.0323*l1o1*l1o2*l1o2+2.6964*l1o1*stdev*stdev-0.9501*stdev*l1o1*l1o1
l3o2=0.0355-0.1555*l2o3-0.203*stdev+0.2399*l2o1+0.0036*l2o3*l2o3+0.1149*stdev*stdev-0.4096*l2o1*l2o1
l3o2=l3o2+0.0332*l2o3*stdev+0.1212*l2o3*l2o1+0.6159*stdev*l2o1-0.0709*l2o3*stdev*l2o1+0.9553*l2o3*l2o3*l2o3
l3o2=l3o2+0.0599*stdev*stdev*stdev-0.0111*l2o1*l2o1*l2o1-0.5923*stdev*l2o3*l2o3-0.1128*l2o3*stdev*stdev
l3o2=l3o2+0.4931*l2o3*l2o1*l2o1-1.475*l2o1*l2o3*l2o3-0.1108*l2o1*stdev*stdev+0.92*stdev*l2o1*l2o1
l4o2=0.0026+0.3854*l3o3-0.0285*stdev+0.407*l3o2+0.0002*l3o3*l3o3+0.0057*stdev*stdev+0.0013*l3o2*l3o2
l4o2=l4o2+0.1058*l3o3*stdev-0.0742*l3o3*l3o2+0.1359*stdev*l3o2-0.1609*l3o3*stdev*l3o2+0.004*l3o3*l3o3*l3o3
l4o2=l4o2-0.0007*stdev*stdev*stdev-0.0513*l3o2*l3o2*l3o2+0.3105*stdev*l3o3*l3o3-0.3124*l3o3*stdev*stdev
l4o2=l4o2+0.1142*l3o3*l3o2*l3o2-0.1181*l3o2*l3o3*l3o3-0.2476*l3o2*stdev*stdev+0.217*stdev*l3o2*l3o2
l4o2=l4o2*0.0002+0.5
IF l4o2 > .4988 AND l4o2 < .5012 THEN iflag = 0 ELSE iflag = 1
open "out" for output as #1 : print #1, l4o2, iflag : close #1 : end
Fig. 10: Change Detection Data Model for distinguishing binary stars from exo-planets. An output of iflag=0 denotes an exo-planet. QBASIC source code is autonomously generated.
Y(x)=255(6.4651- 4.2064COS(6πx)+.9284SIN(6πx)-3.4306COS(18πx)-.0182SIN(18πx)+2.9895COS(12πx)+.9458SIN(12πx)+2.2042COS(36πx)-.1SIN(36πx)
-1.6436COS(10πx)-1.3934SIN(10πx)-2.1054COS(6πx)+.4545SIN(6πx)- 1.7485COS(2πx)-.851SIN(2πx)+1.7537COS(8πx)-.6431SIN(8πx)+1.5867COS(24πx)-.9171SIN(24πx)-1.7154COS(18πx)-.0328SIN(18πx)+1.4905COS(12πx)+.4867SIN(12πx)-1.3774COS(30πx)-.3051SIN(30πx)+1.1039COS(36πx)-.0196SIN(36πx)-1.0041COS(42πx)+.3935SIN(42πx)-.8165COS(10πx)-.703SIN(10πx)-1.0538COS(6πx)+.2224SIN(6πx)+.9674COS(22πx)-.4429SIN(22πx)+1.0181COS(72πx)-.1377SIN(72πx)-.3388COS(20πx)+.9632SIN(20πx)+.4204COS(4πx)+.9096SIN(4πx))/64
Fig. 11: Examples of image Data Modeling on M51. (Top Left) original; (Top Center) image converted to a Hilbert sequenced time series; (Top Right) resultant eigenfunction Data Model of M51; and (Bottom) automatically generated eigenfunction Data Model
equation using 20 sine and cosine terms.
rem initialize
redim xdata(n),ydata(n),extreme(n),totaldat(n),r(n),c(n),newdata(n),newdata1(n),r1(n),c1(n)
rem open x and y data array files
call open_data_files(xdata,n)
call open_data_files(ydata,n)
rem sort x into ascending order and sort y with x to retain x associations
call sort2(n,xdata,ydata)
rem put ydata into newdata
for i=1 to n
newdata(i)=ydata(i)
next i
rem specify objective correlation value
cobj=0.99
rem calculate Data Modeling eigenfunction model of y(x)
rem final result held in totaldat; initialize k1 to hold number of terms
k1=0
10 continue
k1=k1+1
rem generate Fourier transform of data and corresponding power spectra
rem use linear regression to fit a line through the dB power spectra
rem identify maxima in power spectra that occur above the linear fit
rem pull out maxima locations in array extreme
rem find maximum value in array extreme, and use location to
rem extract real and imaginary Fourier terms held in r (real) and c(imaginary)
rem for use in eigenfunction model
call calc_extremes(ydata,n,extreme,r,c)
rem find maximum value location ; only look at first half of data since symmetric
rem ignore zeroth order location
n1=int(n/2)
dmax = extreme(1)
dloc = 1
FOR i = 2 TO n1 - 2
IF extreme(i) > dmax THEN
dmax = extreme(i)
dloc = i
END IF
NEXT i
rem construct sine and cosine representation of dloc frequency at sampling equal to original
rem r1 and c1 hold eigenfunction coefficients, d the frequency term
r1(k1)=r2(dloc)
c1(k1)=c2(dloc)
d(k1)=dloc
FOR i = 1 TO n
newdata1(i) = newdata1(i) + CCUR(r2(dloc)) * COS(dloc * 2# * pi * (i / N))
newdata1(i) = newdata1(i) + CCUR(c2(dloc)) * SIN(dloc * 2# * pi * (i / N))
newdata1(i) = newdata1(i) / SQR(N)
NEXT i
rem calculate residual between original and summation of all terms generated so far
FOR i = 1 TO n
newdata(i) = newdata(i) - newdata1(i)
NEXT i
rem add current term to previous terms
FOR i = 1 TO n
totaldat(i) = totaldat(i) + newdata1(i)
NEXT i
rem calculate correlation between original and current model (totaldat)
CALL correl(totaldat,ydata,N,ycorrel)
rem test to see if correlation criteria met
rem test to see if maxterms criteria met
IF ycorrel < cobj AND k1 < .125 * N THEN
GOTO 99
END IF
rem output final model
CALL output_eigen(r1,c1,d,k1,N)
END Fig. 12: Pseudo-code for Data Modeling eigenfunction model construction.
N=total number of cases
Call Generate_Waveform_Stats(ydata,n,ystats)
Call Generate_Rollup_Stats(ystats,n,rstats)
For i=1 to n
rem 10% Sub-sample nominal cases
Next i
M=number of sub-sampled points
For i=1 to m
rem Construct target output value of ½
rem Allow first 10% of points to be 0.499
rem Allow last 10% of points to be 0.501
Next i
10 continue
rem ______________________________________________________________
rem Data Relations Modeling
rem Construct Data Model using M Stats as input and the target output
rem Use multivariable linear regression to fit polynomial building blocks
rem using at most 3 input variables to map into the target output
rem Use correlation to rank inputs and reduce number if desired
Call correlate_inputs
Call prune_inputs
for i=1 to ninputs
for j=1 to ninputs
for k=1 to ninputs
rem select input #1 from stack
rem select input #2 from stack and different than input #1
rem select input #3 from stack and different than input #1 or #2
for L=1 to number_object_types
call mult_var_regression
call calc_error
next L
next k
next j
next i
rem select D=3 best solutions from above and put on the input stack as functionals
rem repeat until error criteria or max number of layers is achieved
rem score best current case to error criteria
if current_error<=minimum_error then
rem finished
else
goto 10
endif
call output_data_model
rem ______________________________________________________________
For i=1 to n
rem Interrogate Data Relations Model
Call Interrogate_data_model(features,Dmval)
If Dmval>.4988 and Dmval<.5012 then
rem nominal
iflag=0
else
rem tip-off
iflag=1
endif
Next i
end
n=total num ber of cases
Call G enerate_W aveform _Stats(ydata,n,ystats)
Call G enerate_Rollup_Stats(ystats,n,rstats)
For i=1 to n
rem Sub-sam ple of all cases called Group 1
Next i
m =num ber of subsam pled cases
rem use previous pseudo-code for generating a Data Change M odel
call generate_data_change_m odel
fori=1 to n
call Interrogate_data_m odel(features,Dm val)
call flag_tipoffs
rem Save tip-off data as new Group 2
next i
for i=1 to n
rem subsam ple another m points from new pool
rem place with 1st 10% subsam pled from original G roup 1
Next i
m =2*m
call generate_data_change_m odel
for i=1 to n
call Interrogate_data_m odel(features,Dm val)
call flag_tipoffs
rem Save as Group 3
next i
rem extract all k tipoff cases from G roup 3 and
rem random ly assign to 0/1 c lass
for i=1 to k
if rnd>.5 then
out=1
else
out=0
endif
next i
10 continue
call generate_data_relations_m odel
rem score cases
for i=1 to k
call Interrogate_data_m odel(features,Dm val)
if Dm val< .25 then
trainval=0
elseif Dm val>.75 then
trainval=1
else
if trainval=0 then
trainval=1
else
trainval=0
endif
endif
next i
rem check for lim it cycle or 4 layer Data M odel learning all cases
if lim it_cycle or all_learned then
rem sharp im ages are the sm aller class
rem if equal num ber in each class, select class w ith largest RSS
else
goto 10
endif
rem last m odel generated can be used in lieu of training a final m odel
call generate_data_change_m odel
for i=1 to n
call Interrogate_data_m odel(features,Dm val)
call flag_tipoffs
next i
call write_output
end Fig. 13: Pseudo-code for supervised constructing Data Change Models (left) and unsupervised Data Modeling approach for selecting
images for stacking (right).
BackgroundReject
SharpData Model
[O(3n)]
StackData
ModelingEnhance
Image
Camera
Time Fig. 14: Block diagram of unsupervised Data Modeling approach for selecting images for stacking.
Image
Rzzzzzzzzzf =),,,,,,,,( 534335343332312313
Data Model of Equivalent Kernel
)]3([ nO
*
* f is
Image Pixel Locations Correspond To Center Row andColumn of Equivalent Kernel
5554535251
4544434241
3534333231
2424232221
1514131211
zzzzz
zzzzz
zzzzz
zzzzz
zzzzz
Center Image Pixel LocationsFrom Final Image
(After Applying Equivalent Kernel)
5554535251
4544434241
3534333231
2424232221
1514131211
zzzzz
zzzzz
zzzzz
zzzzz
zzzzz
Fig. 15: Image pixel locations corresponding to the center row and column of the equivalent kernel are transformed into the center
image pixel locations after applying the equivalent kernel. This yields the Data Model of the equivalent kernel.
Registax Sharp Selected and Wavelet Processed
Score = 0.1886
Data ModelingScore = 0.1917
Raw Frame #1
Registax Sharp Selected and Wavelet Processed
Score = 0.1886
Data ModelingScore = 0.1917
(Note: JPEG artifacts removed,Noise suppressed)
Raw Frame #1(Note: JPEG artifacts present)
Fig. 16: Data Modeling for image stacking on Mars (top), and zoom in to Mars detail (bottom), Left image is raw image, center sharp
selected and wavelet processed using Registax, and the right image is using Data Modeling approach. A total of 18 images were stacked for both the Registax and Data Modeling final images.
Registax Sharp Selected and Wavelet Processed
Score = 0.2026
Data ModelingScore = 0.2053
(Note: Noise suppressed)
Raw Frame #1
Registax Sharp Selected and Wavelet Processed
Score = 0.2026
Data ModelingScore = 0.2053
Raw Frame #1
Fig. 17: Data Modeling for image stacking on lunar surface feature (top), and zoom in to lunar surface feature (bottom). Left image is raw image, center sharp selected and wavelet processed using Registax, and the right image is using Data Modeling approach. Note
the absence of noise and the presence of truly resolved detail in the Data Modeling image.
Fig. 18: Tags for images in Figures 16 and 17.
CLS
INPUT "ascii file => ", zz$
INPUT "tag output file => ", tag$
OPEN zz$ FOR INPUT AS #1
OPEN tag$ FOR OUTPUT AS #3
fsum = 0: mean = 0: mean1 = 0: stdev = 0: skew = 0
kurt = 0: m6 = 0: m8 = 0: m = 0: min = 1: max = 0
DO UNTIL EOF(1)
LINE INPUT #1, b$
n = LEN(b$)
FOR i = 1 TO n
z = ASC(UCASE$(MID$(b$, i, 1)))
IF z = 32 THEN
'skip procesing of spaces
GOTO 999
ELSEIF z >= 65 AND z <= 90 THEN
'A-Z
z = 10 + (z - 65)
ELSE
'numbers
z = VAL(MID$(b$, i, 1))
END IF
FOR j = 5 TO 0 STEP -1
IF z >= 2 ^ j THEN
z = z - 2 ^ j
f = 1
ELSE
f = 0
END IF
f = 2 * f - 1
fsum = fsum + (f - mean1)
m = m + 1
mean1 = (mean1 * (m - 1) + f) / m
mean = (mean * (m - 1) + fsum) / m
stdev = SQR((((stdev ^ 2) * (m - 1) / m)) + (((mean - fsum) ^ 2) / m))
IF stdev <> 0 THEN
skewnew = ((fsum - mean) / stdev) ^ 3
skew = (skew * (m - 1) + skewnew) / m
kurtnew = ((fsum - mean) / stdev) ^ 4
kurt = (((kurt + 3) * (m - 1) + kurtnew) / m) - 3
m6new = ((fsum - mean) / stdev) ^ 6
m6 = (((m6 + 15) * (m - 1) + m6new) / m) - 15
m8new = ((fsum - mean) / stdev) ^ 8
m8 = (((m8 + 105) * (m - 1) + m8new) / m) - 105
END IF
NEXT j
999 :
NEXT i
LOOP
PRINT #3, mean1, mean, stdev, skew, kurt, m6, m8
CLOSE : END Fig. 19: Source code for generating the tag estimates in Figure 18 and the tag for this paper included in the abstract.
1110010111001101101011010011001
12 −= ii wz
∑=
=i
j
ji zy0
-11
Fig. 20: Data Modeling for text mining. Resulting 1/f data set (right) is now amenable for processing as described in Figure 1.
TABLES
Table 1: Complete Table of Jacoby spectra tag values.
Table 2: Input parameters for generating binary star light curves, along with derived features and Data Modeling results. Luminosity
ratio and radii ratio are constant near the top of the chart for training data on a single object type and varying orbital inclinations. Important data is found in columns 4-8, which describe the histogram of each light curve. Columns 5-7 are the inputs derived for the
Data Model. Rollups showing overlap are at the bottom of each column.
Table 3: Input parameters for exo-planets, along with features derived and Data Modeling results. Luminosity ratio and orbital
inclination are constant values for the first 50 cases while planet radius varies between 0.1 and 0.5 of star radius. The second 50 cases contain equal variation in planet radius and orbital inclination variation between 80 and 90 degrees. Columns 4-8 describe the
histogram of each light curve. Rollups of the columns are provided, which show the existence of total overlap. 7 cases in the last 50 were classified as being like the training data (Accept), while the remaining 43 were labeled as different (Reject).
VIRTUAL INSTRUMENT PROTOTYPING WITH DATA MODELING
H. M. Jaenisch and J. W. Handley Sparta, Inc.
Huntsville, AL
J. C. Pooley III and S. R. Murray Amtec, Inc.
Huntsville, AL
ABSTRACT
Virtual instruments and sensors are mathematical embodiments of physically unrealizable devices. The mathematical theory for prototyping virtual sensors using Data Modeling is defined. The theory of change
detection using Data Modeling is presented. Several examples of prototype virtual sensors are given including: physical parameter sensor using trajectory, THz anthrax detector, gear tooth health sensor, and
fault detector.
INTRODUCTION
A virtual instrument is comprised of a multitude of real and/or virtual sensors. A virtual sensor is a physically unrealizable sensor that exists only as a mathematical construct. It uses as input measurements from other physical devices. This mathematical function fuses measurements into a
derived measurement. The fusion process includes: data reduction, normalization, filtering, transformation, and processing; all of which are encompassed by a mathematical function termed a Data Fusion Model, or abbreviated as Data Model. In this work, we explore the use of Data Models as virtual
sensor prototypes. Such prototypes become physical virtual sensors when they are embedded in firmware in conjunction with physical sensors as a hardware device.
THEORY
For example, consider the measurement of a novel physical parameter termed the Dynamic
Coefficient of Thermal Expansion. The Dynamic Coefficient of Thermal Expansion differs from the classic coefficient of thermal expansion by including the elongation effects due to the coupling of vibration and thermal effects in the material. Clearly, this Dynamic Coefficient of Thermal Expansion does not
currently exist and cannot be measured with existing sensors. It can, however, be inferred from existing sensors and suitable mathematical processing, or the construction of a suitable virtual sensor. This would be achieved as follows:
In the laboratory, strain gauges governed by
GF
R∆=ε (1)
placed in the axial and longitudinal directions measure size change in each axis. For strain gauges, ε is
strain, ∆R change in resistance, while GF is strain sensitivity or gauge factor. Accelerometers are
governed by
m
kxa = (2)
derived from Hooke’s Law and Newton’s 2nd
law of motion. These measure vibration frequency. In (2), k is the spring constant, x displacement, and m mass. Thermocouples governed by
Approved for public release; distribution is unlimited.
a
a
S
VT =∆ (3)
measure temperature change. In (3), the Seebeck coefficient is Sa, the voltage across the thermocouple
Va, and the temperature change between the sides of the thermocouple ∆T. From these measurements,
the change of material size due to changes temperature (∆T) are obtained from
TLL ∆=∆ α (4)
where L is length and α is the coefficient of thermal expansion. The total change including vibration is defined as
ωβ∆=∆ LL (5)
where ∆ω is change in frequency derived from accelerometer measurements and β is proportional to the amplitude of the forcing function. The total expansion of the material would be the sum of effects in (4)
and (5), which yields the Dynamic Coefficient of Thermal Expansion (γ) given by
γω
LL
T
L =∂∂+
∂∂
(6)
Because (6) is a partial differential equation based on physical properties in the form of different
material properties (T and ω), the Dynamic Coefficient of Thermal Expansion cannot be directly measured. Therefore, strain gauges, thermocouples, and accelerometers are input variables to a Data
Model. This Data Model fuses these inputs into meta-variables of the form
),,( Tafuh ε= (7)
These meta-variables form the nested polynomial for the Dynamic Coefficient of Thermal
Expansion (γ).
∑∑= =
=N
h
k
j
j
hhj us1 0
,γ (8)
STOCHASTIC (fBm) DIFFERENTIAL EQUATION MODELING
Invariably, sensor measurements involve sample and process noise. This leads to the modeling of sensor measurements using the one-dimensional stochastic differential equation
1, 2
)()()( tnxgxfdt
dx += (9)
which assumes the self-similar or fractal fine structure of a fBm process is modeled by an additive noise
term g(x)n(t). A solution to (9) would be
[ ] .)()()()( ∫ += dttnxgxftx (10)
Lacking a repeatable numerical solution, the classic approach is modeling the fBm ensemble average behavior of (10) using the generalized Fokker-Planck equation of the form
))((2
1))((
2
2
2
Pxgx
Pxfxt
P
∂∂+
∂∂−=
∂∂
(11)
for P(t,x) with P(0,x)=d(x-z). The Fokker-Planck equation is a model of dynamic continuous probability density functions (PDF) for averages of solutions to stochastic differential equations in (9) arising from Brownian motion increments. The Fokker-Planck equation only characterizes an ensemble of
realizations; it does not model specific fBm trajectories.
Jaenisch derived a specific formulation for a particular fBm trajectory realization x(t)
{ } [ ]∫ +≅≅ dttnxgxftftx f )()()()()( D (12)
This uses self-similarity inherent in the quasi -periodic function f(x) to define fractal fine structure (micro-trend). This takes the form of an interpolating operator Df applied to a continuous function f(t)
(macro-trend)3, 4
.
This approach shows that the process is modeled by a first order differential equation of the form
)(tfdt
dx = (13)
where t is the independent variable and f(t) is the fBm process. Differential equation based processes are comprised of continuous differentiable functions. F(t) is cast as a Turlington polynomial T(x) of the form
( )
∑
+++=
−m
d
xx
mm
m
DBxAxT1
10 101log)( (14)
Turlington polynomials are constructed in a piecewise orthogonal term fashion, yielding a continuous
function. Handley previously obtained the general form of the derivative2 from (14). From this, the first
derivative is
( )
( ) .
101
10
1
∑ −
−
+
+=m
d
xx
d
xx
m
m
m
m
m
m
d
DB
dx
dT (15)
Data Relations Modeling discovers the functional relationship between the first derivative and its forcing functions
5.
DATA RELATIONS MODELING
Data Relations Modeling6 derives a mathematical expression from measured data. This
inverse modeling and parameter estimation yields functional equations and numeric coefficients. These equations are the functional form of the prior distribution required for Bayesian methods.
Discovering the form of the prior model requires approximating the Kolmogorov-Gabor polynomial
L++++= ∑∑∑∑∑∑i j k
kjiijki j
jiiji
ii xxxaxxaxaa0φ (16)
using Ivakhnenko Group Method of Data Handling (GMDH) discrete nested functions of the form
)))))))))((((((((,()( 21 tbxtbxbxtftx nK= (17)
which is a form of orthogonal nested polynomial network. This nested function is comprised of low order basis functions (meta-variables) defined by truncated 3
rd order Taylor series approximations to the
Logistic Function. In (17), n represents the number of layers in the final Data Model and x(bi(t)) the inputs mapped from the previous layer. Each layer forms a polynomial model that is a fusion of up to 3 variables into a meta-variable.
In order to perform Data Relations Modeling for a stochastic differential equation, one selects the
form of the forcing functions driving T’(x). These forcing functions are inputs to the modeling process, and
T’(x) is the output. If specific prior forms of the forcing functions are known, they can be specified. If the priors are unknown, eigenfunctions derived from dominant positive frequency components are used as the model of the fBm trajectory forcing functions.
A useful model forcing function can be obtained from the correlation function and the
characteristic spectral function for the sampled process.
Our eigenfunction extraction method uses a variation of Wiener filtering to identify individual
peaks in a dB power spectrum. First, least squares regression fits a straight line to the entire dB power
spectrum, resulting in a slope and a y intercept. The y intercept is the noise floor across the spectrum. Peaks are identified by slope changes using the 1
st derivative of the dB power spectrum. The
associated Fourier coefficients (cosine (frequency) and sine (phase)) for each peak are used as input into
the Data Modeling process equation. Data Models scale as a nested O[3
n] process, resulting in nonlinear differential equations that
may have hundreds of terms associated with it. This results in the autonomously generated Data Driven Differential Equation
2, 5
)))))))))((((((((,(21
tbxtbxbxtfdt
dxn
K= (18)
Data Modeling generates the Data Driven Differential Equation shown in (18) that can be solved using numerical integration. These differential equations of motion are used to obtain state
transition matrices for Kalman filtering of nonlinear processes and motions. When such differential equations are coupled with the fractal operator Df , fBm dynamic processes are modeled as
{ }.)()( tftx fD≅ (19)
where f(t) is the Runge-Kutta solution to Equation (18) above2.
DATA SEQUENCE MODELING Data sets (time series, image, or other vectors of information) can be reconstructed from a sparse
sub-sampling1 , 6
. Sparse sub-sampling is achieved by simple naive decimation. The sparsely sub-sampled points are a specimen or Data Model of the original data set. Because they are demonstrated to be the support vector
[ ] 01),( =−+ bwxy iiiα (20)
of the set, the sample size need never exceed 10% of the original number of points, akin to reconstructing
a musical score by saving only every 10th note. Data Sequence Modeling is based on a unique fractal pseudo-derivative operator previously defined
2 as
{ } ).()())(()(
110111110111
−+−−−+−−+− −−+−+
−−
−−
≡ nnninniNnnnn
if yyyyyyyxN
yyyy
N
yyyD (21)
This fractal operator evolved from extensive work with the Iterated Function System (IFS) contractive mapping
+
=
n
n
nn
n
nf
e
y
x
dc
a
y
xw
0 (22)
Unlike IFS, Df generates fractal fine structure in a single pass without resorting to random number
generators or Markov processes relying strictly on the Collage and Ruelle-Takens Theorem.
CHANGE DETECTION
Change detection is achieved by creating a Data Model hypersensitive to small changes in
sample distribution. Changes in distribution are often too small to resolve with classical methods. To exploit small changes, the multiplicative high-order terms and cross terms yielded by the Data Model
are required. This is akin to producing a multivariate statistical process control snapshot for very non-Gaussian and nonlinear processes.
Analogous to Shewhart’s X-Bar chart7, a model of the complex multi-variate process that is in
statistical control is derived by observing the time history of the governing distributions. The highly volatile sensitivity of the Data Model equation signals a tip-off when the process dynamics drift or
change8, 9
. A Data Model is built by training on nominal cases against a categorical variable value.
Overfitting makes the model sensitive to feature values outside training boundaries. Change detection models detect combinations of incoming feature values that within valid ranges but simply not in combinations encountered in the nominal training set.
RESULTS AND DISCUSSION
PHYSICAL PARAMETER VIRTUAL SENSOR
Can the mass of an object be directly measured from position? Using classical sensors this
seems untenable. However, a virtual sensor can yield the mass from x and y positions alone. Virtual sensors for cross sectional area and drag coefficient are also developed from the x and y FPA position data
2, 5.
To illustrate, Fig. 1 depicts a measured maneuvering object trajectory. The physics of the target is unknown. The trajectory is biased and distorted by the sensor measurement process and therefore
unknown. A Kalman filter cannot be applied because this object is maneuvering and its equations of motion are unknown. No a priori model exists for the sensor measurement bias and noise. This yields a problem of blind system identification and parameter estimation.
To estimate the equations of motions for this object, we seek a second order differential equation
),,,,,,,(112
2
mnuucc
dt
dyyf
dt
ydKK= (23)
where f is a continuous function. Also, dt
dy and 2
2
dt
yd are first and second derivatives of y with respect to
time, c1 to cn are constants, and u1 to um are forcing functions. U1 through um are autonomously
selected from a priori domain physics related derivative forcing functions:
2r
GMaGravity =
m
pAvCa D
Drag
25.0= ∑ ∑
∞
=
∞
=+=
0 0
)sin()cos(n n
nni nxbnxau (24)
The control parameters are estimated using a genetic algorithm following Koza.
Optimization of parameters uses Levenberg-Marquardt nonlinear least squares fitting10
. The genetic algorithm is a directed random search parameter estimator defined as follows:
1. Select a gene pool size of n
n, where n is the number of coefficients in the forcing function.
2. Populate pool with random values (apply prior probabilities if known).
3. Rank according to selected cost function (determine points falling outside ¼ standard deviation boundary of objective function, and sum their Euclidean distances).
4. Select frequency of occurrence of mutation and its form (negation or random perturbation).
5. Select number of offspring to propagate forward as ½ nn.
6. Populate offspring pool with top ½ of current pool and spliced offspring (splicing selects a random location in the coefficient array and splices the left side of one parent and the right side of another
into a new offspring). 7. Pool is re-ranked as in 3). 8. Stopping criteria: Points within ¼ standard deviation or user specified maximum number of
iterations. Best candidate solutions pass through Levenberg-Marquardt optimization.
For the trajectory in Fig. 1 (upper left), the genetic algorithm estimated:
2037.0 mA = (Equivalent body area)
"5.8217.0 == mD (Diameter equivalent body) 69.0=DC
lbskgM 7318.33 == (25)
The differential equation modeling steps are:
1. Identifying target physics model
a. Position and velocity derivative b. Physics based forcing function derivatives c. Thrust derivative
d. Lumped parameter stability derivatives 2. Identifying sensor bias model
a. Position bias
b. Orientation error c. Measurement error d. Boresight error
Details are found in previous work5. The final results are in Fig. 1. The mode of the error lies in the 8th bin
of the histogram, corresponding to a most frequently occurring error bound of negative one (-1.0) to ½
(0.5) meters. This matches the original Cramer-Rao bound ambiguity. This results in the final second order differential equation form
++++= )20cos,18cos,14cos,8cos,5cos,2(cos)()()(2
2
fCfgfTfdt
ydD
+)36cos,31cos,25cos,23cos,17cos,8cos,5cos,3(cosf
+)36cos,31cos,25cos,23cos,17cos,8cos,5(cosf )20cos,17cos,5cos,2(cosf (26)
where T is thrust, g is gravity, CD is drag, and each cosine term is a dominant positive frequency
component. The number for each cosine is the number of cycles.
Residual
Error
Original and
Model Overlay
0.0
-6151.40.0 1200.0
5.16
-13.18
Thrust
0.0 1200.0
Gravity
Drag
Lumped Stability Derivatives
0.0 1200.0
0.0 1200.0
Boresight
Measurement
Orientation
Meters
Meters
-1.0 m ½ m
Fig. 1. Measured trajectory data compared with model overlay (top left), physics based and lumped stability derivative (fin number, aspect, and orientation) forcing functions (top right), user input thrust acceleration profile (bottom left), and orientation, measurement, and boresight error derivatives (bottom right).
Bt SIMULANT VIRTUAL THz SENSOR
Fig. 2 shows 5 different anthrax Bt simulant concentrations measured with a THz sensor (1.6, 1.8, 2.7, 3.8, and 4.4%). For this study, no field data, simulation, or code existed, only a single viewgraph with crude representation of measurements. This viewgraph had to be scanned in from a paper copy, and
each of the data curves shown extracted from the viewgraph and into a data file. This was done by plotting the bitmap image from the scan to the screen, isolating each of the data curves separately, and reading pixel values relative to the screen coordinates to determine the values of the curves. Once this
was done, the curves were normalized to the values given on the absorption axis of the original chart, and interpolated up to the correct number of points using Data Sequence Modeling.
Because the presence and locations of peaks in each Bt spectrum varied as a function of concentration (particularly above 9 THz), neither simple linear nor higher order Lagrangian interpolation was applicable. Both Data Relations Modeling and Data Sequence Modeling were required to construct
this vi rtual sensor prototype.
Creating this virtual prototype required the following process.
1. Generate prototype spectrum by averaging the 5 known spectra. 2. Sub-sample prototype 20% (52 points).
3. Convert specimen points into an equation by fitting a Turlington polynomial. 4. Sub-sample resulting polynomial and each of the original spectra 8:1 (12.5%). 5. Calculate difference between the polynomial and each spectrum.
6. Concentration, frequency, and prototype are inputs. Differences are the training output. 7. Generate equation (Fig. 3) that generates THz absorption spectrum. 8. Apply Df to (7) providing fine structure.
5.0 Frequency 10.0
1.3
A
bsorp
tion
5.5
1.61.71.8
2.25
2.7
3.25
4.4
4.1
3.8
Fig. 2. Truth Bt anthrax simulant THz spectra (1.6, 1.8, 2.7, 3.8, and 4.4% concentrations) and Data Model Virtual Sensor Prototype spectra predictions (1.7, 2.25, 3.25, and 4.1% concentrations).
The predicted spectra are shown in Fig. 2, and the Data Model virtual sensor prototype is shown in the Appendix. This virtual sensor prototype requires as input the desired concentration, and writes out into a file named Output the absorption spectrum between 5 and 9.8 THz for the input concentration. The
source code is written in QBASIC, and can easily be converted into any other language of choice. A virtual sensor prototype that estimates the Bt concentration from measured absorption
spectra is also given in the Appendix. The Bt spectrum virtual sensor prototype is run in a loop between
minimum and maximum frequency, and its absorption output (γ) and associated frequency (ν) are used as
input into the Bt concentration virtual sensor prototype. The average concentration (ζ) is then determined
by averaging all of the concentration outputs (n) using the following functional form
∑=
=n
i
fn 1
),(1 νγζ (27)
The Bt concentration model in conjunction with the Bt spectrum model are an end-to-end high
fidelity simulation. Once each spectrum has been generated, Data Modeling can be applied to the data to provide a
transfer function that transforms the Bt simulant data into the live anthrax absorption spectrum expected in field tests. This defines a virtual sensor that responds to a stimulus available in controlled environment settings, and transforms them into the response expected in-situ.
For cloud dispersion mapping, only a limited number of sensors may be available for spatial deployment. For example, in the case of measuring anthrax Bt simulant concentrations with a terahertz (THz) sensor, a very coarse grid of sensors is deployed as shown in Fig. 3. This limited number of
sensors sub-samples the measurement space. Using Data Modeling, these sensors yield spatial interpolation and extrapolation. A proof of concept for such virtual 3-D spatial sensors was previously demonstrated
12.
Bio Agent Gas Cloud Meteorological cloud propagation model
Agent concentration vs time – t*+ ∆t Prevailing Wind
Detonation at time t*
Standoff Detector
Point Detector Grid (Hollow = Virtual, Filled = Real)
Fig. 3. Solid point detectors represent those in the field for the test, while hollow squares represent virtual sensors whose measurements are generated with Data Modeling.
GEAR MESH FREQUENCY VIRTUAL SENSOR
The Gear Mesh Frequency Virtual Sensor prototype uses as input the measured accelerometer data, which is transformed into the instantaneous frequency corrected acceleration from which the gear mesh frequency is extracted as output.
This sensor prototype uses corrected shaft velocity variation ))(( tθ& during multiple shaft speed
periods and acquisition snapshot time intervals is Rotordynamic Kinetics Variation Correction (RKVC)
13, 14, 15, 16. RKVC does not require a Tach Pulse, but does require expected nominal shaft speeds
and allowable velocity variations from nominal speed.
The input acceleration signal x(t) may contain several spectral components at different carrier
frequencies fc. This x(t) signal is band pass filtered about a carrier fc(k)
)()()( fXfHfX k→→ (28)
where H(f) = 1 if fc - ∆f ≤ f ≤ fc + ∆f and H(f) = 0 if fc - ∆f < f < fc + ∆f. This modified Xk(f) is then Hilbert
transformed (H.T.) into
)(~..)( fTHf kk χχ →→ (29)
which yields the analytical signal after inverse Fourier transforming
)(~)()(ˆ txtxtx kkk += (30)
from which the Instantaneous Phase Φk (t) for the k
th synchronous waveform of carrier fc(k)
=
)(
)(~)(
tx
txArcTant
k
k
kφ (31)
Then the primary signal x(t) can be instantaneously phase corrected for velocity variations within periods.
)(
)()(
t
tt
ωφ=∆ (32)
This is the instantaneous temporal correction for each time sample. This regenerated time sampling history is interpolated to recreate a new time signal with constant angular velocity. From this
new time signal with constant angular velocity, the gear mesh frequency is determined, and the new time signal used to indirectly measure gear tooth health status.
VIRTUAL FAULT DETECTOR
Since determination of current helicopter gearbox conditions as nominal or anomalous is
physically unrealizable directly from accelerometer measurements alone , a virtual sensor is required. The Virtual Fault Detector uses dominant eigenfunctions of the accelerometer measurements. These eigenfunctions are calculated in the same manner described for the Physical Parameter Virtual
Sensor. The vibration data passes through a change detector. If tip-offs occur, the data is flagged for verification. Prototyping occurs in three parts:
1. Construction of a Data Model for the nominal data. 2. Construction of a Data Model for the anomalous data. 3. Construction of a combined Data Model for the difference between the second and first models.
This third model is the actual model used for anomalous behavior. A 100 kHz data stream was sampled for .05 seconds, yielding 4096 points. This data was sub-
sampled by 10% to 410 points. Eigenfunction extraction yielded the inputs for constructing the nominal and anomalous Data Models. Df reconstructed the equation output at the same sampling rate as the
original (4096 points). Comparison is made between the Data Model and interpolation spectrum. These power spectra
are obtained from linear interpolation of the 410 sub-sampled points and the Data Model. Fig. 4 shows the integrated error between these spectra. Both linear interpolation (top) and Data Modeling (bottom) captured low frequency content.
The Nyquist location in the power spectrum corresponds to ½ the number of sub-sampled points,
or in this case 205 terms into the power spectrum (½ of the 410 sub-sampled points). However, the Data
Model was able to more accurately capture the higher frequency information (shown by the higher value for the linear interpolation curve over the Data Model on the right of the graph in Fig. 4) than the classical linear interpolation approach.
0 F r e q u e n c y T e r m 5 1 2
0
I n t e g r a t e d E r r o r
5 3 0 1 L i n e a r I n t e r p o l a t i o n
D a t a M o d e l
N y q u i s t
Fig. 4. Integrated error between power spectrum for nominal Data Model and original (bottom) and linear interpolation and original (top). Error is less for the Data Model than for linear interpolation at the higher frequencies, and nearly equal up to the point predicted by Nyquist as the maximum frequency captured with the sub-sample rate.
Once the nominal and anomalous equation Data Models are built, a third and final combined
equation Data Model is built using the difference between the nominal and anomalous equation Data Models. The difference data is generated by subtracting the nominal equation output from the anomalous equation output. A combined equation Data Model is then generated in the same manner as for the
nominal and anomalous data using Data Relations Modeling with eigenanalysis.
The power spectrum for the nominal and anomalous Data Modeling outputs are shown in Fig. 5,
depicting the first 512 points (first ¼) in the power spectrum for each. These power spectrum curves actually fall on top of one another, but have been shifted in mean so that the structure of each can be seen. The final model is a signal amplifier that minimizes entropy from insignificant sidebands and
folding frequencies and redistributes the energy into the dominant peaks.
N o m i n a l A n o m a l o u s
Orig inal
Data
M o d e l
Fig. 5. Power spectra for nominal (left), anomalous (middle), and histogram of anomalous time series (right). Main gear mesh frequency (980 Hz) indicated by arrow.
The gear mesh frequency (980 Hz) is the first peak from the left (indicated by an arrow in Fig. 5). The side lobes around 980 Hz are used to detect anomalies.
The combined Data Model equation used to generate the spectra values is given in the Appendix.
This Data Model replaces the one-million point measurement set with the short source code found in the Appendix. This Data Model uses as input a flag of fault or no-fault conditions (specified by a 1 or 0), resulting in
+= )199cos,131cos,110cos,95cos,79cos,66cos,40(cos)( fgi
λ
)201cos,199cos,133cos,131cos,112cos,110cos,95cos,79cos,66cos,40(cosfw ⋅ (33)
A change detector is given in the Appendix. 4096 points from the nominal Data Model were divided into 64 point windows. These windows were characterized using mean, standard deviation, skewness, kurtosis, 6
th and 8
th moment, entropy fractal dimensions
14. This change detector
performed as shown in Table 1.
Correct Classification False Alarm BDR
Nominal 64 / 64 (100%) 0 / 64 (0%) 95.5%
Anomalous 61 / 64 (95%) 3 / 64 (5%) 95.5% Table 1. Virtual fault detector results.
Concatenating nominal and anomalous windows resulted in the Bayesian Detection Rate (BDR)
)1( pfrp
rpBDR
−+= (34)
where r is the true positive rate (64 out of 128 = ½), p is the probability of fault (number of faults / total number of measurements) (64 out of 128 = ½), and f is the false positive rate (3 out of 128).
SUMMARY AND CONCLUSIONS The creation of a virtual instrument, and therefore virtual sensors, is made possible through the
use of Data Modeling. The mathematical theory behind Data Modeling has been explained, and several practical examples of prototype virtual sensors were implemented. These examples include virtual sensors for physical parameter measurement from trajectory position, absorption spectra from view
graphs, indirect measurement of gear tooth health through measurement of gear mesh frequency, and fault detection without a priori knowledge of fault conditions during training.
ACKNOWLEDGEMENTS
The authors would like to thank Dr. Bill Pickel and Dr. Mehmet Erengil of the University of Texas at Austin Institute for Advanced Technology (IAT) for providing the measured trajectory used in constructing the physical parameter virtual sensor. The authors would also like to thank John Deacon of
Sparta, Inc; Tim Aden and Scott McPheeters of US Army PEO-SHORAD for their support of the authors’ work; Dr. Andrew Greene, Nils Höltge, Dr. Marcel Thuerk, and Sion Balass of Matrix-ADS Limited for supporting the genetic algorithm method of selecting parameters; and Licht Strahl Engineering INC and
Dr. Marvin Carroll of Tec-Masters, Inc. for use of the Majqanda tool suite in the analysis in this paper.
REFERENCES
1. Jaenisch, H. and Handley, J., Data Modeling of 1/f noise sets, Proceedings of SPIE Fluctuations
and Noise 2003, Santa Fe, NM, (June 2003).
2. Jaenisch, H.M., Handley, J.W., and Faucheux, J.P., Data Driven Differential Equation Modeling of fBm processes, Proceedings of SPIE, (August 2003).
3. Handley, J.W., Jaenisch, H.M., Bjork, C.A., Richardson, L.T., and Carruth, R.T., Chaos and Fractal
Algorithms Applied to Signal Processing and Analysis, Simulation, 60:4, 261-279 (1993). 4. Handley, J.W. On The Existence Of A Transition Point In Sampled Data Using Fractal Methods.
Master's Thesis, Ann Arbor, MI: UMI, 1995.
5. Jaenisch, H.M., and Handley, J.W., Automatic Differential Equation Data Modeling for UAV Situational Awareness, Society for Computer Simulation, Huntsville Simulation Conference 2003, (October 2003).
6. Jaenisch, H.M. and Handley, J.W., Data Modeling for Radar Applications, Proceedings of the IEEE Radar Conference 2003, (May 2003).
7. Grant, E. Statistical Quality Control. New York: McGraw-Hill, 1952.
8. Jaenisch, H.M., Handley, J.W., and Bonham, L.A., Enabling Calibration on Demand for Situational Awareness, Army Aviation Association of America (AAAA), Tennessee Valley Chapter, (February 2003).
9. Jaenisch, H.M., Handley, J.W., Faucheux, J.P., and Harris, B., Data Modeling of Network Dynamics, Proceedings of SPIE, (August 2003).
10. Koza, J.R., Genetic Programming: On the Programming of Computers by Means of Natural
Selection, Cambridge, MA: MIT Press, 1992. 11. Jaenisch, H.M., Handley, J.W., White, K.H., Watson Jr., J.W., Case, C.T, and Songy, C.G., Virtual
prototyping with Data Modeling, Proceedings of SPIE, Seattle, WA, (July 2002).
12. Jaenisch, H.M., Handley, J.W., Roberts, S.K., Case, C.T., and Songy, C.G. Mapping of underwater topology using indirect sensing and Data Modeling, Proceedings of SPIE, Seattle, WA, (July 2002).
13. Pooley, J.C., Murray, S.R., Jaenisch, H.M., and Handley, J.W., Fault Detection via Complex Hybrid Signature Analysis, JANNAF 39th Combustion, 27th Airbreathing Propulsion, 21st Propulsion Systems Hazards, and 3rd Modeling and Simulation Subcommittees Joint Meeting, Colorado Springs,
CO, (December 2003). 14. Jaenisch, H.M., Handley, J.W., Pooley, J.C., and Murray, S.R. Data Modeling For Complex
Epicyclical Gearbox Fault Detection, JANNAF 39th Combustion, 27th Airbreathing Propulsion, 21st
Propulsion Systems Hazards, and 3rd Modeling and Simulation Subcommittees Joint Meeting, Colorado Springs, CO, (December 2003).
15. Dempsey, P.J., Handschuh, R.F., and Afjeh, A.A., Spiral Bevel Gear Damage Detection Using
Detection Fusion Analysis, 5th
International Conference on Information Fusion, Annapolis, MD, (July 2002).
16. McFadden, P.D., A Technique For Calculating The Time Domain Averages of the Vibration of
the Individual Planet Gears and the Sun Gear in an Epicyclic Gearbox, Journal of Sound and Vibration, 1991, 144(1), 163-172.
APPENDIX Bt spectrum virtual sensor prototype (QBASIC source code is in two columns. Place second column listing immediately after first column to run.)
DEFCUR A-Z:DECLARE FUNCTION yt @(x@,conc@):DECLARE FUNCTION proto@(x@)INPUT "Concentration (1.6% - 4.4%) => ", concOPEN "output" FOR OUTPUT AS #1ntotal = 256: nsub = 32: ymin = 1.3@: ymax = 5.73@
10 : n = INT(x * ( nsub - 1) / (ntotal - 1)) + 1istart = CINT((( ntotal - 1) * (n - 1)) / (nsub - 1))ifinish = CINT((( ntotal - 1) * n) / ( nsub - 1))iend = CINT(((ntotal - 1) * (n + 1)) / ( nsub - 1))
IF iend > 255 THEN iend = iend - 255xpt = (x - istart ) / (ifinish - istart )ix = ( ntotal - 1) * xpt
s = (( yt( iend , conc) - ymin ) / ( ymax - ymin ))s = s - (( yt(istart, conc) - ymin ) / ( ymax - ymin ))Rem Fractal operator (Data Sequence Model)Rem applied to yt (Data Relations Model)
dfde = ((( yt(ifinish , conc ) - yt(istart, conc)) / ( ntotal - 1))) * ixdfde = dfde- ((s*( yt( ntotal -1, conc )-yt(0, conc ))/(ntotal -1)))*ixdfde = dfde + s * yt (ix, conc ) + yt (istart, conc)dfde = dfde - yt (0, conc ) * s
PRINT #1, dfde: IF x < 255 THEN : x = x + 1: GOTO 10CLOSE : ENDFUNCTION proto (x)Rem Turlington polynomial
y = -.0783 * x + 2.6417y = y + - .0009 * LOG(1 + 10 ^ ((x - 5.1724) / .01)) / LOG(10)y = y + .0022 * LOG(1 + 10 ^ ((x - 5.3448) / .01)) / LOG(10)
y = y + .0066 * LOG(1 + 10 ^ ((x - 5.5172) / .01)) / LOG(10)y = y + - .0006 * LOG(1 + 10 ^ ((x - 5.6897) / .01)) / LOG(10)y = y + - .0042 * LOG(1 + 10 ^ ((x - 5.8621) / .01)) / LOG(10)y = y + .0072 * LOG(1 + 10 ^ ((x - 6.0345) / .01)) / LOG(10)
y = y + - .0075 * LOG(1 + 10 ^ ((x - 6.2069) / .01)) / LOG(10)y = y + .0127 * LOG(1 + 10 ^ ((x - 6.3793) / .01)) / LOG(10)y = y + - .0114 * LOG(1 + 10 ^ ((x - 6.5517) / .01)) / LOG(10)y = y + .0104 * LOG(1 + 10 ^ ((x - 6.7241) / .01)) / LOG(10)
y = y + - .0007 * LOG(1 + 10 ^ ((x - 6.8966) / .01)) / LOG(10)y = y + - .0431 * LOG(1 + 10 ^ ((x - 7.069) / .01)) / LOG(10)y = y + .009 * LOG(1 + 10 ^ ((x - 7.2414) / .01)) / LOG(10)y = y + .0215 * LOG(1 + 10 ^ ((x - 7.4138) / .01)) / LOG(10)
y = y + .013 * LOG(1 + 10 ^ ((x - 7.5862) / .01)) / LOG(10)y = y + .0028 * LOG(1 + 10 ^ ((x - 7.7586) / .01)) / LOG(10)y = y + .0024 * LOG(1 + 10 ^ ((x - 7.931) / .01)) / LOG(10)
y = y + - .0232 * LOG(1 + 10 ^ ((x - 8.1034) / .01)) / LOG(10)y = y + - .0114 * LOG(1 + 10 ^ ((x - 8.2759) / .01)) / LOG(10)y = y + .0006 * LOG(1 + 10 ^ ((x - 8.4483) / .01)) / LOG(10)y = y + .0114 * LOG(1 + 10 ^ ((x - 8.6207) / .01)) / LOG(10)
y = y + .0102 * LOG(1 + 10 ^ ((x - 8.7931) / .01)) / LOG(10)y = y + - .0037 * LOG(1 + 10 ^ ((x - 8.9655) / .01)) / LOG(10)y = y + .0166 * LOG(1 + 10 ^ ((x - 9.1379) / .01)) / LOG(10)y = y + - .0297 * LOG(1 + 10 ^ ((x - 9.3103) / .01)) / LOG(10)
y = y + .0266 * LOG(1 + 10 ^ ((x - 9.4828) / .01)) / LOG(10)y = y + - .0237 * LOG(1 + 10 ^ ((x - 9.6552) / .01)) / LOG(10)y = y + .0349 * LOG(1 + 10 ^ ((x - 9.8276) / .01)) / LOG(10)proto = y
END FUNCTIONFUNCTION yt (x, conc )ON LOCAL ERROR RESUME NEXT
Rem Kolmogorov- Gabor polynomial (Data Relations Model)x1 = xf = 5@ + 4.8@ * x1 / 255p1 = proto(f)
psub = (p1 - 2.8587) / .3133fsub = (f - 7.5) / 1.2623csub = (conc - 2.86) / .992l1o1 = .1757
l1o1 = l1o1 + (1.0155) * ( csub )l1o1 = l1o1 + (.1212) * (fsub )l1o1 = l1o1 + (.0355) * (psub )l1o1 = l1o1 + ( -.1403) * ( csub * csub )
l1o1 = l1o1 + ( -.002) * (fsub * fsub)l1o1 = l1o1 + (.0058) * (psub * psub)l1o1 = l1o1 + (.0652) * (csub * fsub)l1o1 = l1o1 + (.0926) * (csub * psub)
l1o1 = l1o1 + ( -.0108) * ( fsub * psub)l1o1 = l1o1 + ( -.0223) * ( csub * fsub * psub )l1o1 = l1o1 + ( -.0086) * ( csub * csub * csub )
l1o1 = l1o1 + ( -.002) * (fsub * fsub * fsub )l1o1 = l1o1 + ( -.0149) * ( psub * psub * psub )l1o1 = l1o1 + ( -.096) * (fsub * csub * csub )l1o1 = l1o1 + (.0266) * (csub * fsub * fsub )
l1o1 = l1o1 + (.004) * ( csub * psub * psub)l1o1 = l1o1 + ( -.0108) * ( psub * csub * csub )l1o1 = l1o1 + ( -.0109) * ( psub * fsub * fsub )l1o1 = l1o1 + (.0177) * (fsub * psub * psub)
l2o2 = .0615l2o2 = l2o2 + (.0341) * (l1o1)l2o2 = l2o2 + (.6378) * (csub )l2o2 = l2o2 + ( -.1514) * ( psub )
l2o2 = l2o2 + ( -.3858) * (l1o1 * l1o1)l2o2 = l2o2 + ( -1.2049) * ( csub * csub )l2o2 = l2o2 + ( -.0637) * ( psub * psub)
l2o2 = l2o2 + (1.5586) * (l1o1 * csub )l2o2 = l2o2 + (.9776) * (l1o1 * psub)l2o2 = l2o2 + ( -1.0147) * ( csub * psub )l2o2 = l2o2 + (.7615) * (l1o1 * csub * psub)
l2o2 = l2o2 + ( -.7653) * (l1o1 * l1o1 * l1o1)l2o2 = l2o2 + ( -.012) * (csub * csub * csub )l2o2 = l2o2 + (.0165) * (psub * psub * psub)l2o2 = l2o2 + (2.2658) * ( csub * l1o1 * l1o1)
l2o2 = l2o2 + ( -1.2412) * (l1o1 * csub * csub)l2o2 = l2o2 + (.118) * (l1o1 * psub * psub)l2o2 = l2o2 + ( -.0762) * ( psub * l1o1 * l1o1)l2o2 = l2o2 + ( -.6542) * ( psub * csub * csub )
l2o2 = l2o2 + ( -.2157) * ( csub * psub * psub )l1o2 = .1727l1o2 = l1o2 + (1.0447) * ( csub )
l1o2 = l1o2 + (.101) * ( psub)l1o2 = l1o2 + ( -.1425) * ( csub * csub )l1o2 = l1o2 + ( -.0026) * ( psub * psub)l1o2 = l1o2 + (.1389) * (csub * psub)
l1o2 = l1o2 + ( -.0068) * ( csub * csub * csub)l1o2 = l1o2 + ( -.0028) * ( psub * psub * psub) l1o2 = l1o2 + ( -.0792) * ( psub * csub * csub)l1o2 = l1o2 + ( -.0094) * ( csub * psub * psub)
l2o3 = .0072l2o3 = l2o3 + (.0073) * (l1o1)l2o3 = l2o3 + (.0215) * ( fsub )l2o3 = l2o3 + (.9155) * (l1o2)
l2o3 = l2o3 + ( -.021) * (l1o1 * l1o1)l2o3 = l2o3 + ( -.032) * ( fsub * fsub)l2o3 = l2o3 + ( -.3618) * (l1o2 * l1o2)
l2o3 = l2o3 + ( -.1584) * (l1o1 * fsub )l2o3 = l2o3 + (.3667) * (l1o1 * l1o2)l2o3 = l2o3 + (.122) * (fsub * l1o2)l2o3 = l2o3 + ( -.4431) * (l1o1 * fsub * l1o2)
l2o3 = l2o3 + ( -.0294) * (l1o1 * l1o1 * l1o1)l2o3 = l2o3 + ( -.0279) * ( fsub * fsub * fsub)l2o3 = l2o3 + (3.7686) * (l1o2 * l1o2 * l1o2)l2o3 = l2o3 + (.9564) * ( fsub * l1o1 * l1o1)
l2o3 = l2o3 + (.1896) * (l1o1 * fsub * fsub )l2o3 = l2o3 + ( -8.7844) * (l1o1 * l1o2 * l1o2)l2o3 = l2o3 + (5.1023) * (l1o2 * l1o1 * l1o1)l2o3 = l2o3 + ( -.1916) * (l1o2 * fsub * fsub)
l2o3 = l2o3 + ( -.4921) * ( fsub * l1o2 * l1o2)l3o2 = - .0065l3o2 = l3o2 + (.0354) * (l2o2)
l3o2 = l3o2 + (.1506) * ( csub )l3o2 = l3o2 + (.7698) * (l2o3)l3o2 = l3o2 + (.6768) * (l2o2 * l2o2)l3o2 = l3o2 + (.5409) * ( csub * csub)
l3o2 = l3o2 + (.3753) * (l2o3 * l2o3)l3o2 = l3o2 + ( -1.145) * (l2o2 * csub )l3o2 = l3o2 + ( -.5801) * (l2o2 * l2o3)l3o2 = l3o2 + (.1062) * ( csub * l2o3)
l3o2 = l3o2 + ( -.5078) * (l2o2 * csub * l2o3)l3o2 = l3o2 + (.9953) * (l2o2 * l2o2 * l2o2)l3o2 = l3o2 + ( -1.6416) * (csub * csub * csub )l3o2 = l3o2 + ( -1.3315) * (l2o3 * l2o3 * l2o3)
l3o2 = l3o2 + ( -8.1034) * (csub * l2o2 * l2o2)l3o2 = l3o2 + (9.13) * (l2o2 * csub * csub )l3o2 = l3o2 + ( -1.0112) * (l2o2 * l2o3 * l2o3)
l3o2 = l3o2 + (2.7962) * (l2o3 * l2o2 * l2o2)l3o2 = l3o2 + ( -4.3226) * (l2o3 * csub * csub )l3o2 = l3o2 + (4.0238) * ( csub * l2o3 * l2o3)l2o1 = .1936
l2o1 = l2o1 + (.0136) * (l1o1)l2o1 = l2o1 + (.0761) * ( fsub )l2o1 = l2o1 + (.9395) * ( csub )l2o1 = l2o1 + ( -3.3321) * (l1o1 * l1o1)
l2o1 = l2o1 + ( -.0653) * ( fsub * fsub )l2o1 = l2o1 + ( -3.5339) * (csub * csub )l2o1 = l2o1 + (.4446) * (l1o1 * fsub)l2o1 = l2o1 + (6.7663) * (l1o1 * csub )
l2o1 = l2o1 + ( -.3496) * ( fsub * csub )l2o1 = l2o1 + ( -1.1369) * (l1o1 * fsub * csub )l2o1 = l2o1 + ( -1.0585) * (l1o1 * l1o1 * l1o1)
l2o1 = l2o1 + ( -.0324) * ( fsub * fsub * fsub)l2o1 = l2o1 + (1.0497) * ( csub * csub * csub)l2o1 = l2o1 + (1.009) * ( fsub * l1o1 * l1o1)l2o1 = l2o1 + (.2401) * (l1o1 * fsub * fsub )
l2o1 = l2o1 + ( -3.1781) * (l1o1 * csub * csub )l2o1 = l2o1 + (3.2531) * ( csub * l1o1 * l1o1)l2o1 = l2o1 + ( -.2992) * ( csub * fsub * fsub)l2o1 = l2o1 + (.1068) * ( fsub * csub * csub )
l3o3 = - .0189l3o3 = l3o3 + (.1171) * (l2o2)l3o3 = l3o3 + ( -.18) * (l2o1)l3o3 = l3o3 + (1.0049) * (l2o3)
l3o3 = l3o3 + (.0969) * (l2o2 * l2o2)l3o3 = l3o3 + (.4771) * (l2o1 * l2o1)l3o3 = l3o3 + ( -.8041) * (l2o3 * l2o3)l3o3 = l3o3 + ( -1.3671) * (l2o2 * l2o1)
l3o3 = l3o3 + (1.0996) * (l2o2 * l2o3)l3o3 = l3o3 + (.5197) * (l2o1 * l2o3)l3o3 = l3o3 + (.2083) * (l2o2 * l2o1 * l2o3)
l3o3 = l3o3 + ( -.8395) * (l2o2 * l2o2 * l2o2)l3o3 = l3o3 + ( -1.9646) * (l2o1 * l2o1 * l2o1)l3o3 = l3o3 + (1.7981) * (l2o3 * l2o3 * l2o3)l3o3 = l3o3 + (2.2974) * (l2o1 * l2o2 * l2o2)
l3o3 = l3o3 + ( -.4148) * (l2o2 * l2o1 * l2o1)l3o3 = l3o3 + (1.0008) * (l2o2 * l2o3 * l2o3)l3o3 = l3o3 + ( -1.339) * (l2o3 * l2o2 * l2o2)l3o3 = l3o3 + (5.2977) * (l2o3 * l2o1 * l2o1)
l3o3 = l3o3 + ( -6) * (l2o1 * l2o3 * l2o3)l4o2 = .0312l4o2 = l4o2 + ( -.0024) * (l3o2)l4o2 = l4o2 + (.175) * (csub )
l4o2 = l4o2 + (.8012) * (l3o3)l4o2 = l4o2 + ( -.0577) * (l3o2 * l3o2)l4o2 = l4o2 + ( -.1658) * ( csub * csub )
l4o2 = l4o2 + ( -.1476) * (l3o3 * l3o3)l4o2 = l4o2 + (.0647) * (l3o2 * csub)l4o2 = l4o2 + (.0622) * (l3o2 * l3o3)l4o2 = l4o2 + (.2193) * ( csub * l3o3)
l4o2 = l4o2 + ( -.2148) * (l3o2 * csub * l3o3)l4o2 = l4o2 + ( -.4841) * (l3o2 * l3o2 * l3o2)l4o2 = l4o2 + ( -.4275) * ( csub * csub * csub)l4o2 = l4o2 + (.4404) * (l3o3 * l3o3 * l3o3)
l4o2 = l4o2 + ( -1.3707) * (csub * l3o2 * l3o2)l4o2 = l4o2 + (1.3713) * (l3o2 * csub * csub)l4o2 = l4o2 + ( -2.5034) * (l3o2 * l3o3 * l3o3)l4o2 = l4o2 + (2.8303) * (l3o3 * l3o2 * l3o2)
l4o2 = l4o2 + ( -.2438) * (l3o3 * csub * csub)l4o2 = l4o2 + (.627) * (csub * l3o3 * l3o3)l4o2 = l4o2 * (.9814) + (.0007)
IF l4o2 < ( -1.8572) - (1.1209) THEN l4o2 = (- 1.8572) - (1.1209)IF l4o2 > (2.2382) + (1.1209) THEN l4o2 = (2.2382) + (1.1209)yt = l4o2 + p1END FUNCTION
Bt concentration virtual sensor prototype. (QBASIC source code is in two columns. Place second column listing immediately after first column to run.)
Rem Kolmogorov- Gabor polynomial (Data Relations Model)input "Absorption = ", absorpinput "Frequency = ", freqabsorp = ( absorp - 2.8609) / .917freq = (freq- 7.5164) / 1.2582l1o3 = .0874l1o3 = l1o3 + (1.2226) * ( absorp)l1o3 = l1o3 + ( -.0199) * ( absorp * absorp )l1o3 = l1o3 + ( -.0923) * ( absorp * absorp * absorp )l1o1 = - .1685l1o1 = l1o1 + (1.2886) * ( absorp)l1o1 = l1o1 + ( -.117) * (freq)l1o1 = l1o1 + (.0405) * ( absorp * absorp )l1o1 = l1o1 + (.162) * (freq* freq)l1o1 = l1o1 + ( -.1277) * ( absorp * freq)l1o1 = l1o1 + ( -.1125) * ( absorp * absorp * absorp )l1o1 = l1o1 + ( -.0905) * (freq* freq* freq)l1o1 = l1o1 + (.0417) * (freq* absorp * absorp )l1o1 = l1o1 + (.0456) * ( absorp * freq* freq)l1o2 = 0l1o2 = l1o2 + (.9782) * ( absorp )l1o2 = l1o2 + ( -.257) * (freq)l2o1 = - .0678l2o1 = l2o1 + ( -.4069) * (l1o3)l2o1 = l2o1 + (1.216) * (l1o1)l2o1 = l2o1 + ( -.1502) * (l1o2)l2o1 = l2o1 + (.7836) * (l1o3 * l1o3)l2o1 = l2o1 + (.8533) * (l1o1 * l1o1)l2o1 = l2o1 + (.0785) * (l1o2 * l1o2)l2o1 = l2o1 + ( -1.3953) * (l1o3 * l1o1)l2o1 = l2o1 + (.0431) * (l1o3 * l1o2)l2o1 = l2o1 + ( -.3564) * (l1o1 * l1o2)l2o1 = l2o1 + ( -1.2731) * (l1o3 * l1o1 * l1o2)l2o1 = l2o1 + (1.0529) * (l1o3 * l1o3 * l1o3)l2o1 = l2o1 + ( -.9622) * (l1o1 * l1o1 * l1o1)l2o1 = l2o1 + (.4182) * (l1o2 * l1o2 * l1o2)l2o1 = l2o1 + ( -3.9566) * (l1o1 * l1o3 * l1o3)l2o1 = l2o1 + (3.8149) * (l1o3 * l1o1 * l1o1)l2o1 = l2o1 + ( -1.2622) * (l1o3 * l1o2 * l1o2)l2o1 = l2o1 + (2.0645) * (l1o2 * l1o3 * l1o3)l2o1 = l2o1 + (.0742) * (l1o2 * l1o1 * l1o1)l2o1 = l2o1 + (.1696) * (l1o1 * l1o2 * l1o2)l2o3 = - .0818l2o3 = l2o3 + (.0759) * (l1o3)l2o3 = l2o3 + ( -1.0568) * ( absorp )l2o3 = l2o3 + (1.2708) * (l1o1)l2o3 = l2o3 + (.2173) * (l1o3 * l1o3)l2o3 = l2o3 + ( -.7217) * ( absorp * absorp )l2o3 = l2o3 + (.4713) * (l1o1 * l1o1)l2o3 = l2o3 + (.9198) * (l1o3 * absorp )l2o3 = l2o3 + ( -2.5729) * (l1o3 * l1o1)l2o3 = l2o3 + (1.6621) * ( absorp * l1o1)l2o3 = l2o3 + ( -.1178) * (l1o3 * absorp * l1o1)l2o3 = l2o3 + (4.7812) * (l1o3 * l1o3 * l1o3)l2o3 = l2o3 + ( -1.0852) * ( absorp * absorp * absorp )l2o3 = l2o3 + ( -.8142) * (l1o1 * l1o1 * l1o1)l2o3 = l2o3 + ( -5.9031) * ( absorp * l1o3 * l1o3)l2o3 = l2o3 + (3.913) * (l1o3 * absorp * absorp )l2o3 = l2o3 + (5.4668) * (l1o3 * l1o1 * l1o1)l2o3 = l2o3 + ( -4.7801) * (l1o1 * l1o3 * l1o3)l2o3 = l2o3 + (1.6853) * (l1o1 * absorp * absorp )l2o3 = l2o3 + ( -2.8258) * ( absorp * l1o1 * l1o1)l2o2 = - .166l2o2 = l2o2 + (.1439) * (l1o3)l2o2 = l2o2 + (.1699) * ( absorp )l2o2 = l2o2 + (.5341) * (l1o2)l2o2 = l2o2 + ( -3.3055) * (l1o3 * l1o3)l2o2 = l2o2 + (.2443) * ( absorp * absorp )l2o2 = l2o2 + (2.6863) * (l1o2 * l1o2)l2o2 = l2o2 + (5.2213) * (l1o3 * absorp )l2o2 = l2o2 + (.6905) * (l1o3 * l1o2)l2o2 = l2o2 + ( -5.5128) * ( absorp * l1o2)l2o2 = l2o2 + ( -.324) * (l1o3 * absorp * l1o2)
l2o2 = l2o2 + (2.9655) * (l1o3 * l1o3 * l1o3)l2o2 = l2o2 + ( -2.9985) * ( absorp * absorp * absorp )l2o2 = l2o2 + (4.7594) * (l1o2 * l1o2 * l1o2)l2o2 = l2o2 + ( -6.6428) * ( absorp * l1o3 * l1o3)l2o2 = l2o2 + (3.9217) * (l1o3 * absorp * absorp )l2o2 = l2o2 + ( -2.3861) * (l1o3 * l1o2 * l1o2)l2o2 = l2o2 + (2.2219) * (l1o2 * l1o3 * l1o3)l2o2 = l2o2 + (8.6865) * (l1o2 * absorp * absorp )l2o2 = l2o2 + ( -10.1277) * ( absorp * l1o2 * l1o2)l3o1 = - .026l3o1 = l3o1 + ( -.5655) * (l2o1)l3o1 = l3o1 + (1.8251) * (l2o3)l3o1 = l3o1 + ( -.1683) * (l2o2)l3o1 = l3o1 + ( -.2062) * (l2o1 * l2o1)l3o1 = l3o1 + (2.3963) * (l2o3 * l2o3)l3o1 = l3o1 + (.6281) * (l2o2 * l2o2)l3o1 = l3o1 + ( -1.9053) * (l2o1 * l2o3)l3o1 = l3o1 + (1.9318) * (l2o1 * l2o2)l3o1 = l3o1 + ( -2.833) * (l2o3 * l2o2)l3o1 = l3o1 + (1.1231) * (l2o1 * l2o3 * l2o2)l3o1 = l3o1 + (5.0889) * (l2o1 * l2o1 * l2o1)l3o1 = l3o1 + ( -6.9325) * (l2o3 * l2o3 * l2o3)l3o1 = l3o1 + (3.2972) * (l2o2 * l2o2 * l2o2)l3o1 = l3o1 + ( -9.7348) * (l2o3 * l2o1 * l2o1)l3o1 = l3o1 + (9.4629) * (l2o1 * l2o3 * l2o3)l3o1 = l3o1 + (5.598) * (l2o1 * l2o2 * l2o2)l3o1 = l3o1 + ( -5.8589) * (l2o2 * l2o1 * l2o1)l3o1 = l3o1 + (12.2058) * (l2o2 * l2o3 * l2o3)l3o1 = l3o1 + ( -14.3) * (l2o3 * l2o2 * l2o2)l3o3 = - .0486l3o3 = l3o3 + ( -.1775) * (l2o2)l3o3 = l3o3 + (2.4852) * (l2o3)l3o3 = l3o3 + ( -1.1885) * (l2o1)l3o3 = l3o3 + (1.6957) * (l2o2 * l2o2)l3o3 = l3o3 + (2.0175) * (l2o3 * l2o3)l3o3 = l3o3 + (2.0096) * (l2o1 * l2o1)l3o3 = l3o3 + ( -1.2979) * (l2o2 * l2o3)l3o3 = l3o3 + ( -1.7907) * (l2o2 * l2o1)l3o3 = l3o3 + ( -2.615) * (l2o3 * l2o1)l3o3 = l3o3 + (.0304) * (l2o2 * l2o3 * l2o1)l3o3 = l3o3 + (.8335) * (l2o2 * l2o2 * l2o2)l3o3 = l3o3 + ( -3.6036) * (l2o3 * l2o3 * l2o3)l3o3 = l3o3 + (3.1642) * (l2o1 * l2o1 * l2o1)l3o3 = l3o3 + ( -6.2871) * (l2o3 * l2o2 * l2o2)l3o3 = l3o3 + (4.7083) * (l2o2 * l2o3 * l2o3)l3o3 = l3o3 + ( -3.1113) * (l2o2 * l2o1 * l2o1)l3o3 = l3o3 + (4.2263) * (l2o1 * l2o2 * l2o2)l3o3 = l3o3 + (6.8997) * (l2o1 * l2o3 * l2o3)l3o3 = l3o3 + ( -6.9167) * (l2o3 * l2o1 * l2o1)l4o3 = .0318l4o3 = l4o3 + ( -.0515) * (l3o1)l4o3 = l4o3 + (.0789) * (absorp )l4o3 = l4o3 + (.9848) * (l3o3)l4o3 = l4o3 + (.1984) * (l3o1 * l3o1)l4o3 = l4o3 + (.0332) * (absorp * absorp )l4o3 = l4o3 + ( -4.6211) * (l3o3 * l3o3)l4o3 = l4o3 + ( -4.345) * (l3o1 * absorp )l4o3 = l4o3 + (4.515) * (l3o1 * l3o3)l4o3 = l4o3 + (4.2166) * ( absorp * l3o3)l4o3 = l4o3 + ( -.0451) * (l3o1 * absorp * l3o3)l4o3 = l4o3 + (.1034) * (l3o1 * l3o1 * l3o1)l4o3 = l4o3 + ( -.0114) * ( absorp * absorp * absorp )l4o3 = l4o3 + (4.8489) * (l3o3 * l3o3 * l3o3)l4o3 = l4o3 + (1.441) * (absorp * l3o1 * l3o1)l4o3 = l4o3 + ( -.0934) * (l3o1 * absorp * absorp )l4o3 = l4o3 + ( -7.0619) * (l3o1 * l3o3 * l3o3)l4o3 = l4o3 + (2.0594) * (l3o3 * l3o1 * l3o1)l4o3 = l4o3 + (.0613) * (l3o3 * absorp * absorp )l4o3 = l4o3 + ( -1.3036) * (spec * l3o3 * l3o3)conc = l4o3 * (.8247) + (2.8608)IF conc < (1.6) - (1.0025) THEN conc = (1.6) - (1.0025)IF conc > (4.4) + (1.0025) THEN conc = (4.4) + (1.0025)print "Concentration = ", conc : end
DEFCUR A-Z: DECLARE FUNCTION yt @ (x@)20:input "No Fault (0) or Fault (1) => ",fOPEN "output" FOR OUTPUT AS #1ntotal = 4096: nsub = 410: ymin = -17@: ymax = 15@10 : n = 1 + INT(CINT(x - 1) / 10)istart = CINT((( ntotal - 1) * (n - 1)) / ( nsub - 1))ifinish = CINT((( ntotal - 1) * n) / ( nsub - 1))xpt = (x - istart ) / ( ifinish - istart )ix = ( ntotal - 1) * xpts = ( yt (n + 1,f) - ymin ) / ( ymax - ymin )s = s - (yt (n - 1,f) - ymin ) / (ymax - ymin)Rem Fractal operator (Data Sequence Model)Rem applied to yt (Data Relations Model)dfde = ((( yt(n,f) -yt(n -1,f))/( ntotal - 1)))*ixdfde =dfde - ((s*( yt(nsub -1,f) -yt (0,f))/( ntotal - 1)))*ixdfde = dfde + s*yt(xpt *(nsub - 1),f)+( yt(n-1,f)*( nsub -1)/( ntotal- 1))dfde = dfde - yt (0,f) * s * ( nsub - 1) / ( ntotal - 1)PRINT #1, dfde : IF x < 4095 THEN : x = x + 1: GOTO 10CLOSE : ENDFUNCTION yt (x,f)pi = 4@ * ATN(1@)npts = 410Rem Eigenfunctions for nominal termcos40 = COS((2 * pi * 40 * x / (npts - 1)) + .1895)cos66 = COS((2 * pi * 66 * x / (npts - 1)) - .5521)cos79 = COS((2 * pi * 79 * x / (npts - 1)) - 1.5047)cos95 = COS((2 * pi * 95 * x / (npts - 1)) + 1.5572)cos110 = COS((2 * pi * 110 * x / ( npts - 1)) - .2232)cos131 = COS((2 * pi * 131 * x / ( npts - 1)) - .6192)cos199 = COS((2 * pi * 199 * x / ( npts - 1)) + .4987)cos40 = (cos40 - .0024) / .6375cos66 = (cos66 - .0021) / .6371cos79 = (cos79 - .0002) / .6352cos95 = (cos95 - 0) / .6351cos110 = (cos110 - .0024) / .6374cos131 = (cos131 - .002) / .6371cos199 = (cos199 - .0021) / .6372Rem Kolmogorov- Gabor polynomial (Data Relations Model)Rem for nominal terml1o1 = 0l1o1 = l1o1 + ( -.3234) * (cos66)l1o1 = l1o1 + ( -.2608) * (cos95)l1o1 = l1o1 + ( -.1904) * (cos131)l1o1 = l1o1 + ( -.1473) * (cos199)l1o1 = l1o1 + (.1321) * (cos110)l1o1 = l1o1 + ( -.1214) * (cos79)l1o1 = l1o1 + (.0828) * (cos40)l1o1 = l1o1 * (8.4211) + ( -.4932)l1o2=0if f=1 then
Rem Eigenfunctions for anomalous termcos40 = cos ((2*pi*40*x/( npts -1)) - 0.8528)cos66 = cos ((2*pi*66*x/( npts -1))+0.0104)cos79 = cos ((2*pi*79*x/( npts -1))+0.2700)cos95 = cos ((2*pi*95*x/( npts -1)) - 0.1601)cos110 = cos((2*pi*110*x/( npts- 1)) - 0.4579)cos112 = cos((2*pi*112*x/( npts- 1))+0.6647)cos131 = cos((2*pi*131*x/( npts- 1)) - 0.4524)cos133 = cos((2*pi*133*x/( npts- 1))+0.0544)cos199 = cos((2*pi*199*x/( npts- 1))+1.0276)cos201 = cos((2*pi*201*x/( npts- 1))+0.6440)cos40 = (cos40 - 0.0016) / 0.6367cos66 = (cos66 - 0.0024) / 0.6375cos79 = (cos79 - 0.0024) / 0.6374cos95 = (cos95 - 0.0024) / 0.6375cos110 = (cos110 - 0.0022) / 0.6372cos112 = (cos112 - 0.0019) / 0.637cos131 = (cos131 - 0.0022) / 0.6373cos133 = (cos133 - 0.0024) / 0.6375cos199 = (cos199 - 0.0013) / 0.6363cos201 = (cos201 - 0.002) / 0.637Rem Kolmogorov - Gabor polynomial (Data Relations Model)Rem for anomalous terml1o2=0l1o2=l1o2+ ( - 0.7048) * (cos95)l1o2=l1o2+ (0.3304) * (cos66)l1o2=l1o2+ (0.2267) * (cos131)l1o2=l1o2+ (0.1533) * (cos199)l1o2=l1o2+ ( - 0.1535) * (cos110)l1o2=l1o2+ (0.1167) * (cos112)l1o2=l1o2+ (0.1135) * (cos79)l1o2=l1o2+ ( - 0.0753) * (cos133)l1o2=l1o2+ ( - 0.031) * (cos40)l1o2=l1o2+ (0.0009) * (cos201)l1o2=l1o2* (7.0817) + ( - 3.412)
endifyt = l1o1+l1o2END FUNCTION
Combined Data Model for nominal and anomalous helicopter gearbox data including frequency components used. (QBASIC source code is in two columns. To run, place second column listing immediately after first column.)
Change detector. (QBASIC source code is in three columns. To run, place second column listing immediately after first column, and the third after the second.)
DEFCUR A-Zpi=4@* atn (1@)open "input" for input as #1input #1, meaninput #1, stdevinput #1, skewinput #1, kurtinput #1, m6input #1, Jaeninput #1, Handclose #1mean = (mean --0.4454) / 0.5152stdev = ( stdev- 4.0706) / 0.8783skew = (skew -0.0162) / 0.3504kurt = ( kurt --0.8099) / 0.3423m6 = (m6 --8.1288) / 2.5084Jaen = ( Jaen -0.5198) / 0.0396Hand = (Hand -1.2527) / 0.0192l1o2= -0.5913l1o2=l1o2+ ( -0.1476) * (skew)l1o2=l1o2+ ( -0.9041) * (mean)l1o2=l1o2+ ( -0.9286) * (kurt )l1o2=l1o2+ (0.4752) * (skew * skew)l1o2=l1o2+ (0.2562) * (mean * mean)l1o2=l1o2+ ( -0.2235) * (kurt * kurt )l1o2=l1o2+ (0.5887) * (skew * mean)l1o2=l1o2+ ( -0.4035) * (skew * kurt )l1o2=l1o2+ ( -1.0192) * (mean * kurt )l1o2=l1o2+ (0.1964) * (skew * mean * kurt)l1o2=l1o2+ ( -0.1016) * (skew * skew * skew)l1o2=l1o2+ (0.0464) * (mean * mean * mean)l1o2=l1o2+ (0.1473) * ( kurt * kurt * kurt)l1o2=l1o2+ ( -0.2244) * (mean * skew * skew)l1o2=l1o2+ ( -0.2274) * (skew * mean * mean)l1o2=l1o2+ (0.3348) * (skew * kurt * kurt)l1o2=l1o2+ ( -0.0717) * (kurt * skew * skew)l1o2=l1o2+ (0.2728) * ( kurt * mean * mean)l1o2=l1o2+ (0.516) * (mean * kurt * kurt )l1o3=0.0066l1o3=l1o3+ (1.5425) * ( Jaen)l1o3=l1o3+ ( -0.4563) * (mean)l1o3=l1o3+ (0.44) * (kurt )l1o3=l1o3+ (0.3005) * ( Jaen * Jaen)l1o3=l1o3+ ( -0.05) * (mean * mean)l1o3=l1o3+ ( -0.7195) * (kurt * kurt )l1o3=l1o3+ ( -0.2532) * (Jaen * mean)l1o3=l1o3+ ( -0.1315) * (Jaen * kurt )l1o3=l1o3+ ( -1.4195) * (mean * kurt )l1o3=l1o3+ (0.4939) * ( Jaen * mean * kurt)l1o3=l1o3+ ( -1.2495) * (Jaen * Jaen * Jaen)l1o3=l1o3+ ( -0.1227) * (mean * mean * mean)l1o3=l1o3+ (0.3834) * ( kurt * kurt * kurt)l1o3=l1o3+ ( -0.0849) * (mean * Jaen * Jaen)l1o3=l1o3+ ( -0.5829) * (Jaen * mean * mean)l1o3=l1o3+ ( -0.7079) * (Jaen * kurt * kurt)l1o3=l1o3+ ( -2.3192) * (kurt * Jaen * Jaen)l1o3=l1o3+ ( -0.5023) * (kurt * mean * mean)l1o3=l1o3+ (0.857) * (mean * kurt * kurt )l2o3=0.0816l2o3=l2o3+ (1.0612) * (l1o2)l2o3=l2o3+ (0.1459) * (m6)
l2o3=l2o3+ ( -0.1725) * (l1o3)l2o3=l2o3+ (0.1167) * (l1o2*l1o2) l2o3=l2o3+ (0.2472) * (m6*m6)l2o3=l2o3+ ( -0.0337) * (l1o3*l1o3)l2o3=l2o3+ (0.1863) * (l1o2*m6)l2o3=l2o3+ ( -0.1812) * (l1o2*l1o3)l2o3=l2o3+ ( -0.1442) * (m6*l1o3)l2o3=l2o3+ (0.0003) * (l1o2*m6*l1o3)l2o3=l2o3+ ( -0.0953) * (l1o2*l1o2*l1o2)l2o3=l2o3+ ( -0.0601) * (m6*m6*m6)l2o3=l2o3+ (0.1436) * (l1o3*l1o3*l1o3)l2o3=l2o3+ (0.0718) * (m6*l1o2*l1o2)l2o3=l2o3+ ( -0.086) * (l1o2*m6*m6)l2o3=l2o3+ ( -0.3954) * (l1o2*l1o3*l1o3)l2o3=l2o3+ (0.4013) * (l1o3*l1o2*l1o2)l2o3=l2o3+ (0.0081) * (l1o3*m6*m6)l2o3=l2o3+ ( -0.0409) * (m6*l1o3*l1o3)l2o2=0.1892l2o2=l2o2+ (0.8725) * (l1o2)l2o2=l2o2+ ( -0.4387) * (skew)l2o2=l2o2+ (0.5315) * (l1o3)l2o2=l2o2+ (0.1053) * (l1o2*l1o2)l2o2=l2o2+ ( -0.0652) * (skew*skew)l2o2=l2o2+ ( -0.0461) * (l1o3*l1o3)l2o2=l2o2+ ( -0.4263) * (l1o2*skew)l2o2=l2o2+ ( -0.1575) * (l1o2*l1o3)l2o2=l2o2+ (0.1623) * (skew*l1o3)l2o2=l2o2+ ( -0.0269) * (l1o2*skew*l1o3)l2o2=l2o2+ ( -0.1766) * (l1o2*l1o2*l1o2)l2o2=l2o2+ (0.1399) * (skew*skew*skew)l2o2=l2o2+ ( -0.032) * (l1o3*l1o3*l1o3)l2o2=l2o2+ ( -0.0525) * (skew*l1o2*l1o2)l2o2=l2o2+ (0.0611) * (l1o2*skew*skew)l2o2=l2o2+ ( -0.0615) * (l1o2*l1o3*l1o3)l2o2=l2o2+ (0.2965) * (l1o3*l1o2*l1o2)l2o2=l2o2+ ( -0.1964) * (l1o3*skew*skew)l2o2=l2o2+ ( -0.0044) * (skew*l1o3*l1o3)l3o2=0.0876l3o2=l3o2+ (0.3014) * (l2o3)l3o2=l3o2+ (0.0996) * (Hand)l3o2=l3o2+ (0.9825) * (l2o2)l3o2=l3o2+ (0.0043) * (l2o3*l2o3)l3o2=l3o2+ ( -0.0983) * (Hand*Hand)l3o2=l3o2+ (0.2049) * (l2o2*l2o2)l3o2=l3o2+ (0.019) * (l2o3*Hand)l3o2=l3o2+ ( -0.2056) * (l2o3*l2o2)l3o2=l3o2+ (0.1945) * (Hand*l2o2)l3o2=l3o2+ ( -0.0195) * (l2o3*Hand*l2o2)l3o2=l3o2+ ( -0.0011) * (l2o3*l2o3*l2o3)l3o2=l3o2+ (0.0194) * (Hand*Hand*Hand)l3o2=l3o2+ (0.001) * (l2o2*l2o2*l2o2)l3o2=l3o2+ (0.1624) * (Hand*l2o3*l2o3)l3o2=l3o2+ ( -0.1872) * (l2o3*Hand*Hand)l3o2=l3o2+ ( -0.0355) * (l2o3*l2o2*l2o2)l3o2=l3o2+ (0.024) * (l2o2*l2o3*l2o3)l3o2=l3o2+ ( -0.0368) * (l2o2*Hand*Hand)l3o2=l3o2+ ( -0.1612) * (Hand*l2o2*l2o2)l3o1= -0.0593l3o1=l3o1+ (0.2493) * (l2o3)l3o1=l3o1+ ( -0.2785) * ( stdev)l3o1=l3o1+ (0.8792) * (l2o2)
l3o1=l3o1+ (0.0068) * (l2o3*l2o3)l3o1=l3o1+ (0.0318) * ( stdev *stdev)l3o1=l3o1+ (0.0869) * (l2o2*l2o2)l3o1=l3o1+ (0.1524) * (l2o3* stdev )l3o1=l3o1+ ( -0.0899) * (l2o3*l2o2)l3o1=l3o1+ ( -0.1232) * (stdev *l2o2)l3o1=l3o1+ (0.0041) * (l2o3* stdev *l2o2)l3o1=l3o1+ (0.0004) * (l2o3*l2o3*l2o3)l3o1=l3o1+ (0.0455) * ( stdev *stdev* stdev)l3o1=l3o1+ (0.0042) * (l2o2*l2o2*l2o2)l3o1=l3o1+ ( -0.2995) * (stdev *l2o3*l2o3)l3o1=l3o1+ ( -0.0799) * (l2o3* stdev* stdev)l3o1=l3o1+ (0.0093) * (l2o3*l2o2*l2o2)l3o1=l3o1+ ( -0.0269) * (l2o2*l2o3*l2o3)l3o1=l3o1+ ( -0.0337) * (l2o2* stdev* stdev)l3o1=l3o1+ (0.2907) * ( stdev *l2o2*l2o2)l4o3= -0.077l4o3=l4o3+ (1.1419) * (l3o2)l4o3=l4o3+ ( -0.0212) * (Jaen )l4o3=l4o3+ ( -0.5045) * (l3o1)l4o3=l4o3+ ( -0.0003) * (l3o2*l3o2)l4o3=l4o3+ (0.0168) * ( Jaen*Jaen)l4o3=l4o3+ ( -0.205) * (l3o1*l3o1)l4o3=l4o3+ ( -0.0134) * (l3o2* Jaen )l4o3=l4o3+ (0.1819) * (l3o2*l3o1)l4o3=l4o3+ ( -0.1237) * (Jaen *l3o1)l4o3=l4o3+ ( -0.0025) * (l3o2* Jaen*l3o1)l4o3=l4o3+ ( -0.0018) * (l3o2*l3o2*l3o2)l4o3=l4o3+ (0.0072) * ( Jaen*Jaen*Jaen)l4o3=l4o3+ (0.0166) * (l3o1*l3o1*l3o1)l4o3=l4o3+ ( -0.7481) * (Jaen *l3o2*l3o2)l4o3=l4o3+ ( -0.1203) * (l3o2* Jaen *Jaen )l4o3=l4o3+ (0.2037) * (l3o2*l3o1*l3o1)l4o3=l4o3+ ( -0.2098) * (l3o1*l3o2*l3o2)l4o3=l4o3+ (0.0858) * (l3o1* Jaen*Jaen)l4o3=l4o3+ (0.7326) * ( Jaen*l3o1*l3o1)l5o2=0.0092l5o2=l5o2+ (1.0851) * (l4o3)l5o2=l5o2+ ( -0.013) * (mean)l5o2=l5o2+ (0.106) * (Hand)l5o2=l5o2+ (0.0326) * (l4o3*l4o3)l5o2=l5o2+ ( -0.078) * (mean*mean)l5o2=l5o2+ ( -0.0273) * (Hand*Hand)l5o2=l5o2+ (0.109) * (l4o3*mean)l5o2=l5o2+ ( -0.2199) * (l4o3*Hand)l5o2=l5o2+ ( -0.4196) * (mean*Hand)l5o2=l5o2+ (0.2675) * (l4o3*mean*Hand)l5o2=l5o2+ (0) * (l4o3*l4o3*l4o3)l5o2=l5o2+ ( -0.1212) * (mean*mean*mean)l5o2=l5o2+ ( -0.0074) * (Hand*Hand*Hand)l5o2=l5o2+ ( -0.012) * (mean*l4o3*l4o3)l5o2=l5o2+ ( -0.035) * (l4o3*mean*mean)l5o2=l5o2+ ( -0.1349) * (l4o3*Hand*Hand)l5o2=l5o2+ (0.0378) * (Hand*l4o3*l4o3)l5o2=l5o2+ ( -0.0361) * (Hand*mean*mean)l5o2=l5o2+ (0.2311) * (mean*Hand*Hand)l5o2 = l5o2* (0.0002) + (0.5)open "out1" for output as #1print #1, l5o2close #1end
Data Modeling of deep sky images
James Handley*a
, Holger Jaenischa,b
, Albert Lima, Graeme White
a,
Alex Honsa, Miroslav Filipovic
c,a, Matthew Edwards
b
aJames Cook University, Centre for Astronomy, Townsville QLD 4811, Australia
bAlabama Agricultural and Mechanical University, Department of Physics, Huntsville, AL 35811
cUniversity of Western Sydney, Locked Bag 1797 PENRITH SOUTH DC NSW 1797, Australia
ABSTRACT
We present a method for simulating CCD focal plane array (FPA) images of extended deep sky objects using Data
Modeling. Data Modeling is a process of deriving functional equations from measured data. These tools are used to
model FPA fixed pattern noise, shot noise, non-uniformity, and the extended objects themselves. The mathematical
model of the extended object is useful for correlation analysis and other image understanding algorithms used in Virtual
Observatory Data Mining. We apply these tools to the objects in the Messier list and build a classifier that achieves
100% correct classification.
Keywords: Data Modeling, Virtual Observatory, astronomy, deep-sky, image modeling, noise modeling, Messier
classification, functional modeling, component modeling
1. INTRODUCTION
Data Modeling1,2
is the process of deriving a mathematical expression from measured data.
Two approaches for modeling focal plane array (FPA) images are component based modeling and functional modeling.
In component based modeling, the deep sky object and each noise effect are modeled independently. Arrays for
each effect are stacked into a final image. Component based modeling uses simple equations to model deep sky objects
and noise effects, or it can use statistical models. In contrast, functional modeling of deep sky objects is done strictly
using univariate or multivariate equations. These equations are continuous and their nth order derivatives are available
for analysis. Functional modeling also provides a common basis for robust image analysis.
2. COMPONENT BASED DATA MODELING
Component based Data Modeling of deep sky objects and noise effects (thermal, shot, fixed pattern, FPA nonuniformity,
and hot/dead pixels) are generated independently. Each independent model is stacked to build a Component Based Image
Data Model.
The Moffat function3 βγα
ρ
−
+
−+−= 1
)()(),( 00 yyxx
yxf (1)
is used for modeling point-spread functions. The control parameters α, β, γ, and ρ modify the size, shape, and extent of
the 2-D distribution. Deep sky objects are modeled using a series of Moffat functions and properly selecting coefficients.
This Data Model of M51 used 2 Moffat functions with coefficients obtained using a genetic algorithm (GA)4 yielding
*[email protected]; phone 1 256 337 3769; James Cook University; Centre for Astronomy; Townsville, QLD
4811; AUSTRALIA.
6.04.14.1
5.055.155.1
16.0
))342.0)(45(()28(5.11
6.0
))707.0)(20(()35(0.3),(
−−
+
−−+−+
+
−+−=
yxyxyxf (2)
Next, noise effects on the FPA and the surrounding field stars are modeled using different types of random number
distributions. Use of random numbers yield statistical rather than functional models. To eliminate random number
generators, we propose using sin xe (Sxe function) as a pseudo random number approximating function
5
−
−=
∑∑MinMax
Minx
Djif x
e
2
)(sin
),( (3)
−
−=
∑MinMax
Minx
Cjif x
e
2
)(sin
),( (4)
2
1sin),(
+=
exAjif (5)
(Nonuniformity) (Thermal) (Fixed Pattern)
2
2
1sinsize
xFN
e += (6) 0),( =jif 2
2
1sinsize
xEN
e += (7) 255),( =jif
(Dead pixels) (Hot pixels)
σπ )2
1sin2sin()
2
1sinlog(2),(
++−=
ee xxBjif size
xN
e
2
1sin += (Shot Noise and Field Stars) (8)
These functions use only an x-index value, but map into 2-D (i,j) using Hilbert sequencing. The number of Moffat
functions needed in the final model and their coefficients is determined using a genetic algorithm (GA). Component
models yield short numerical equations but of limited fidelity and not in real-time.
Figure 1 depicts a Component Data Model of M51. Comparison shows good representation of the salient image features.
Because M51 exhibits two bright cores, the genetic algorithm chose two Moffat functions to represent it. Comparison of
the original image on the bottom left with the final model shows similarity in placement of the main cores, similar
magnitude and extent of the cores, a similar number of field stars, and even hints of spiral structure.
3. FUNCTIONAL DATA MODELING
3.1 Univariate model – Turlington polynomial
The Turlington polynomial yields one equation with continuous derivative per data set of the form
( )
∑−
=
−
−
+−+−+=
1
2
001.101111 101log)(001.)()(
n
j
xx
jj
j
mmxxmyxT
jj
jj
jxx
yym
−
−=
+
+
1
1 (9)
where x and y are the original (x, y) data points and n the number of points from the original used for building the
Turlington polynomial. The variable n can be either all of the points or a sub-sampled set of the original.
The Turlington polynomial is built in a piecewise linear fashion one point at a time as data arrives. This lends itself to
real-time construction. The logarithm term makes Turlington polynomials both orthogonal and differentiable. One
drawback to the Turlington polynomial is in its use of n terms, equal to the number of points in the data set being
modeled. This yields fast and streaming real-time derivatives of terms, but an exceptionally large final model6,7
.
3.2 Univariate model - Eigenfunction
Eigenfunctions1 approximate T(x) in compact form. T(x) models require coefficients be stored describing each data
segment. The size is dependent on the data-sampling rate. Using eigenfunctions and the method of residuals, T(x) is
approximated by
∑=
+≅n
j
jjjj xiBxAxT1
sincos)( ωω (10)
In Equation (10) Aj is the amplitude of the jth eigenfunction term, Bj is the phase of the jth eigenfunction term, and ωj is
2π times the frequency f defined by the jth derived dominant Fourier frequency term. Dominant eigenfunction terms are
added one at a time until correlation between T(x) and the integral function in (9) converges. Pseudo code used for
creating eigenfunction models is shown in Figure 2.
These smaller models are orthogonal, differentiable, and Lebesgue integrable. However, they require multiple
Fourier Transforms, making them memory and computer intensive.
3.3 Multivariate functional Data Modeling
A functional is a function whose variables are themselves functions. We approximate the Kolmogorov-Gabor
polynomial by using Ivakhnenko’s Group Method of Data Handling8 (GMDH) using nested functionals of the form
))))))))),,(((((((((),,,( 2121 kjinL xxxybybybyfxxxy KK = [ ] n
LxxxyO 3),,,( 21 =K (11)
This structured approach forms intermediate meta-variables from combinations of three inputs combined into a single
new fused output. This fused meta-variable becomes a new input variable available at the next layer. Since the algorithm
only uses inputs necessary to achieve convergence, pruning of inputs is automatic and requires no external intervention,
enabling unsupervised learning under proper circumstances.
This Functional Data Model is derived in near real-time like T(x) and yields a final model substantially shorter like the
eigenfunction model. If derived from derivative data, it will yield a differential equation model. The algorithm for this
process is listed in Reference 1.
3.4 2-D image to 1-D conversion
Enabling functional modeling of images requires transformation from 2-D into 1-D without 2-D decorrelation. Several
methods exist, including raster scanning and fixed pattern readout such as zigzag sequencing. However, only Hilbert
sequencing preserves 2-D correlations at dyadic sample sizes. Hilbert sequencing is illustrated graphically in Figure 3
and is given by
)()()()( 43211 nnnnn PwPwPwPwH ∪∪∪=+ )()()()( 43211 nnnnn PwPwPwHwP ∪∪∪=+ (12)
subject to Lindenmayer’s L-system grammars represented and defined by
°=−→−+→+
→−++−→+−−+→
90δ
FF
FLRFRLFR
FRLFLRFL
),,(),,(
),,(),,(
),sin,cos(),,(
δααδαα
αααα
+→⇒−−→⇒+
++→⇒
yxyx
yxyx
lylxyxF (13)
Psuedo code for generating the Hilbert sequence is readily available3.
4. APPLICATIONS
4.1 Training cases
We chose a bitmap database of 110 Messier objects for use in classifier construction. Classifier flowcharts in Figures 4
and 5 will be described in Section 4.4 and 4.5. In our database, M102 (repeat of M101) was removed and replaced with
NGC58669. Each bitmap is a 24-bit color image with varying sizes. Bitmaps were resized to 64x64, generating a
thumbnail of each image. Thumbnails were then converted to gray scale and Hilbert sequenced into 1-D.
4.2 Turlington Data Models
Turlington polynomials given in Equation (9) were used to construct Data Models of the entire image. Because
Turlington polynomials are continuous functions, they can be interpolated to any resolution. Figure 6 compares the
original and Turlington Data Models for M13 (cluster), M20 (nebula), and M101 (galaxy). T(x) models suppress noise
by increasing the fitting parameter currently set to 0.001 in Equation (9).
4.3 Eigenfunction Data Models
Eigenfunction models of Messier objects consisting of 5, 10, and 20 terms were built. Using more terms than 5 gave
marginal improvement in correlation, yet doubled and quadrupled the model size. Therefore, we limited our models to 5
terms.
Figures 7-9 contain a library of eigenfunction Data Models, one for each of the 110 Messier objects. Each object is
plotted next to 3 data graphs. The middle graph is the Hilbert sequenced original waveform, the bottom graph the
eigenfunction model, and the top graph is a special histogram derived from the eigenfunction model that will be
discussed in more detail in Section 4.4.
Data Models of images with the object centrally located and somewhat symmetrical (except for open clusters)
contained three dominant peaks. The width of these peaks was generally wider for extended objects. Open
clusters did not show these peaks. Rather, the waveforms displayed 1/f structure5 (Figures 7-9).
4.4 Change detection Data Model
The top graphs in Figures 7-9 are created by generating a double histogram with mode subtraction. First, the
histogram of the data is calculated with number of bins equal to number of points. This histogram is normalized 0-255
and a new histogram (same number of bins) calculated. The mode of this histogram is removed, leaving the modified
2nd
order histograms shown. Characterization of these histograms using descriptive statistics provides features for a
hybrid change detector. Reference 1 describes change detection theory and descriptive features in detail. Our statistical
feature classifier correctly identified 107 out of 110 objects. Figure 4 is a flowchart of the change detector, and the
details are as follows:
Our first change detector (8 layer) ([O(3)8]polynomial) was constructed to identify clusters as nominal, and other
objects (nebula and galaxy) as off-nominal. When this classifier was tested against cluster data, it achieved 100% correct
classification. However, when galaxies and nebula were presented, 19 were mislabeled as clusters. Finally, a classifier
was constructed resolving differences between clusters and mislabeled objects (galaxies and nebulae).
The resolver is a 10 layer ([O(3)10]polynomial) Data Model. The classifier only post processes the data sets labeled as
“clusters”. Our resolver correctly identified all nebulae, all clusters except M35, and all galaxies except M108.
With clusters removed, the remaining objects are passed through a second change detector ([O(3)8]polynomial) that
correctly identified all presented galaxies and all nebulae except M78 (which was mislabeled galaxy).
Additional unusual and potentially difficult test cases were selected to explore performance envelopes of the classifier.
We chose NGC869 (double cluster), the Large Magellanic Cloud (LMC), Small Magellanic Cloud (SMC), Comet
Hale-Bopp, and Comet Neat, shown in Figure 10. These cases were presented to the change detector for assignment:
Object Cluster CD Resolver Galaxy CD Interpretation
Double Cluster 0.5002 -5.2 x108 N/A Unlike any Messier list object
LMC 0.5 0.9558 N/A Cluster
SMC 0.5517 N/A 3.1 x107 Nebula
LMC+SMC 3.1 x109 N/A 3.4 x10
4 Nebula
Hale-Bopp 4.7 x1010 N/A 0.5 Galaxy
Neat 9.9 x109 N/A 4.0 x10
7 Nebula
4.5 Stellar object classification
5-term eigenfunction models are constructed for each Messier object10,11, yielding 16 coefficients shown as a cluster plot
in Figure 11 and given in tabular form in Figure 12. These features are used to build a Data Model classifier. First, we
reduced the dimension of the output to two classes forming a cascade. The first classifier identifies clusters from the
initial pool of objects (clusters, nebula, and galaxies). Once identified, a second classifier distinguishes between nebula
and galaxies. Figure 5 is a flowchart of this classifier.
Using this approach, a 10 layer Data Model ([O(3)10]polynomial) was constructed that correctly distinguished all 57
clusters from galaxies and nebula. Next, a 2 layer Data Model ([O(3)2]polynomial) was constructed that correctly
classified the remaining 40 galaxies and 13 nebulae into their proper classes. The total classifier uses both Data Models;
the first one identifies clusters, while the second determines if the object is a galaxy or nebula.
We obtained 100% correct classification for the Messier list of objects. Also, the 2 combined Data Models needed
only 13 of the 16 available features from Figure 12. The three features not used were B(1), w(4)/2π, and B(5). Our
classifier also detects novelties or changes. If the equation yielded a value outside of -0.0862< x < 1.1313 bounds, the
object is flagged unique; exhibiting characteristics not observed in the Messier training set.
Figure 11 shows a graph using the best two discriminating features (A0 and A1) from Figure 12. These features were
automatically selected by calculating the separability between the 3 classes of objects in all 16 dimensions of the feature
space, and plotting the features that maximized the minimum cluster separability using a K-factor defined as
1
2
1
2
2
2
12
NN
Kσσ
µµ
+
−= (14)
where µi is the mean of each group, Ni is the number of points in each group, and σi is the standard deviation of each
group. Unusual cases were presented to the classifier to score. Our classifier assigned them as follows:
Object DM Value Interpretation
NGC 869 (Double Cluster) 1.5 x1014 Unlike any Messier list object
Large Magellanic Cloud (LMC) 0.2843 Nebula
Small Magellanic Cloud (SMC) 1.0173 Galaxy
Combined LMC and SMC 3.5 x109 Unlike any Messier list object
Comet Hale-Bopp 0.9991 Cluster
Comet Neat 1.0269 Cluster
These cases are unusual because they do not resemble any Messier object. The Double Cluster and combined LMC
and SMC were flagged as novelties. We found our Data Model classifier can determine when new class definitions are
required without supervision.
5. SUMMARY
In conclusion, two approaches for modeling deep sky images was successfully demonstrated. Component based
modeling allows very simple equations to be built. Functional modeling was demonstrated using two different
techniques. Turlington polynomials were demonstrated for real-time applications, and eigenfunctions for short models.
Very good exception handling of novel examples is exhibited using Change Detectors. Functional Data Modeling
resulted in a classifier that correctly identified all 110 of the Messier objects and performed reasonably well classifying
unusual objects.
ACKNOWLEDGMENTS
The authors would like to thank Scott McPheeters, Tim Aden, and John Deacon for their continued support during the
course of this work.
REFERENCES
1. Jaenisch, H., Handley J., Lim A., M.D. Filipovic, White G., Hons A. , Crothers S., Deragopian G., Schneider M.,
Edwards M., “Data Modeling for Virtual Observatory data mining”, Proceedings of SPIE Vol. 5493 (2004).
2. Lim, A., Jaenisch, H., Handley, J., Berrevoets, C., White, G., Deragopian, G., Payne, J., Schneider, M., “Image
Resolution and Performance Analysis of Webcams for Ground Based Astronomy”, Proceedings of SPIE Vol. 5489
(2004).
3. Jaenisch, H.M., Handley, J.W., Scoggins, J., Carroll, M.P., “ISIS: An IR Seeker Model Incorporating Fractal
Concepts”, Proceedings of SPIE, Vol 2225 (1994).
4. Jaenisch, H.M., and Handley, J.W., “Automatic Differential Equation Data Modeling for UAV Situational
Awareness”, Society for Computer Simulation, Huntsville Simulation Conference 2003, (October 2003).
5. Jaenisch, H. and Handley, J., “Data Modeling of 1/f noise sets”, Proceedings of SPIE Vol. 5114 (2003).
6. Jaenisch, H.M. and Handley, J.W., “Data Modeling for Radar Applications”, Proceedings of the IEEE Radar
Conference 2003, (May 2003).
7. Jaenisch, H.M., Handley, J.W. , Faucheux, J.P., “Data Driven Differential Equation Modeling of fBm processes”,
Proceedings of SPIE Vol. 5204(2003).
8. Madala, H.R., Ivakhnenko, A.G., Inductive Learning Algorithms for Complex Systems Modeling, Boca Raton, FL:
CRC Press, 1994.
9. “Messier Objects”, http://www.3towers.com/messier.htm, June 1, 2004.
10. Jaenisch, H.M., Filipovic, M.D., “Classification of Jacoby Stellar Spectra Using Data Modeling”, Proceedings of
SPIE Vol.4816 (2002).
11. Jaenisch, H.M., Collins, W.J., Handley, J.W., Hons, A., Filipovic, M.D., Case, C.T., Songy, C.G., “Real-time visual
astronomy using image intensifiers and Data Modeling”, Proceedings of SPIE Vol. 4796 (2002).
FIGURES
Thermal Noise Shot Noise & Hot/Dead PixelsFixed Pattern NoiseNonuniformity
M51 Data Model M51 with Noise EffectsM51 Fig. 1: Component based Image Data Model for M51, including surrounding star field and FPA noise effects.
rem initialize
redim xdata(n),ydata(n),extreme(n),totaldat(n),r(n),c(n),newdata(n),newdata1(n),r1(n),c1(n)
rem open x and y data array files
call open_data_files(xdata,n)
call open_data_files(ydata,n)
rem sort x into ascending order and sort y with x to retain x associations
call sort2(n,xdata,ydata)
rem put ydata into newdata
for i=1 to n
newdata(i)=ydata(i)
next i
rem specify objective correlation value
cobj=0.99
rem calculate Data Modeling eigenfunction model of y(x)
rem final result held in totaldat; initialize k1 to hold number of terms
k1=0
10 continue
k1=k1+1
rem generate Fourier transform of data and corresponding power spectra
rem use linear regression to fit a line through the dB power spectra
rem identify maxima in power spectra that occur above the linear fit
rem pull out maxima locations in array extreme
rem find maximum value in array extreme, and use location to
rem extract real and imaginary Fourier terms held in r (real) and c(imaginary)
rem for use in eigenfunction model
call calc_extremes(ydata,n,extreme,r,c)
rem find maximum value location ; only look at first half of data since symmetric
rem ignore zeroth order location
n1=int(n/2)
dmax = extreme(1)
dloc = 1
FOR i = 2 TO n1 - 2
IF extreme(i) > dmax THEN
dmax = extreme(i)
dloc = i
END IF
NEXT i
rem construct sine and cosine representation of dloc frequency at sampling equal to original
rem r1 and c1 hold eigenfunction coefficients, d the frequency term
r1(k1)=r2(dloc)
c1(k1)=c2(dloc)
d(k1)=dloc
FOR i = 1 TO n
newdata1(i) = newdata1(i) + CCUR(r2(dloc)) * COS(dloc * 2# * pi * (i / N))
newdata1(i) = newdata1(i) + CCUR(c2(dloc)) * SIN(dloc * 2# * pi * (i / N))
newdata1(i) = newdata1(i) / SQR(N)
NEXT i
rem calculate residual between original and summation of all terms generated so far
FOR i = 1 TO n
newdata(i) = newdata(i) - newdata1(i)
NEXT i
rem add current term to previous terms
FOR i = 1 TO n
totaldat(i) = totaldat(i) + newdata1(i)
NEXT i
rem calculate correlation between original and current model (totaldat)
CALL correl(totaldat,ydata,N,ycorrel)
rem test to see if correlation criteria met
rem test to see if maxterms criteria met
IF ycorrel < cobj AND k1 < .125 * N THEN
GOTO 99
END IF
rem output final model
CALL output_eigen(r1,c1,d,k1,N)
END
Fig. 2: Pseudo-code for Data Modeling eigenfunction model construction.
Fig. 3: Hilbert sequence scanning maintains dyadic neighbor correlation.
Cluster
Change Detector
Galaxy
Change Detector
Resolver
ClassifierCluster?
No
Yes
Galaxy?
No
Yes
START
Cluster?
No
YesCluster
Change Detector
Galaxy
Change Detector
Resolver
ClassifierCluster?
No
Yes
Galaxy?
No
Yes
START
Cluster?
No
Yes
Cluster
Nebula
Galaxy
Cluster
Nebula
Galaxy
Fig. 4: Flowchart for Data Modeling Change Detector.
Galaxy
ClassifierCluster
Classifier
START
Cluster?
Yes
No
Galaxy?
Yes
NoGalaxy
ClassifierCluster
Classifier
START
Cluster?
Yes
No
Galaxy?
Yes
No
Nebula
Cluster Galaxy
Nebula
Cluster Galaxy
Data Model
Original Data Model (256 x 256)
Data Model
Original Data Model (256 x 256)
Data Model
Original
Data Model
(64 x 64)
Data Model (256 x 256)
Original
(64 x 64)
Data Model
(64 x 64)
Original
(64 x 64)
Data Model
(64 x 64)
Original
(64 x 64)
Fig. 5: Flowchart for Data Modeling Classifier.
Fig. 6: Comparison of Turlington Data Model output and original for M13 (top), M20 (middle), and M101 (bottom).
M1 M2 M3 M4
M5 M6 M7 M8
M9 M10 M11 M12
M13 M14 M15 M16
M17 M18 M19 M20
M21 M22 M23 M24
M25 M26 M27 M28
M29 M30 M31 M32
M33 M34 M35 M36
M37 M38 M39 M40
M41 M42 M43 M44
M45 M46 M47 M48
M1 M2 M3 M4M1 M2 M3 M4
M5 M6 M7 M8M5 M6 M7 M8
M9 M10 M11 M12M9 M10 M11 M12
M13 M14 M15 M16M13 M14 M15 M16
M17 M18 M19 M20M17 M18 M19 M20
M21 M22 M23 M24M21 M22 M23 M24
M25 M26 M27 M28M25 M26 M27 M28
M29 M30 M31 M32M29 M30 M31 M32
M33 M34 M35 M36M33 M34 M35 M36
M37 M38 M39 M40M37 M38 M39 M40
M41 M42 M43 M44M41 M42 M43 M44
M45 M46 M47 M48M45 M46 M47 M48 Fig. 7: Data Models for Messier objects M1-M48.
M49 M50 M51 M52
M53 M54 M55 M56
M57 M58 M59 M60
M61 M62 M63 M64
M65 M66 M67 M68
M69 M70 M71 M72
M73 M74 M75 M76
M77 M78 M79 M80
M81 M82 M83 M84
M85 M86 M87 M88
M89 M90 M91 M92
M93 M94 M95 M96
M49 M50 M51 M52M49 M50 M51 M52
M53 M54 M55 M56M53 M54 M55 M56
M57 M58 M59 M60M57 M58 M59 M60
M61 M62 M63 M64M61 M62 M63 M64
M65 M66 M67 M68M65 M66 M67 M68
M69 M70 M71 M72M69 M70 M71 M72
M73 M74 M75 M76M73 M74 M75 M76
M77 M78 M79 M80M77 M78 M79 M80
M81 M82 M83 M84M81 M82 M83 M84
M85 M86 M87 M88M85 M86 M87 M88
M89 M90 M91 M92M89 M90 M91 M92
M93 M94 M95 M96M93 M94 M95 M96
Fig. 8: Data Models for Messier objects M49-M96.
M97 M98 M99 M100
M101 M102 M103 M104
M105 M106 M107 M108
M109 M110 M7 M8
M97 M98 M99 M100M97 M98 M99 M100
M101 M102 M103 M104M101 M102 M103 M104
M105 M106 M107 M108M105 M106 M107 M108
M109 M110 M7 M8M109 M110 M7 M8
Fig. 9: Data Models for Messier objects M97-M110.
Fig. 10: Comparison of test objects: M51 (left), NGC869 (left center), LMC and SMC (right center), Comet Hale-Bopp (right top), and
Comet Neat (right bottom).
N e b u l a
G a l a x y
C l u s t e r
Fig. 11: Cluster plot showing distribution of Data Modeling features. Features plotted are A(0) on the x-axis and A(1) on the y-axis.
Bottom plot shows Messier number designation, and top plot coded by object classification.
Fig. 12: Table of coefficients for the 5-term eigenfunction based Data Model for each Messier object.
Image resolution and performance analysis of webcams
for ground based astronomy
Albert Lim*a, Holger
Jaenisch
a,b, James Handley
a, Miroslav Filipovic
c,a, Graeme White
a, Alex Hons
a,
Cor Berrevoets, Gary Deragopiana, Jeffrey Payne
a, Mark Schneider
a, Matthew Edwards
b
a James Cook University, Centre for Astronomy, Townsville QLD 4811, Australia
bAlabama Agricultural and Mechanical University, Department of Physics, Huntsville, AL 35811 cUniversity of Western Sydney, Locked Bag 1797 PENRITH SOUTH DC NSW 1797, Australia
ABSTRACT
We present here a novel concept for achieving real-time super-resolution ground-based imagery for small aperture
telescopes. We explore the combination of existing stacking and registration software in conjunction with real-time
equation based Data Models. Our research indicates that for anisoplanatic imagery, a real-time video/software
enhanced analog to conventional speckle imaging is possible. This paper highlights the technique and theory for
creating such a system.
Keywords: Video/Software adaptive optics, image stacking, speckle imaging, webcam, Data Model, Registax, image
enhancement
1. INTRODUCTION
The use of broadband imaging as an alternative to speckle imaging for small telescope applications has been previously
demonstrated. Images were dewarped and sharp images selected for registration and stacking. Averaging of the corrected
sequence removes motion blur, improves the signal to noise (SNR), and causes the remaining short-exposure blur to
become position stable. For narrow field of view (FOV) observations, the point-spread function (PSF) is considered
constant across the image (isoplanatic); for wider FOV’s the atmosphere introduces a spatially dependent PSF, termed
anisoplanatic distortion. Figure 1 shows anisoplanatic distortion on Mars and lunar crater Gassendi1,2,3.
Lambert demonstrated theoretical and practical examples of achieving super-resolved images by exploiting the
anisoplanatic distortion to enable observation of spatial frequency content which would normally be lost beyond the
telescope aperture4. Super-resolution is only possible under anisoplanatic conditions. Most proposals to exploit
super-resolution are based on active control and alteration of the optical system to capture the high frequency content5.
We propose not moving or altering the optical system, and simply waiting until an extra sharp cell with its extra high
spatial frequency content is imaged into the telescope. Data Modeling Change Detectors are used to indicate when
these short exposure sharp and distortion free broadband images occur. Once detected, these extra sharp cells are
filtered, registered, stacked, and enhanced to maximize the effects of the high spatial frequency content of the final
image. By displaying the current enhanced image on a high-resolution monitor and leaving it intact (persistent)
throughout the next improved resolution update, continuous video enhanced imagery occurs even if updates are
intermittent4,6,7,8.
We demonstrate how Data Modeling can enable real-time adaptive identification of sharp frames from live video
streams, which are then processed using algorithms currently available in the form of freeware such as Registax9. To
achieve this, we use video collected at real-time rates (1-15Hz) using webcams or video sampled imagery. The real-time
Data Modeling algorithm enables the user to either select a few examples of sharp images to calibrate the system and get
the process up and going quickly, or it may be left entirely unsupervised allowing the Data Model to learn how to discern
sharp distortion free images form blurry ones10.
*[email protected]; phone +65 65674163; James Cook University, Centre for Astronomy, Townsville, QLD 4811,
AUSTRALIA
Once constructed, the Data Modeling Change Detector is then used to indicate images that should be filtered, registered,
stacked, and enhanced in real-time for display at the telescope on a monitor or at the eyepiece in the form of a miniature
high-resolution television display akin to the Collins Electro-Optics I3 Image Intensifier eyepiece11,12,13.
2. EQUIPMENT
2.1 Telescope
In the course of this work, Mars images were collected on the 25th August 2003 using the Singapore Science Centre
Observatory telescope, a 40 cm (16”), F13, Cassegrain telescope with a focal length of 5200 mm (205”). A Celestron
Ultima 2x Barlow lens was used with the above telescope for negative projection to the below webcam achieving a
combined F number of F26.5. Local observers rated the seeing on this night as 7 out of 10 per improvised scale
described in page 2 of “Photographic Atlas of the Moon”14.
2.2 Webcam
We used the Philips ToUcam Pro with a ¼” CCD with 1 lux sensitivity capable of 60 frames per second video capture at
640 x 480 resolution. It is equipped with a F/2.0 lens of 6mm focal length, integrated microphone and USB interface.
For collecting the Mars AVI video, the lens on this webcam was removed and replaced with a 1.25” standard eyepiece
sleeve and used with the above telescopic setup. Sound was disabled and the download frame rate was set at 15 frames
per second with exposure times of 1/33 second per frame.
2.3 Image registration/stacking software
Software was required to register and stack the sharp images from the webcam. Programs available included several
freeware programs: AstroStack, AstroVideo, IRIS, and Registax. For this application, Registax was chosen for
registration and stacking.
2.4 Image processing software
Once the sharp images were stacked, enhancement was done to the stacked image using Registax (wavelet based) and
MAJQANDA (Data Modeling). These methods are described later in this work.
3. SHARP IMAGE DETECTION AND SELECTION
3.1 Absolute chi-by-eye
Traditionally, sharp images are detected and selected using I×χ , defined as “absolute chi-by-eye”. This approach
provides the highest quality and sensitivity in image selection, but is also the least efficient and slowest because it
depends on human interaction and comparison between multiple images in a stream of frames that easily exceed
hundreds to thousands of image frames with varying blurriness.
3.2 Registax quality factor
Alternatively, a step towards automation is provided by software such as Registax. It provides a figure of merit termed
“Quality factor” to assist in identification of sharp frames.
The Quality factor is calculated from information in the spatial frequency graphs (Figure 1) and the filter thresholds are
set interactively by the user. A sum of the values below the red line is calculated and the fraction that is located right of
the green Quality-filter line is the value. This is based on the fact that sharper/clearer images will show relatively higher
amounts of fine (high spatial frequency) details than more blurred images. The position of the filter is crucial to
proper image quality ranking. If the filter is set near zero, all the images appear to have the same quality
(approximately 1).
The different filter settings have marked effects on the quality estimate. Figure 1 (top row) shows power spectra for
quality filter settings of 1, 5, and 15 pixels on the same image. When the filter is set at 1, most of the values are right of
the filter. In the bottom row of Figure 1, graphs are given that show quality difference (top red line) for all the images
(ordered by quality).
Also shown in the bottom row graphs of Figure 1 are curves using pixel neighborhoods of various sizes for comparing
the differences between each image and the reference image (blue line, bottom of the two graphs). This method, though
powerful, must be used with care because improved quality factors can be dominated by spatial noise. Both spatial
noise and sharp detail will yield greater RSS differences with the reference image. This means relying on this quality
factor for sharp image selection remains interactive and non-robust.
We also used simple JPEG compression file size as sharp image indication with the smaller file size indicating a
sharp image. The lunar image in focus has regions of essentially constant value giving higher JPEG compression. Mars
at best focus has regions of graded shading. When graduated regions are blurred or noisy, the image has constant regions
yielding higher compression for blurry Mars images.
3.3 Data Modeling sharp image detection
We assert as a fundamental premise that nominal occurs more frequently than off-nominal. If the reverse is true,
the definition swaps and the premise still hold.
We propose a method for identifying sharp frames by creating an equation based Data Model that is tuned to examples of
sharp frames and then used as a real-time bulk filter. We later explore the generalization of this approach to the
unsupervised case. In order to understand how Data Modeling can achieve this, we will examine Data Modeling theory.
Data Modeling15
is the process of deriving a mathematical expression from measured data. We approximate the
Kolmogorov-Gabor polynomial of the form16
LK ++++= ∑∑∑∑∑∑i j k
kjiijk
i j
jiij
i
iiL xxxaxxaxaaxxxy 021 ),,,( (1)
by using Ivakhnenko’s Group Method of Data Handling (GMDH) using nested functionals of the form16
))))))))),,(((((((((),,,( 2121 kjinL xxxybybybyfxxxy KK = [ ] n
LxxxyO 3),,,( 21 =K (2)
This structured approach forms intermediate meta-variables from combinations of three inputs combined into a single
new fused output. This fused meta-variable becomes a new input variable available at the next layer. Since the algorithm
only uses the inputs necessary to achieve convergence, pruning of inputs occurs automatically and requires no external
intervention.
The O [3n] functional (function whose variables themselves are functions) is now used as a snapshot model of a group
of sharp images, or conversely a group of blurry images, depending on the application. This enables the functional
to be used as a fast real-time bulk filter for identifying sharp images bases on change detection18,19.
3.3.1 Change detector construction
Each Data Model is generated using only a portion (50% or less) of the available nominal data, with the
remainder held back for testing the Data Model after construction. This supports the ergodic Data Modeling re-
sampling theory that only a portion of the available data is required to produce adequate change models20.
For analyzing images, samples of pixel neighborhoods (image sub-sections) are extracted and characterized by five (5)
descriptive statistics given by
∑=
− −=N
j
jNxx
1
2
11 )(σ (3) ∑
=
−=
N
j
j
N
xxSkew
1
3
1
σ (4)
31
4
1 −
−= ∑
=
N
j
j
N
xxKurt
σ (5) 15
1
6
16 −
−= ∑
=
N
j
j
N
xxM
σ (6)
1051
8
18 −
−= ∑
=
N
j
j
N
xxM
σ (7)
Once the features in Equations (3) - (7) are calculated for each nominal data set, a Data Change Model is generated using
a target regression value of ½ as the nominal output. Once the equation model is derived, boundaries are placed on either
side of the target regression value that characterizes the fluctuation of the functional about ½. Typically, these boundaries
are set at 0.499 and 0.501 with a standard deviation of .0012. Figure 2 illustrates construction of a change detector21,22.
In unsupervised mode, no a priori knowledge of images being sharp or non-sharp is required. It is only assumed
that the majority of the images are background (non-sharp). Pseudo-code for constructing a Data Change Model
unsupervised is found in Figure 3. For the images analyzed for this paper, the final Data Change Model was determined
by placing a tight boundary of 0.0012 on either side of the final sharp image class designation. All images that flagged
inside of the two boundaries are considered candidate sharp images, while all others are considered non-sharp images
and are discarded from further consideration23,24,25.
4. IMAGE REGISTRATION AND STACKING
4.1 Registax for image registration
A good reference image is selected and an alignment area, depicted by a cursor box in Figure 4 (left), is framed over a
feature of interest such as a planetary disk or a lunar crater. The alignment area boxed will now serve as a basis for the
alignment procedure. During initial alignment, the program will align the images with the reference area using a Fourier-
based cross-correlation with normalization. To achieve this, both the reference-area and the image-area are transformed
into real and imaginary parts using Fast Fourier Transform (FFT). The cross-correlation products (real and imaginary)
are calculated from26
),(),(),(),(),( yxSEQyxREFyxSEQyxREFyxShift IMAGIMAGREALREALREAL += (8)
),(),(),(),(),( yxSEQyxREFyxSEQyxREFyxShift IMAGREALREALIMAGIMAG += (9)
This is then processed using an inverse FFT to create a new “image”. The brightest spot in this image is located at the
expected shift position. The initial FFT also estimates image quality, which is directly derived from the FFT-transform of
the image being processed. The FFT-transform is first converted into a “real” image by calculating
),(),(),(),(),( yxSEQyxSEQyxSEQyxSEQyxIMAGE IMAGIMAGREALREAL += (10)
A radial average value called a power spectrum is then calculated for all radii from the center of this “image” outwards.
Figure 4 (left) shows the power spectrum and a graph displaying registration properties. Basically, after running the FFT
on an image, the program sums for every possible radius (from the center) the intensities in the FFT-ed image per radius.
This sum per radius in integer steps is then divided by the number of pixels that contributed to the intensity value (the
smaller the radii, the fewer the contributing pixels). An array with radii starting from 1 onwards is thus generated. A
bandwidth filter is then used to calculate the sum for the radii in the band and presented as a proportion based on the sum
in the band divided by the sum for all radii. In this way, the sum of the values in the section of the power-spectrum
defined by user is thus compared with the total sum of the power-spectrum. This process is depicted in Figure 1.
Next, the images are sorted according to quality levels. Only those images are processed which exceed a certain user-
defined quality threshold in comparison to the image with the highest measured quality. The next stage is
optimizing the initial alignment. During optimization, every image is first block-matched with the reference image to
find a pixel-shift that result in the smallest sum of squared differences between the images.
Every image is normalized before these squared differences are computed to prevent the changes in overall brightness
affecting the outcome. The block-match search is done in a user-specified search window, and the program keeps track
of the improvement for every image in alignment (lower sum of squared differences, SSD). Using a second user defined
percentage threshold, optimization terminates when subsequent optimizations do not yield SSD improvement. Once
completed, the registered images are ready to be stacked to minimize noise and eliminate motion by spatial correlation
4.2 Image stacking increases SNR
Stacking enough correctly selected sharp (low noise) images results in significant increase in the signal-to-noise ratio.
Technically, the signal-to-noise ratio (SNR) is what makes the difference between a good and a poor image. Adding 2
images will combine their signals S1 and S2, however, their noise also quadruple as27
21 SSS += (11)
2
2
2
1
2 NNN += (12)
Their signal to noise ratio is defined as
( )2
12
2
2
1
21
NN
SSSNR
+
+= (13)
From (13), we derive,
2
1
2
1
2
2
1
2
2
1
2
1
11
+
+
+
=
N
N
SNR
N
N
SNRSNR
(14)
Assuming both images have similar SNR and noise, adding 2 images increases the signal-to-noise by a factor of square
root 2 resulting in 1.414 times the initial SNR of each image.
During stacking, we can assume multiple exposures taken under similar conditions will result in adding images with
similar SNR and noise. (i.e. S1 = S2 = S3 = …… and N1 = N2 = N3 = …… )
Extending from the above, Equation (14) may be expressed as
...
...1...1
2
1
2
1
2
2
1
2
2
1
2
1 +
+
+
+
+
+
=
N
N
SNR
N
N
SNRSNR
(15)
2
1
)111(
)()()( 321
K
K
+++
++=
SNRSNRSNRSNR (16)
For n images therefore, the final signal-to-noise ratio is function of square root of n, i.e.
1)(*2
1
SNRnSNR = (17)
Thus, it can be seen that signal to noise varies to the square root of n, while stacking 2 images will increase SNR by
1.414 times, stacking 9 images will increase the SNR by 3 times and stacking 625 images would result in an increase
SNR factor of 25 times.
During stacking, all the pixel-intensities of the images considered of good quality are stacked as seen in Figure 4
(center). Stacking and all other subsequent processing is done in floating point matrices to maintain information for post
image processing. RGB layers may be stacked separately or used to create luminance out of RGB-layers which are
maintained for post processing in either color option.
5. IMAGE ENHANCEMENT
5.1 Registax wavelet based image enhancement
After stacking, post-processing is done using wavelets. The wavelet-processing incorporated by Registax uses a simple
filter to separate the stacked image into 6 spatial layers. The process is similar to successive unsharp masking runs.
Initially, the stack-image gets convoluted with the wavelet-filter. The difference between the stack and this convolved
image is calculated and moved into the first wavelet-layer. The second layer is constructed using the blurred image
which is convolved once more with the wavelet-filter, and the difference between these is stored in layer 2.
This process continues until the initial intensities of the stack are divided across 6 layers and a final, blurred, base layer.
The advantage with wavelet-techniques are orthogonality and invertibility. This insures no information is lost during
processing. Adding the layers together without alteration retrieves the original stacked image. The 6 layers represent 6
spatial resolution layers, the highest (layer 1) will have the finest details and the lowest (layer 6) the coarsest details.
Layer 1 will often have most noise present, and removing this layer reduces noise. Individual layers may be post
processed to increase the contrast for details present in that layer. This powerful image enhancement method requires
some skill and interactivity to use, and is difficult to make robust for automatic processing.
5.2 Data Modeling image enhancement
The authors have determined that the Van Cittert deconvolution nonlinear image processing method can be
approximated through the application of a series of high pass and low pass kernel filters to an image. Additionally, using
knowledge of the number of times the filters are applied enables the individual high pass and low pass filters to be
combined into an equivalent kernel. An equivalent kernel is one that when applied to the image gives the same value as
a series of individual kernels applied to the image sequentially.
To illustrate, consider an image processing sequence consisting of applying a high pass filter, followed by a low pass
filter, followed finally by a Gaussian filter to an image. Examples of these filters are
111
1151
111
−−−−−−−−
111
131
111
121
242
121
High Low Gaussian (18)
Convolving each of these three kernels together yields a larger kernel that is called the equivalent kernel.
111
1151
111
−−−−−−−−
⊗111
131
111
⊗121
242
121
=
4810841
4210161024
8109415294108
10161522521521610
8109415294108
4210161024
14810841
−−−−−−−−−−−−−−−−−−−−−−
1
−−−−−−
(19)
When the equivalent kernel is applied to the image, the result is the same as convolving the image with each of the
three smaller kernels separately. Mathematically, the order the kernels are applied in does not matter. If a high pass
filter of size 3x3 is applied to an image alternatively with a low pass filter that is also of size 3x3, the resultant for one
application of these 2 filters is the same as having used a single equivalent kernel of 5x5. Taking this one step further, a
series of 5 sets of 3x3 high pass and 3x3 low pass filters could be applied to the image so that
== )(5),,,,( HPLPHPLPHPLPHPLPHPLPHPLP 21 x 21 Equivalent Kernel (20)
For this example, an equivalent kernel of size 21x21 would yield the same answer in a single pass. To keep from having
to spatially convolve such a large kernel or to avoid using a Fast Fourier Transform (FFT) for convolution, the authors
have derived a Data Model of the equivalent kernel.
We found the equivalent kernel process is approximated with Data Models. The Data Model uses either
equivalent features from local pixel neighborhoods operated on by the equivalent kernel, or a subset of the pixel
neighborhood values directly as input. The subset of the pixel neighborhood chosen corresponds to the center row and
center column of the kernel as shown in Figure 5.
6. RESULTS
Figures 6 and 7 display results of image selection, registration, stacking and processing using Data Modeling and
Registax for Mars and the moon. These results can be summarized as:
Mars Moon
Total pool 66 225
Data Model flagged to stack 18 38
Registax images used 18 38
Final Registax Image Quality Factor = 0.1886 0.2026
Final Data Model Image Quality Factor = 0.1917 0.2053
The Mars results showed striking improvement by Registax over the original, but an additional marginal improvement
by the Data Model. For the moon, although interactive Registax clearly improved on the raw, stacked image Data
Modeling showed marked improvement over it. Also, in this case the Data Modeling was performed unsupervised.
Finally, legacy images constructed with stacking are shown in Figure 8.
7. FUTURE WORK
Our present setup uses webcams at the telescope to capture AVI sequences (mostly blurred frames with sporadic sharper
ones) for storage in a computer for post processing.. Ideally, the Data Modeling approach for selecting only the sharpest
images and enhancing them even further can be performed inline and in real-time as images are acquired at the telescope
thereby facilitating the use of the Data Model itself as a real-time video/software enhanced analog to speckle
imaging. This is illustrated in Figure 9.
Achieving real time sharp image selection also saves storage space (each 2 minute AVI sequence on 640 x 480
resolution is over 1GB in size) and valuable time. We need to also apply Data Modeling to the automatic image
registration problem. The advent of powerful computers, high speed USBs and related technology will enable webcams
using software to perform real-time registration28.
8. SUMMARY
We demonstrated webcam based image deblurring and enhancement using existing software and algorithms for small
aperture ground based telescopes. We also demonstrated the concept of auto selection of sharp frames, constructed a
Data Change Detection Model using local pixel statistics as features to find other sharp images in real-time, and
modeled a Van Cittert deconvolution using an equivalent kernel from a series of high and low pass filters29.
ACKNOWLEDGEMENTS
The authors wish to thank committee members of The Astronomical Society of Singapore (TASOS) and Mr. Tan Wei
Leong for their support and the images; and Scott McPheeters, Tim Aden, and John Deacon for their continued support
during the course of this work.
REFERENCES
1. Charnotskii, M.I., “Imaging in turbulence beyond diffraction limits”, Proceedings of SPIE Vol. 2534 (1995)
2. Charnotskii, M.I., et. al., “Observation of superresolution in nonisoplanatic imaging through turbulence”, J. Opt.
Soc. Am. A., Vol. 7, No. 8, August 1990, pp. 1345-50.
3. Lubosz, W., “Optical systems with resolving powers exceeding the classical limit. II”, J. Opt. Soc. Am. A., Vol. 57,
1967, pp. 932-41.
4. Roddier, F. Adaptive Optics in Astronomy. Cambridge, UK: Cambridge University Press, 1999.
5. Hubin, N. & L. Noethe 1993, “What is Adaptive Optics?”, Science, 262 : 1345 – 1484.
6. Jaenisch, H.M., Handley, J.W., Scoggins, J., Carroll, M.P., "Atmospheric Optical Turbulence Model (ATOM) Based
On Fractal Theory, " Proceedings of SPIE, Los Angeles, CA January 24, 1994.
7. Ellerbrook, B.L. “First Order Performance Evaluation of Adaptive-Optics Systems for Atmospheric-Turbulence
Compensation in Extended-Field-of-View Telescopes.” Journal of the Optical Society of America A 11:783-805
(1994)
8. Fraser, D., Lambert, A., Jahromi, M.R.S., Clyde, D., Donaldson, N., “Can broad-band image restoration rival
speckle restoration?”, Proceedings of SPIE Vol. 4792 (2002).
9. Berrevoets C., “Processing Webcam Images with Registax ”, Sky and Telescope, Apr 2004, p 130 – 135.
10. Lambert, A., Fraser, D., Jahromi, M.R.S., Hunt, B.R., “Super-resolution in image restoration of wide area images
viewed through atmospheric turbulence”, Proceedings of SPIE Vol. 4792 (2002).
11. Shemer, A., et al., “Superresolving optical system with time multiplexing and computer decoding”, Applied Optics,
Vol. 38 No. 35, 10 December 1999, pp. 7245-51.
12. Jaenisch, H.M., Collins, W.J., Handley, J.W., Case, C.T., Songy, C.G., “Real-time visual astronomy using image
intensifiers and Data Modeling”, Proceedings of SPIE, Vol. 4796 (2002).
13. Collins, W., “Collins Electro-Optics”, http://www.ceoptics.com/, May 24, 2004.
14. Lim, A., Chong, S.M., Ang, P.S., “Photographic Atlas of the Moon”, Cambridge University Press, 2002.
15. Jaenisch, H.M. and Handley, J.W., “Data Modeling for Radar Applications”, Proceedings of the IEEE Radar
Conference 2003, (May 2003).
16. Westwick, D.T. Kearney, R.E., Identification of Nonlinear Physiological Systems, Piscataway, NJ: IEEE Press,
2003.
17. Madala, H.R., Ivakhnenko, A.G., Inductive Learning Algorithms for Complex Systems Modeling, Boca Raton, FL:
CRC Press, 1994.
18. Jaenisch, H.M., and Handley, J.W., “Automatic Differential Equation Data Modeling for UAV Situational
Awareness”, Society for Computer Simulation, Huntsville Simulation Conference 2003, (October 2003).
19. Handley, J., Jaenisch, H., Lim, A., White, G., Pennypacker, C., Edwards, M., “Data Modeling of Deep Sky Images”,
Proceedings of SPIE Vol. 5497 (2004).
20. Jaenisch, H.M., Handley, J.W., Pooley J.C., Murray S.R., “Data Modeling for Fault Detection”, Society for
Machinery Failure Prevention Technology (MFPT), (April 2003), published in Proceedings of MFPT (2004).
21. Pooley, J.C. and Murray, S.R., “Rotordynamic Fault Diagnostics Using Phase Coherent Filtering”, Proceedings of
the Society for Machinery Failure Prevention Technology (MFPT), (April 2003).
22. Pooley, J.C., Murray, S.R., Jaenisch, H.M., and Handley, J.W., “Fault Detection via Complex Hybrid Signature
Analysis”, JANNAF 39th Combustion, 27th Airbreathing Propulsion, 21st Propulsion Systems Hazards, and 3rd
Modeling and Simulation Subcommittees Joint Meeting, Colorado Springs, CO, (December 2003).
23. Jaenisch, H., Handley J., Lim A., M.D. Filipovic, White G., Hons A. , Crothers S., Deragopian G., Schneider M.,
Edwards M., “Data Modeling for virtual observatory data mining”, Proceedings of SPIE Vol. 5493 (2004).
24. Jaenisch, H.M., Handley, J.W., Bonham, L.A., “Enabling Calibration on Demand for Situational Awareness”, Army
Aviation Association of America (AAAA), Tennessee Valley Chapter, (February 2003).
25. Jaenisch, H.M., Handley, J.W., Faucheux, J.P., Harris, B., “Data Modeling of Network Dynamics”, Proceedings of
SPIE, Vol. 5200 (2003).
26. Berrevoets, C. “Registax Website”, http: http://aberrator.astronomy.net/registax/, May 24, 2004.
27. Newberry M.V. “The Signal to Noise Connection – Part II”, CCD Astronomy, Fall 1994, p 13 – 14 (1994)
28. Lim, A., “Astro Scientific Centre Pte. Ltd.”, http://www.astro.com.sg/, May 24, 2004.
29. James Cook University, Centre for Astronomy, http://www.jcu.edu.au/school/mathphys/astronomy/, May 24, 2004.
FIGURES
Fig. 1: Examples of individual sharp (left), intermediate (center) and blur frames (right) as scored by quality factor. Graphs on bottom
show ranking of approximately 50 images by quality (top curve), and difference between reference image and individual frames
(bottom curve).
Tip-Off
Tip-Off
Nominal
Tip-Off
Tip-Off
Nominal
Equation
Output
σ
T = w0+w1x1+w2x2+w3x3 +w4x12
+w5x22 +w6x3
2 +w7x13 +w8x2
3
+w9x32 +w10x1x2 +w11x1x3
+w12x2x3
Skewness Standard
Deviation
Kurtosis
Rollup
Stats
Data Model
Data
SeriesPDF
Fig. 2: Data Change Modeling uses descriptive statistics as shown above that characterize the PDF or histogram of the input data to
build a model that is sensitive to small changes in input feature distribution.
n=total num ber of cases
Call G enerate_W aveform _Stats(ydata,n,ystats)
Call G enerate_R ollup_Stats(ystats,n,rstats)
For i=1 to n
rem Sub-sam ple of all cases called G roup 1
Next i
m =num ber of subsam pled cases
rem use prev ious pseudo-code for generating a D ata Change M odel
call generate_data_change_m odel
fori=1 to n
call In terrogate_data_m odel(features,D m val)
call flag_tipoffs
rem Save tip-off data as new G roup 2
next i
for i=1 to n
rem subsam ple another m points from new pool
rem place w ith 1st 10% subsam pled from original G roup 1
Next i
m =2*m
call generate_data_change_m odel
for i=1 to n
call In terrogate_data_m odel(features,D m val)
call flag_tipoffs
rem Save as G roup 3
next i
rem extract a ll k tipoff cases from G roup 3 and
rem random ly assign to 0/1 c lass
for i=1 to k
if rnd>.5 then
out=1
else
out=0
endif
endif
nex t i
10 continue
call generate_data_re la tions_m odel
rem score cases
for i=1 to k
ca ll In terrogate_data_m odel(features,D m val)
if D m val< .25 then
tra inval=0
elseif D m val> .75 then
tra inval=1
else
if tra inva l=0 then
tra inva l=1
else
tra inva l=0
endif
endif
nex t i
rem check for lim it cyc le or 4 layer D ata M odel learn ing a ll cases
if lim it_cycle or all_ learned then
rem sharp im ages are the sm aller c lass
rem if equal num ber in each c lass, se lect c lass w ith largest R SS
else
goto 10
endif
rem last m odel generated can be used in lieu of tra in ing a fina l m odel
ca ll generate_data_change_m odel
for i=1 to n
ca ll In terrogate_data_m odel(features,D m val)
ca ll flag_tipoffs
nex t i
ca ll w rite_output
end Fig. 3: Pseudo-code for unsupervised Data Change Detector generation.
Fig. 4: (Left) Alignment in progress at 12%. The power spectrum and Registration Properties are shown; (Center) Stacking. Users can
manually deselect blur frames that were ranked incorrectly by Registax; (Right) Registax’s 6 layer wavelet processing works well with
most planetary astronomical images.
Image
Rzzzzzzzzzf =),,,,,,,,( 534335343332312313
Data Model of Equivalent Kernel
)]3([ nO
*
* f is
Image Pixel Locations
Correspond To Center Row and
Column of Equivalent Kernel
5554535251
4544434241
3534333231
2424232221
1514131211
zzzzz
zzzzz
zzzzz
zzzzz
zzzzz
Center Image Pixel Locations
From Final Image
(After Applying Equivalent Kernel)
5554535251
4544434241
3534333231
2424232221
1514131211
zzzzz
zzzzz
zzzzz
zzzzz
zzzzz
Fig. 5: Data Modeling captures center row and center column of equivalent kernel and builds an equation model that captures the
image enhancement algorithm.
Registax Sharp Selected and Wavelet
Processed
Score = 0.1886
Data Modeling
Score = 0.1917
Raw Frame #1
Registax Sharp Selected and Wavelet
Processed
Score = 0.1886
Data Modeling
Score = 0.1917
(Note: JPEG artifacts removed,
Noise suppressed)
Raw Frame #1
(Note: JPEG artifacts present)
Fig. 6: Data Modeling for image stacking on Mars (top) and zoom into Mars (bottom).
Registax Sharp Selected and Wavelet
Processed
Score = 0.2026
Data Modeling
Score = 0.2053
(Note: Noise suppressed)
Raw Frame #1
Registax Sharp Selected and Wavelet
Processed
Score = 0.2026
Data Modeling
Score = 0.2053
Raw Frame #1
Fig. 7: Data Modeling for image stacking on lunar surface feature (top), and zoom in to lunar surface feature (bottom).
Jupiter Saturn Alpine Valley
Fig. 8: Examples of high-resolution images produced by processing stacked images from AVIs using Registax.
BackgroundReject
SharpData Model
[O(3n)]
StackData
Modeling
EnhanceImage
Camera
Time
Fig. 9: Block diagram of unsupervised Data Modeling approach for selecting images for stacking.
Classification of Jacoby stellar spectra data using Data Modeling
Holger M. Jaenisch*a and Miroslav D. Filipovic**aaUniversity of Western Sydney
Keywords: Data Modeling, Neural network, Stellar Spectra Classification, Astronomy, Decision Architecture
ABSTRACT
This paper presents the results of creating a Data Model by using neural equation networks of high order polynomials toachieve 100% correct classification of the Jacoby stellar spectra. The Jacoby set is a challenging group of 161 spectraspanning the full range of temperature, sub-temperature and luminosity groupings of standard star types. To achieve fulllearning, the development of a cascaded decision architecture linking an extensive network of polynomial decisionequations was required. The two dominant features were extracted, and complex decision maps generated. Also, thesensitivity of the equation architecture to misclassification due to measurement noise was analyzed.
1. BACKGROUND
1.1. History of Stellar SpectraIn the early 1800’s, the German physicist Joseph von Fraunhofer first observed the patterns of sun spectrum absorptionlines. In the late part of the 19th century, astronomers were able for the first time to routinely examine the spectra of starsin large numbers. Astronomers like Angelo Secchi and E.C. Pickering noted that the stellar spectra could be divided intogroups by the general appearance of their spectra. They proposed classification schemes where stars were groupedtogether by the prominence of certain spectral lines. In Secchi’s scheme, stars with strong hydrogen lines were calledtype I, stars with strong lines from metallic ions like iron and calcium were called type II, stars with wide bands ofabsorption that got darker toward the blue were called type III, and so on. Building upon this early work, astronomers atthe Harvard Observatory refined the spectral types and renamed them with letters, A, B, C, and so forth. The observatoryalso embarked on a massive project to classify spectra, carried out by astronomers Williamina Fleming, Annie JumpCannon, and Antonia Maury. The results of their work was published between 1918 and 1924 as the Henry DraperCatalog (he was the benefactor of this work), and provided classifications of 225,300 stars representing only a tinyfraction of the stars in the sky. In the course of this Harvard study, some of the old spectral types were consolidatedtogether, and the types were rearranged to reflect a steady change in the strengths of representative spectral lines. Thisrearrangement of order caused the order of the spectral classes to become O, B, A, F, G, K, and M. These letterdesignations have no physical meaning but have remained to this day. Each spectral class is divided into tenths (sub-temperature class), so that a B0 star follows an O9, and an A0, a B9. In this scheme the sun is designated a type G2.
These early studies generally lumped stars together into classes based upon the appearance of the spectra. Those thatlooked the same got lumped together. However, in the 1930’s and 1940’s, a better understanding of the physics behindthe spectra was developed. It was then understood that the predominant factor driving a star’s spectra was its surfacetemperature. It was found that O class stars are the hottest with surface temperatures on the order of 40,000 ˚K, while Mclass stars the coolest at 3000 ˚K. The decimal divisions of sub-temperature class follow the same pattern, with a B0 starhotter than a B9.
*[email protected]; phone 1 256 337 3768; fax 1 256 830 0287; Sparta, Inc.; 4901 Corporate Dr. Ste 102;Huntsville, Alabama, USA 35805;**[email protected]; Intl. phone 61 2 47360135; local phone (02) 47360135;Intl fax 61 2 47360129;http://www.uws.edu.au/astronomy/index.html; Astronomy, School of Engineering and IndustrialDesign, University of Western Sydney, Locked Bag 1797, PENRITH SOUTH DC, NSW 1797, AUSTRALIA.
In the 1940’s and 1950’s, W. W. Morgan and P.C. Keenan at Yerkes Observatory introduced the MK system. The MKsystem is a refinement of the earlier Harvard work that takes into account that stars at the same temperature can havedifferent sizes. For example, a star one hundred times larger than our sun but with the same surface temperature, willshow subtle spectrum differences and will have a much higher luminosity. The MK system adds a Roman numeral to theend of the spectral type to indicate the luminosity class: I a supergiant, III a giant star, and V a main sequence star. Oursun as a typical main-sequence star would be designated a G2V1.
1.2. Explanation of Jacoby SpectraThe spectral type of a star is fundamental to astronomy, so much so that an astronomer beginning the study of any starwill first try to find out its spectral type. The spectral type of a star allows the astronomer to know not only thetemperature of the star, but also its luminosity (expressed often as the absolute magnitude of the star) and its color. Theseproperties help in determining the distance, mass, and many other physical quantities associated with the star, itssurrounding environment, and its past history. If the star has not already been catalogued or if there is doubt about thelisted classification, then the classification must be done by taking a spectrum of a star with a spectrograph andcomparing it with an Atlas of well-studied spectra of bright stars. Fig. 1 shows 13 standard digital spectra from theprincipal MK spectral types measured with a spectrograph. The range of wavelength (the x axis) is 3900 Å to 4500 Åand the intensity (the y axis) of each spectrum is normalized, which means that it has been multiplied by a constant sothat the spectrum fits into the picture, with a value of 1.0 for the maximum intensity, and 0 for no light at all.
3900 wavelength (angstroms) 4500
3900 wavelength (angstroms) 4500
0
1
0
1
0
1
0
10
1
0
1
0
1
0
1
0
10
1
0
1
0
1
0
1
3900 wavelength (angstroms) 4500
O5V
B0V
B6V
A1V
A5V
F0V
F5V
G0V
G6V
K0V
K5V
M0V
M5V
3900 wavelength (angstroms) 4500
3900 wavelength (angstroms) 4500
0
1
0
1
0
1
0
10
1
0
1
0
1
0
1
0
10
1
0
1
0
1
0
1
3900 wavelength (angstroms) 4500
O5V
B0V
B6V
A1V
A5V
F0V
F5V
G0V
G6V
K0V
K5V
M0V
M5V
Fig 1: Examples of Jacoby spectra. Shown here are 13 examples of Jacoby spectra extractedfrom the original between 3900 Å and 4500 Å and measured with a spectrograph. Normalizedintensity image graphs taken as photographs of the same portion of the Jacoby spectra areshown on the lower right of each of the object data graphs. This portion of the Jacoby spectrais free of measurement error and is the same portion used by CLEA. The full Jacoby spectraranges from 3510 Å to 7427.2 Å.
The classification of stellar enables the reduction of a large sample of diverse stars to a manageable number of naturalgroups with similar characteristics. Since the group members presumably have similar physical characteristics, they canbe studied as groups, not as isolated stars. By the same token, unusual stars may readily be identified because of theirobvious differences from the natural groups. These peculiar stars can then be subjected to intensive study in order toattempt to understand the reason for their unusual nature. These exceptions to the rule often help in understanding broadfeatures of the natural groups and provide evolutionary links between the groups.
The Jacoby stellar catalog is the cleanest and most comprehensive data collected in the visible wavelength band to date.This catalog provides a comprehensive catalog for modeling and predicting deep sky object spectra. The data for thiscatalog was taken over 26 different nights between December 1980 and December 1981 with the Intensified ReticonScanner (IRS) on the No. 1 90 cm telescope located at the Kitt Peak National Observatory in Tucson, Arizona. The IRSis a dual beam spectrophotometer that measures both the sky and the star simultaneously. Data for this catalog was takenin the beam switch mode, where the object is alternatively observed through one aperture and then the other. These twoapertures were 13 inches in diameter and were separated physically by 61 inches. Jacoby, et.al.2 documented in theiroriginal work that this instrument was under development until February 1981, and that great care was taken to insurethat no fixed pattern noise effects from the instrument were introduced into the measurements.
The spectra for each of the 161 objects published in the Jacoby catalog were measured in three (3) overlappingsegments. These segments consisted of: 1) blue, grating No. 35 (600 1 mm-1) in second order with copper sulfate andSchott WG3 blocking filters to cover 3430 or 3500 Å to 4870 or 4950 Å (dependent on exact instrument setup for thecase being observed); 2) green, grating No. 56 (600 1 mm-1) in second order with either a Schott GG455 or Corning 3-75blocking filter to cover from 4760 to 6220 Å; and 3) red, grating No. 36 (1200 1 mm-1) in first order with either a SchottGG15 or WG2 blocking filter to cover from 6000 to 7450 Å. The profile of the instrument can be represented by aGaussian function, and the full width of the half maximum is approximately 4.5 Å (which varied across the spectrum andwith seeing variations, but not more than 10%), and the data was corrected for any noted measurement artifacts.
1.3. Wavelength FeaturesIn order to perform classification, a set of numbers that describe or characterize each of the spectra above must bechosen and developed. These descriptive numbers are called features3. In the case of this work, actual values sampledfrom the spectra themselves were used as the characterizing features. These features were not chosen in a random or adhoc manner from the data, they were chosen to be the different line strengths located between 3900 Å and 4500 Å thatscientists and astronomers have in the past 100 years determined as being descriptors of specific star types. Thesefeatures are listed in Fig 2.
3900 Start of CLEA file3934 Ca II (K line), visible A0-A9, strongest G8-K23965 He I, visible O6-B9, strongest B0-B23968 Ca II (H line), H line(blend) = K line at ~F0, strongest from G8-K23970 H I (H Epsilon), H Balmer Series, Max@A04026 He I, strongest B24031 Mn I, compare to 4130 Si II line for discrimination in F stars4046 Fe I, visible F0 and cooler, strongest K2-K5, discerns G stars4068 C III, strongest O6-O94073 O II, visible O9-B24089 Si IV, strongest O6-O94097 N III, visible B0-B24100 He II, strongest O4102 H I (H Delta), H Balmer Series, Max@A0
4121 He I, strongest B, maximum around B24131 Si II, visible B2-A54144 Fe I, Appear F0 and cooler, strongest K2-K54227 Ca I, first visible F0, Strengthens to cooler K and M stars4267 C II, visible B0-B8, strongest at B44300 CH & Metals (G Band), visible F5 in main sequence, G0 in supergiants.4317 O II, visible in O9-B24340 H I (H Gamma), H Balmer Series, Max@A04384 Fe I, strongest K2-K54388 He I, maximum around B24471 He I, maximum around B24472 He I, maximum around B24481 Mg II, evident in B2-A5, Strongest at A1.4500 End of CLEA file
Fig 2: The 28 wavelength features from the Jacoby spectra used in this work, and theirphysical significance in stellar spectra.
2. CLASSIFICATION METHODS
2.1. Classical MethodsHistorically, expert systems, Bayesian classifiers, clustering methods, and neural networks are methods that have beenavailable and used unsuccessfully to classify stellar spectra. In the case of expert systems, the if-then-else rules that mustbe developed are complex and difficult to test. For the type of decision architecture encompassed by this work, an expertsystem of if-then-else rules would need on the order of 28 factorial features times 7 temperature classes times 19 sub-temperature classes times 6 luminosity classes, or on the order of 2x1054 rules. Even if this expert system providedperfect class separability and classification, the if-then-else rule set would be totally unmanageable and could notadequately be tested for failure modes.
The benefit of the Bayesian classifier is that it does not have such a large and unwieldy number of rules or features tokeep track of and test. However, the Bayesian classifier only uses a statistical model to represent the classes using themean and covariance of the features. If a Bayesian classifier is used to model the decision architecture required for thisstellar classification problem, a great deal of overlap in the 28 dimensional feature space in mean and covariance wouldexist. This would result in a large amount of misclassification.
Clustering methods such as K-means are similar to the Bayesian classifier approach in that the 28 dimensional featurespace is large, but they provide a better representation of the spectra to be modeled. Instead of using pure mean andcovariance, the K-means algorithm uses geometrical distance from the mean of the cluster to determine classmembership. However, clustering methods require trying out combinatorial subsets of groupings (basically allcombinations) to determine the optimal grouping and class membership, which for this type of data set is computationalunfeasible.
Neural networks are universal function approximators that find coefficients for the fit, but the form of the equation thatthey use is unknown to the trainer. The Rumelhardt back propagation neural network for example uses the sigmoidalfunction to represent the transition from layer to layer. In Fig 3, the sigmoidal function would transform at the hiddenlayer the values from the input layer into those propagated to the output layer. Once the output layer is determined, it iscompared with the truth (differences are generated) and the differences are propagated back to the input layer where thecoefficients (called weights) are adjusted to minimize this difference. Neural networks do not allow the user or trainer toextract the equation that the neural network has determined, and the contribution of any one feature to the final outcomeis unknown. In addition, neural networks require an a priori unknown amount of training time and have several inputparameters (such as convergence and learning rate) that must be tweaked by the user, thereby making them non-robustand ad hoc. These methods have been tried on stellar spectra classification in the past and have demonstrated onlymarginal improvement over the best Bayes classifiers and fail on subclass decisions. These networks can also quicklybecome complex.
InputLayer
HiddenLayer
OutputLayer
x1
x2
x3
x4
x5
x6
x28
“O”
“B”
“A”
“F”
“G”
“M”
“K”
N1
N2
N3
N4
N5
N6
N28
...
...
...
InputLayer
HiddenLayer
OutputLayer
x1
x2
x3
x4
x5
x6
x28
“O”
“B”
“A”
“F”
“G”
“M”
“K”
N1
N2
N3
N4
N5
N6
N28
...
...
...
Fig 3: Traditional neural network architecture for determining temperature class of stellarobjects from the 28 wavelength input features of interest.
2.2. Data Modeling Using Neural Equation Stellar Classification ArchitectureThe Neural Equation Stellar Classification architecture is an example of a neural equation based method that uses DataModeling to generalize multivariate regression and neural networks by deriving high order polynomial equations ofdatabase relations using simple low order building blocks. This method is robust and efficient, and is based in part on theGroup Method of Data Handling (GMDH) polynomial method developed by Ivakhnenko4,5. Jaenisch6 has taken thisprocess a step further by developing an information processing architecture for generating equations representing theknowledge contained in hybrid neural networks and expert systems. This architecture is defined by the following salientfeatures: 1) Feature selection process is fully autonomous; 2) Generates neural polynomial decision equation; 3) Inputvariables may call external simulations and executables for their value(s); 4) Output variables may call externalsimulations and executables with value(s); 5) Output may become input to new neural equation; 6) Expert systems maybe linked with neural equation as either input or output; and 7) Neural equation modeling is totally open-ended andscaleable. This capability allows multi-variable problems to be represented in a different form from standard linear andnon-linear programming formats and provides a new format that is amenable to neural architecture solution methods.Any solution that can be found by a neural architecture can be recast in Ivakhnenko polynomial form and can then beused to generate an objective function directly. This new simple algebraic model can then be optimized using othermethods in Data Modeling developed by Jaenisch to provide a fast and efficient model. Fig. 4 shows an example usingthird order building blocks (labeled as triples) to build a higher order (in this case 27th order) polynomial model.
x1
x3
x5
x3
x5
x7
x1
x5
x7
N
N
N
N
N
N
N
N
N
Triple
Triple
Triple Triple U “K”or “M”
N = NormalizerU = Unitizer
Equation 1
Equation 2
Equation 3 Equation 4x1
x3
x5
x3
x5
x7
x1
x5
x7
N
N
N
N
N
N
N
N
N
Triple
Triple
Triple Triple U “K”or “M”
x1
x3
x5
x3
x5
x7
x1
x5
x7
x1
x3
x5
x3
x5
x7
x1
x5
x7
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
Triple
Triple
Triple
Triple
TripleTriple TripleTriple UU “K”or “M”
“K”or “M”
N = NormalizerU = Unitizer
Equation 1
Equation 2
Equation 3 Equation 4
Fig 4: Neural equation stellar classification architecture example. Here, four differentequations (represented as Triple for their type) are used in a decision architecture to determinetemperature class K from temperature class M. This architecture can be embedded in a higherorder architecture that determines temperature classes K and M from other classes such as G.
In this type of equation based neural architecture, the tree structure can continue to expand if the complexity of thedecision architecture becomes more complex. This allows for entire branches to be added to the tree if necessary withoutretraining the entire tree. Also, each individual equation in the architecture can be represented by an entire network ofpolynomial based equation models, thereby allowing for the nesting of decision architectures within other decisionarchitectures. This method is fully adaptive, and is allowed to choose which features are needed, and only uses thosefeatures that are necessary to accurately make the correct decision.
3. RESULTS
3.1. Decision ArchitecturesUsing the equation based neural architecture approach, independent decision architectures were constructed for each ofthe classes to be determined: temperature, sub-temperature, and luminosity. Once these architectures were generated,they were taken and fused together into one overall network that calls each architecture and yields the final full stellarclassification. This is shown graphically in Fig. 5, where the example case shown yields G for the temperature class, 2for the sub-temperature class, and V for the luminosity class. It had been hoped that a fractional subset of the featuresprovided to the algorithm would be chosen as necessary to adequately model the classes, and this was the case for thetemperature class, where only 23 of the 28 features were needed. This, however, was not the case with the harder sub-temperature and luminosity classes, where the algorithm independently used all 28 features for its decision architecture.
(O,B,A,F,G,K,M)3965,3968,4500,4026,41214144,4340,3900,3934,39704046,4089,4097,4100,41024131,4227,4300,4317,4384
4388,4472,4481
Order=81
Temperature Class
(0,0.5,1.0,…,9.5)
4500,3970,3934,3968,41444046,4068,4073,4100,42674388,4481,4031,4300,44723900,3965,4089,4097,41314227,4340,4384,4026,4102
4121,4317,4471
Order=729Sub-Temperature
Class
(I,II,III,IV,V,Ve)Order=6561
Luminosity Class
Stellar Class(e.g. G2V)
4089,3968,3970,4026,40974317,4384,3934,4388,44814068,4102,4267,4500,40734340,4472,3900,3965,40464131,4144,4227,4300,4031
4121,4471,4100
Wavelengths(Angstroms)**
(O,B,A,F,G,K,M)3965,3968,4500,4026,41214144,4340,3900,3934,39704046,4089,4097,4100,41024131,4227,4300,4317,4384
4388,4472,4481
Order=81
Temperature Class
(0,0.5,1.0,…,9.5)
4500,3970,3934,3968,41444046,4068,4073,4100,42674388,4481,4031,4300,44723900,3965,4089,4097,41314227,4340,4384,4026,4102
4121,4317,4471
Order=729Sub-Temperature
Class
(I,II,III,IV,V,Ve)Order=6561
Luminosity Class
Stellar Class(e.g. G2V)
4089,3968,3970,4026,40974317,4384,3934,4388,44814068,4102,4267,4500,40734340,4472,3900,3965,40464131,4144,4227,4300,4031
4121,4471,4100
Wavelengths(Angstroms)**
Fig 5: Overall decision architecture. Wavelength features used for each class are shown on theleft listed in order of overall contribution along with the maximum polynomial order for thedecision architecture in the center and the possible outcomes on the right. The decision ofeach of these three architectures is fused together to give the final stellar class, which in thisexample case is G2V.
If we zoom into each of the ovals shown in Fig. 5, we find the decision architecture for each class decision. Thearchitecture for determining the temperature class is shown below in Fig. 6. In Fig. 6, each oval is represented by anequation architecture with the order of the equation required to make each decision shown. The decision architecture isread from left to right. The process of using this architecture to determine the temperature class would consist ofextracting the necessary features for each equation (shown to the left of each ellipse), feeding them into the equationmodel as inputs, and extracting the output. The output would then be compared to the decision boundary shown on theright of the oval. For example, for Equation 1, wavelength features 3968, 4144, 4500, 3965, 4102, 4131, 4472, 4121, and4100 (all in Å) would be input, and the output compared to the decision boundary (in this case, 0.5). If the output valueis less than 0.5, then the temperature class is O; else another equation must be run through for further determination(labeled Remaining in Fig. 6). This cascade approach guarantees convergence and learning of all examples uses in thelearning process, and also provides an excellent decision architecture roadmap for determining the class of an unknownstellar object.
This same process is repeated for both the sub-temperature and luminosity classes. In the case of the sub-temperatureclass determination, a larger (in terms of the number of ovals) architecture was required because there were morepossible outcomes. The results of this architecture construction are shown in Fig. 7, Fig. 8, and Fig. 9. These figures areconnected together so that when the architecture flows off the bottom of Fig. 7, it begins again at the top of Fig. 8, andFig. 8 flows right into Fig. 9 the same way. The luminosity class decision architecture is shown in Fig. 10 and is read thesame as Fig. 6. It should be noted that cases in which the order was high (81 or greater) were considered cases that weredifficult to learn. These were cases where the equations had to learn very subtle differences; ones that the othertechniques described earlier in this work would have failed on.
It should also be noted that in the luminosity class, one oval required an equation of order 6561. This case is one thatindicates that possibly more classes need to be created in the stellar catalog to accurately distinguish between stellarluminosities. In addition, 4 cases in the sub-temperature class determination and 2 cases in the luminosity classdetermination had to be siphoned, meaning that learned examples were removed and a separate equation generated fordetermining the rest of the class from the remaining classes. This also represents cases that were difficult to learn and arecases where the number of truth classes should be increased and the spectra reassigned.
<.5 then “B”
Else“R emaining”
Order=3
Equation 2393439684317
<.5 then “O”
Else“ Remaining”
Order=81
3968
Equation 1
41444500396541024131447241214100
<.5 then “A”
Else“ Remaining”
Order=9
Equation 3
40894500397040264340
<.4 then “F”
Else“Re maining”
Order=9
Equation 4
43004121438843844340
<.5 then “G”
Else“R emaining”
Order=9
Equation 5
40463965448140973900 <.5 then “K”
Order=3
Equation 6
402639654227
Else “M”
It is difficult to distinguish Class “O”from the other classes because an 81 st
order polynomial was needed.
<.5 then “B”
Else“R emaining”
Order=3
Equation 2393439684317
<.5 then “O”
Else“ Remaining”
Order=81
3968
Equation 1
41444500396541024131447241214100
<.5 then “A”
Else“ Remaining”
Order=9
Equation 3
40894500397040264340
<.4 then “F”
Else“Re maining”
Order=9
Equation 4
43004121438843844340
<.5 then “G”
Else“R emaining”
Order=9
Equation 5
40463965448140973900 <.5 then “K”
Order=3
Equation 6
402639654227
Else “M”
It is difficult to distinguish Class “O”from the other classes because an 81 st
order polynomial was needed.
Fig 6: Temperature class decision architecture. Decision architecture is read from left to right.It should be noted that the determination of Class O from the other classes modeled withEquation 1 was a difficult case because an 81st order polynomial was needed.
<.55 then “0”
Else“ Remaining”
Order=729Equation 1B
4388,4481,39683970,4317,45004267,4073,4384
<.50 then “0”
Else“ Remaining”
Order=243Equation 1A
3934,44724089,40314384,39704481,44714046,3968
<.5 then “0.5”
Else“R emaining”
Order=9
Equation 2
4481,4102408941213900
<.5 then “1.0”
Else“Re maining”
Order=9
Equation 3A
4026,4472
<.5 then “1.0”
Else“ Remaining”
Order=9
Equation 3B
4102,39344046,4068
408942674500 <.5 then “1.5”
Order=3
Equation 4
413143844388
Else“ Remaining”
<.5 then “2.0”
Order=81Equation 5A
4144,4340,4317,41024471,4384,39704089,4073,4481 Else Next
Page
4068,4144,41314340,4227
4121410240974068
It is difficult to distinguish Class “0” fromthe other classes because of the high orderof both of the polynomials (1A and 1B)needed.
Class had to be separated (siphoned) tolearn. This indicates a need for anothersub-classification.
Difficult caseindicated by high
order
<.55 then “0”
Else“ Remaining”
Order=729Equation 1B
4388,4481,39683970,4317,45004267,4073,4384
<.50 then “0”
Else“ Remaining”
Order=243Equation 1A
3934,44724089,40314384,39704481,44714046,3968
<.5 then “0.5”
Else“R emaining”
Order=9
Equation 2
4481,4102408941213900
<.5 then “1.0”
Else“Re maining”
Order=9
Equation 3A
4026,4472
<.5 then “1.0”
Else“ Remaining”
Order=9
Equation 3B
4102,39344046,4068
408942674500 <.5 then “1.5”
Order=3
Equation 4
413143844388
Else“ Remaining”
<.5 then “2.0”
Order=81Equation 5A
4144,4340,4317,41024471,4384,39704089,4073,4481 Else Next
Page
4068,4144,41314340,4227
4121410240974068
It is difficult to distinguish Class “0” fromthe other classes because of the high orderof both of the polynomials (1A and 1B)needed.
Class had to be separated (siphoned) tolearn. This indicates a need for anothersub-classification.
Difficult caseindicated by high
order
Fig 7: Sub-temperature class decision architecture A. Decision architecture is read from left toright. When leaving Equation 5A without a sub-temperature class decision, proceed toarchitecture B in Fig 8. It should be noted that Equations 1A, 1B, and 5A which determinesub-temperature classes 0 and 2 were difficult cases.
<.5 then “2.5”
Else“ Remaining”
Order=9
Equation 6396841214267
<.5 then “2.0”
Else“Re maining”
Order=9
Equation 5B
3934,40683900409740314102
<.4 then “3.0”
Else“ Remaining”
Order=27
Equation 7A
4089,40974388,39004500,3968
4100<.5 then “3.0”
Else“ Remaining”
Order=9
Equation 7B
4026,4073
<.5 then “4.0”
Else“Re maining”
Order=243Equation 9
3965,4300,43174384,3968,39344089,4340,40264472,4097,3970
4046,3900 <.5 then “4.5”
Order=3
Equation 10
396539684227
Else“ Remaining”
<.5 then “5.0”
Order=9
Equation 11
4068,4144,43174097,3934,4300
4267,4089
41314227
4097413140894046
Previous Page
Else NextPage
Difficult caseindicated by high
order
<.5 then “2.5”
Else“ Remaining”
Order=9
Equation 6396841214267
<.5 then “2.0”
Else“Re maining”
Order=9
Equation 5B
3934,40683900409740314102
<.4 then “3.0”
Else“ Remaining”
Order=27
Equation 7A
4089,40974388,39004500,3968
4100<.5 then “3.0”
Else“ Remaining”
Order=9
Equation 7B
4026,4073
<.5 then “4.0”
Else“Re maining”
Order=243Equation 9
3965,4300,43174384,3968,39344089,4340,40264472,4097,3970
4046,3900 <.5 then “4.5”
Order=3
Equation 10
396539684227
Else“ Remaining”
<.5 then “5.0”
Order=9
Equation 11
4068,4144,43174097,3934,4300
4267,4089
41314227
4097413140894046
Previous Page
Else NextPage
Difficult caseindicated by high
order
Fig 8: Sub-temperature class decision architecture B. Decision architecture is read from left toright. When leaving Equation 11 without a sub-temperature class decision, proceed toarchitecture C in Fig 9. It should be noted that Equation 9 determining sub-temperature class4 was a difficult case.
<.5 then “6.0”
Else“R emaining”
Order=9
Equation 134384,4073
40263970
<.5 then “5.5”
Else“ Remaining”
Order=3
Equation 12
403140263970
<.5 then “6.5”
Else“R emaining”
Order=3
Equation 14
447245003970
<.5 then “7.0”
Else“R emaining”
Order=9
Equation 15
4073,4471
<.5 then “7.5”
Else“R emaining”
Order=3
Equation 16
422740263968
<.5 then “8.0”
Order=3
Equation 17
408940973965
Else“R emaining”
<.5 then “8.5”
Order=3
Equation 18
39683970
39684481
4300,4102396843404317
Previous Page
<.5 then “9.0”
Order=3
Equation 19
42674500
Else “9.5”Else
“ Remaining”
No Difficult Cases!
<.5 then “6.0”
Else“R emaining”
Order=9
Equation 134384,4073
40263970
<.5 then “5.5”
Else“ Remaining”
Order=3
Equation 12
403140263970
<.5 then “6.5”
Else“R emaining”
Order=3
Equation 14
447245003970
<.5 then “7.0”
Else“R emaining”
Order=9
Equation 15
4073,4471
<.5 then “7.5”
Else“R emaining”
Order=3
Equation 16
422740263968
<.5 then “8.0”
Order=3
Equation 17
408940973965
Else“R emaining”
<.5 then “8.5”
Order=3
Equation 18
39683970
39684481
4300,4102396843404317
Previous Page
<.5 then “9.0”
Order=3
Equation 19
42674500
Else “9.5”Else
“ Remaining”
No Difficult Cases!
Fig 9: Sub-temperature class decision architecture C. Decision architecture is read from left toright. It should be noted that there were no difficult cases here.
<.60 then “I”
Else“ Remaining”
Order=243Equation 1B
4073,39704481,43844500,3934
<.38 then “I”
Else“R emaining”
Order=81Equation 1A
4267,44723934,45004144,43404068,39684102
<.5 then “II”
Else“R emaining”
Order=27
Equation 2
4046,45004031,44713968,41444472,4300 <.3 then “III”
Else“ Remaining”
Order=6561Equation 3A
4500,4100,4267
<.5 then “III”
Else“R emaining”
Order=81Equation 3B
4481,40684388,39704384,45004340,42273900,4097 <.5 then “IV”
Order=27
Equation 4
4046,4300,39344097,4144,41003970,4031
Else“R emaining”
<.5 then “V”
Order=3
Equation 5
396841443900
Else “Ve”
4472,41214031,4317
3965,4089,39684388,4131,40264068,4144,42273934,4481
Very difficult classdecision!
Possibly new Sub-class needed
Possibly new Sub-classes needed!
<.60 then “I”
Else“ Remaining”
Order=243Equation 1B
4073,39704481,43844500,3934
<.38 then “I”
Else“R emaining”
Order=81Equation 1A
4267,44723934,45004144,43404068,39684102
<.5 then “II”
Else“R emaining”
Order=27
Equation 2
4046,45004031,44713968,41444472,4300 <.3 then “III”
Else“ Remaining”
Order=6561Equation 3A
4500,4100,4267
<.5 then “III”
Else“R emaining”
Order=81Equation 3B
4481,40684388,39704384,45004340,42273900,4097 <.5 then “IV”
Order=27
Equation 4
4046,4300,39344097,4144,41003970,4031
Else“R emaining”
<.5 then “V”
Order=3
Equation 5
396841443900
Else “Ve”
4472,41214031,4317
3965,4089,39684388,4131,40264068,4144,42273934,4481
Very difficult classdecision!
Possibly new Sub-class needed
Possibly new Sub-classes needed!
Fig 10: Luminosity class decision architecture. Decision architecture is read from left to right.It should be noted that Equations 1A and 1B for class I, and Equations 3A and 3B for class IIIwere very difficult cases and denote the possibility that new sub-classes are needed.
3.2. Decision Map and Source CodeIn addition to constructing the decision architectures above, Data Modeling developed by Jaenisch also provides amethod for characterizing the interaction in the decision space between the dominant features. For example, usingwavelength 3968 Å and the wavelength feature that is the highest available contributor to the overall class determination(highest bar other than 3968 Å in the temperature, sub-temperature, and luminosity class bar charts in Fig. 11), decisionmaps such as the ones shown in Fig. 12 below can be constructed. These maps hold all other input features constantwhile varying the two dominant features. Fig. 12 shows three different slices through the decision space. To constructthe maps of Fig. 12, all input features except 3968 Å and 3965 Å were set to their minimum values (left map), middlerange values (middle map), and maximum values (right map). Features 3968 Å and 3965 Å were allowed to vary acrosstheir entire valid training ranges for each of the three maps. These maps display the complicated boundaries that existbetween decisions in a graphical manner in the temperature class, with each class displayed as a different color andlabeled in white. Each map represents a single slice through n-dimensional decision space that is fractal in nature. This isin stark contrast with the classical methods described earlier in this work that use simple mean and covariance models ofcircles and ellipses to represent the decision boundaries and overlap regions. Because of the fractal nature of the decisionspace, it is impossible to generate if-then rules that adequately characterize the complicated decision process and provide100% classification. Equivalent maps can also be constructed for the sub-temperature and luminosity classes, as well asfor the overall decision.
In the Jacobi data sets, only 161 stellar spectra were measured and repetition in overall stellar classes exists. In all, thereare more than 1000 possible classifications of stellar objects available as outcomes to the equation model, and theequation model was generated using all available data representing only about 10% of the possible classificationoutcomes. This equation model network was built with only a subset of the possible outcomes, and is allowed togeneralize to determine if a new object is of a class that as yet has not been measured.
Luminosity Class
0
20
40
60
80
100
120
1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 1 8 19 20 21 22 23 24 25 2 6 27 28
3900
3934
3965
3968
3970
4026
4031
4046
4068
4073
4089
4097
4100
4102
4121
4131
4144
4227
4267
4300
4317
4340
4384
4388
4471
4472
4481
4500
Luminosity Class
0
20
40
60
80
100
120
1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 1 8 19 20 21 22 23 24 25 2 6 27 28
3900
3934
3965
3968
3970
4026
4031
4046
4068
4073
4089
4097
4100
4102
4121
4131
4144
4227
4267
4300
4317
4340
4384
4388
4471
4472
4481
4500
3900
3934
3965
3968
3970
4026
4031
4046
4068
4073
4089
4097
4100
4102
4121
4131
4144
4227
4267
4300
4317
4340
4384
4388
4471
4472
4481
4500
Total Usage
0
50
100
150
200
250
300
350
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
3900
3934
3965
3968
3970
4026
4031
4046
4068
4073
4089
4097
4100
4102
4121
4131
4144
4227
4267
4300
4317
4340
4384
4388
4471
4472
4481
4500
Total Usage
0
50
100
150
200
250
300
350
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
3900
3934
3965
3968
3970
4026
4031
4046
4068
4073
4089
4097
4100
4102
4121
4131
4144
4227
4267
4300
4317
4340
4384
4388
4471
4472
4481
4500
3900
3934
3965
3968
3970
4026
4031
4046
4068
4073
4089
4097
4100
4102
4121
4131
4144
4227
4267
4300
4317
4340
4384
4388
4471
4472
4481
4500
Sub-Temperature Class
0
20
40
60
80
1 00
1 20
1 40
1 60
1 80
2 00
1 2 3 4 5 6 7 8 9 10 11 12 1 3 14 15 16 17 18 1 9 20 21 22 23 24 2 5 26 27 28
3900
3934
3965
3968
3970
4026
4031
4046
4068
4073
4089
4097
4100
4102
4121
4131
4144
4227
4267
4300
4317
4340
4384
4388
4471
4472
4481
4500
Sub-Temperature Class
0
20
40
60
80
1 00
1 20
1 40
1 60
1 80
2 00
1 2 3 4 5 6 7 8 9 10 11 12 1 3 14 15 16 17 18 1 9 20 21 22 23 24 2 5 26 27 28
3900
3934
3965
3968
3970
4026
4031
4046
4068
4073
4089
4097
4100
4102
4121
4131
4144
4227
4267
4300
4317
4340
4384
4388
4471
4472
4481
4500
3900
3934
3965
3968
3970
4026
4031
4046
4068
4073
4089
4097
4100
4102
4121
4131
4144
4227
4267
4300
4317
4340
4384
4388
4471
4472
4481
4500
Temperature Class
0
5
10
15
20
25
30
35
40
45
50
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
3900
3934
3965
3968
3970
4026
4046
4089
4097
4100
4102
4121
4131
4144
4227
4300
4317
4340
4384
4388
4472
4481
4500
Temperature Class
0
5
10
15
20
25
30
35
40
45
50
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
3900
3934
3965
3968
3970
4026
4046
4089
4097
4100
4102
4121
4131
4144
4227
4300
4317
4340
4384
4388
4472
4481
4500
3900
3934
3965
3968
3970
4026
4046
4089
4097
4100
4102
4121
4131
4144
4227
4300
4317
4340
4384
4388
4472
4481
4500
Fig 11: Summary of feature usage by class decision and total. Features that are used moreoften are more significant contributors to each decision. Wavelength 3968 Å was the mostsignificant contributor.
Fig 12: Decision maps for the temperature class. The graph on the left was constructed byholding the values of all wavelength features except 3968 Å (x-axis) and 3965 Å (y-axis) attheir minimum values and allowing the two features of interest to vary. The center graph usesthe mid range values of each of these features and the right graph the maximum values ofthese features. In each graph, 3968 Å (ranges from 0.0502 to 0.8651) and 3965 Å (rangesfrom 0.0567 to 0.9086) are varied from the min to the max of their ranges in 100 equal steps.It should be noted that each map represents a single slice through a n-dimensional decisionspace that is fractal in nature.
To determine the sensitivity of the equation architecture to sensor measurement errors, noised signatures were generated.These noised signatures were created by adding up to one standard deviation (one sigma) of random error to the originalJacobi spectra. Wavelength features were then extracted, and these features were constrained to be within the trainingboundaries of the Data Modeling equations. The Data Modeling classifications were then sorted into ascended orderindependently for temperature, sub-temperature, and luminosity classes. These results are shown in Fig. 13. Fig 13
shows each outcome as a separate row, with the first row across the top of each map representing the true Jacobi spectraand 100% correct classification. Individual stars can be traced down the map to determine how classification proceededas more noise was added to the measured spectra. It should be noted that for temperature class, as more noise was added,class O looked more like B, and B more like O, and K and M also both tended to look like O. For sub-temperatureclasses, all classes tended to look like 0 and 0.5 as more noise was added. And for luminosity, almost all stars lookedlike either I or III with increased noise. Also included below in Fig. 14 is an example of an equation generated with themethod developed by Jaenisch, in this case the equation that determines the O temperature class from all othertemperature classes.
Fig 13: Misclassification sensitivity to increasing noise. In the temperature class, class Otended to look like B, B like O, and both K and M like O with increased noise. For sub-temperature, almost all stars looked like 0 or 0.5, and for luminosity, most stars converged toeither I or III with increasing noise.
l1o3 = 2.1688 - 0.3521*t4500 + 2.038*t3965 -1.743*t4144l1o3 = l1o3 - 0.8056*t4500*t4500 + 0.1852*t3965*t3965l1o3 = l1o3 + 0.3188*t4144*t4144 + 0.323*t4500*t3965l1o3 = l1o3 + 0.6292*t4500*t4144 - 1.316*t3965*t4144l1o3 = l1o3 + 0.4574*t4500*t3965*t4144 - 0.1163*t4500*t4500*t4500l1o3 = l1o3 + 0.0236*t3965*t3965*t3965 + 0.0526*t4144*t4144*t4144l1o3 = l1o3 - 1.2608*t3965*t4500*t4500 + 0.5746*t4500*t3965*t3965l1o3 = l1o3 - 0.1599*t4500*t4144*t4144 + 0.4705*t4144*t4500*t4500l1o3 = l1o3 - 0.0637*t4144*t3965*t3965 - 0.0888*t3965*t4144*t4144l1o2 = 2.1356 + 1.7757*t3968 - 0.5989*t4472 - 1.3982*t4144l1o2 = l1o2 - 0.0689*t3968*t3968 - 0.8957*t4472*t4472l1o2 = l1o2 + 0.0356*t4144*t4144 - 0.1635*t3968*t4472l1o2 = l1o2 - 1.4428*t3968*t4144 + 1.172*t4472*t4144l1o2 = l1o2 + 1.1471*t3968*t4472*t4144 - 0.1052*t3968*t3968*t3968l1o2 = l1o2 + 0.1781*t4472*t4472*t4472 + 0.0473*t4144*t4144*t4144l1o2 = l1o2 + 0.2106*t4472*t3968*t3968 - 1.016*t3968*t4472*t4472l1o2 = l1o2 - 0.1649*t3968*t4144*t4144 - 0.2573*t4144*t3968*t3968l1o2 = l1o2 + 0.1521*t4144*t4472*t4472 - 0.1863*t4472*t4144*t4144l2o3 = 0.6373 + 0.2525*l1o3 - 0.1045*t4100 + 0.0243*l1o2l2o3 = l2o3 - 0.7416*l1o3*l1o3 + 0.0463*t4100*t4100l2o3 = l2o3 - 0.2711*l1o2*l1o2 - 0.0633*l1o3*t4100l2o3 = l2o3 + 0.5585*l1o3*l1o2 + 0.1682*t4100*l1o2l2o3 = l2o3 - 2.006*l1o3*t4100*l1o2 + 0.085*l1o3*l1o3*l1o3l2o3 = l2o3 + 0.0023*t4100*t4100*t4100 + 0.3119*l1o2*l1o2*l1o2l2o3 = l2o3 + 1.0911*t4100*l1o3*l1o3 + 0.0022*l1o3*t4100*t4100l2o3 = l2o3 - 0.4909*l1o3*l1o2*l1o2 + 0.0063*l1o2*l1o3*l1o3l2o3 = l2o3 - 0.145*l1o2*t4100*t4100 + 0.7645*t4100*l1o2*l1o2l3o2 = 0.1866 + 1.1442*l2o3 + 0.1711*t3968 - 0.2543*t4121l3o2 = l3o2 - 0.6571*l2o3*l2o3 - 0.0781*t3968*t3968l3o2 = l3o2 + 0.3587*t4121*t4121 - 0.571*l2o3*t3968l3o2 = l3o2 - 0.0396*l2o3*t4121 - 0.896*t3968*t4121l3o2 = l3o2 + 1.7283*l2o3*t3968*t4121 - 0.0585*l2o3*l2o3*l2o3l3o2 = l3o2 + 0.0728*t3968*t3968*t3968 - 0.0041*t4121*t4121*t4121l3o2 = l3o2 + 0.0751*t3968*l2o3*l2o3 + 0.0006*l2o3*t3968*t3968l3o2 = l3o2 - 0.6821*l2o3*t4121*t4121 + 0.7264*t4121*l2o3*l2o3l3o2 = l3o2 + 0.1007*t4121*t3968*t3968 - 0.0401*t3968*t4121*t4121l2o2 = 0.6221 + 0.7539*l1o3 - 0.0836*t4131 - 0.4128*l1o2l2o2 = l2o2 - 0.5396*l1o3*l1o3 - 0.1235*t4131*t4131l2o2 = l2o2 + 0.0406*l1o2*l1o2 - 0.7962*l1o3*t4131
l2o2 = l2o2 + 0.0073*l1o3*l1o2 + 0.986*t4131*l1o2l2o2 = l2o2 - 1.9057*l1o3*t4131*l1o2 - 0.2229*l1o3*l1o3*l1o3l2o2 = l2o2 - 0.0107*t4131*t4131*t4131 + 0.3671*l1o2*l1o2*l1o2l2o2 = l2o2 + 1.4539*t4131*l1o3*l1o3 + 0.0094*l1o3*t4131*t4131l2o2 = l2o2 - 0.9617*l1o3*l1o2*l1o2 + 0.7312*l1o2*l1o3*l1o3l2o2 = l2o2 + 0.1344*l1o2*t4131*t4131 + 0.3263*t4131*l1o2*l1o2l2o1 = 0.4775 + 0.3633*l1o3 + 0.0247*t4102 + 0.1438*l1o2l2o1 = l2o1 - 0.456*l1o3*l1o3 + 0.0627*t4102*t4102l2o1 = l2o1 - 0.4008*l1o2*l1o2 - 0.1214*l1o3*t4102l2o1 = l2o1 + 0.4992*l1o3*l1o2 + 0.0815*t4102*l1o2l2o1 = l2o1 - 0.9426*l1o3*t4102*l1o2 + 0.085*l1o3*l1o3*l1o3l2o1 = l2o1 + 0.0094*t4102*t4102*t4102 + 0.0038*l1o2*l1o2*l1o2l2o1 = l2o1 + 0.4241*t4102*l1o3*l1o3 - 0.0709*l1o3*t4102*t4102l2o1 = l2o1 + 0.0891*l1o3*l1o2*l1o2 - 0.2419*l1o2*l1o3*l1o3l2o1 = l2o1 - 0.05*l1o2*t4102*t4102 + 0.3658*t4102*l1o2*l1o2l3o3 = 0.6049 + 0.8283*l2o3 + 0.5852*l2o2 - 1.0118*l2o1l3o3 = l3o3 - 0.0688*l2o3*l2o3 - 0.239*l2o2*l2o2l3o3 = l3o3 - 0.3041*l2o1*l2o1 - 1.9498*l2o3*l2o2l3o3 = l3o3 + 0.7738*l2o3*l2o1 + 1.0837*l2o2*l2o1l3o3 = l3o3 – 0.0248*l2o3*l2o2*l2o1 - 0.4811*l2o3*l2o3*l2o3l3o3 = l3o3 - 1.0575*l2o2*l2o2*l2o2 + 1.4796*l2o1*l2o1*l2o1l3o3 = l3o3 + 1.8239*l2o2*l2o3*l2o3 - 2.1622*l2o3*l2o2*l2o2l3o3 = l3o3 - 0.5003*l2o3*l2o1*l2o1 + 0.0719*l2o1*l2o3*l2o3l3o3 = l3o3 + 4.5777*l2o1*l2o2*l2o2 - 3.8468*l2o2*l2o1*l2o1l4o3 = 0.4485 + 0.0845*l3o2 - 0.196*t4500 + 0.562*l3o3l4o3 = l4o3 + 0.0271*l3o2*l3o2 - 0.1464*t4500*t4500l4o3 = l4o3 - 1.0001*l3o3*l3o3 - 0.9481*l3o2*t4500l4o3 = l4o3 + 0.2703*l3o2*l3o3 + 1.3307*t4500*l3o3l4o3 = l4o3 + 0.0357*l3o2*t4500*l3o3 + 0.3315*l3o2*l3o2*l3o2l4o3 = l4o3 + 0.0062*t4500*t4500*t4500 + 0.2246*l3o3*l3o3*l3o3l4o3 = l4o3 + 0.9*t4500*l3o2*l3o2 - 0.1734)*l3o2*t4500*t4500l4o3 = l4o3 + 0.3387*l3o2*l3o3*l3o3 - 0.9927*l3o3*l3o2*l3o2l4o3 = l4o3 + 0.4172*l3o3*t4500*t4500 - 1.0498*t4500*l3o3*l3o3Class = l4o3*0.1665 + 0.8056if Class < 0.1 then Class = 0.1if Class > 0.9 then Class = 0.9
l1o3 = 2.1688 - 0.3521*t4500 + 2.038*t3965 -1.743*t4144l1o3 = l1o3 - 0.8056*t4500*t4500 + 0.1852*t3965*t3965l1o3 = l1o3 + 0.3188*t4144*t4144 + 0.323*t4500*t3965l1o3 = l1o3 + 0.6292*t4500*t4144 - 1.316*t3965*t4144l1o3 = l1o3 + 0.4574*t4500*t3965*t4144 - 0.1163*t4500*t4500*t4500l1o3 = l1o3 + 0.0236*t3965*t3965*t3965 + 0.0526*t4144*t4144*t4144l1o3 = l1o3 - 1.2608*t3965*t4500*t4500 + 0.5746*t4500*t3965*t3965l1o3 = l1o3 - 0.1599*t4500*t4144*t4144 + 0.4705*t4144*t4500*t4500l1o3 = l1o3 - 0.0637*t4144*t3965*t3965 - 0.0888*t3965*t4144*t4144l1o2 = 2.1356 + 1.7757*t3968 - 0.5989*t4472 - 1.3982*t4144l1o2 = l1o2 - 0.0689*t3968*t3968 - 0.8957*t4472*t4472l1o2 = l1o2 + 0.0356*t4144*t4144 - 0.1635*t3968*t4472l1o2 = l1o2 - 1.4428*t3968*t4144 + 1.172*t4472*t4144l1o2 = l1o2 + 1.1471*t3968*t4472*t4144 - 0.1052*t3968*t3968*t3968l1o2 = l1o2 + 0.1781*t4472*t4472*t4472 + 0.0473*t4144*t4144*t4144l1o2 = l1o2 + 0.2106*t4472*t3968*t3968 - 1.016*t3968*t4472*t4472l1o2 = l1o2 - 0.1649*t3968*t4144*t4144 - 0.2573*t4144*t3968*t3968l1o2 = l1o2 + 0.1521*t4144*t4472*t4472 - 0.1863*t4472*t4144*t4144l2o3 = 0.6373 + 0.2525*l1o3 - 0.1045*t4100 + 0.0243*l1o2l2o3 = l2o3 - 0.7416*l1o3*l1o3 + 0.0463*t4100*t4100l2o3 = l2o3 - 0.2711*l1o2*l1o2 - 0.0633*l1o3*t4100l2o3 = l2o3 + 0.5585*l1o3*l1o2 + 0.1682*t4100*l1o2l2o3 = l2o3 - 2.006*l1o3*t4100*l1o2 + 0.085*l1o3*l1o3*l1o3l2o3 = l2o3 + 0.0023*t4100*t4100*t4100 + 0.3119*l1o2*l1o2*l1o2l2o3 = l2o3 + 1.0911*t4100*l1o3*l1o3 + 0.0022*l1o3*t4100*t4100l2o3 = l2o3 - 0.4909*l1o3*l1o2*l1o2 + 0.0063*l1o2*l1o3*l1o3l2o3 = l2o3 - 0.145*l1o2*t4100*t4100 + 0.7645*t4100*l1o2*l1o2l3o2 = 0.1866 + 1.1442*l2o3 + 0.1711*t3968 - 0.2543*t4121l3o2 = l3o2 - 0.6571*l2o3*l2o3 - 0.0781*t3968*t3968l3o2 = l3o2 + 0.3587*t4121*t4121 - 0.571*l2o3*t3968l3o2 = l3o2 - 0.0396*l2o3*t4121 - 0.896*t3968*t4121l3o2 = l3o2 + 1.7283*l2o3*t3968*t4121 - 0.0585*l2o3*l2o3*l2o3l3o2 = l3o2 + 0.0728*t3968*t3968*t3968 - 0.0041*t4121*t4121*t4121l3o2 = l3o2 + 0.0751*t3968*l2o3*l2o3 + 0.0006*l2o3*t3968*t3968l3o2 = l3o2 - 0.6821*l2o3*t4121*t4121 + 0.7264*t4121*l2o3*l2o3l3o2 = l3o2 + 0.1007*t4121*t3968*t3968 - 0.0401*t3968*t4121*t4121l2o2 = 0.6221 + 0.7539*l1o3 - 0.0836*t4131 - 0.4128*l1o2l2o2 = l2o2 - 0.5396*l1o3*l1o3 - 0.1235*t4131*t4131l2o2 = l2o2 + 0.0406*l1o2*l1o2 - 0.7962*l1o3*t4131
l2o2 = l2o2 + 0.0073*l1o3*l1o2 + 0.986*t4131*l1o2l2o2 = l2o2 - 1.9057*l1o3*t4131*l1o2 - 0.2229*l1o3*l1o3*l1o3l2o2 = l2o2 - 0.0107*t4131*t4131*t4131 + 0.3671*l1o2*l1o2*l1o2l2o2 = l2o2 + 1.4539*t4131*l1o3*l1o3 + 0.0094*l1o3*t4131*t4131l2o2 = l2o2 - 0.9617*l1o3*l1o2*l1o2 + 0.7312*l1o2*l1o3*l1o3l2o2 = l2o2 + 0.1344*l1o2*t4131*t4131 + 0.3263*t4131*l1o2*l1o2l2o1 = 0.4775 + 0.3633*l1o3 + 0.0247*t4102 + 0.1438*l1o2l2o1 = l2o1 - 0.456*l1o3*l1o3 + 0.0627*t4102*t4102l2o1 = l2o1 - 0.4008*l1o2*l1o2 - 0.1214*l1o3*t4102l2o1 = l2o1 + 0.4992*l1o3*l1o2 + 0.0815*t4102*l1o2l2o1 = l2o1 - 0.9426*l1o3*t4102*l1o2 + 0.085*l1o3*l1o3*l1o3l2o1 = l2o1 + 0.0094*t4102*t4102*t4102 + 0.0038*l1o2*l1o2*l1o2l2o1 = l2o1 + 0.4241*t4102*l1o3*l1o3 - 0.0709*l1o3*t4102*t4102l2o1 = l2o1 + 0.0891*l1o3*l1o2*l1o2 - 0.2419*l1o2*l1o3*l1o3l2o1 = l2o1 - 0.05*l1o2*t4102*t4102 + 0.3658*t4102*l1o2*l1o2l3o3 = 0.6049 + 0.8283*l2o3 + 0.5852*l2o2 - 1.0118*l2o1l3o3 = l3o3 - 0.0688*l2o3*l2o3 - 0.239*l2o2*l2o2l3o3 = l3o3 - 0.3041*l2o1*l2o1 - 1.9498*l2o3*l2o2l3o3 = l3o3 + 0.7738*l2o3*l2o1 + 1.0837*l2o2*l2o1l3o3 = l3o3 – 0.0248*l2o3*l2o2*l2o1 - 0.4811*l2o3*l2o3*l2o3l3o3 = l3o3 - 1.0575*l2o2*l2o2*l2o2 + 1.4796*l2o1*l2o1*l2o1l3o3 = l3o3 + 1.8239*l2o2*l2o3*l2o3 - 2.1622*l2o3*l2o2*l2o2l3o3 = l3o3 - 0.5003*l2o3*l2o1*l2o1 + 0.0719*l2o1*l2o3*l2o3l3o3 = l3o3 + 4.5777*l2o1*l2o2*l2o2 - 3.8468*l2o2*l2o1*l2o1l4o3 = 0.4485 + 0.0845*l3o2 - 0.196*t4500 + 0.562*l3o3l4o3 = l4o3 + 0.0271*l3o2*l3o2 - 0.1464*t4500*t4500l4o3 = l4o3 - 1.0001*l3o3*l3o3 - 0.9481*l3o2*t4500l4o3 = l4o3 + 0.2703*l3o2*l3o3 + 1.3307*t4500*l3o3l4o3 = l4o3 + 0.0357*l3o2*t4500*l3o3 + 0.3315*l3o2*l3o2*l3o2l4o3 = l4o3 + 0.0062*t4500*t4500*t4500 + 0.2246*l3o3*l3o3*l3o3l4o3 = l4o3 + 0.9*t4500*l3o2*l3o2 - 0.1734)*l3o2*t4500*t4500l4o3 = l4o3 + 0.3387*l3o2*l3o3*l3o3 - 0.9927*l3o3*l3o2*l3o2l4o3 = l4o3 + 0.4172*l3o3*t4500*t4500 - 1.0498*t4500*l3o3*l3o3Class = l4o3*0.1665 + 0.8056if Class < 0.1 then Class = 0.1if Class > 0.9 then Class = 0.9
Fig 14: Example source code from Data Modeling to determine O temperature class fromother temperature classes.
4. FUTURE WORK
Although only the full clean portion of the Jacoby spectra was analyzed in this work, this technique should also beamenable to processing across the full spectra and provide 100% classification using only a subset of the wavelengthfeatures available to the process. The authors looked at using the traditional UBV regions for any performanceimprovement, but there was no discernable improvement using these. There may be wavelength regions in these spectrathat lie outside of the traditional UBV regions that would provide performance enhancement, and these should be lookedat further. The complexity of the classifier could be reduced if the number of wavelengths required for classificationwere reduced. This may be possible by using wavelength feature ratios that combine multiple wavelengths togetherinstead of using the raw spectra numbers directly.
5. CONCLUSIONS
Until now, no classification method applied to stellar spectra has given 100% classification in a closed form solution.Data Modeling has given both a closed form solution for stellar classification (albeit a high order solution in some cases)and has given 100% correct classification on the temperature class, sub-temperature class, and luminosity for the 161Jacoby spectra. This method independently found that all 28 features that were available were needed for classification,and that wavelength 3968 Å (Ca II (H line), H line (blend)) was the most significant wavelength contributor toclassification. In the temperature class, class O was the hardest to distinguish from the other classes and required an 81st
order polynomial to do so. In the sub-temperature class, 0 was the hardest to classify and required a 729th orderpolynomial, and in the luminosity class, III was the hardest and required a 6561st order polynomial. In addition,siphoning was required on 4 classes in the sub-temperature cases and on 2 classes in the luminosity, which indicates thatadditional stellar sub-classes are needed in stellar catalogs. This is readily apparent in the luminosity class, where both a6561st polynomial and siphoning were required to determine class III from IV, V, and Ve. Classes II, IV, and Ve in theluminosity class had fewer samples than the other luminosity classes in the Jacobi spectra, which unfairly constrains thefeature dynamic ranges for those classes. More stellar spectra are needed for these classes. The classical UBVwavebands were tried by the authors and did not yield a less complicated solution than the one presented here. And useof this Data Model equation to determine star classification from discrete wavelength measurements along with theobservation of its visual magnitude makes available to the astronomer a direct estimate of the distance to the star.
ACKNOWLEDGMENTS
This stellar spectra classification project was originally begun to partially fulfill the requirements for the completion of aMasters of Science in Astronomy at the University of Western Sydney. The authors would like to thank the following fortheir contributions to this effort: Professor A. Hons (email:[email protected]) and Professor P. Jones(email:mailto:[email protected]), University of Western Sydney (UWS), G. Snyder, Contemporary LaboratoryExperiments in Astronomy (CLEA) (website:http://www.gettysburg.edu/academics/physics/clea/CLEAhome.html)(email:[email protected]), and C. Case, C. Songy, and J. Handley, Sparta, Inc. for their support and encouragement.
REFERENCES
1) Project CLEA Home Page,Contemporary Laboratory Experiences in Astronomy, University of Gettysburg,Gettysburg, PA, http://www.gettysburg.edu/academics/physics/clea/CLEAhome.html, 25 October 20012) Jacoby, G.H., Hunter, D.A. and C. A. Christian. “A Library of Stellar Spectra”,Astrophysical Journal SupplementSeries, 56:257-281, October 1984.3) Fukunaga, K. “Introduction to Statistical Pattern Recognition”, 2nd ed., Boston, MA: Academic Press, c1990.4) Ivakhnenko, G.,GMDH: Group Method of Data Handling for complex systems modeling and forecasting, CyberneticCenter, Ukraina, http://www.inf.kiev.ua/GMDH-home, December 1998.5) Madala, H.R. and A.G. Ivakhnenko.Inductive Learning Algorithms for Complex Systems Modeling, Boca Raton, FL:CRC Press, c1994.6) Jaenisch, H.M. “Neural Equation Stellar Classification”, University of Western Sydney, Master’s Thesis, Sydney,Australia, October 2001.
Enabling Unattended Data Logging and Publication by Data Model
Change Detection and Environmental Awareness
Holger M. Jaenisch*
James Cook University, Townsville QLD 4811, Australia
ABSTRACT
This paper presents a novel self-initializing algorithm using Change Detection to achieve self-awareness of unusual
conditions without a prior modeling assumptions. Deviations from baseline nominal conditions yield a tip-off and the
variation off baseline indicates a novelty to be logged for publication. Incremental processing of the data log enables
common transients to be ignored and viewed as nominal. In this framework, only second pass novelties invoke enough
interest for publication. The mathematical methods for enabling this exploit both classical control theory transfer
functions to model the environment and O(3n) Volterra series type polynomials as an innovative change detection
method without explicit modeling.
Keywords: Formal V&V Analysis, Data Modeling, Agent, UAV, UMV, Deep Space Probes, Transfer Function Modeling,
Automatic Proofing, Automatic Differential Equation Derivation, Ezekiel
1. INTRODUCTION
This is a new and novel self-initializing algorithm1 based on classical methods (both control and Fourier theory) applied
in a completely unique way. It can be deployed to watch any type of scene or environment and will determine what is
normal or out of the ordinary simply from monitoring small regions of the entire scene. Whenever something new or
unusual occurs, it is able to record the scene and concurrently update its internal model of what it has seen and
experienced. The algorithm continues to collect samples of novel scenes representing unusual changes as they are
encountered. Once filled to capacity, it automatically offloads what is learned and maintains a memory trace for
continuity. This paper presents the details of this algorithm, including its application to novel imagery from Mars,
Mercury, Asteroid Ida, Moon Titan, and Earth.
2. EZEKIEL CONCEPT
Environmental scene awareness is enabled with a novel Artificially Intelligent Algorithm called Ezekiel1. Notionally,
various environmental math models are stored in slide carousel fashion that are retrievable by indexing on similarity of
present conditions to ones encountered in the past. The carousel of environment models coupled with carousels of
encountered variations is notionally illustrated in Figure 1.
Fig. 1. Illustrations of the Ezekiel concept.
Ezekiel captures linear models of sensor input and corresponding interpreted output as a classical control-theory transfer
function (TF) model. Using the notional concept of a carousel slide projector, each new linear model created by Ezekiel
is stored as a “slide” in the carousel. As the agent scans its surroundings, it compares mathematically the new sensor
input to the linear model TF slides it has in memory (slide carousel indexed on numerical similarity).
*[email protected]; phone 1 256 337-3768
If the new sensor input does not numerically resemble any of the existing TF slides, the two closest matches are used to
derive a virtual slide for comparison. If this fails to yield a match, the new data is added as a new slide and the collection
re-indexed based on mathematical similarity. However, if similarity was discovered immediately upon inspection, then
the second carousel of known variations is activated and equivalence checked. If a match occurs the sample is ignored. If
no match occurs, the variation carousels associated with the two closest TF matches are equivalence checked; if still no
match; the new data is added as a new known-variant and placed in the carousel. The variation carousel is then updated
and re-indexed.
In addition to the baseline transfer function model, each TF slide has another entire carousel of variations associated with
it (wheel-within-wheel). In this fashion, Ezekiel continues to scan its environment building a representational model and
belief system of its surroundings. When its slide capacity is reached, it off loads the data for telemetry and saves a small
percentage of the slide data to provide continuity with previous experience. Ezekiel is now able to store new novelties as
they occur. This empowers Ezekiel to construct linear system TF models in a fully adaptive piece wise fashion, enabling
highly nonlinear regions to be automatically captured with higher density linear models and linear regions to be
represented adaptively with sparse models.
3. THEORY
Data Modeling1-2 is a unique term coined to describe the process of deriving mathematical equations as models from
simulated or measured data. What follows is a specific embodiment that achieves Data Modeling using classical control
and Fourier theory. The final differential equations generated by the process are generically termed Data Models.
Classic transfer functions3-4 are defined using the raw input and output data from a plant (in this case scene of interest).
The z transform of the input and output data is defined as
n
n zazazazX −−− ++++= K2
2
1
11)( (1) n
n zbzbzbbzY −−− ++++= K2
2
1
10)(
In this form, the classic transfer function is formed by
n
n
n
n
zazaza
zbzbzbb
zX
zYzH −−−
−−−
++++++++
==K
K2
2
1
1
2
2
1
10
1)(
)()( (2)
where X(z) defines the characteristic polynomial of the system, and whose roots (poles) define the sensitivity of the
system. This represents the TF for a single input variable with multiple realizations resulting in a single output variable
of multiple associated variations. For multiple input/output variable cases, the transfer function is
[ ]
=
)()()(
)()()(
)()()(
)(
21
22221
11211
zHzHzH
zHzHzH
zHzHzH
zH
KKKK
K
K
L
MOLM
L
L
(3)
where each resultant output is a linear combination of all inputs. The transfer function from input j to output i is Hij(z).
This matrix captures the influence of each individual input element to all outputs.
This process can be greatly simplified by adopting a unique and novel interpretation or perspective of definitions. By
taking a vector of multiple simultaneous inputs and an equal length vector of multiple simultaneous outputs (MIMO) and
transposing them (transposing column vectors to rows), the result is simplified to an equivalent single input/output
system (SISO) where the multiple realizations of the single input value are actually the single simultaneous input values.
This variable transfiguration (transpose followed by new interpretation of resulting vector) is represented by
−
=
=
0.42
63.19
2.42
)(
,
,2
,1
MM
iN
i
i
x
x
x
zX [ ] [ ]0.4263.192.42)( )(,1)1(,1,1 −== ++ℑ LL Niii xxxzX (4)
N inputs, single realization Single virtual input, N realizations
Once the input vector is transfigured, the process in (4) is repeated for the output vector. Now a virtual single input
single output transfer function is defined following the form in (2) above.
A MIMO system has now been transfigured into a virtual SISO form that is readily converted into differential equation
form as follows
])[(])1[()][(])[(])2[(])1[(][ 1021 TnixbTixbiTxbTniyaTiyaTiyaiTy nn −++−++−++−+−= LL (5)
This form represents a simple differential equation model derived from a single input/output exemplar, where the powers
of the z transform terms are converted into lags of the historical data (both input and previous output).
If the number of I/O terms is large, this form becomes unwieldy because of the need to keep track of n inputs and n-1
previous outputs. And if this differential equation model is stored and used as a transfer function, the required precision
to make the calculations rises rapidly with as the order of the denominator increases, which is directly dependent on the
n-1 previous outputs here.
Both of these problems are mitigated by recasting the TF and ultimately the differential equation form into Fourier
frequency space. This is easily accomplished using the classic z transform defined for z = eiω , which is the same as the
discrete Fourier transform (DFT) defined by
∑=k
j kn
ji
nC )
2exp(
1 π (6)
Therefore, the TF itself can be easily represented in frequency space using the fast Fourier transform (FFT) to determine
the Fourier series of the input and output sequences. Classically, these two Fourier series would be combined into a TF
given by
∑=
+=n
j
jjjj xBxAxy1
sincos)( ωω )(
)(
InputFFT
OutputFFTTF = (7)
The Fourier series is a lengthy representation that includes many alias terms due to frequency folding. A simplified form
identifies the dominant frequency components or eigenmodes, labeling them as lambda terms after zeroing all other
frequency locations.
otherwise0Ȝdominantif1Ȝ
BAfxyN
ij
j
j
N
ij
jk ==
+=≅ ∑ )sin()cos(),,,()(22
21
ππ λλλλλ K
xBxA jjjjj ωωλ sincos += (8)
The Ezekiel Transfer Function (ETF) is now formed from these lambda terms
=))((
))(()(
HeptorFFT
sPixelValueFFTETF
λλλ (9)
where an individual λETF is easily converted into differential form as
( )
( ) 0
101
101
2001.
001.
1
1
1
1
12
12 =
+
−−
−−−
−−−
− ∑−
=−
−
−
−
+
+
nslides
iff
ff
ii
ii
ii
ii
i
i
ff
yy
ff
yy
ff
yy
df
dA (10)
where y is the integral of the real part of the ETF given by ΣA. Likewise, the expression in (10) must also be repeated for
the imaginary part of the ETF.
Since each ETF is a linear model of a vastly nonlinear scene, a series of ETFs are saved in piecewise linear fashion to
accommodate such behavior. For this reason, the subscript i is added to (9)
i
iHeptorFFT
sPixelValueFFTETF
=))((
))(()(
λλλ (11)
The collection of ETF’s comprise a carousel, with each ETF occupying a slot. The matrices of remaining values after
eigenfunction modeling results in the following (i x k) matrices, where i is the number of ETFs and k is the number of
non-zero rows remaining in the U (real) or V (imaginary) matrix. A notional U matrix of real values is given by
=
0000
000
00
000
000
4,5
4,43,4
5,33,32,3
3,21,2
3,12,1
λλλ
λλλλλλλ
U (12)
where the real portion of the eigenmodels of the ith ETF is located in the ith column of the U matrix. A second matrix V
like this one is also generated for the imaginary terms.
Next a differential equation model is derived for each λ order across the λETFs by transposing the A matrix and B
matrix and repeating the differential equation construction process in (10) above. It should be noted that the λETFs are in
an order of ranked similarity which will be explained in the next section. This results in an analogous derivation of the
Volterra type functional differential equation (a differential equation whose variables are themselves either differential or
integral equations). In this case, we call this unique form the Ezekiel Integro-Differential equation given as
0)()()(11 0
12
=ℵℵ−ℵ
−
∫∫ ∫ℵ
−
ℵ
−
−nn
k
d ETFdfdf fdf
d
f
λλλ (13)
where is the similarity FOM defined as the unique characterization of each ETF and described in detail in the next
section. The Appendix contains a numerical implementation of evaluating (13).
ℵ
4. EZEKIEL IMPLEMENTATION
The Ezekiel transfer function is defined using as Output pixel values and as Input a vector of 7 statistics (coined Heptor)
describing the pixel neighborhood. These 7 statistics are the standard deviation (m2), skewness (m3), kurtosis (m4), m6,
m8, DH (novel fractal statistic comparing shortest path to actual path), and DJ (novel fractal statistic comparing range of
data to its standard deviation). This process is depicted in Figure 2.
Fig. 2. Heptor Statistics Calculation. A 7 x 7 neighborhood is pulled out of the overall image (top) and is shown on the bottom left.
This neighborhood is enlarged in the second bottom fame, and then shown as the actual image values in the third bottom frame.
Statistics are calculated on these 49 numbers, resulting in the Heptor stats shown on the bottom right.
Since each ETF is a linear model of a vastly nonlinear scene, a series of ETFs are saved in piecewise linear fashion to
accommodate such behavior. The collection of ETF’s comprise a carousel, with each ETF occupying a slot.
Because each ETF is only valid over a short range of values, it is necessary to identify for a new input which ETF is the
closest matching to the current input. Since the current input will always fall in between two (2) existing ETFs in the
carousel, a new virtual ETF is formed by weighted (proximity to new input) averaging of the eigenmodes in the
neighboring ETFs. This is done with (13).
ETF identification uses a novel type of Formal V&V (symbolic V&V) which treats the existing ETFs as “gold standard”
software models and new incoming data as test software. A mathematical proof of congruence between the two is
accomplished using a novel similarity FOM5. This is calculated using each input component of the ETF, enabling ETF
sorting by structural similarity.
))(2cos()( BABASignFOMSimilarity oo π==ℵ (14)
The similarity FOM (ℵ ) characterizes the input Heptor portion of each ETF. Conversely, the variation carousel is
populated by raw output pixel values stored in the order the neighborhoods occur. A comparison is now done between
the pixel neighborhood under consideration and each variant by calculating Equivalence (correlation of the magnitude of
the eigenmodes), which is defined for k variants by
( )( )( ) ( )∑∑
∑
==
=
−−
−−=
N
jk
kj
N
j
j
N
jk
kjj
k
VariantVariantCurrentCurrent
VariantVariantCurrentCurrent
eEquivalenc
1
2
,
1
2
1
,
100 (15)
Current and Variant are the magnitudes of the TF (PSD) models for the simulation and measured data given by
22
jjj BAurrentC += (16)
and Current and Variant in (15) are the respective means of the result of (16). A threshold for determining a match to
an ETF is automatically determined using
∑−
=+ −=
1
1
12
1 NETFSlides
m
mm FOMSimilarityFOMSimilarityNETFSlides
SimThresh (17)
which is ½ the average difference between the similarity FOM for neighboring slides in the ETF. Therefore, this is a
dynamic and adaptive threshold that changes according to ETFs in memory. For the variants, the Equivalency threshold
is also dynamic and adaptive, but is calculated as the running mean of the all Equivalency tests performed in the current
analysis, using the Equivalency relationship in (15). The Equivalency threshold is given as
∑−
=
=1
1
1 sNVariation
m
meEquivalencsNVariation
hEquivThres (18)
Carousel update is based on these adaptive thresholds. The current exemplar is compared to all members of the carousel
to find the most similar. Next, if the similarity is within the threshold, the exemplar is compared to the variants of this
ETF for equivalence. If equivalent, it is discarded, otherwise it is added as a new variant.
However, if the exemplar falls outside the threshold for similarity to the closest matching ETF, the process is more
involved. A decision to add this exemplar as an ETF is made. First, the variants of 4 ETFs are tested. These are the most
similar ETF, the ETF one index ahead of it, the ETF one index behind it, and the weighted average ETF formed from the
two ETFs the exemplar falls between. If the exemplar proves equivalent, then discard; otherwise, add a new ETF and its
associated variant.
Given the current stored corpus of experiential knowledge in Ezekiel, a prediction is made mathematically of what other
neighborhood sites should exist and what they would look like described by their input Heptor. These sites are called
relics and are significant because, if found, they are mathematically predicted to exist based on prior experience.
Prediction is accomplished using the stored ETF models to reconstruct an ensemble of Heptors for the variants
associated and stored with each. This is accomplished by inverting the ETF model and multiplying the resultant by the
pixel neighborhood values of the variant, most of which were associated and stored with the ETF when the exemplar was
similar enough to the ETF. The process of generating these predicted Heptors is given by
))(1
( VariantFFTETF
InvFFTeptorPredictedH = (19)
These Heptors are assigned a categorical value of ½ meaning relic. A change detector polynomial is derived using these
exemplars which enables distant sampled neighborhoods to be quickly compared to the mathematical predictions.
Change detection is achieved by creating a multi-variable polynomial Data Model that exhibits hypersensitivity to small
changes of input values. Tuning makes the model sensitive to feature values and combinations not encountered in
derivation. Threshold detecting outliers is set using the polynomial output from training data. The upper and lower
boundaries are assigned to be one-half of a standard deviation above and below the mean value of the polynomial output
from the training data.
The simulation created for this paper uses two different novel algorithms called change detectors. The first is a change
detector that uses the Heptor for the current pixel neighborhood as input and tips-off (outside the nominal boundary
around ½) whenever a new novel pixel neighborhood is encountered (unlike anything experienced before). This change
detector serves as a first level bulk filter to determine if a new pixel neighborhood should be processed or discarded.
The second is a novel “relic” detector (a relic is a mathematically predicted environmental scene that is expected to exist
based on current experience) described above, for which the definitions of nominal and tip-off are reversed. It looks for
data similar to training data. For the relic detector, tip-off (relic detection) occurs in the tight boundary around ½.
Whenever values in this region are found, the existence of a relic is flagged. Since both change detectors are self-
initializing and adaptive while providing goals with checks and balances; a novel form of machine situational
awareness is achieved.
To give the Ezekiel agent a navigational aide, a sense of direction is provided. The next heading for Ezekiel is chosen by
taking an extended range sensor reading in each of the 4 cardinal directions from the current heading. In the simulation
for this paper, this was simplified to be the fixed global cardinal points and heading independent. Because of the coarse
resolution of this simulation, very little difference in method is resolved; however, in practice the difference is
appreciable because goal following is enhanced using the local orientation. These headings are processed through the
Guide TF. The output value from each of the 4 cardinal directions is a vector whose resultant is normalized to construct a
unit pointing vector. The length of the final vector defining the new location is based on the pixel neighborhood size and
the distance away from the current pixel neighborhood that the extended range sensor readings are taken.
Generally, computer processing and memory resources are constrained. This makes it necessary for the ETF and
associated variation carousel to limit the maximum number of slides. When the maximum number of slides is reached,
the carousels write out (data log and publish) while telemetering and re-initializing. If the ETF carousel is filled, the
associated variation carousel is also re-initialized. If the variation carousel is re-initialized, the ETF carousel is left intact.
During re-initialization, multiple ETF slides are averaged together into one while multiple variation slides are averaged
into one. These averaged slides are then maintained by the Ezekiel algorithm to maintain historical continuity.
The entire corpus of offloaded carousels can be exchanged between sensor platforms for initialization by capturing the
information content in differential equation form as shown in (6) through (8). This information can also be represented in
lookup table (LUT) format. The LUT allows for real-time execution of equations, even those that cannot run in real-time
by constructing a table of solutions to the carousel ODEs. Reference 5 contains a detailed description of constructing
LUTs for computer processing and memory limited applications.
5. ALGORITHM TESTING
5.1 Description
The ETF can now be used to model and reconstruct the data. This will demonstrate both exact reconstruction of training
points stored in the carousels, and that the ETF carousel results do not blow up when working between carousels. This is
demonstrated using different frequency sine curves for output and increasing index counter for input. By changing the
starting point on the index counter, different sine curves should be extracted that represent interpolated frequencies
between those used to construct the ETF slides. This is shown in the Appendix as a surface plot of similarity sorted sine
curves, the number of which is greater than the original number of ETF slides. This demonstrates the interpolation power
of the ETF carousel. Each sine curve is the output derived from multiplying the input by the interpolated ETF, and
performing an inverse FFT.
As a system level proof, the 128 point long 1/f data set in Figure 3 was created. Windows of data were selected from this
1/f data set for input, and corresponding data windows 1 point ahead in the x-axis were selected for output. The Ezekiel
algorithm was then applied to determine the number of slides and variations required to capture this data. To demonstrate
that the transfer functions and associated variations captured the data process, the data set was appended to itself and
processed three times. Algorithm learning can be seen by the graph in Figure 3, where the final third is all zeros,
indicating that no novelties were detected in the final copy of the data. The middle third of the data still contained
novelty indications by the change detector, indicating that it had not fully learned all of the data; however, it should be
noted that the ETF flagged each case in the middle third selected by the change detector as a false alarm (change detector
novelty found to be nominal according to ETF).
Fig. 3. Change detectors and TF carousel models converge when no new or novel information is presented.
The above shows that ETF and change detectors in the Ezekiel algorithm converged in terms of information content and
did not add any new novel information after carousels were populated (no 2 or 3 after the first 1/6 of the run) and final
change detectors constructed (no 1, 2, or 3 in final 1/3 of the run).
Now that the ETF has been shown to capture the information presented, and that the Ezekiel algorithm with offloading
converges, the overall process in the simulation of applying the algorithm to an image is described. First, initialization of
the simulation must occur. One method is to use off-loaded carousel information from a previous run. This is the method
used by the author for all Mars images after the first, in order to maintain continuity between platforms and runs. The
second method of initialization is to allow the simulation to process a fixed number of pixel neighborhoods without use
of either the guide or change detector and assuming that each is a novel neighborhood. This is done by placing the sensor
platform in the center of the image and allowing it to march outward in a spiral pattern, as seen in Figure 4 (top right).
Once initialized, each new pixel neighborhood selected from the image is processed through the following basic steps:
1. Calculate Heptor Stats of neighborhood
2. Run Heptor through Guide to flag relics
3. Run Heptor through Change Detector to flag novelty
4. Run Guide on extended sensor range pixel neighborhoods,
combine output into pointing vector.
5. If Not relic and Not novelty, Goto 7
6. If relic or If novelty, check carousels as described in
previous section to confirm novelty. If novelty confirmed,
add information to carousels.
7. Use pointing to move to next neighborhood and repeat at 1.
Start
End
(0,0)
NX
Y
Fig. 4. Novelty neighborhoods from simulation run of overall Mars image, with neighborhoods connected by lines using closest
distance (right, top). Final reduced neighborhood group and path to be handed over to next level mission (right, bottom).
Flag denoting novelty/relic results for all neighborhoods processed (bottom left). Here 0 is non-novelty, 1 a novelty flagged by the
change detector later found to be a false alarm, 2 novelties requiring addition of an ETF to the carousel, 3 novelties requiring only
addition of a variant to an existing carousel, and 4 relics.
As multiple pixel neighborhoods are processed through the above steps, a novelty flag for each neighborhood (Figure 4)
is generated. Here, a value of 0 is non-novelty, 1 is a preliminary novelty flagged by the change detector that is later
confirmed to be a false alarm, 2 is a novelty resulting in the addition of an ETF to the carousel, 3 is a novelty resulting in
the addition of a variant to an existing carousel, and 4 is a relic.
As neighborhoods are visited, only those that are novelties are of potential further interest for the mission. These novelty
neighborhoods are shown in Figure 4 on the left, with lines connecting neighborhood centers together by starting at the
first point (denoted by A) and finding the closest distance to current point. This process is repeated until all novelty
neighborhoods are mapped. Both false alarm and overlapping neighborhoods are removed.
Next, the group is ranked and reduced for handover to a sensor platform at the next level. This is performed by ranked
based on standard deviation within each class (novelty flag = 2, 3 and 4), with the highest standard deviation being
ranked the highest. Next, the top 10% of these regions are kept and placed together into a single group and the minimum
distance path that connects the centers of the regions together is calculated. This provides the final ranked route plan for
use at the next level, shown in Figure 4 on the right.
5.2 Global Layer To A
The algorithm process at the satellite level the global Mars flat map given in Table 1, resulting in the selection of the
local neighborhoods shown including the path. Table 1 displays the characterization of the selected local neighborhoods
in ranked order, including intended course of action by the next level of UMV. Table 1 shows neighborhood A.1 ranked
highest, with A.4 also noteworthy of further study at the UAV flyover level as a relic.
Bitmap Center
(x, y)
Simil
Rank
(10-4)
Std
Dev
Rank
Heptor Statistics
(StdDev, Skew, Kurt, M6, M8, DJ, DH)
A.1
(237,207)
(Relic) 1.9 38.08
20.942, 0.053, -1.321, -11.59, -97.396,
1.883, 1.373
Predicted scape. Two mounds center Q3.
Rich structure with volcanic crater Q4.
DEPLOY UAV flyover.
A.2
(259,284)
(Var) 87.8 26.00
8.042, 0.141, -0.844, -8.941, -86.235, 1.938,
1.276
Variant scape. Large plateau with mound
and impact crater Q2. Mountain range Q1,
shelf Q4.
A.3
(306,337)
(TF) 118.3 16.46
3.317, 0.332, -0.269, -4.651, -60.362, 1.624,
1.367
Unique scpe. Very rich fractured area. Many
mountains and flow structure. Mountain Q4,
flow channel Q3.
A.4
(83,295)
(Relic) 80.8 24.50
8.511, -0.364, -0.674, -7.7, -78.273, 1.793,
1.363
Predicted scape. Interesting high altitude
plateau Q1,Q4, possible large shelf and drop
off Q3, or mountain range shadow.
DEPLOY UAV flyover.
Table 1. Characterization of final selected neighborhoods in ranked order from overall Mars flat map.
5.3 Layer A.1 to B
The algorithm process at the UAV flyover level the Mars image given in Table 2 corresponding to A.1. This results in
the selection of the local neighborhoods shown in Table 2 including the path. Table 2 displays the characterization of the
selected local neighborhoods in ranked order, including intended course of action by the next level of UMV. Table 2
shows neighborhood B.1 ranked highest, with B.3 and B.5 also noteworthy of further study at the next level as a relic.
Each is flagged for a UAV recon mission, and B.1 will be looked at in further detail in this work.
Bitmap Center
(x, y)
Simil
Rank
(10-4)
Std
Dev
Rank
Heptor Statistics
(StdDev, Skew, Kurt, M6, M8, DJ, DH)
B.1
(71,518)
(TF) 41.25 33.82
20.796, -0.149, -0.973, -8.788, -
79.743, 1.666, 1.365
Unique scape. Large fractured Mound amid
shadowed craters. Part of large flow field.
Channel on top center.
Initiate UAV Recon Mission.
B.3
(229,379)
(Relic) 31.0 13.60
4.495, 0.811, -0.4, -3.048, -43.476, 1.791,
1.338
Predicted scape. Foothills of mountains Q4.
Crater Q3. Initiate UAV Recon Mission
B.5
(569,1072)
(Relic) 75.08 28.82
9.438, 0.115, -0.946, -10.332, -94.062, 1.771,
1.332
Predicted scape. Fascinating flow channel
hills and structure across Q3. Initiate UAV
Recon Mission.
(0,0)
X
Y
N
Start
End
Start
End
(0,0)
NX
Y
Table 2. Characterization of final selected neighborhoods in ranked order for zoom in to A.1.
5.4 Layer B.1 to C
The algorithm process at the UAV recon level the next Mars image in Table 3 (corresponding to B.1 above), resulting in
the selection of the local neighborhoods shown including the path. Table 3 displays the characterization of the selected
local neighborhoods in ranked order, including intended course of action by the next level of UMV. Table 3 shows
neighborhood C.1 ranked highest, with no relic present.
Bitmap
(7x7
Enlarged)
Center
(x, y)
Simil
Rank
(10-4)
Std
Dev
Rank
Heptor Statistics
(StdDev, Skew, Kurt, M6, M8, DJ, DH)
C.1
(173,7)
(Var) 1070 53.05
45.996, -0.128, -0.783, -8.878, -86.141, 1.238, 1.369
Variant scape. Possible hill with craters near foothill
Q2,Q3. DEPLOY ROVER. Analyze and sample
terrain mission initiated.
C.2
(140,14
0)
(TF)
2170 13.81
20.537,-0.233, 0.072, 0.681, 9.549, 1.32, 1.384
Unique scape. Possible rise and slope Q1, Q2.
Mountains near Q4.
C.3
(205,18
6)
(TF)
13.51 26.21
26.05, -0.036, 0.149, -0.937, -31.079, 1.316, 1.374
Unique scape. Plateau with mountain Q2. Slope Q3.
C.4
(271,21
4)
(TF)
110 35.88
31.352, -0.14, -0.705, -6.19, -56.771, 1.299, 1.372
Unique scape. Mountainous region. Large variation of
structure.
Table 3. Characterization of final selected neighborhoods in ranked order for zoom in to B.1.
5.5 Terrain 1 (Rocky Zoom Into C.1)
Processing at ROVER level the Mars image (corresponding to C.1 above). Table 4 displays selected neighborhoods in
ranked order, including intended course of action by the next level of UMV. Table 4 shows neighborhood D.1 ranked
highest, with no relic present.
(0,0)
X
Y
NStart
End
(0,0)
X
Y
N
Start
End
Bitmap
(7x7
Enlarged
)
Center
(x, y)
Simil
Rank
(10-4)
Std
Dev
Rank
Heptor Statistics
(StdDev, Skew, Kurt, M6, M8, DJ, DH)
D.1
(216,227)
(TF) 18.98 8.23
9.321, 0.073, 0.431, 2.331, -4.151, 1.319,
1.357
Unique rocks. Analyze. COLLECT
SAMPLES from different scale zones.
D.2
(207,231)
(Var) 3630 16.90
14.54, -0.498, -0.031, -1.22, -26.075, 1.534,
1.388
Variant rocks. Photo.
D.3
(195,263)
(TF) 63.84 36.08
22.584, -0.071, -0.88, -8.862, -84.041,
1.674, 1.381
Unique rocks. Analyze.
D.4
(154,220)
(TF) 210 53.79
42.312, .00059, -1.458, -12.259, -99.769,
1.737, 1.369
Unique rocks. Analyze.
D.5
(133,193)
(Var) 320 27.77
26.576, -0.803, -0.116, -2.197, -
35.877,1.755, 1.372
Variant rocks. Photo.
Table 4. Characterization of final selected neighborhoods in ranked order.
ACKNOWLEDGEMENTS
The author thanks the following for support and encouragement during this Doctor of Science thesis and this paper: Dr.
White, Dr. Filipovic, Dr. Blank, Dr. Fennelly, Mr. Hons, and Mr. Handley of James Cook University Department of
Astronomy; and Mr. Hicklen of dtech Systems, Inc.
REFERENCES
1. Jaenisch, H.M., Algorithms for Autonomous Unmanned Deep Space Probes, D.Sc. Thesis, James Cook University.
(Submitted to University for Approval), 2005.
2. Jaenisch, H.M. and Handley, J.W., “Data Modeling for Radar Applications”, Proc.IEEE Radar Conf. 2003.
3. Rosko, Joseph S., Digital Simulation of Physical Systems, Reading, MA: Addison-Wesley, 1972.
4. Barnett, S., Cameron, R.G., Introduction to Mathematical Control Theory, 2nd ed., Oxford: Clarendon Press, 1993.
5. Jaenisch, H.M., Handley, J.W., Hicklen, M.L., “Data Model Predictive Control as a New Mathematical Framework
for Simulation and VV&A”, SPIE Defense and Security Symposium, Orlando, FL, April 2006.
APPENDIX Carousel Demonstration Holger M. Jaenisch November 3, 2005
nfiducial 10:= Number of original slides npts 64:= Length of each TF Eigflag 1:= Eigenfunction Model On=1
newpts 37:= Total number of slides (after interpolation) i 0 nfiducial 1−..:= j 0 npts 1−..:= w1i
10i 1+
nfiducial
⋅:=
Aj i, sin w1
i
j 2⋅ π⋅
npts⋅
:= Bj j, j:= zA
i⟨ ⟩cfft A
i⟨ ⟩( ):= zAi⟨ ⟩
FULLEIGzAi⟨ ⟩
Eigflag,( ):= zBi⟨ ⟩
cfft Bi⟨ ⟩( ):=
zBi⟨ ⟩
FULLEIGzBi⟨ ⟩
Eigflag,( ):= dotpi
CALCSIM zBi⟨ ⟩
npts,( ):= dotpi 1, i:= dotp csort dotp 0,( ):= A1
i⟨ ⟩A
dotp i 1,⟨ ⟩
:=
B1i⟨ ⟩
Bdotp i 1,⟨ ⟩
:= A A1:= B B1:= C reverse AT( ):= zA1
i⟨ ⟩zA
dotp i 1,⟨ ⟩
:= zA zA1:= dotp dotp0⟨ ⟩:= zB
i⟨ ⟩cfft B1
i⟨ ⟩( ):=
tfj i,
zAj i,
zBj i, 1+
:= Etfi⟨ ⟩
FULLEIGtfi⟨ ⟩
Eigflag,( ):= k 0npts
21−..:= zTF
j i npts⋅+ tfj i,:= i 0 newpts 1−..:= w1
i1 i .25⋅+:=
Generate
new
locations
in between
fiducial
points
RCAR GENCAROUSELRe zTF( ) dotp, npts, nfiducial,( ):= ICAR GENCAROUSELIm zTF( ) dotp, npts, nfiducial,( ):=
NewBj i, if floor w1
i( ) ceil w1i( ) B
j floor w1i( ) 1−,, Bj floor w1i( ) 1−, ceil w1
i( ) w1i
−( )⋅ Bj ceil w1i( ) 1−, w1
ifloor w1
i( )−( )⋅+, :=
zdotpi
if floor w1i( ) ceil w1
i( ) dotpfloor w1i( ) 1−, dotp
floor w1i( ) 1− ceil w1i( ) w1
i−( )⋅ dotp
ceil w1i( ) 1− w1i
floor w1i( )−( )⋅+,
:=
RAj i, RUNTURL RCAR
j⟨ ⟩zdotp
i,
:= IA
j i, RUNTURL ICARj⟨ ⟩
zdotpi
,
:= TFC
j i, RAj i, 1− IA
j i,⋅+:= zNewBi⟨ ⟩
cfft NewBi⟨ ⟩( ):=
Selected
Virtual TFzNewAj i, TFC
j i, zNewBj i, 1+( )⋅:= NewA
i⟨ ⟩icfft zNewA
i⟨ ⟩( ):= NewOut1 reverse NewAT( ):= selsim 29:=
w1selsim
8.25= zdotpselsim
0.977= TF1j
TFCj selsim,:= D
j i, NewBmod j i+ npts,( )( ) selsim,:=
zDi⟨ ⟩
cfft Di⟨ ⟩( ):=
zEj i, TF1
jzD
j i, 1+( )⋅:= Ei⟨ ⟩
icfft zEi⟨ ⟩( ):= E Re E( ):= FnlTF
j i, TFCj i,:= FnlTF reverse FnlTF
T( ):=
0 2 40
2
4
D j 0,
Bj ceil w1
selsim( ) 1−,
Bj floor w1
selsim( ) 1−,
j
TURLCOEFx y, n,( ) A
y1
y0
−
x1
x0
−←
B A x0
⋅( )− y0
+←
Dm
ym 1+ y
m−
xm 1+ x
m−
ym
ym 1−−
xm
xm 1−−
−←
m 1 n 2−..∈for
Em
xm
←
m 0 n 1−..∈for
F0
A←
F1
B←
F2
n←
F2 m+ D
m←
m 1 n 2−..∈for
Fn m+ 1+ E
m←
m 0 n 1−..∈for
F
≡
0 20 40 601
0
1
Ej 0,
Aj ceil w1
selsim( ) 1−,
Aj floor w1
selsim( ) 1−,
j
CALCSIM zA n,( ) dotp
0
ceiln
2
1−
k
Re zAk( ) Im zA
k( )⋅( )∑=
←
dotp cos dotp 2⋅ π⋅( ) sign dotp( )⋅←
dotp
≡
RUNTURLDERF x, norder,( ) A F0
←
B F1
←
n F2
←
c 1←
Em
Fn m+ 1+←
m 0 n 1−..∈for
Dm
F2 m+←
Cm
if
x Em
−
c300>
x Em
−
300, c,
←
m 1 n 2−..∈for
dj k, DCOEF norder j, k,( )←
k 1 norder..∈for
j 1 norder..∈for
yd if norder 1 A, 0,( )←
yderiv
1
norder
h 1
n 2−
m
Dm
dnorder h,⋅ C
m( )1 norder−⋅
10
x Em−
Cm
1 10
x Em−
Cm+
h
⋅ 1−( )h 1+
⋅ ln 10( )norder 1−
⋅
∑=
∑=
yd+←
yderiv
≡
FULLEIGzX eigflag,( )
n length zX( )←
maxloc 1←
maxval zX1
←
maxval zXi
←
maxloc i←
zXi
maxval>if
i 2 n 1−..∈for
An 1− 0←
Amaxloc
zXmaxloc
←
An maxloc− A
maxloc
←
A0
zX0
←
eigflag 1if
A zX← otherwise
A
≡RUNTURL F x,( ) A F
0←
B F1
←
n F2
←
c 1010−
←
Em
Fn m+ 1+←
m 0 n 1−..∈for
Dm
F2 m+←
Cm
if
x Em
−
c300>
x Em
−
300, c,
←
m 1 n 2−..∈for
yturl A x⋅ B+
1
n 2−
m
Cm
Dm
⋅ log 1 10
x Em−
Cm+
⋅
∑=
+←
yturl
≡
DCOEF iorder j1, k1,( )
di1 i2, 0←
i2 0 iorder..∈for
i1 0 iorder..∈for
d1 1, 1←
break iorder 2<if
di1 i2, d
i1 i2, di1 1− i3, i3⋅+←
i3 i2 1− i2..∈for
i2 1 i1..∈for
i1 2 iorder..∈for
i 1 1..∈for
dj1 k1,
≡
INTEGDATA x y, n,( )
Am 0,
xm
xm 1−+
2←
m 1 n 1−..∈for
A0 0,
3 x0
⋅ x1
−
2←
An 0, A
n 1− 0, xn 1−+ x
n 2−−←
A0 1, 0←
Am 1+ 1,
0
m
k
yk
Ak 1+ 0, A
k 0,−( )⋅ ∑=
←
m 0 n 1−..∈for
A
≡
NewOut1
FnlTF
FnlTF
GENCAROUSELA dotp, noutputs, nrep,( ) mindotp min dotp( )←
maxdotp max dotp( )←
xi
dotpi
←
i 0 nrep 1−..∈for
yj
Ai j noutputs⋅+←
j 0 nrep 1−..∈for
x1 x←
y1 y←
nrep1 nrep←
x1j 1+ x
j←
y1j 1+ y
j←
j 0 nrep 1−..∈for
x10
1−←
y10
ynrep 1−←
nrep1 nrep1 1+←
mindotp 1−>if
x1nrep1
1←
y1nrep1
y0
←
nrep1 nrep1 1+←
maxdotp 1<if
Bi⟨ ⟩
TURLCOEFx1 y1, nrep1,( )←
i 0 noutputs 1−..∈for
B
≡
Data Driven Differential Equation Modeling of fBm processes
Holger M. Jaenisch*a
, James W. Handley a, and Jeffery P. Faucheux
a
aSparta Inc., 4901 Corporate Drive, Suite 102, Huntsville, AL 35805
ABSTRACT
This paper presents a unique method for modeling fractional Brownian motion type data sets with ordinary differential
equations (ODE) and a unique fractal operator. To achieve such modeling, a new method is introduced using
Turlington polynomials to obtain continuous and differentiable functions. These functions are then fractal interpolated to
yield fine structure. Spectral decomposition is used to obtain a differential equation model which is then fractal
interpolated to forecast a fBm trajectory. This paper presents an overview of the theory and our modeling approach along
with example results.
Keywords: Data Modeling, fractal, noise, 1/f, inverse fractal modeling, differential equation, fractal operator,
interpolation, extrapolation, pseudo-derivative
1. INTRODUCTION
Fractional Brownian motion (fBm) processes which exhibit 1/fβ power spectra characteristics are continuously being
discovered as governing dynamics in natural systems. Functional modeling and prediction of fBm sequences can
significantly advance technologies such as MEMS, quantum computing, optical bistability, and biological processes
which exhibit self-organization and complexity.
The goal of this work is to develop:
1. Interpolating process that generates repeatable fine structure using only knowledge of the local neighborhood.
2. Extrapolation process that generates repeatable fine structure using only knowledge of the local neighborhood.
3. Simple autonomous method for generating differential equation models from sampled1 fBm data.
Currently, fBm models assume the form of a one-dimensional stochastic differential equation
)()()( tnxgxfdt
dx+= (1)
which assumes the self-similar or fractal fine structure of a fBm process is modeled by an additive noise term g(x)n(t). A
solution to (1) would be
[ ] .)()()()( ∫ += dttnxgxftx (2)
Lacking a repeatable numerical solution, the classic approach is modeling the fBm ensemble average behavior of (2)
using the generalized Fokker-Planck equation2,3,4 of the form
))((2
1))(( 2
2
2
Pxgx
Pxfxt
P
∂∂
+∂∂
−=∂∂
(3)
for P(t,x) with P(0,x)=δ(x-z). The Fokker-Planck equation is a model of the dynamic continuous probability density
function (PDF) of the solution to the stochastic differential equation in (1) arising from Brownian motion increments.
*[email protected]; phone 1 256 337-3768; fax 1 256 830-0287; sparta.com
In (2), if n(t) is modeled by a pseudo random number generator, non-repeatable fBm trajectories result for each model
run. If the random number generator is replaced by a function of the form
)sin()( ettn = (4)
repeatable fBm trajectories are generated. If the fBm fine structure could be synthesized from the local temporal
dynamics or f(x) directly, the need for injecting the additive noise term n(t) externally would be eliminated.
Instead of modeling only the dynamic PDF, we derive a specific alternative equation formulation for a particular fBm
trajectory x(t). We propose
{ } [ ]∫ +≅≅ dttnxgxftftx f )()()()()( D (5)
which uses the self-similarity inherent in the almost periodic function f(x) to define the fractal fine structure (micro-
trend) in the form of an interpolating operator Df applied to a continuous function f(t) (macro-trend). Not only can this be
done to the trajectory or motion history directly by modeling f(t) with polynomials, but it can also be applied to the
solution of differential equation models.
2. INTERPOLATION
Interpolation is used to evaluate f(x) between fiducial points or knots using knowledge of the local temporal
neighborhood. Interpolation functions are smooth curves passing through fiducial points. Fine structure is added from a
separate source g(x) in a non-repeating fashion.
2.1 Deslauriers-Dubuc general interpolant
This approach achieves functional fractal fine structure modeling from sub-sampled fBm data or continuous models. The
dyadic extension of the fundamental interpolant is
)3()1()1()3()()2/( ++++−+−+= tdFtcFtbFtaFtFtF (6)
where a, b, c, and d are scaling coefficients, subject to (7)5. In (7), [t] denotes the integer portion of the number t.
∑+
−=
−=3][
2][
)()()(t
tn
ntFnyty (7)
The fundamental interpolant generates a value at each interval midpoint (t/2) in the sampled data using a weighted sum
of the two points occurring before and after the midpoint. These points correspond to the 1 step and 3 step intervals
before and after the interval midpoint at the new dyadic interpolation resolution. Method limitations include being too
open-ended, summation of the scaling coefficients must equal one, and inability to determine scaling coefficient values
directly from the data.
Since the solution of the coefficients in (6) cannot be obtained analytically and remain heuristic adjustment values, we
chose to recast Deslauriers-Dubuc dyadic interpolation as Hidden Variable Iterated Function Systems (IFS). IFS recasts
the interpolation into geometric form using contractive affine transformations6,7 defined as
,0
+
=
n
n
nn
n
nf
e
y
x
dc
a
y
xw (8)
and Elton8 proved in (9) that this process will converge for all continuous functions in (8). Such convergence is
guaranteed by modeling the fBm macro-trend with continuous Turlington polynomials described later in this paper.
∑ ∫=
∞→=
+
n
kX
kn
xdxfxfn 0
)()()(1
1lim µ (9)
2.2 Iterated Function Systems (IFS)
IFS interpolation is a contractive affine transform subject to the constraints
=
−
−
1
1
0
0
n
n
ny
x
y
xw and for n=1,2,…, N. (10)
=
n
n
N
N
ny
x
y
xw
Therefore, the five real numbers a, c, d, e, and f that specify the transformation are defined as
,10 −=+ nnn xexa ,nnNn xexa =+ ,100 −=++ nnnn yfydxc .nnNnNn yfydxc =++ (11)
This is a system of four equations and five unknowns, resulting in one free parameter in each transformation. This free
parameter is chosen to be dn and is called the vertical scaling factor. This choice allows the other four parameters to be
written in terms of the data and the parameter dn as
,)(
)(
0
1
xx
xxa
N
nnn −
−= − ,
)(
)(
0
01
xx
xxxxe
N
nnNn −
−= −
,)(
)(
)(
)(
0
0
0
1
xx
yyd
xx
yyc
N
Nn
N
nnn −
−−
−−
= − .)(
)(
)(
)(
0
00
0
01
xx
yxyxd
xx
yxyxf
N
NNn
N
nnNn −
−−
−−
= − (12)
We determine dn using a pseudo-derivative, defined as
.)()( 1111 −+−+ −=+−+= nnnnnnn yyyyyyd (13)
IFS interpolation starts with an initial point (usually one of the points given in (10)) and applies one of the transforms
given in (8) and defined by the coefficients in (12) to this point to generate a new point that lies on the attractor for the
function. This new point is then used as a starting point and the process is repeated. This process iterates until the
function or attractor fills in to within a specified tolerance given as
.ˆ ε≤− xx (14)
This test must be performed either at each iteration or the process allowed to progress for a fixed large number of
iterations and then checked. The IFS coefficients in (12) determine the current x distance from the starting point of the
data set (as a fractional percentage of the length of the entire data set) and maps the new y value to an x location the same
fractional percentage into the transform interval. The coefficients in (12) determine how far the vertical scaling
coefficient raises the previous y value above or below the sloped line connecting the beginning y and ending y of the
transform interval.
IFS interpolation can be implemented in two ways: a deterministic algorithm or a random algorithm. The deterministic
algorithm applies the transform in (8) and (12) in a combinatorial fashion to fill in the attractor at increasing scale
resolutions. The random algorithm uses the same transforms and coefficients, but they are applied in a Markov process
fashion so that the attractor fills in at random x locations. Neither of these approaches yields an f(x) without substantial
O[103] iterations. The number of iterations was bounded to O[102] by the authors in previous work using a Ruelle-
Takens theorem base algorithm9.
2.3 Data Modeling
In previous work, the authors demonstrated the ability to sub-sample data and reconstruct it by defining the Data Model
as the sub-sampled points9,12. These equally spaced sub-sampled points are the support vectors and the number of
points N is identified using
=→∆+
N
NJ
R
Ji
Ji
1log
log
limmin)Re(0
1
σ 5.01
log
log
lim)Re(0
1 =
=→∆+
N
NJ
R
Ji
Ji
σ (15)
The number of points N to sub-sample in building the model is bounded by (15) where R is the range of the data and ı is
the standard deviation of the data. N can be solved for between these bounds in order to minimize the variance of the
residual between reconstructed and original. We have found excellent results by solving the second equation in (15) for
N, and have determined that a good rule of thumb to go by in the absence of this information is to sub-sample 10% of the
original data set.
2.4 Fractal operator
The use of fractal operators applied to images is explored by Davis10 and Russ19. Neither show how image operators
can be applied to f(x).
From (8), the information required for IFS to evaluate f(x) at a specific x location consists of the coefficients from (12)
that define the IFS transformation associated with the nth segment containing the point and the value of the previous
point (xi-1 ,yi-1). This information is used to transform the previous point into the current (xi ,yi) location. Because we
have defined dn by a pseudo-derivative, each of the coefficients in (12) is based on the sub-sampled data values
themselves along with the beginning and ending points. Using equally spaced sampled data and assuming that the
sampled data process starts at x0 = 0 and ends at xN = N reduces (12) and (13) to
,)( 1
N
xxa nn
n−−
= ,1−= nn xe ,))(()( 0111
N
yyyy
N
yyc Nnnnn
n
−−−
−= −+− (16)
).( 1101 −+− −−= nnnn yyyyf
Values for xn and xn-1 can be obtained from the sub-sampling rate, length of the original data set, and the current xi
value. If estimates of yn-1, yn, and yn+1 can be generated from a f(x), then the coefficients in (16) can be solved for directly
whenever the transform is used. Calculating these y values requires knowledge of xn-1, xn, and xn+1. Equation (16) is
further reduced assuming the Data Model exists on the [0,1] interval so that N = 1 and x1 = xN = 1.
Since the IFS process is based on affine transformations, the xi-1 value for the previous point that was transformed into
the current point xi can be determined geometrically. Using the same f(x) above, the value yi-1 at previous xi-1 is
determined and transformed into the needed (xi ,yi) pair. Recasting (8) using the coefficients from (16) results in our
fractal operator on yi defined as Df
{ } ).()())(()(
110111110111
−+−−−+−−+− −−+−+
−−
−−
≡ nnninniNnnnn
if yyyyyyyxN
yyyy
N
yyyD (17)
where N is the number of points in the original process Data Model and vanishes if N = 1, y0 the value of the Data Model
at x = 0, yN the value at x = N, i the index associated with the (x,y) pair that is transformed into the needed function pair,
and n the index of the straight line segment connecting the original Data Model points defining the interval containing
the required xi.
2.5 Turlington polynomials
We propose obtaining fiducial yi values from a polynomial representation of the fBm macro-trend. This equation is
evaluated at any yi with associated xi. Chebyshev polynomials are an orthogonal model. However, this also means extra
terms cannot be added to the model equation as new data points are sampled. Conversely, the Turlington polynomial11
defined as
( )
( )
( )
.101log
101
10log)(
1
10
1
10 ∑∑
++
+
++=−
−
−
md
xx
m
i
c
xx
c
xx
im
m
i
i
i
i
DCBxAxT (18)
where coefficients A and B are the slope and y intercept of the reference asymptote, ci and dm are the asymptote transition
region that captures major changes in the first derivative of the data, xi and xm are the intersection points between two
neighboring asymptotes, and Cm and Dm are the slopes of the asymptotes multiplied by the transition regions.
Determination of these coefficients are discussed in more detail in Reference 11.
Turlington polynomials are built in a piecewise fashion relative to an arbitrary reference asymptote. Terms can be
appended to the polynomial as new trajectory points occur. Turlington polynomials are constructed by assuming one of
the segments connecting neighboring points in the data set is the reference asymptote. The largest derivative location in
the data set is usually chosen for the reference asymptote; however, choice of the reference asymptote is arbitrary and
will be selected as the first segment in our application as shown below.
Although this function is built from the Data Model data points (support vectors) piecewise, the final equation
representation is both continuous and differentiable. Unlike spline based methods, one term is added to the equation for
each of the sub-sampled data points used in its construction, and because of the logarithmic nature of the equation, new
terms can be appended for new data points without changing the T(x) values at previously sampled x locations.
The analytical derivative of the Turlington polynomial is
( )
( )∑∑ −
−+
+
+
+
++=
m
d
xx
d
xx
m
mi
xx
xxx
i
m
m
m
m
i
ii
d
DCB
dx
dT
112
)1exp()1exp(
)1exp()1exp(
101
10
1010
10100
)1exp(
(19)
Real time data applications assume the first data segment collected is the reference asymptote causing i in (18) to be
zero. This results in the simplified form
( )
∑
+++=
−m
d
xx
mm
m
DBxAxT1
10 101log)( (20)
and ( )
( ) .
101
10
1
∑ −
−
+
+=m
d
xx
d
xx
m
m
m
m
m
m
d
DB
dx
dT (21)
By taking successive derivatives on (20) and collecting like terms, it can be shown that the nth order derivative of the
reduced form of the Turlington polynomial given in (20) is
( )
( )
( ).)10ln(
101
101
)(1
,
1
1
1
∑ ∑
+
−+= −
= −
−
+m
in
nn
ii
d
xx
i
d
xx
n
m
m
i
n
n
m
m
m
m
d
DnB
dx
Td ψ (22) ∑≠≠−=
−=i
njj
ij
jnin j
01
,1, ψψ
where B(1) = B and B(n) = 0 for all n > 1, and ψn,i defined for n > 1 with ψ1,1 = 1. This results in the identity
( )
( )
( )
( )
.101log)10ln(
101
101
)(1
10
1
,
1
1
1
∑∑ ∑∫∫
+++=
+
−+
−−
= −
−
+ md
xx
m
m
in
nn
ii
d
xx
i
d
xx
n
m
m
i
n
m
m
m
m
m
m
DBxAdxdxd
DnB L
321L ψ (23)
Source code listed in the Appendix contains a program for calculating both the Turlington polynomial for a given data
set and the nth order derivative using the formulation shown in (22). This source code takes as input a data file name, the
number of points in the file, and the order of the derivative (0 gives T(x) instead of a derivative). Output is a source
listing that can be interrogated at any given x value on the interval [0,1], with 0 representing the input data starting point
and 1 the ending point. A loop can be added to generate multiple data points on the [0,1] interval.
2.6 Fractal operator (Df) on Turlington polynomials
Once the beginning and ending x values for the nth IFS transform are determined, T(x) is used to determine y values for
each end point and yi-1 for the nth segment transformation. With this knowledge, it is only necessary to perform a single
IFS transform on (xi-1 ,yi-1) to find the desired value for (xi ,yi), which leads to the functional form of the fractal operator
on T(x) as
{ }.)()( xTxf fD= (24)
IFS
Original
T(x)
Diff Eq
0 255
Df{T(x)}
Fig. 1. Comparison of fractal operator derived function with original.
For the fractal operator method, the pseudo-derivative dn yields a lower bound value because it is derived from T(x)
rather than the raw data. Fig. 1 shows a comparison of this method with IFS and the original. The polynomial model
does not contain the range of fine structure that is captured in sub-sampling the raw data set. If the application requires,
the dn in (13) can be scaled up by σ(f(x))/σ(T(x)), the ratio of the standard deviation of the original data to the standard
deviation of T(x). These standard deviations can be calculated recursively on the data in real time as it occurs using (25)
where µ is the mean and σ is the standard deviation12.
1
1
1
1
+
+
++ +
=
i
i
i
iii
N
y
N
Nµµ ( ) ( )
.1
2
11
1
2
1
+
++
++
−+
=
i
ii
i
iii
N
y
N
N µσσ (25)
3. EXTRAPOLATION
Extrapolation is used to determine values at locations outside the neighborhood of all prior data points. This is akin to
evaluating a function beyond its original extent. This process can use knowledge of either the local or global
neighborhood. Extrapolation functions take the form of linear or smoothed trend estimates. Seldom is fine structure
preserved.
3.1 Linear prediction
Linear prediction is an autoregressive model that is linear in the sum of its terms. It performs extrapolation on data sets
that are equally spaced and stationary. Linear prediction is especially successful in extrapolating signals that are smooth
and oscillatory, though not necessarily periodic. Linear prediction uses N consecutive values in order to predict the N+1
value. Linear prediction is defined as
,1
∑=
− +=N
j
mjmjm xydy (26)
where each of the dj coefficients are the weighting factors in making the linear combinations of the N consecutive ym
values, and xm is the discrepancy between the predicted value and the actual. For linear prediction, the xm term is
assumed to be zero. The number of previous points to use in the calculation (N) is a user specified input. Measures must
also be taken with linear prediction to insure stability. The stability condition is given by the characteristic equation
01
=−∑=
−N
j
jN
j
N zdz 1≤z (27)
where all roots must be inside of the unit circle. When solving (26) above for the linear prediction coefficients, there is
no guarantee that the roots will fall within the unit circle13.
3.2 Novel fractal algorithm
Because IFS are contractive on the unit interval, they do not extrapolate. In order to use them for extrapolation while
maintaining contractivity, a new support vector outside of the original transform intervals is introduced. This is
accomplished using linear prediction or by appending the first point of the data set to the end of the data set according to
)( 11 iiNN xxxx −+= ++
iN yy =+1 Ni ,...,2,1= (28)
to form the extrapolation interval. New IFS transform coefficients using (12) are calculated, and the data sequence is
reconstructed to include the newly added support vector. Only the points generated in the first ½ of the extrapolation
interval are kept as extrapolated points. This process is repeated by increasing i in (28) and appending the next point in
the original data set to the end of the saved extrapolated points to create a new extrapolation interval. The reconstruction
process is continued until all of the original data points have been used, and can be repeated as many times as necessary
to generate the desired length forecast9.
3.3 Differential equation approach
Polynomial models of data are valid across the range of the original data set. Differential equation models do not suffer
this constraint because they are comprised of continuous functions that can model the dynamic process across a much
longer time projection than the original interval. If the underlying differential equation governing a fBm process can be
determined, the differential equation can then be used to determine the value of the function even for cases that fall
outside of the original boundaries.
Using Data Modeling, we have developed a simple method for determining a differential equation for a given fBm
process. We begin by assuming that the process can be modeled by a first order differential equation of the form
)(tfdt
dx= (29)
where t is the independent variable and f(t) is the fBm process. Since differential equation based processes are comprised
of continuous differentiable functions, we first build a T(x) of the form found in (20) of the function f(t). Once T(x) is
determined, its first derivative T’(x) can then be determined from (21).
Data Relations Modeling is then used to discover the functional relationship between the first derivative and its forcing
functions. Data Relations Modeling is the process of finding a mathematical expression that provides a good fit between
given finite sample values of the independent variables and the associated values of the dependent variables of the
process. The mathematical expression that fits the given sample of data is called a Data Model. This process involves
finding both the functional form of the Data Model and the numeric coefficients for the Data Model. The Data Relations
Modeling process of discovering automatic functions is an evolutionary and genetic directed approach where the set
elements consist entirely of simple polynomial building blocks and is described in detail in References 14-17.
The final form of the Data Relations Modeling equations and the polynomial order are derived from an approximation to
the Kolmogorov-Gabor polynomial
L++++= ∑∑∑∑∑∑i j k
kjiijk
i j
jiij
i
ii xxxaxxaxaa0φ (30)
represented by
)))))))))((((((((,()( 21 tbxtbxbxtftx nK= [ ] ntxO 3)( = (31)
which is a form of an orthogonal nested polynomial network comprised of low order basis functions. In (31), n
represents the number of layers in the final Data Model and x(bi(t)) the inputs mapped from the previous layer. Data
Model polynomials on the order of ~ O[310] have been generated within minutes using only 3rd order polynomial basis
functions. Data Models adequately capture high degrees of process nonlinearities but still execute in real-time14,15,16,17.
In order to perform Data Relations Modeling, it is necessary to select the form of the forcing functions driving T’(x). If a
specific forms of the forcing functions are known, they can be specified as inputs into the Data Relations Modeling
process. If the form is unknown, Data Relations Modeling uses eigenfunctions or cosine and sine terms derived from the
dominant frequency components as the model of the fBm trajectory forcing functions.
To find the dominant frequency components, eigenanalysis is used to identify the eigenvalues of T’(x). These
eigenvalues correspond to the dominant peaks in the power spectra and are determined in ranked order of contribution.
However, the process of determining eigenvalues is complicated through the use of linear algebra methods to insure that
the matrix is Hermitian, normal, yields real roots, and in most cases is symmetric. If the matrix does not exhibit these
properties, it must be transformed. Since a great deal of care and preprocessing steps may potentially need to be taken in
applying this method, the authors chose to use the simple peak identification method listed below.
We propose a variation of Wiener filtering to identify individual peaks in the dB power spectra. This is achieved by
first performing regression to fit a straight line to the entire dB power spectra, resulting in a slope and a y intercept. The
value of the y intercept on the dB power spectra is the noise floor on the entire spectra and is illustrated in Fig. 2. Once
the noise floor is determined, dB power spectra values that fall below the noise floor can be zeroed and an inverse
Fourier transform performed. Maximum entropy methods can then be used to accurately determine the locations of
peaks in the power spectra and the corresponding number of cycles for each peak. This method was avoided in this work
because it can also lead to the enhancement of spurious peaks above the noise floor.
Peaks are identified by slope changes using the 1st derivative of the dB power spectra (dPSD). Zero values in dPSD
correspond to minima and maxima occurring one point earlier in the power spectra. For maxima, the derivative value
immediately before the zero crossing has a positive value and the derivative value after the crossing has a negative value.
Once flagged as a maxima, the value in the dB power spectra is then checked to insure that the maxima is above the
noise floor. This process is demonstrated in Fig. 2, where the graph on the left shows the linear fit and noise floor
superimposed onto the power spectra, and the graph on the right dPSD used in identifying maxima.
dB Power Spectrum dB Power Spectrum Derivative
Noise Floor
-58.6
0.0
-21.3
3
0.9
0 127 1 127
Linear FitZero Line
PSD peaks correspond
with negative slope at zero
crossing in PSD derivative
Fig. 2. Identification of maxima values above noise floor in power spectrum.
Cosine and sine terms as shown in the fractal operator and differential equation source code in the Appendix are
generated for each maxima. Care is taken to ensure that the number of cycles generated by each cosine and sine term
across the number of points used as input to Data Relations Modeling is the same as that represented by each dB power
spectra maxima. Simply adding these terms together and using a differential equation solver does not give the
solution to the differential equation. As shown in Section 4, nonlinear coupling exists between individual cosine
and sine terms, and the proper weighting coefficients for individual coupled terms must be generated14,15.
Since Data Relations Modeling scales as a nested O[3n] process, the resulting nonlinear differential equation may
have hundreds of terms associated with it. This results in the autonomously generated Data Driven Differential
Equation
)))))))))((((((((,( 21 tbxtbxbxtfdt
dxnK= (32)
Once Data Relations Modeling has found the Data Driven Differential Equation shown in (31), it can be solved using
numerical integration. Now, a solution of the differential equation can be found for independent variable values outside
of the original range. When this differential equation is coupled with the fractal operator Df, fBm dynamic processes are
modeled as
{ }.)()( tftx fD≅ (33)
Here f(t) is the Runge-Kutta solution to Equation (32) above. Although this method can be slow, the process of
integration can be sped up by saving intermediate terms. And if the application merits this approach, Turlington
polynomials for Df{T(x)} can be used directly bypassing the need for integration.
4. EXAMPLE
An example of this Data Modeling process was used to determine the differential equation for a specific fBm data set.
This model was extrapolated. Envision a continuous forcing function defined as
,)sin()( ∫= dxxxf e (34)
which is the integral of the Sxe function9,15. This function generates a fBm data set as shown in Fig. 3, where the data set
is 512 points in length, and the Data Differential Equation Modeling process was performed using the first 256 points of
the data set and leaving out the remaining 256 points for comparison with the extrapolation results18.
A variance based sub-sampling method was applied to the data set to determine the support vectors necessary for Data
Modeling. This optimal sampling method first sub-samples 10% of the data points evenly and uses straight-line
segments to interpolate between the points. The root sum squared (RSS) difference between the interpolated and
original curves is generated, and is declared to be the baseline. Next, the data set is sub-sampled and reconstructed
using the methods described above in Section 2.2, beginning at 5% and iteratively increasing the sub-sampling
percentage. When the RSS difference between the reconstructed and original is less than that calculated using the
straight-line segments above, the optimal sub-sampling rate is identified.
For the data set shown on the left of Fig. 3, the optimal sampling rate was found to be 5:1. Every 5th data point was sub-
sampled from the data set, and T(x) and T’(x) generated from Equations (20) and (21). This T’(x) represents the
differential equation, and is the output in the Data Relations Modeling process.
U s e d t o b u i l d
d i f f e r e n t i a l e q u a t i o n
H e l d b a c k t o t e s t
e x t r a p o l a t i o n
0 2 5 5 2 5 6 5 1 1
-11.4
8.0
Fig. 3. fBm data set generated using Sxe function.
Since no a priori knowledge exists for the form of the forcing functions, cosine and sine terms were generated for the
dominant frequency components. Fig. 2 shows the process of identifying nine peaks corresponding to 2, 6, 9, 12, 15, 18,
20, 23, and 26 cycles for this data set. A total of 18 separate inputs (9 cosine and 9 sine) were generated for use in Data
Relations Modeling, and each equation is built by multiplying the number of cycles above with 2π t. The equations for
these 18 terms are listed in the fractal operator and differential equation source code in the Appendix of this paper, and
correspond to
)),52cos(),46cos(),40cos(),36cos(),30cos(),24cos(),18cos(),12cos(),4(cos( tttttttttfdt
dx πππππππππ=
))52sin(),46sin(),40sin(),36sin(),30sin(),24sin(),18sin(),12sin(),4sin( ttttttttt πππππππππ
(35)
Using Data Relations Modeling, the functional relationship between these 18 inputs and the output was calculated.
Although it was possible that coupled terms in the final Data Model could reach orders as high as O[34] = 81, this case
yielded an acceptable solution using only the coupling of 3 forcing functions in any single term. A comparison of the
Data Model based first derivative and the Turlington derivative is shown in Fig. 4.
A fourth order Runge-Kutta differential equation solver was then applied to the Data Relations Model to generate the
macro-trend based on the differential equation. The fractal operator Df was then applied, resulting in the addition of the
micro-trend.
O r ig in a l
D a ta
M o d e l
R S S D if f = 3 .4 %
0 2 5 5-12
.6
8.0
Fig. 4. Turlington derivative and Data Model.
The Runge-Kutta process was allowed to continue to propagate for a full interval in the x-axis past the end of the data
used for generating the differential equation model. The results of this are shown in Fig. 5, including a comparison of
this extrapolation method and linear prediction. Source code to generate this function is given in the Appendix under the
fractal operator and differential equation heading. Removing the apostrophe in line 3 of the source generates the first ½
of the top curve (Df) in Fig. 5 along with the differential equation output, Df{T(x)} and T(x). Leaving the apostrophe in
place forces single point f(x) evaluation at the command line specified x value, and changing 255 to 511 in line 22 allows
the program to generate the curves on the full intervals shown in Fig. 5.
O rig in a l
D iffe re n tia l
E q u a tio n
L in e a r
P re d ic tio n
E x tra p o la te d
0 2 5 5 2 5 6 5 1 1
R S S D iff = 2 5 %
R S S D iff = 2 6 %
Fig. 5. Final results showing comparison of Data Driven Differential Equation Modeling and linear prediction.
Using the differential equation approach combined with the fractal operator Df generated a good estimate of the behavior
exhibited by the fBm process. This can be seen by comparing the graphs on the right of Fig. 5. The differential equation
method yielded a good filtered estimate of the fBm process, and can enable physics based modeling by the coupling of
physics based forcing functions to generate new fBm trajectories. On the other hand, the Turlington polynomial
approach yields real time solutions of a particular dynamic system configuration. The novel fractal operator predictably
generates model behavior fine structure in a functional form. All these tools are available for the investigator’s use and
discretion.
ACKNOWLEDGMENTS
The authors would like to thank Marvin Carroll, Tec-Masters, Level-13, Licht Strahl Engineering INC, and Kristi
Jaenisch and Technical Dive College for the use of the MAJQANDA algorithm suite during the course of this work.
REFERENCES
1. Handley, J. On The Existence of a Transition Point in Sampled Data using Fractal Methods. Ann Arbor, MI: UMI,
1995.
2. Zwillinger, D. Handbook of Differential Equations, 3rd edition. San Diego, CA: Academic Press, 1998.
3. Dubkov, A. B. Spagnolo, and N. Novgorod. “From theory of infinitely divisible distributions to derivation of
generalized master equation for Markov process”, Proceedings of SPIE: Noise in Complex Systems and Stochastic
Dynamics. June 2-4, 2003. Santa Fe, NM.
4. Soize, C. The Fokker-Planck Equation for Stochastic Dynamical Systems and Its Explicit Steady State Solutions.
Singapore: World Scientific, 1994.
5. Cherbit, G. Fractals: Non-integral Dimensions and Applications. New York: Wiley, 1991.
6. Massopust, P. Space Curves Generated By Iterated Function Systems. Ann Arbor, MI: UMI, 1986.
7. Scheinerman, E. Invitation to Dynamical Systems. Upper Saddle River, NJ: Prentice-Hall, 1996.
8. Elton, J. “An Ergodic Theorem for Iterated Maps,” Journal of Ergodic Theory and Dynamical Systems, 7:481-488
(1987).
9. Jaenisch, H. and J. Handley “Data Modeling of 1/f noise sets”. Proceedings of SPIE: Noise in Complex Systems and
Stochastic Dynamics. June 2-4, 2003. Santa Fe, NM. (CD version contains source code and executable)
10. Davis, G. “Implicit Image Models in Fractal Image Compression”, Invited paper, SPIE Conference on Wavelet
Applications in Signal and Image Processing IV, Denver, August 1996.
11. Turlington, T. Behavioral Modeling of Nonlinear RF and Microwave Devices. Boston, MA: Artech House, 2000.
12. Jaenisch, H. and J. Handley, “Data Modeling for Radar Applications,” Proceedings of the IEEE Radar Conference
2003, May 5-8, 2003. Huntsville, AL.
13. Press, W., S. Teukolsky, W. Vetterling, and B. Flannery. Numerical Recipes in FORTRAN, 2nd edition. Cambridge:
Cambridge University Press, 1992.
14. Jaenisch, H. and J. Handley “Data Modeling of network dynamics”. Proceedings of SPIE: Applications and Science
of Neural Networks, Fuzzy Systems, and Evolutionary Computation VI. August 5-6, 2003. San Diego, CA.
15. Jaenisch, H., J. Handley, K. White, J. Watson, C. Case, and C. Songy. “Virtual prototyping with Data Modeling”,
Proceedings of SPIE, July 7-11 2002. Seattle, WA.
16. Jaenisch, H., J. Handley, C. Case, C. Songy, “Graphics based intelligent search and abstracting using Data
Modeling”, Proceedings of SPIE, July 7-11 2002. Seattle, WA.
17. Jaenisch, H. “Fractal Interpolation for Patching Holes in Data Sets”, Proceedings of the Southeastern Simulation
Conference ’92, Pensacola, FL 1992.
18. Handley, J., H. Jaenisch, C. Bjork, L. Richardson, R. Carruth, “Chaos and Fractal Algorithms Applied to Signal
Processing and Analysis,” Simulation, 60:4, 261-279 (1993).
19. Russ, J. The Image Processing Handbook, Third Edition. Boca Raton, FL: CRC Press, 1999.
APPENDIX
Source listing for generating Turlington polynomial T(x) and n
th order derivative T
n(x) generation routine.
DEFCUR A-Z: INPUT "File to process => ", zz$: INPUT "Number of points => ", npts: INPUT "Derivative Order (0 for none) => ", n
CLS : OPEN zz$ FOR INPUT AS #1: OPEN "turling.bas" FOR OUTPUT AS #10
PRINT #10, "defcur a-z": PRINT #10, "cls": PRINT #10, "INPUT "; CHR$(34); "X (0-1) = "; CHR$(34); ", x"
INPUT #1, y0: INPUT #1, y1: y2 = y1: cc = .01: i = 2: m = (y1 - y0) * (npts - 1): ycept = y0
IF n > 0 THEN PRINT #10, "n="; n; ":redim d(n,n):d(1,1)=1": PRINT #10, "for i=2 to n": PRINT #10, " for k=1 to i"
IF n > 0 THEN PRINT #10, " for j=k-1 to k": PRINT #10, " d(i,k)=d(i,k)+d(i-1,j)*j"
IF n > 0 THEN PRINT #10, " next j": PRINT #10, " next k": PRINT #10, "next i"
IF n = 1 THEN PRINT #10, "y="; m ELSE PRINT #10, "y=0": PRINT #10, "y = "; m; " * x + "; ycept
IF n > 0 THEN PRINT #10, "FOR i = 1 TO n"
FOR n1 = 1 TO npts - 2
IF i > 2 THEN : y0 = y1: y1 = y2
INPUT #1, y2: i = i + 1: a = (npts - 1) * (y2 - 2 * y1 + y0) * cc: b = (i - 2) / (npts - 1)
IF n = 0 THEN PRINT #10, "y = y + "; a; " * LOG(1 + 10 ̂ ((x - "; b; ") / "; cc; ")) / LOG(10)"
IF n > 0 THEN PRINT #10, " y=y+d(n,i)*((-1)̂ (i+1))*"; a; "*(10̂ ((x-"; b; ")/"; cc; ")̂ i)*(log(10)̂ (n-1))/((("; cc; ")̂ n)*((1+10̂ ((x-"; b; ")/"; cc; "))̂ i))"
NEXT n1
IF n > 0 THEN PRINT #10, "NEXT i"
PRINT #10, "print "; CHR$(34); "F("; CHR$(34); "x"; CHR$(34); ") = "; CHR$(34); ",y": PRINT #10, "end": CLOSE : END
Source listing for applying fractal operator Df to T(x) and Data Driven Differential Equation Model. DEFCUR A-Z: DECLARE FUNCTION y1@ (x1@) : DECLARE FUNCTION yt@ (x@)
DECLARE SUB rk4 (y@, dydx@, x@, h@, yout@) : DECLARE SUB derivs (xh@, yt@, dym@)
CLS : x = VAL(COMMAND$): i = -1: OPEN "dqdmout" FOR OUTPUT AS #1: ' i = 0
ntotal = 256: nsub = 51: ymin = -5.375: ymax = 7.998
10 : IF x = ntotal THEN nsub = nsub + 1: ntotal = ntotal + ((ntotal - 1) / (nsub - 1))
n = CINT(x * (nsub - 2) / (ntotal - 1)) + 1
istart = CINT(((ntotal - 1) * (n - 1)) / (nsub - 1))
ifinish = CINT(((ntotal - 1) * n) / (nsub - 1))
iend = CINT(((ntotal - 1) * (n + 1)) / (nsub - 1))
xpt = (x - istart) / (ifinish - istart)
ix = (ntotal - 1) * xpt
dfde = ((y1(iend) - ymin) / (ymax - ymin)) - ((y1(istart) - ymin) / (ymax - ymin))
dfde = (((y1(ifinish) - y1(istart)) / (ntotal - 1)) - (dfde * (y1(ntotal - 1) - y1(0)) / (ntotal - 1))) * ix
dfde = dfde + (((y1(iend) - ymin) / (ymax - ymin)) - ((y1(istart) - ymin) / (ymax - ymin))) * y1(ix) + y1(istart)
dfde = dfde - y1(0) * (((y1(iend) - ymin) / (ymax - ymin)) - ((y1(istart) - ymin) / (ymax - ymin)))
x1 = (x * 1.1765 - CINT(6 * INT(x * 1.1765 / 6))) / (CINT(6 * (INT(x * 1.1765 / 6) + 1)) - CINT(6 * INT(x * 1.1765 / 6)))
dftx = ((yt((INT(x * 1.1765 / 6) + 2) * 2) - yt(2 * INT(x * 1.1765 / 6))) / 2) * yt(x1 * 100) * 2
dftx = dftx + yt(2 * INT(x * 1.1765 / 6)) - .4062 * (yt((INT(x * 1.1765 / 6) + 2) * 2) - yt(2 * INT(x * 1.1765 / 6)))
dftx = dftx + x1 * (yt((INT(x * 1.1765 / 6) + 1) * 2) - yt(2 * INT(x * 1.1765 / 6)))
dftx = dftx + x1 * (-(yt((INT(x * 1.1765 / 6) + 2) * 2) - yt(2 * INT(x * 1.1765 / 6))) * -.3029)
dftx = dftx * 13.373 + -5.375: tx = yt(x * 1.1765 / 3) * 13.373 + -5.375: de = y1(x)
PRINT #1, de, dfde, tx, dftx: IF i = 0 AND x < 255 THEN : x = x + 1: GOTO 10
CLOSE : END
SUB derivs (xh, yt, dym)
pi = 4@ * ATN(1@)
cos12 = (COS(24 * pi * xh) - .0033) / .6381: cos15 = (COS(30 * pi * xh) - .0033) / .6329: cos18 = (COS(36 * pi * xh) - .0033) / .6382
cos2 = (COS(4 * pi * xh) - .0033) / .6379: cos6 = (COS(12 * pi * xh) - .0033) / .6382: sin18 = SIN(36 * pi * xh) / .6337
cos20 = (COS(40 * pi * xh) - .0033) / .6392: cos23 = (COS(46 * pi * xh) - .0033) / .6378: cos26 = (COS(52 * pi * xh) - .0033) / .6379
cos9 = (COS(18 * pi * xh) - .0033) / .6377: sin12 = SIN(24 * pi * xh) / .6337: sin15 = SIN(30 * pi * xh) / .6293
sin2 = SIN(4 * pi * xh) / .6344: sin20 = SIN(40 * pi * xh) / .6322: sin23 = SIN(46 * pi * xh) / .6345: sin26 = SIN(52 * pi * xh) / .6344
sin6 = SIN(12 * pi * xh) / .6337: sin9 = SIN(18 * pi * xh) / .6343
l1o4 = -.0717 - .339 * sin2 - .5038 * cos6 - .2515 * cos9 + .1124 * sin2 * sin2 + .41 * cos6 * cos6 - .4611 * cos9 * cos9
l1o4 = l1o4 + .2167 * sin2 * cos6 - .0173 * sin2 * cos9 + .113 * cos6 * cos9 - .0825 * sin2 * cos6 * cos9
l1o4 = l1o4 + .4874 * sin2 * sin2 * sin2 + .3393 * cos6 * cos6 * cos6 + .1518 * cos9 * cos9 * cos9
l1o4 = l1o4 - .2052 * cos6 * sin2 * sin2 + .0462 * sin2 * cos6 * cos6 - .1794 * sin2 * cos9 * cos9
l1o4 = l1o4 + .1666 * cos9 * sin2 * sin2 - .0299 * cos9 * cos6 * cos6 - .1123 * cos6 * cos9 * cos9
l1o6 = .2828 - .6034 * sin9 - .7482 * sin6 + 1.2483 * cos2 + .198 * sin9 * sin9 - .3128 * sin6 * sin6
l1o6 = l1o6 - .1186 * cos2 * cos2 - .0887 * sin9 * sin6 + .3034 * sin9 * cos2 + .1787 * sin6 * cos2 - .2136 * sin9 * sin6 * cos2
l1o6 = l1o6 + .0484 * sin9 * sin9 * sin9 - .0084 * sin6 * sin6 * sin6 - .4628 * cos2 * cos2 * cos2 + .0119 * sin6 * sin9 * sin9
l1o6 = l1o6 - .0167 * sin9 * sin6 * sin6 + .1476 * sin9 * cos2 * cos2 - .1378 * cos2 * sin9 * sin9 - .087 * cos2 * sin6 * sin6
l1o6 = l1o6 + .3653 * sin6 * cos2 * cos2 - .0117 + .0189 * sin2 + .089 * sin9 + .023 * cos2
l1o5 = l1o5 - .0331 * sin2 * sin2 + .1918 * sin9 * sin9 - .1483 * cos2 * cos2 + .1793 * sin2 * sin9
l1o5 = l1o5 - .045 * sin2 * cos2 + .3006 * sin9 * cos2 - .221 * sin2 * sin9 * cos2 + .1851 * sin2 * sin2 * sin2
l1o5 = l1o5 + .0263 * sin9 * sin9 * sin9 - .0039 * cos2 * cos2 * cos2 - .2651 * sin9 * sin2 * sin2 + .1796 * sin2 * sin9 * sin9
l1o5 = l1o5 - .2937 * sin2 * cos2 * cos2 + .4368 * cos2 * sin2 * sin2 - .1382 * cos2 * sin9 * sin9 - .1277 * sin9 * cos2 * cos2
l1o1 = -.544 - .5585 * sin2 + .3854 * cos6 + .3821 * sin15 + .1043 * sin2 * sin2 + .3103 * cos6 * cos6
l1o1 = l1o1 + .0265 * sin15 * sin15 + .2171 * sin2 * cos6 - .0165 * sin2 * sin15 - .247 * cos6 * sin15
l1o1 = l1o1 + .1974 * sin2 * cos6 * sin15 + .4941 * sin2 * sin2 * sin2 - .2536 * cos6 * cos6 * cos6 + .024 * sin15 * sin15 * sin15
l1o1 = l1o1 - .2097 * cos6 * sin2 * sin2 + .0484 * sin2 * cos6 * cos6 - .0139 * sin2 * sin15 * sin15 - .2107 * sin15 * sin2 * sin2
l1o1 = l1o1 - .0523 * sin15 * cos6 * cos6 + .0563 * cos6 * sin15 * sin15
l1o2 = .1908 - .8303 * sin2 - .3383 * sin9 + 1.9176 * cos6 + .1123 * sin2 * sin2 - .6608 * sin9 * sin9
l1o2 = l1o2 + .3892 * cos6 * cos6 + .1806 * sin2 * sin9 + .2171 * sin2 * cos6 + .0592 * sin9 * cos6
l1o2 = l1o2 + .0139 * sin2 * sin9 * cos6 + .5024 * sin2 * sin2 * sin2 + .0399 * sin9 * sin9 * sin9 - 1.1168 * cos6 * cos6 * cos6
l1o2 = l1o2 + -.1353 * sin9 * sin2 * sin2 + .1881 * sin2 * sin9 * sin9 + .0512 * sin2 * cos6 * cos6 - .2117 * cos6 * sin2 * sin2
l1o2 = l1o2 + .1073 * cos6 * sin9 * sin9 + .0649 * sin9 * cos6 * cos6
l1o3 = .4092 * sin2 - .3528 * sin9 - .2975 * sin6 - .2772 * cos6 + .2512 * cos12 + .2035 * cos9 - .174 * sin20
l1o3 = l1o3 + -.1572 * cos18 + .1487 * cos15 - .1487 * cos23 + .1073 * sin15 + .114 * cos2 - .0981 * sin12
l1o3 = l1o3 + -.0834 * sin26 + .0472 * cos26 + .0421 * cos20 - .0275 * sin23 + .0245 * sin18
l2o3 = -.0003 + .3299 * l1o1 + .1325 * cos9 - .1543 * cos23 - .1321 * sin12 - .1084 * sin26 + .0923 * cos15
l2o3 = l2o3 + .1066 * sin15 + .0614 * cos26 - .0357 * sin23 - .1131 * sin20 + .0288 * sin18 + .1001 * cos18
l2o3 = l2o3 + -.093 * cos2 - .0263 * cos20 - .0748 * cos12 + .1838 * cos6 - .0479 * sin93 + .2072 * sin6
l2o3 = l2o3 + -.1522 * sin2 + .6088 * l1o6 + .0278 * l1o2 + .3372 * l1o5 - .2977 * l1o3 + .6479 * l1o4
dym = l2o3 * 3.7356 - .3038: END SUB
SUB rk4 (y, dydx, x, h, yout)
hh = h * .5: h6 = h / 6: xh = x + hh: yt1 = y + hh * dydx: CALL derivs(xh, yt1, dyt): yt1 = y + hh * dyt: CALL derivs(xh, yt1, dym)
yt1 = y + h * dym: dym = dyt + dym: CALL derivs(x + h, yt1, dyt): yout = y + h6 * (dydx + dyt + 2 * dym): END SUB
FUNCTION y1 (x1)
y = 0@: x = 0@: dydx = 1.1571: h = 1 / 255
FOR i = 1 TO x1: CALL rk4(y, dydx, x, h, yout): x = x + h: y = yout: CALL derivs(x, y, dydx): NEXT i: y1 = yout * 13.373: END FUNCTION
FUNCTION yt (x) : y0 = .0116 * x + .4062
y0 = y0 + -.0007 * LOG(1 + 10 ^ (x - 2)) + -.0529 * LOG(1 + 10 ^ (x - 4)) + .0778 * LOG(1 + 10 ^ (x - 6))
y0 = y0 + -.0127 * LOG(1 + 10 ^ (x - 8)) + .0057 * LOG(1 + 10 ^ (x - 10)) + -.0192 * LOG(1 + 10 ^ (x - 12))
y0 = y0 + .0243 * LOG(1 + 10 ^ (x - 14)) + -.0491 * LOG(1 + 10 ^ (x - 16)) + .0596 * LOG(1 + 10 ^ (x - 18))
y0 = y0 + -.0763 * LOG(1 + 10 ^ (x - 20)) + .0274 * LOG(1 + 10 ^ (x - 22)) + .0037 * LOG(1 + 10 ^ (x - 24))
y0 = y0 + -.0014 * LOG(1 + 10 ^ (x - 26)) + .0296 * LOG(1 + 10 ^ (x - 28)) + .0177 * LOG(1 + 10 ^ (x - 30))
y0 = y0 + -.0268 * LOG(1 + 10 ^ (x - 32)) + -.0704 * LOG(1 + 10 ^ (x - 34)) + .0091 * LOG(1 + 10 ^ (x - 36))
y0 = y0 + .0333 * LOG(1 + 10 ^ (x - 38)) + .05 * LOG(1 + 10 ^ (x - 40)) + -.0152 * LOG(1 + 10 ^ (x - 42))
y0 = y0 + -.0289 * LOG(1 + 10 ^ (x - 44)) + .013 * LOG(1 + 10 ^ (x - 46)) + .0116 * LOG(1 + 10 ^ (x - 48))
y0 = y0 + -.0414 * LOG(1 + 10 ^ (x - 50)) + .0419 * LOG(1 + 10 ^ (x - 52)) + -.002 * LOG(1 + 10 ^ (x - 54))
y0 = y0 + .0003 * LOG(1 + 10 ^ (x - 56)) + .0201 * LOG(1 + 10 ^ (x - 58)) + -.0488 * LOG(1 + 10 ^ (x - 60))
y0 = y0 + .0408 * LOG(1 + 10 ^ (x - 62)) + -.0013 * LOG(1 + 10 ^ (x - 64)) + -.016 * LOG(1 + 10 ^ (x - 66))
y0 = y0 + -.0098 * LOG(1 + 10 ^ (x - 68)) + -.0348 * LOG(1 + 10 ^ (x - 70)) + .0376 * LOG(1 + 10 ^ (x - 72))
y0 = y0 + .0334 * LOG(1 + 10 ^ (x - 74)) + -.0432 * LOG(1 + 10 ^ (x - 76)) + -.0107 * LOG(1 + 10 ^ (x - 78))
y0 = y0 + -.0038 * LOG(1 + 10 ^ (x - 80)) + -.0302 * LOG(1 + 10 ^ (x - 82)) + .0381 * LOG(1 + 10 ^ (x - 84))
y0 = y0 + -.002 * LOG(1 + 10 ^ (x - 86)) + .0344 * LOG(1 + 10 ^ (x - 88)) + -.0251 * LOG(1 + 10 ^ (x - 90))
y0 = y0 + .0087 * LOG(1 + 10 ^ (x - 92)) + .0093 * LOG(1 + 10 ^ (x - 94)) + -.0687 * LOG(1 + 10 ^ (x - 96))
y0 = y0 + .0727 * LOG(1 + 10 ^ (x - 98)) + -.0441 * LOG(1 + 10 ^ (x - 100))
yt = y0: END FUNCTION
Data Modeling augmentation of JPEG for real-time streaming video
Holger M. Jaenisch*a,b
and James W. Handleyb,c
adtech Systems Inc., P.O. Box 18924, Huntsville, AL 35804 bJames Cook University, Townsville QLD 4811, Australia
cSparta, Inc., 6000 Technology Dr., Bldg. 3, Huntsville, AL 35805
ABSTRACT
This paper explores sub-sampling in conjunction with JPEG compression algorithms. Rather than directly compressing
large high-resolution images, we propose decimation to thumbnails followed by compression. This enables Redundant
Array of Independent Disks (RAID) compression and facilitates real-time streaming video with small bandwidth
requirements. Image reconstruction occurs on demand at the receiver to any resolution required using Data Modeling
based fractal interpolation. The receive side first uncompresses JPEG and then fractal interpolates to any required
resolution. This device independent resolution capability is useful for real-time sharing of image data across virtual
networks where each node has a different innate resolution capability. The same image is constructed to whatever
limitations exist at each individual node, keeping image data device independent and image resolution scalable up or
down as hardware/bandwidth limitations and options evolve.
Keywords: Data Modeling, JPEG, JPEG2000, image compression, decimation, bandwidth, image resolution
1. CONCEPT
This paper examines sub-sampling used in conjunction with JPEG algorithms as shown in Figure 1. The approach starts
with an image and JPEG compression requiring a certain amount of time and memory. By first sub-sampling the image
down to a thumbnail, a smaller image to which to apply JPEG compression results. This thumbnail will compress much
faster and smaller than the original. The receive side then uncompresses the compressed thumbnail and fractal
interpolates to any resolution needed, including to super-resolution if desired. This device independent resolution
capability is useful for real-time sharing of image data across the virtual network because each node will have a different
innate resolution capability and limitations. This keeps image data device independent and image resolution scalable up
or down as hardware/bandwidth limitations and options evolve.
Original
ImageJPEG
Compressed
ThumbnailDecimated
Thumbnail
Decimated
ThumbnailSub-Sampling
Fig. 1. Data Modeling augmentation of JPEG.
The process of system identification followed by parameter estimation applied to solving the system inverse problem
yields an equation. This equation can be a differential equation, a particular solution in the form of an algebraic equation,
or a general solution in the form of a partial differential equation. In any case, the derived equation is what we refer to as
a Data Model. The Data Model is the smallest amount of information required to functionally model empirical dynamics.
This model can take the form of a group of real numbers, a single equation, or a network of equations. Data Models are
also variables that can be embedded in other functions as variables, therefore the term used is functional. Data Modeling
approximates differential equations and their forcing functions using algebraic basis functions in the specific form of the
Kolmogorov-Gabor (K-G) polynomial. The K-G polynomial is numerically derived using a hierarchy of third or lower
order polynomials as intermediate variables. Nesting these functional models yields the Data Relations Model1,2
))))))))),,(((((((((),,,( 2121 kjinL xxxybybybyfxxxy KK = [ ] n
LxxxyO 3),,,( 21 =K (1)
*[email protected]; phone 1 256 337-3768
2. FRACTAL DECIMATION AND RECONSTRUCTION
Discrete cosine transform (DCT) and wavelet based compression algorithms adhering to the JPEG and JPEG2000
standards3 yield varying compression sizes on equal size images, because these compression techniques are based on
image complexity. In contrast, sub-sampling yields a fixed image thumbnail size that is known in advance and is not
dependent on image complexity. Rather, sub-sampling is dependent only on the sub-sample rate and the original size.
Using Data Modeling, the data is first converted from 2-D into a 1-D data sequence. Several methods exist, including
raster scanning and fixed pattern readout such as zigzag sequencing. However, only the Hilbert sequence space filling
fractal curve4 preserves 2-D correlations at dyadic sample sizes. Figure 2 demonstrates the Hilbert fractal curve. The
Data Model of the image is the Turlington polynomial5,6 of the resultant 1-D data sequence, which provides an equation
model of the thumbnail with continuous derivatives. The Turlington polynomial is of the form
( )
∑−
=
−
−⎥⎥⎦
⎤
⎢⎢⎣
⎡+−+−+=
1
2
001.101111 101log)(001.)()(
n
j
xx
jj
j
mmxxmyxT
jj
jj
jxx
yym
−
−=
+
+
1
1 (2)
Details on constructing these models are found in the References, and for brevity the Data Models are represented in this
paper by the actual sub-sampled points from the images.
Fig. 2. Example of Hilbert sequencing.
Once a dyadic block or image size is determined for the Hilbert sequence, the image is sampled independently for the 3
different color planes (red, green, and blue). This results in 3 separate data sequences whose values are between 0 and
255 (in the case of 24 bit images). Each data sequence can be processed separately and recombined into an image after
processing. The color gray is created when equal amounts of red, green, and blue are mixed together. To transform the
red, green, and blue data for each pixel into grayscale, the average of the red, green, and blue color planes is calculated.
3
bluegreenred
gray
yyyy
++= (3)
IFS fractal interpolation7,8,9 from the literature is recast into the functional form
),(),( iiiiii fydxcexayxF +++= (4)
where a new x and y value is derived on the left hand side of the equation from applying the ith transform to the previous
x and y locations in the iteration process shown on the right hand side of the equation. This process still requires iterating
through all of the previous steps to get to the proper (x, y) location pair and IFS transform. By assuming x is a
monotonically increasing index of uniform spacing, a unique fractal pseudo-derivative operator is defined as
{ } ).()())(()(
110111110111
−+−−−+−−+− −−+−+⎥⎦
⎤⎢⎣⎡ −−
−−
≡ nnninniNnnnn
if yyyyyyyxN
yyyy
N
yyyD (5)
where N is the number of points in the original process Data Model and vanishes if N = 1, y0 the value of the Data Model
at x = 0, yN the value at x = N, i the index associated with the (x,y) pair that is transformed into the needed function pair,
and n the index of the straight line segment connecting the original Data Model points defining the interval containing
the required xi. This operator is derived from the IFS equations in References 7-9, and Equation 5 is the Data Model of
the process10-13.
3. RESULTS
3.1 Image data sources
Two grayscale image classes stored in 24-bit color format were used for this study. Sizes in megabytes (MB) and in
pixels is given later in this work in Table 1. Because Hilbert sequencing requires images to be dyadic (power of two),
these images were padded up to the next highest power of two. One method of padding that would maintain the mean of
the image would be to actually pad the image its mean value. However, this does not preserve the other higher order
statistics and would provide sharp discontinuities all along the edges where the padding is added. These sharp
discontinuities would cause the fractal reconstruction algorithm to fail and were therefore avoided. Instead, the images
were padded with data from the image itself. Rows and columns taken from image and appended in reverse sequence to
top and right and the sub-sampling rates increased to insure equivalent thumbnails generated. An example of this dyadic
padding is shown in Figure 3.
Original
Padded
Fig. 3. Image 1 with no padding (left), and with padding on its top and right (right). Note symmetry along top and right edge.
3.2 Figures of merit for scoring results
To score image processing and reconstruction results, one or more figures of merit (FOM) must be selected. One simple
method is through use of standard statistics and higher order moments (M1 – M8 in Equations 6 - 11) given by
xxMN
j
jN== ∑
=1
11
01
=Mµ 21
=Mσ (6)
σ=−= ∑=
−
N
j
jNxxM
1
2
11
2 )( 12
=Mµ 32
=Mσ (7)
∑= ⎥
⎥⎦
⎤
⎢⎢⎣
⎡ −=
N
j
j
N
xxM
1
3
13 σ
03
=Mµ 153
=Mσ (8)
31
4
14 −
⎪⎭
⎪⎬⎫
⎪⎩
⎪⎨⎧
⎥⎥⎦
⎤
⎢⎢⎣
⎡ −= ∑
=
N
j
j
N
xxM
σ 3
4=Mµ 96
4=Mσ (9)
151
6
16 −
⎪⎭
⎪⎬⎫
⎪⎩
⎪⎨⎧
⎥⎥⎦
⎤
⎢⎢⎣
⎡ −= ∑
=
N
j
j
N
xxM
σ 15
6=Mµ 100
6=Mσ (10)
1051
8
18 −
⎪⎩
⎪⎨⎧
⎪⎭
⎪⎬⎫
⎥⎥⎦
⎤
⎢⎢⎣
⎡ −= ∑
=
N
j
j
N
xxM
σ 105
8=Mµ 1407
8=Mσ (11)
where the µ term after each is the mean expected value of the raw moment calculation for a Gaussian distribution and the
σ term is the standard deviation of the moment for a Gaussian distribution, forming a confidence interval around the
values of the moment.
It was found that the mean squared error (MSE) and peak signal to noise ratio (PSNR) from the literature along with a
novel histogram correlation method worked well on the images described above, and therefore were applied to the Data
Model to compare each with the original. The mean squared error (MSE) and the peak signal to noise ratio (PSNR) are
given by
∑∑ −=i
ii yxMSE2)( ∑= )
255log(10
2
MSEPSNR (12)
Jaenisch created a novel FOM for use in numerical perceptual image scoring called Histogram correlation (Figure 4).
This method extracts a window from the two images being compared and generates a histogram for each. This histogram
is then converted into a time series of data and the correlation between the two calculated.
WindowExtracted
From SameLocation in
Each Image
Histogram
Generated
Histogram
Converted toTime Series
Correlation
Original
Reconstructed
WindowExtracted
From SameLocation in
Each Image
Histogram
Generated
Histogram
Converted toTime Series
Correlation
Original
Reconstructed
Fig. 4. Calculation of the Histogram correlation FOM.
Histogram correlation is given in equation form as
∑∑∑
∑−−
−−=
i
i
i
i
i
ii
yyxx
yyxx
r22 )()(
))(( (13)
For MSE, lower values represent a closer match to the original, while for PSNR, higher values represent a closer match.
For the correlation FOM, higher values represent closer matches to the original. For scoring images with Histogram
Correlation, MSE, or PSNR, windows ranging in size between 3 x 3 and 256 x 256 were used. 16 random placed
windows of each size were extracted from the images and the FOMs calculated inside of each window. For each window
size, the results from the 16 windows are summed together for each FOM and the resultants summed together for all
window sizes between 3 x 3 and 256 x 256. In the case of MSE, a lower value is better, indicating less error. For PSNR
and Histogram Correlation, higher values are better, indicating more correlation and higher SNR11.
3.3 Final images
The two image types were first decimated 10:1 and 100:1 and then reconstructed using Data Modeling based fractal
reconstruction. The original images along with 10:1 and 100:1 thumbnail sub-sampling and Data Modeling
reconstruction are shown in Figure 5. The Correlation FOM, PSNR, and MSE are given in Table 1 for images that were
decimated both at 10:1 and 100:1, JPEG compressed to the compression ratios varying between 50:1 and 704:1 as shown
in Table 1, and finally by reversing the process by uncompressing using JPEG and then applying Data Modeling fractal
reconstruction to inflate the uncompressed but decimated thumbnail back to its original size.
Fig. 5. Image type 1 and 2. 50:1 (top left), 89:1 (top right), original (middle), 459:1 (bottom left), and 704:1 (bottom right).
Image Name and
Sub-Sample Rate
Original
Size (KB)
Original
Size (Pixels)
Correlation
FOM
PSNR
(104)
MSE
(106)
JPEG Compressed
Size (KB)
Overall
Compression Ratio
Image 1 (10:1) 9700 1800 x 1800 2960 8.8 1.8 181 50:1
Image 2 (10:1) 9700 1800 x 1800 2208 12.1 0.3 109 89:1
Image 1 (100:1) 9700 1800 x 1800 2560 6.9 5.6 21 459:1
Image 2 (100:1) 9700 1800 x 1800 1440 10.7 0.3 13 704:1
Table 1. Statistics and FOM scores for images.
Table 1 summarizes FOM calculations for each reconstructed image compared with the original. As shown in Figure 5,
the Data Modeling approach provides good images at both 10:1 and 100:1 decimation and further JPEG compression.
Also shown in Table 1 is the resultant image size in kilobytes after compressing the thumbnail using JPEG set at a
quality factor of 90%. This quality factor was chosen to insure that no major information loss or artifacts were introduced
to the image by the JPEG algorithm. For each image, the overall compression ratio after using both fractal decimation
and JPEG compression were actually 5 to 9 times larger than the original thumbnail compression of 10:1 and 100:1.
From this, it is estimated that in order to obtain overall compression of 10:1, it is only necessary to sub-sample 2:1, and
to achieve overall compression of 100:1, it is only necessary to sub-sample 20:1. Fortran 95 source code for sub-
sampling, compressing, and reconstructing the images in Figure 5 is provided in the Appendix of this work.
Zooming into the resulting image type 1 after applying 100:1 decimation, JPEG compression, JPEG uncompression, and
Data Modeling fractal reconstruction is shown in Figure 6. Note that at this block size, only 4 pixels are saved as the
model of the block. Figure 6 shows that the fractal reconstructed method at fine resolution shows detail that is not
present in images that are inflated using methods such as linear interpolation. On the left in Figure 6 is a 16 x 16 pixel
neighborhood from the original image. In the middle of Figure 6 is the resultant of applying decimation, JPEG, and
linear interpolation, where only four values (model values) in the neighborhood were simply enlarged without adding
detail. On the right in Figure 6, the Data Modeling approach in the place of linear interpolation is shown. Our method
reconstructs detail from only the four pixel values used by linear interpolation.
Fig. 6. Zoom in to 16 x 16 pixel neighborhood on image type 1. Original (left), 100:1 decimation, JPEG compression/uncompression,
and linear interpolation to undo decimation showing lack of detail (center), and Data Modeling fractal reconstruction applied instead
of liner interpolation (right). Note that the Data Modeling result contains detail like that in the original.
3.4 Sub-sample rate and block size estimation Data Model
Final images shown in Figure 5 were sub-sampled and reconstructed to their original size using all available pixels in the
original padded image (block size of 2048x2048). For images where the dynamics of the local pixel neighborhoods vary,
better reconstruction results are obtained using image sub-blocks and performing reconstruction on each local
neighborhood independently. To explore the limiting performance of the algorithm, we varied the sub-sample rates from
1.5:1 to 16384:1 and the local pixel neighborhoods between 8x8 pixel blocks to 2048x2048. This is not mandatory in the
image decimation/compression/reconstruction process. The authors have found that a good rule of thumb given no
further information is to use 10:1 sub-sampling with block sizes of either 8x8 or 16x168. Figure 7 (top) shows the
Histogram Correlation FOM for image class 1 and 2 for varying. In the top 2 curves, increasing block sizes are shown
concatenated together for increasing sub-sample rates (which also increases from left to right on the top curves). Figure 7
(bottom) reverses the sequence by displaying groups of increasing sub-sample rates concatenated together for each
reconstruction block size (which also increases from left to right on the bottom curves).
0
500
1000
1500
2000
2500
3000
3500
4000
His
tog
ram
Co
rrela
tio
n F
OM
1.5 Sub-Sample Rate 16384
*Note that block size increments for each sub-sample rate
Image Class 1
0
500
1000
1500
2000
2500
3000
3500
4000
His
tog
ram
Co
rre
lati
on
FO
M
1.5 Sub-Sample Rate 16384
*Note that block size increments for each sub-sample rate
Image Class 2
0
500
1000
1500
2000
2500
3000
3500
4000
His
tog
ram
Co
rrela
tio
n F
OM
8 Block Size 2048
*Note that sub-sample rate increments for each block size
Image Class 1
0
500
1000
1500
2000
2500
3000
3500
4000
His
tog
ram
Co
rre
lati
on
FO
M
8 Block Size 2048
*Note that sub-sample rate increments for each block size
Image Class 2
Fig. 7. Varying sub-sample rates from 1.5:1 up to 16384:1 for image class 1 (top left) and image class 2 (top right). Varying block
sizes from 8x8 to 2048x2048 for image class 1 (bottom left) and image class 2 (bottom right).
The results curves shown in Figure 7 required many hours of CPU time to run the approximately 150 cases on each
image type (300 cases total), with each case requiring several hundred sub-cases to generate the windows to characterize
the figures of merit described in Section 3.2. Using the results of this analysis, a Data Model was constructed to
eliminate exhaustive characterization of new images that are of the same or similar class as those in Figure 5 for
determination of optimal block size and sub-sample rate. This Data Model was constructed using 1) desired Histogram
correlation FOM value for final result and 2) average standard deviation of 16 randomly selected 8x8 pixel block
neighborhoods in the original, and yielded as output an estimate of the optimal sub-sample rate and block size to use in
the Data Modeling fractal reconstruction process. This Data Model was created using the data curves in Figure 7 and
multi-variable linear regression using the nested Kolmogorov-Gabor polynomial form in Equation 11,2. Output results
from the Data Model compared with the training data is provided in Figure 8.
Original
DataModel
Su
b-S
am
ple
Ra
te
Blo
ck
Siz
e
Index Index Fig. 8. Predictive Data Model results compared to original for sub-sample rate (left) and block size (right).
Source code for this predictive Data Model is provided in Figure 9, and two lookup tables generated from these equation
models are provided in Figure 10. By finding the average standard deviation from 16 8x8 blocks from your image across
the bottom and the desired Histogram Correlation FOM along the side in the top Figure 10 lookup table, the optimal sub-
sample rate is read off directly. Using the same two inputs in the bottom Figure 10 lookup table, the optimal block size
used in the Data Modeling reconstruction process is read off directly.
OPEN "acorrel" FOR INPUT AS #1IN PUT #1, correl 'histogram correl fomCLOSE #1OPEN "astdev" FOR INPUT AS #1INPUT #1, stdev 'standard deviationCLOSE #1bcorrel = (correl - .4728) / .2133astdev = (stdev - 16.5976) / 8.0445GOSUB subsampasubsamp = l1o1 'log10 of subsamp rateombsuba = (asubsamp - 1.5932) / .8939mbstdev = (stdev - 16.3419) / 8.0329GOSUB blocksizablksiz = l4o1OPEN "output" FOR OUTPUT AS #1'subsamp rate and block sizePRINT #1, 10 ^ asubsamp, ablksizCLOSE #1END
subsamp:l1o1 = -.5407l1o1 = l1o1 + (-.4536) * (bcorrel)l1o1 = l1o1 + (.9381) * (astdev)l1o1 = l1o1 + (.3418) * (bcorrel * bcorrel)l1o1 = l1o1 + (.1327) * (astdev * astdev)l1o1 = l1o1 + (-.0858) * (bcorrel * astdev)l1o1 = l1o1 + (-.314) * (bcorrel * bcorrel * bcorrel)l1o1 = l1o1 + (-.2227) * (astdev * astdev * astdev)l1o1 = l1o1 + (-.1637) * (astdev * bcorrel * bcorrel)l1o1 = l1o1 + (-.0133) * (bcorrel * astdev * astdev)l1o1 = l1o1 * (.7468) + (1.3849)IF l1o1 < .1761 THEN l1o1 = .1761IF l1o1 > 4.2144 THEN l1o1 = 4.2144RETURN
blocksiz:l1o1 = -.2149l1o1 = l1o1 + (-.3765) * (mbstdev)l1o1 = l1o1 + (.5527) * (ombsuba)l1o1 = l1o1 + (-.2344) * (mbstdev * mbstdev)l1o1 = l1o1 + (.4856) * (ombsuba * ombsuba)l1o1 = l1o1 + (-.2067) * (mbstdev * ombsuba)l1o1 = l1o1 + (.6168) * (mbstdev * mbstdev * mbstdev)l1o1 = l1o1 + (-.154) * (ombsuba * ombsuba * ombsuba)l1o1 = l1o1 + (-.0497) * (ombsuba * mbstdev * mbstdev)l1o1 = l1o1 + (.0898) * (mbstdev * ombsuba * ombsuba)l1o2 = 0l1o2 = l1o2 + (.4639) * (mbstdev)l1o2 = l1o2 + (.2992) * (ombsuba)l2o3 = .7634l2o3 = l2o3 + (3.1154) * (l1o1)l2o3 = l2o3 + (5.9309) * (mbstdev)l2o3 = l2o3 + (-7.0514) * (l1o2)l2o3 = l2o3 + (-6.4967) * (l1o1 * l1o1)l2o3 = l2o3 + (-5.6109) * (mbstdev * mbstdev)l2o3 = l2o3 + (-26.2132) * (l1o2 * l1o2)
l2o3 = l2o3 + (1.8352) * (l1o1 * mbstdev)l2o3 = l2o3 + (10.0944) * (l1o1 * l1o2)l2o3 = l2o3 + (23.2086) * (mbstdev * l1o2)l2o3 = l2o3 + (-33.6865) * (l1o1 * mbstdev * l1o2)l2o3 = l2o3 + (-.3964) * (l1o1 * l1o1 * l1o1)l2o3 = l2o3 + (-6.604) * (mbstdev * mbstdev * mbstdev)l2o3 = l2o3 + (-14.0569) * (l1o2 * l1o2 * l1o2)l2o3 = l2o3 + (2.6917) * (mbstdev * l1o1 * l1o1)l2o3 = l2o3 + (5.7411) * (l1o1 * mbstdev * mbstdev)l2o3 = l2o3 + (31.1435) * (l1o1 * l1o2 * l1o2)l2o3 = l2o3 + (-1.3842) * (l1o2 * l1o1 * l1o1)l2o3 = l2o3 + (11.6813) * (l1o2 * mbstdev * mbstdev)l2o3 = l2o3 + (5) * (mbstdev * l1o2 * l1o2)l3o2 = -.2408l3o2 = l3o2 + (.58) * (l2o3)l3o2 = l3o2 + (.2969) * (ombsuba)l3o2 = l3o2 + (-.0829) * (mbstdev)l3o2 = l3o2 + (.219) * (l2o3 * l2o3)l3o2 = l3o2 + (-.2805) * (ombsuba * ombsuba)l3o2 = l3o2 + (.5857) * (mbstdev * mbstdev)l3o2 = l3o2 + (.3052) * (l2o3 * ombsuba)l3o2 = l3o2 + (-.1895) * (l2o3 * mbstdev)l3o2 = l3o2 + (-.1353) * (ombsuba * mbstdev)l3o2 = l3o2 + (.8291) * (l2o3 * ombsuba * mbstdev)l3o2 = l3o2 + (-.4954) * (l2o3 * l2o3 * l2o3)l3o2 = l3o2 + (.2144) * (ombsuba * ombsuba * ombsuba)l3o2 = l3o2 + (-.2555) * (mbstdev * mbstdev * mbstdev)l3o2 = l3o2 + (-.8619) * (ombsuba * l2o3 * l2o3)l3o2 = l3o2 + (.1994) * (l2o3 * ombsuba * ombsuba)l3o2 = l3o2 + (2.2746) * (l2o3 * mbstdev * mbstdev)l3o2 = l3o2 + (-.4046) * (mbstdev * l2o3 * l2o3)l3o2 = l3o2 + (-.2718) * (mbstdev * ombsuba * ombsuba)l3o2 = l3o2 + (-.9601) * (ombsuba * mbstdev * mbstdev)l4o1 = -.2943l4o1 = l4o1 + (2.4065) * (l3o2)l4o1 = l4o1 + (.3783) * (ombsuba)l4o1 = l4o1 + (-1.625) * (mbstdev)l4o1 = l4o1 + (1.5285) * (l3o2 * l3o2)l4o1 = l4o1 + (.1599) * (ombsuba * ombsuba)l4o1 = l4o1 + (.1683) * (mbstdev * mbstdev)l4o1 = l4o1 + (.0643) * (l3o2 * ombsuba)l4o1 = l4o1 + (-1.4337) * (l3o2 * mbstdev)l4o1 = l4o1 + (-.2452) * (ombsuba * mbstdev)l4o1 = l4o1 + (.7922) * (l3o2 * ombsuba * mbstdev)l4o1 = l4o1 + (-.5002) * (l3o2 * l3o2 * l3o2)l4o1 = l4o1 + (.1066) * (ombsuba * ombsuba * ombsuba)l4o1 = l4o1 + (.3336) * (mbstdev * mbstdev * mbstdev)l4o1 = l4o1 + (-.3695) * (ombsuba * l3o2 * l3o2)l4o1 = l4o1 + (-.3886) * (l3o2 * ombsuba * ombsuba)l4o1 = l4o1 + (.788) * (l3o2 * mbstdev * mbstdev)l4o1 = l4o1 + (-.4752) * (mbstdev * l3o2 * l3o2)l4o1 = l4o1 + (.2214) * (mbstdev * ombsuba * ombsuba)l4o1 = l4o1 + (-.8907) * (ombsuba * mbstdev * mbstdev)l4o1 = l4o1 * (761.0519) + (831.2941)IF l4o1 < 8 THEN l4o1 = 8IF l4o1 > 2048 THEN l4o1 = 2048RETURN
Fig. 9. QuickBASIC 4.5 source for determining optimal sub-sample rate and block size.
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 2 2
3 3 2 2 2 2 2 2 2 2 2 3 3 3 4 4 5 5 6 6 6 6 6 5 5
4 3 3 3 2 2 2 3 3 3 3 4 4 5 6 7 8 9 10 11 11 11 11 11 10
4 3 3 3 3 3 3 3 3 4 4 5 6 7 8 10 11 13 15 16 17 18 18 18 16
4 4 3 3 3 3 3 3 4 4 5 6 7 9 10 12 14 17 19 22 24 25 26 26 24
4 4 3 3 3 3 3 4 4 5 6 7 8 10 12 15 18 21 24 28 31 33 34 34 33
4 4 3 3 3 3 4 4 5 5 6 8 9 12 14 17 21 25 29 34 38 41 43 44 43
5 4 4 4 4 4 4 5 5 6 8 9 11 14 17 21 26 31 36 42 48 52 55 56 55
7 6 5 5 5 5 6 6 7 8 10 12 15 18 22 28 34 40 48 56 63 69 73 74 73
10 9 8 8 8 8 8 9 11 13 15 18 22 27 33 40 49 59 70 81 91 99 105 107 105
20 17 15 14 14 14 15 17 19 22 26 32 38 46 57 69 83 99 117 134 150 164 172 175 170
49 41 37 34 33 34 35 38 43 49 58 68 82 99 119 143 171 202 235 267 297 321 334 336 323
163 135 117 107 103 103 106 114 126 142 163 190 225 267 318 377 444 517 593 666 730 777 800 793 753
760 613 523 469 441 432 439 461 498 551 622 713 827 964 1126 1312 1518 1737 1956 2160 2326 2434 2464 2401 2243
5154 4062 3382 2962 2717 2599 2581 2648 2797 3026 3338 3741 4237 4831 5520 6293 7125 7978 8796 9505 10024 10272 10181 9719 8896
16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383
16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383
1.00
0.95
0.90
0.85
0.80
0.75
0.70
0.65
0.60
0.55
0.50
0.45
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
His
tog
ram
Co
rrela
tio
n F
OM
Sub-Sample Rate
5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0 20.0 21.0 22.0 23.0 24.0 25.0 26.0 27.0 28.0 29.0
Standard Deviation (averaged from 16 8x8 blocks)
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 2 2
3 3 2 2 2 2 2 2 2 2 2 3 3 3 4 4 5 5 6 6 6 6 6 5 5
4 3 3 3 2 2 2 3 3 3 3 4 4 5 6 7 8 9 10 11 11 11 11 11 10
4 3 3 3 3 3 3 3 3 4 4 5 6 7 8 10 11 13 15 16 17 18 18 18 16
4 4 3 3 3 3 3 3 4 4 5 6 7 9 10 12 14 17 19 22 24 25 26 26 24
4 4 3 3 3 3 3 4 4 5 6 7 8 10 12 15 18 21 24 28 31 33 34 34 33
4 4 3 3 3 3 4 4 5 5 6 8 9 12 14 17 21 25 29 34 38 41 43 44 43
5 4 4 4 4 4 4 5 5 6 8 9 11 14 17 21 26 31 36 42 48 52 55 56 55
7 6 5 5 5 5 6 6 7 8 10 12 15 18 22 28 34 40 48 56 63 69 73 74 73
10 9 8 8 8 8 8 9 11 13 15 18 22 27 33 40 49 59 70 81 91 99 105 107 105
20 17 15 14 14 14 15 17 19 22 26 32 38 46 57 69 83 99 117 134 150 164 172 175 170
49 41 37 34 33 34 35 38 43 49 58 68 82 99 119 143 171 202 235 267 297 321 334 336 323
163 135 117 107 103 103 106 114 126 142 163 190 225 267 318 377 444 517 593 666 730 777 800 793 753
760 613 523 469 441 432 439 461 498 551 622 713 827 964 1126 1312 1518 1737 1956 2160 2326 2434 2464 2401 2243
5154 4062 3382 2962 2717 2599 2581 2648 2797 3026 3338 3741 4237 4831 5520 6293 7125 7978 8796 9505 10024 10272 10181 9719 8896
16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383
16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383 16383
1.00
0.95
0.90
0.85
0.80
0.75
0.70
0.65
0.60
0.55
0.50
0.45
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
His
tog
ram
Co
rrela
tio
n F
OM
Sub-Sample Rate
5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0 20.0 21.0 22.0 23.0 24.0 25.0 26.0 27.0 28.0 29.0
Standard Deviation (averaged from 16 8x8 blocks)
2048 1032 428 125 8 8 8 8 8 2048 1391 2048 1693 74 356 238 324 833 1857 1974 1501 2048 8 8 8
2048 1032 428 125 8 8 8 8 8 2048 1391 2048 1693 74 356 238 324 833 1857 1974 1501 2048 8 8 8
2048 1032 428 125 8 8 8 8 8 2048 1391 2048 1693 74 356 238 324 833 1857 1974 1501 2048 8 8 8
8 1032 428 125 8 8 8 8 8 2048 1391 2048 1693 74 356 238 324 833 1857 1974 1501 2048 8 8 8
8 16 1370 121 8 8 8 8 8 2048 1391 2048 118 357 308 349 646 1078 1506 1775 1529 1779 8 8 8
8 2048 533 156 649 2048 8 2048 1841 2048 1757 38 55 18 108 352 488 437 414 413 506 741 2048 980 8
8 8 87 291 437 697 2048 172 1590 1397 232 48 101 349 448 60 34 173 8 391 143 860 1123 2048 8
8 8 68 428 331 209 152 572 1182 537 139 108 245 300 8 312 1361 923 8 577 8 2048 1942 1129 2048
8 8 75 468 242 326 295 686 775 315 126 140 226 70 8 1145 1977 723 8 679 8 2048 8 8 2048
8 8 82 485 189 356 382 622 540 227 120 140 150 8 215 1524 1814 138 233 395 549 1269 8 8 2048
8 8 86 480 154 320 388 485 359 171 110 117 65 8 414 1496 1232 8 818 8 1360 8 8 8 1136
8 8 43 388 197 211 283 297 213 129 91 71 8 8 470 1051 339 8 1258 8 2048 8 8 8 8
8 8 926 202 337 179 175 170 141 104 65 17 8 8 220 177 8 418 901 8 2048 558 763 8 125
8 8 2048 91 432 319 229 196 169 131 78 19 8 8 8 8 11 856 8 554 2048 2048 1721 8 8
8 8 1391 51 617 522 379 326 296 258 200 121 18 8 8 8 8 8 8 1795 2048 336 8 8 87
8 2048 885 1010 1032 444 331 345 362 370 365 332 253 118 8 8 8 8 1021 1784 1086 8 299 8 2048
2048 2048 2048 194 1179 350 2048 2048 2048 843 436 450 326 235 203 209 326 710 1296 1554 1234 1130 971 8 2048
2048 787 2048 990 8 8 8 8 8 8 8 2048 2048 2048 2048 2048 579 941 1653 1876 1745 1583 1438 2048 2048
2048 2048 1135 2048 8 8 8 8 8 8 8 2048 2048 2048 2048 1772 849 793 1564 2048 2048 2048 2048 2048 2048
8 8 2048 8 2048 2048 2048 2048 2048 2048 2048 2048 2048 246 8 8 8 2048 1290 8 1877 2048 2048 2048 2048
8 8 2048 8 2048 2048 2048 2048 2048 2048 2048 2048 2048 246 8 8 8 2048 1290 8 1877 2048 2048 2048 2048
1.00
0.95
0.90
0.85
0.80
0.75
0.70
0.65
0.60
0.55
0.50
0.45
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
His
tog
ram
Co
rrela
tio
n F
OM
Block Size
5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0 20.0 21.0 22.0 23.0 24.0 25.0 26.0 27.0 28.0 29.0
Standard Deviation (averaged from 16 8x8 blocks)
2048 1032 428 125 8 8 8 8 8 2048 1391 2048 1693 74 356 238 324 833 1857 1974 1501 2048 8 8 8
2048 1032 428 125 8 8 8 8 8 2048 1391 2048 1693 74 356 238 324 833 1857 1974 1501 2048 8 8 8
2048 1032 428 125 8 8 8 8 8 2048 1391 2048 1693 74 356 238 324 833 1857 1974 1501 2048 8 8 8
8 1032 428 125 8 8 8 8 8 2048 1391 2048 1693 74 356 238 324 833 1857 1974 1501 2048 8 8 8
8 16 1370 121 8 8 8 8 8 2048 1391 2048 118 357 308 349 646 1078 1506 1775 1529 1779 8 8 8
8 2048 533 156 649 2048 8 2048 1841 2048 1757 38 55 18 108 352 488 437 414 413 506 741 2048 980 8
8 8 87 291 437 697 2048 172 1590 1397 232 48 101 349 448 60 34 173 8 391 143 860 1123 2048 8
8 8 68 428 331 209 152 572 1182 537 139 108 245 300 8 312 1361 923 8 577 8 2048 1942 1129 2048
8 8 75 468 242 326 295 686 775 315 126 140 226 70 8 1145 1977 723 8 679 8 2048 8 8 2048
8 8 82 485 189 356 382 622 540 227 120 140 150 8 215 1524 1814 138 233 395 549 1269 8 8 2048
8 8 86 480 154 320 388 485 359 171 110 117 65 8 414 1496 1232 8 818 8 1360 8 8 8 1136
8 8 43 388 197 211 283 297 213 129 91 71 8 8 470 1051 339 8 1258 8 2048 8 8 8 8
8 8 926 202 337 179 175 170 141 104 65 17 8 8 220 177 8 418 901 8 2048 558 763 8 125
8 8 2048 91 432 319 229 196 169 131 78 19 8 8 8 8 11 856 8 554 2048 2048 1721 8 8
8 8 1391 51 617 522 379 326 296 258 200 121 18 8 8 8 8 8 8 1795 2048 336 8 8 87
8 2048 885 1010 1032 444 331 345 362 370 365 332 253 118 8 8 8 8 1021 1784 1086 8 299 8 2048
2048 2048 2048 194 1179 350 2048 2048 2048 843 436 450 326 235 203 209 326 710 1296 1554 1234 1130 971 8 2048
2048 787 2048 990 8 8 8 8 8 8 8 2048 2048 2048 2048 2048 579 941 1653 1876 1745 1583 1438 2048 2048
2048 2048 1135 2048 8 8 8 8 8 8 8 2048 2048 2048 2048 1772 849 793 1564 2048 2048 2048 2048 2048 2048
8 8 2048 8 2048 2048 2048 2048 2048 2048 2048 2048 2048 246 8 8 8 2048 1290 8 1877 2048 2048 2048 2048
8 8 2048 8 2048 2048 2048 2048 2048 2048 2048 2048 2048 246 8 8 8 2048 1290 8 1877 2048 2048 2048 2048
1.00
0.95
0.90
0.85
0.80
0.75
0.70
0.65
0.60
0.55
0.50
0.45
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
His
tog
ram
Co
rrela
tio
n F
OM
Block Size
5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0 20.0 21.0 22.0 23.0 24.0 25.0 26.0 27.0 28.0 29.0
Standard Deviation (averaged from 16 8x8 blocks) Fig. 10. Lookup table for determining optimal sub-sample rate (top) and block size (bottom).
3.5 Appendix sou
is Fortran 95. Input images are dyadic and in 4 column ASCII text. If input
ACKNOWLEDGEMENTS
The authors would like to thank the following individuals fo nce, direction, and support during the course of
REFER NCES
. Jaenisch, H.M., Handley, J.W., “Data Modeling for Radar Applications”, Proceedings of IEEE Radar Conference
2. urray, S., “Virtual Instrument Prototyping With Data Modeling”, JANNAF
3. , New York: Van Nostrand
4. oggins, C.J., Carroll, M.P., Handley, J.W., "Entropy Fractal Analysis of Medical Images Using
5. Edwards, M., “Data Modeling of deep sky
6. rds, M.,
7. mage Compression, Wellesley, MA: A.K. Peters, c1993.
1995.
pression”,
11. ., Deragopian, G., Payne, J.,
12. sing For Target
13. etection”, Society for
rce code description
The source code given in the Appendix
image is not dyadic, it must be padded as described in Section 3.1. Source code consists of 1) SUBSAMP, driver
program that decimates and calls the other programs; 2) BMP2TXT, creates a 4 column ASCII text file consisting of an
index, red color plane, green color plane, and blue color plane from a bitmap; 3) HILB, creates the dyadic Hilbert fractal
curve; 4) RECONST, inflates thumbnail back to original size; and 5) TXT2BMP, converts 4 column ASCII text file into
bitmap. Source code for BMP2TXT and TXT2BMP are not provided in this work, but can be obtained in EXE format
via the Internet. Also included is the source code for the JPEG compression algorithm applied to the decimated
thumbnails to obtain the final compression ratios.
r their guida
this work: Marvin Barnett, Computer Sciences Corporation; Richard Esslinger and Chester Rowe, Axiom, Inc.; Gary
Whitley, GME-C; Scott McPheeters and Tim Aden, MTHEL Project Office; and John Deacon, Sparta, Inc.
E
1
2003, Huntsville AL. May 18-19 2003.
Jaenisch, H., Handley, J., Pooley, J., M
39th Combustion, 27th Airbreathing Propulsion, 21st Propulsion Systems Hazards, and 3rd Modeling and
Simulation Subcommittees Joint Meeting, Colorado Springs, CO, December 1-5, 2003.
Pennebaker, W.B. and Mitchell, J.L., JPEG: Still Image Data Compression Standard
Reinhold, c1993.
Jaenisch, H.M., Sc
ROSETA, " Proceedings of SPIE, Los Angeles, CA, January 24, 1994.
Handley, J., Jaenisch, H., Lim, A., White, G., Hons, A., Filipovic, M.,
images”, SPIE Astronomical Telescopes and Instrumentation 2004, Glasgow, Scotland, UK, June 24, 2004.
Jaenisch, H., Handley, J., Lim, A., Filipovic, M., White, G., Hons, A., Deragopian, G., Schneider, M., Edwa
“Data Modeling for Virtual Observatory data mining”, SPIE Astronomical Telescopes and Instrumentation 2004,
Glasgow, Scotland, UK, June 24, 2004.
Barnsley M. F., and Hurd, L.P., Fractal I
8. Fisher, Y. ed. Fractal Image Compression: Theory and Application, New York: Springer-Verlag, c
9. Welstead, S. Fractal and Wavelet Image Compression Techniques, Bellingham, WA: SPIE Press, c1999.
10. Jaenisch, H.M., Taylor, S.C., Handley, J.W., Carroll, M.P., “Optimal Fractal Image and Data Com
Proceedings Of The Southeastern Simulation Conference ‘95, Orlando, FL, pp. 67-77.
Lim, A., Jaenisch, H., Handley, J., Filipovic, M., White, G., Hons, A., Berrevoets, C
Schneider, M., Edwards, M., “Image resolution and performance analysis of webcams for ground based astronomy”,
SPIE Astronomical Telescopes and Instrumentation 2004, Glasgow, Scotland, UK, June 24, 2004.
Jaenisch, H., Handley, J., Faucheux, J., “Data Modeling Enabled Real Time Image Proces
Discrimination”, Proceedings of SPIE, SPIE Defense and Security Symposium, March 2005.
Jaenisch, H.M., Handley, J.W., Pooley J.C., Murray S.R., “Data Modeling for Fault D
Machinery Failure Prevention Technology (MFPT), April, 2003.
APPENDIX
Driver Program (Image sub-sampling) (SUBSAMP) (Left) and Fractal Operator Algorithm (RECONST) (Right) program subsampimplicit noneinteger, allocatable :: rpt(:),gpt(:),bpt(:)integer, allocatable :: rpt1(:),gpt1(:),bpt1(:)integer*4 n,i,j,nth,smple1,ival,ijk,ijkoldinteger r,g,breal tmpcharacter*80 infilewrite (6,*)'Input bitmap name => 'read(5,*)infilewrite(6,*)'Sample rate (Every nth point, enter N) => 'read(5,*)nthwrite(6,*)write(6,*)'Converting bitmap to text'write(6,*)
c system call to create text file (4 cols, index, red, green, blue)call system("bmp2txt "//infile//" tmp1.txt")write(6,*)write(6,*)'Reading in data'write(6,*)open (1,file='tmp1.txt', status='unknown')
11 read(1,*,end=10)N,r,g,bgoto 11
10 rewind(1)n=n+1allocate (rpt(n),gpt(n),bpt(n),rpt1(n),gpt1(n),bpt1(n))do i=1,N
read(1,*)j,rpt(i),gpt(i),bpt(i)enddoclose(1)write(6,*)write(6,*)'Generate Hilbert transform'write(6,*)
c system call to run Hilbert programcall system("hilb")open(1,file='hilb.txt',status='unknown')write(6,*)write(6,*)'Apply Hilbert transform'write(6,*)do i=1,N
read(1,*)jrpt1(i)=rpt(j)gpt1(i)=gpt(j)bpt1(i)=bpt(j)
enddoclose(1)write(6,*)write(6,*)'Write out subsampled data'write(6,*)open(1,file='tmp1.txt',status='unknown')smple1=int(N/nth)smple1=int(sqrt(real(smple1)))smple1=smple1*smple1do i=1,smple1
tmp=real(N-1)/real(smple1-1)tmp=tmp*real(i-1)+1.0ival=nint(tmp)write(1,*)i-1,rpt1(ival),gpt1(ival),bpt1(ival)
enddoclose(1)write(6,*)write(6,*)'Reconstruct to original length'write(6,*)
c system call to run reconstruction programcall system ("reconst")write(6,*)write(6,*)'Read in reconstructed data'write(6,*)open(1,file='recon.txt',status='unknown')do i=1,N
read(1,*)j,rpt(i),gpt(i),bpt(i)enddoclose(1)open(1,file='hilb.txt',status='unknown')write(6,*)write(6,*)'Apply inverse Hilbert transform'write(6,*)do i=1,N
read(1,*)jrpt1(j)=rpt(i)gpt1(j)=gpt(i)bpt1(j)=bpt(i)
enddoclose(1)write(6,*)write(6,*)'Write out final result'write(6,*)open(1,file='tmp1.txt',status='unknown')do i=1,N
write(1,*)i-1,rpt1(i),gpt1(i),bpt1(i)enddoclose(1)
c cleanupcall system("txt2bmp tmp1.txt recon.bmp")call system("del tmp1.txt")call system("del hilb.txt")call system("del recon.txt")stopend
program reconstructimplicit noneinteger rpt,gpt,bptinteger*4 ilen,iwid,jlen,jwid,iflginteger*4 jsize,isize,i,j,ijk,ijk1integer rmin,rmax,gmin,gmax,bmin,bmaxcharacter*80 infileinteger fracoperinteger,allocatable::darray(:,:)write(6,*)'Input thumbnail file name => 'read(5,*)infilewrite(6,*)'Color (1) or Gray (2) => 'read(5,*)iflgif (iflg.ne.2)iflg=1write(6,*)'Thumbnail length => 'read(5,*)ilenwrite(6,*)'Thumbnail width => 'read(5,*)iwidwrite(6,*)'Final dyadic length => 'read(5,*)jlenwrite(6,*)'Final dyadic width => 'read(5,*)jwidopen (1,file=infile,status='unknown')jsize=ilen*iwidallocate (darray(3,jsize))ijk=0do i=1, ilen
do j=1, iwidijk=ijk+1read(1,*)ijk1,darray(1,ijk),darray(2,ijk),darray(3,ijk)
enddoenddoclose (1)isize=jlen*jwidopen (1,file='recon.txt',status='unknown')rmin=255rmax=0gmin=255gmax=0bmin=255bmax=0do i=1, ijk
if(darray(1,i).lt.rmin)rmin=darray(1,i)if(darray(1,i).gt.rmax)rmax=darray(1,i)if(darray(2,i).lt.gmin)gmin=darray(2,i)if(darray(2,i).gt.gmax)gmax=darray(2,i)if(darray(3,i).lt.bmin)bmin=darray(3,i)if(darray(3,i).gt.bmax)bmax=darray(3,i)
enddodo i=1, isize
rpt=fracoper(darray,1,i,jsize,isize,rmin,rmax)if (iflg.eq.2) then
write(1,*)i-1,rpt,rpt,rptelse
gpt=fracoper(darray,2,i,jsize,isize,gmin,gmax)bpt=fracoper(darray,3,i,jsize,isize,bmin,bmax)write(1,*)i-1,rpt,gpt,bpt
endifenddoclose (1)stopend
integer function fracoper(darray,k,i,nsub,ntotal,yn1,yx1)implicit noneinteger*4 i,k,n,istart,ifinish,iendreal xpt,ixreal y1,ys,yf,ye,y2,yn,yxinteger yn1,yx1integer*4 nsub, ntotalinteger darray(3,nsub)n = nint(real(i-1) * real(nsub - 2) / real(ntotal - 1)) + 1istart = nint((real(ntotal - 1) * real(n - 1)) / real(nsub - 1))ifinish = nint((real(ntotal - 1) * real(n)) / real(nsub - 1))iend = nint((real(ntotal - 1) * real(n + 1)) / real(nsub - 1))xpt = ((i-1) - istart) / (ifinish - istart)ix = (ntotal - 1) * xptys=real(darray(k,n))yf=real(darray(k,n+1))if (n+2.gt.nsub)then
ye=real(darray(k,1))else
ye=real(darray(k,n+2))endifyn=real(yn1)yx=real(yx1)y1=(ye-yn)/(yx-yn)y1=y1-((ys-yn)/(yx-yn))y1=-(y1*(real(darray(k,nsub))-real(darray(k,1)))/(ntotal-1))y1=y1+((yf-ys)/real(ntotal-1))y1=y1*ixix=int(xpt*nsub)y2=real(darray(k,ix))+(real(darray(k,ix+1))-real(darray(k,ix)))
+ *((xpt*nsub)-ix)y1=y1+(((ye-yn)/(yx-yn))-((ys-yn)/(yx-yn)))*y2+ysy1=y1-real(darray(k,1))*(((ye-yn)/(yx-yn))-((ys-yn)/(yx-yn)))if (y1.lt.yn)y1=ynif (y1.gt.yx)y1=yxfracoper=nint(y1)returnend
Modified IFS Random Iteration Algorithm (RECONST) (Left) and Hilbert Sequence Program (HILB) (Right) program reconstructimplicit noneinteger*4 ilen,iwid,jlen,jwid,iflginteger*4 jsize,isize,i,j,ijk,ijk1,num,nptsinteger rpt,gpt,bptcharacter*80 infilereal, allocatable:: z(:)real, allocatable:: rint(:)real, allocatable:: gint(:)real, allocatable:: bint(:)write(6,*)'Input thumbnail file name => 'read(5,*)infilewrite(6,*)'Color (1) or Gray (2) => 'read(5,*)iflgif (iflg.ne.2)iflg=1write(6,*)'Thumbnail length => 'read(5,*)ilenwrite(6,*)'Thumbnail width => 'read(5,*)iwidwrite(6,*)'Final dyadic length => 'read(5,*)jlenwrite(6,*)'Final dyadic width => 'read(5,*)jwidopen (1,file=infile,status='unknown')jsize=ilen*iwidisize=jlen*jwidnum=jsize-1npts=isizeallocate (z(0:num))allocate (rint(1:npts))allocate (gint(1:npts))allocate (bint(1:npts))do i=1,jsize
read(1,*)ijk1,rpt,gpt,bptrint(i)=real(rpt)gint(i)=real(gpt)bint(i)=real(bpt)
enddoclose (1)open (1,file='recon.txt',status='unknown')if (iflg.eq.2)then
do i = 0,numz(i)=rint(i+1)
enddocall fractalint(npts,z,num,rint)
elsedo i = 0,num
z(i)=rint(i+1)enddocall fractalint(npts,z,num,rint)do i = 0, num
z(i)=gint(i+1)enddocall fractalint(npts,z,num,gint)do i = 0, num
z(i)=bint(i+1)enddocall fractalint(npts,z,num,bint)
endifif (iflg.eq.2)then
do i=1,nptswrite(1,*)i-1,int(rint(i)),int(rint(i)),int(rint(i))
enddoelse
do i=1,nptswrite(1,*)i-1,int(rint(i)),int(gint(i)),int(bint(i))
enddoendifclose (1)stopend
subroutine FractalInt(points,z,num,yint)implicit noneinteger*4 i, k, num, points, ndonereal x(0:num),z(0:num),d(num)real yy(0:num), a(0:num), e(0:num), c(0:num)real yint(points)real tol,newx,newy,xx,yyy,rndum,breal h,dh,dd,correl,sdev,h0,perconf,dfhinteger nterm, nsetsdo i=1, points
yint(i)=-999.9999enddodo i=0,num
x(i)=real(i)/real(num)x(i)=x(i)*real(points-1)+1
enddoxx = x(0)yyy = z(0) do i=1,num-1
d(i)=(z(i+1)-z(i-1))/510.0enddod(num)=(z(0)-z(num-1))/510.0do i = 1, num
b = x(num)-x(0)a(i) = (x(i)-x(i-1)) / be(i) = (x(num) * x(i - 1) - x(0) * x(i)) / bc(i) = (z(i) - z(i - 1) - d(i) * (z(num) - z(0))) / byy(i) = (x(num) * z(i - 1) - x(0) * z(i) - d(i) *
+ (x(num) * z(0) - x(0) * z(num))) / benddo
202 do i=1,pointscall random_number(rndum)k=1+int(real(num)*rndum)if(k.gt.num)k=numif(k.lt.1)k=1newx = a(k) * xx + e(k)newy = c(k) * xx + d(k) * yyy + yy(k)yint(nint(newx)) = newyxx=newxyyy=newy
enddodo i = 1, points
If (yint(i).eq.-999.9999)thenGoTo 202
endifenddoreturnend
pro gram hilbimplicit noneinteger nptsinteger i,j,l1,arraysz,direc,parity,levelinteger ileft(5),iright(5)integer*4 incr, incr1common /hilby/ i,j,l1,incr,incr1,arraysz,direc,parity,level,
+ ileft,irightwrite(6,*)'Input number of points (dyadic) => 'read(5,*)nptsopen (1,file='hilb.txt',status='unknown')call hilbert (1,npts)close(1)stopend
RECURSIVE SUBROUTINE DOHILBERT (L, P)implicit noneinteger l,pinteger i,j,l1,arraysz,direc,parity,levelinteger ileft(5),iright(5)integer*4 incr, incr1common /hilby/ i,j,l1,incr,incr1,arraysz,direc,parity,level,
+ ileft,irightIF (l.ne.0) THEN
DIREC = ILEFT(P * DIREC + 3)CALL DOHILBERT(L - 1, -P)CALL FWDDIREC = IRIGHT(P * DIREC + 3)CALL DOHILBERT(L - 1, P)CALL FWDCALL DOHILBERT(L - 1, P)DIREC = IRIGHT(P * DIREC + 3)CALL FWDCALL DOHILBERT(L - 1, -P)DIREC = ILEFT(P * DIREC + 3)
END IFRETURNEND
INTEGER FUNCTION LINKINDEX (ROW, COLUMN)implicit noneinteger row,columninteger i,j,l1,arraysz,direc,parity,levelinteger ileft(5),iright(5)integer*4 incr, incr1common /hilby/ i,j,l1,incr,incr1,arraysz,direc,parity,level,
+ ileft,irightLINKINDEX = (COLUMN - 1) * arraysz + ROWRETURNEND
SUBROUTINE HSTEP (ROW, COLUMN, DIRECTION)implicit noneinteger row,column,directioninteger i,j,l1,arraysz,direc,parity,levelinteger ileft(5),iright(5)integer*4 incr, incr1common /hilby/ i,j,l1,incr,incr1,arraysz,direc,parity,level,
+ ileft,irightIF (ABS(DIRECTION).eq.1) THEN
ROW = ROW + DIRECTIONELSE
COLUMN = COLUMN + DIRECTION / 2END IFRETURNEND
SUBROUTINE hilbert (start, finish)implicit noneinteger start,finish,numinteger i1,h,k,linkindexinteger l,isuminteger i,j,l1,arraysz,direc,parity,levelinteger ileft(5),iright(5)integer*4 incr, incr1common /hilby/ i,j,l1,incr,incr1,arraysz,direc,parity,level,
+ ileft,irightnum = finish - start + 1ILEFT(1) = 1ILEFT(2) = -2ILEFT(3) = 0ILEFT(4) = 2ILEFT(5) = -1IRIGHT(1) = -1IRIGHT(2) = 2IRIGHT(3) = 0IRIGHT(4) = -2IRIGHT(5) = 1i = 0DO while (j.le.num)
i = i + 1j = (2 ** i) ** 2
enddol1 = i - 1INCR = 0INCR1 = 0ISUM = 0do I1 = 1, l1 - 1
ISUM = ISUM + (2 ** L) ** 2 - 1enddodo LEVEL = 1,l1
arraysz = 1do H = 1, LEVEL
arraysz = arraysz * 2enddoi = 1j = 1DIREC = 1PARITY = 1CALL DOHILBERT(LEVEL, PARITY)k = LINKINDEX(i, j)
enddowrite(1,*)arrayszRETURNEND
SUBROUTINE FWDimplicit noneinteger k,linkindex,isuminteger i,j,l1,arraysz,direc,parity,levelinteger ileft(5),iright(5)integer*4 incr, incr1common /hilby/ i,j,l1,incr,incr1,arraysz,direc,parity,level,
+ ileft,irightk = LINKINDEX(i, j)CALL HSTEP(i, j, DIREC)IF (l1.eq.LEVEL) THEN
INCR = INCR + 1IF (INCR.gt.ISUM) THEN
INCR1 = INCR1 + 1write(1,*)k
END IFEND IFRETURNEND
JPEG Compression Code (JPEG) blo ck data initjpegimplicit noneinteger isizeparameter (isize=8)integer zigzaguinteger zigzagvcommon /zigu/zigzagu(0:isize*isize-1)common /zigv/zigzagv(0:isize*isize-1)data zigzagu/0,0,1,2,1,0,0,1,2,3,4,3,2,1,0,0,1,2,3,4,5,6,5,
* 4,3,2,1,0,0,1,2,3,4,5,6,7,7,6,5,4,3,2,1,2,3,4,* 5,6,7,7,6,5,4,3,4,5,6,7,7,6,5,6,7,7/
data zigzagv/0,1,0,0,1,2,3,2,1,0,0,1,2,3,4,5,4,3,2,1,0,0,1,* 2,3,4,5,6,7,6,5,4,3,2,1,0,1,2,3,4,5,6,7,7,6,5,* 4,3,2,3,4,5,6,7,7,6,5,4,5,6,7,7,6,7/
endprogram jpegimplicit noneinteger isizeparameter (isize=8)integer jsizeparameter (jsize=2048)integer ksizeinteger pixelsdouble precision dctinteger zigzaguinteger zigzagvdouble precision reorderedinteger quanttableinteger rpix(jsize,jsize,3)integer final(jsize,jsize,3)character*80 pixfilecharacter*80 quantfilecharacter*80 outfileinteger lstmeaninteger i,j,k,l,m,ninteger previousdctermcommon /zigu/zigzagu(0:isize*isize-1)common /zigv/zigzagv(0:isize*isize-1)common /dctdat/dct(0:isize-1,0:isize-1),
* pixels(0:isize-1,0:isize-1)common /reorder/reordered(0:isize*isize-1)common /quant/quanttable(0:isize-1,0:isize-1)quantfile="qtable.dat"lstmean=128ksize=jsize/isizepixfile = "input"outfile = "output"call readblock(pixfile,rpix)call readquanttable(quantfile)do i=1,ksize
do j=1,ksizedo k=1,3
do l=0,isize-1do m=0,isize-1
pixels(l,m)=rpix(((i-1)*isize)+l+1,* ((j-1)*isize)+m+1,k)
enddoenddocall fdctcall quantizecall zigzagpreviousdcterm = lstmeancall unzigzagcall dequantizecall idctdo l=0,isize-1
do m=0,isize-1final(((i-1)*isize)+l+1,
* ((j-1)*isize)+m+1,k)=pixels(l,m)enddo
enddoenddo
enddoenddocall writeblock(outfile,final)stopendsubroutine fdctimplicit noneinteger x,y,u,vdouble precision sumdouble precision cdouble precision mpiinteger isizeparameter (isize=8)integer pixelsdouble precision dctcommon /dctdat/dct(0:isize-1,0:isize-1),
* pixels(0:isize-1,0:isize-1)mpi=2.0*asin(1.0)do u=0,isize-1
do v=0,isize-1sum = 0.0do x=0,isize-1
do y=0,isize-1sum=sum+(pixels(x,y)-128)*cos((2*x+1)*
* u*(mpi/16.))*cos((2*y+1)*v*(mpi/16.))enddo
enddodct(u,v)=0.25*C(u)*C(v)*sum
enddoenddoreturnendsubroutine idctimplicit noneinteger x,y,u,vdouble precision sumdouble precision mpidouble precision cinteger isizeparameter (isize=8)integer pixelsdouble precision dctcommon /dctdat/dct(0:isize-1,0:isize-1),
* pixels(0:isize-1,0:isize-1)mpi=2.0*asin(1.0)do x=0,isize-1
do y=0,isize-1sum = 0.0do u=0,isize-1
do v=0,isize-1sum=sum+C(u)*C(v)*dct(u,v)*cos((2*x+1)*
* u*(mpi/16.))*cos((2*y+1)*v*(mpi/16.))enddo
enddopixels(x,y)=128.5+0.25*sum
enddoenddoreturnend
double precision function c(n)implicit noneinteger nif (n.eq.0) then
c=1.0/sqrt(2.0)else
c=1.0endifreturnendsubroutine printdctimplicit noneinteger i,jinteger isizeparameter (isize=8)integer pixelsdouble precision dctcommon /dctdat/dct(0:isize-1,0:isize-1),
* pixels(0:isize-1,0:isize-1)do i=0,isize-1
write(6,*) i,(dct(i,j),j=0,isize-1)enddoreturnendsubroutine printpixelsimplicit noneinteger i,jinteger isizeparameter (isize=8)integer pixelsdouble precision dctcommon /dctdat/dct(0:isize-1,0:isize-1),
* pixels(0:isize-1,0:isize-1)do i=0,isize-1
write(6,*) i,(pixels(i,j),j=0,isize-1)enddo returnendsubroutine quantizeimplicit noneinteger i,jinteger isizeparameter (isize=8)integer pixelsdouble precision dctinteger quanttableinteger roundcommon /dctdat/dct(0:isize-1,0:isize-1),
* pixels(0:isize-1,0:isize-1)common /quant/quanttable(0:isize-1,0:isize-1)do i=0,isize-1
do j=0,isize-1dct(i,j)=round(dct(i,j)/quanttable(i,j))
enddoenddoreturnendsubroutine dequantizeimplicit noneinteger i,jinteger isizeparameter (isize=8)integer pixelsdouble precision dctinteger quanttablecommon /dctdat/dct(0:isize-1,0:isize-1),
* pixels(0:isize-1,0:isize-1)common /quant/quanttable(0:isize-1,0:isize-1)do i=0,isize-1
do j=0,isize-1dct(i,j)=dct(i,j)*quanttable(i,j)
enddoenddoreturnendinteger function round(x)implicit nonedouble precision xif (x.gt.0.0) then
round=int(x+0.5)else
round=int(x-0.5)endifreturnendsubroutine readblock(filename,rpix)implicit nonecharacter*80 filenameinteger i,jinteger xtempinteger jsizeparameter (jsize=128)integer rpix(jsize,jsize,3)open(1,file=filename,status='unknown')do i=1,jsize
read(1,*)(xtemp,rpix(i,j,1),rpix(i,j,2),+ rpix(i,j,3),j=1,jsize)
enddoclose(1)returnendsubroutine readquanttable(filename)implicit nonecharacter*80 filenameinteger i,jinteger isizeparameter (isize=8)real facparameter (fac=30)integer quanttablecommon /quant/quanttable(0:isize-1,0:isize-1)open(1,file=filename,status='unknown')do i=0,isize-1
read(1,*)(quanttable(i,j),j=0,isize-1)enddodo i=0,isize-1
do j=0,isize-1quanttable(i,j)=int(fac*real(quanttable(i,j)))
enddoenddoclose(1)returnend
integer function integercodesize(n)implicit noneinteger nif (n.lt.0) n=-nif (n.eq.1) then
integercodesize=1elseif(n.lt.4) then
integercodesize=2elseif(n.lt.8) then
integercodesize=3elseif(n.lt.16) then
integercodesize=4elseif(n.lt.32) then
integercodesize=5elseif(n.lt.64) then
integercodesize=6elseif(n.lt.128) then
integercodesize=7elseif(n.lt.256) then
integercodesize=8elseif(n.lt.512) then
integercodesize=9elseif(n.lt.1024) then
integercodesize=10else
write(6,*)"Illegal coefficient value"stop
endifreturnendsubroutine zigzagimplicit noneinteger iinteger isizeparameter (isize=8)integer pixelsdouble precision dctinteger zigzaguinteger zigzagvdouble precision reorderedcommon /zigu/zigzagu(0:isize*isize-1)common /zigv/zigzagv(0:isize*isize-1)common /dctdat/dct(0:isize-1,0:isize-1),
* pixels(0:isize-1,0:isize-1)common /reorder/reordered(0:isize*isize-1)do i=0, isize*isize-1
reordered(i)=dct(zigzagu(i),zigzagv(i))enddoreturnendsubroutine unzigzagimplicit noneinteger iinteger isizeparameter (isize=8)integer pixelsdouble precision dctinteger zigzaguinteger zigzagvdouble precision reorderedcommon /zigu/zigzagu(0:isize*isize-1)common /zigv/zigzagv(0:isize*isize-1)common /dctdat/dct(0:isize-1,0:isize-1),
* pixels(0:isize-1,0:isize-1)common /reorder/reordered(0:isize*isize-1)do i=0, isize*isize-1
dct(zigzagu(i),zigzagv(i))=reordered(i)enddoreturnendsubroutine printrunlengths(previousdcterm)implicit noneinteger previousdcterminteger isizeparameter (isize=8)integer pixelsdouble precision dctdouble precision reorderedcommon /dctdat/dct(0:isize-1,0:isize-1),
* pixels(0:isize-1,0:isize-1)common /reorder/reordered(0:isize*isize-1)integer iinteger runlengthinteger dcdiffinteger integercodesizeopen(2,file="compress.out",status="unknown",access="append")runlength=0dcdiff=int(dct(0,0))-previousdctermwrite(2,*)integercodesize(dcdiff),dcdiffdo i=0,isize*isize-1
if (reordered(i).eq.0)thenrunlength=runlength+1
elsewrite(2,*)runlength,integercodesize(int(reordered(i))),
* int(reordered(i))runlength=0
endifenddoclose(2)returnendsubroutine writeblock(filename,final)implicit nonecharacter*80 filenameinteger i,j,kinteger jsizeparameter (jsize=128)integer final(jsize,jsize,3)open(1,file=filename,status='unknown')do i=1,jsize
do j=1,jsizedo k=1,3
if(final(i,j,k).lt.0)final(i,j,k)=0if(final(i,j,k).gt.255)final(i,j,k)=255
enddoenddo
enddok=0do i=1,jsize
do j=1,jsizewrite(1,10)k,final(i,j,1),final(i,j,2),final(i,j,3)k=k+1
enddoenddo
10 format(1x,i5,1x,i3,1x,i3,1x,i3)close(1)returnend
Shai-Hulud: The quest for worm sign
Holger M. Jaenisch*a, James W. Handleya, Jeffrey P. Faucheuxb and Ken Lamkinb aSparta, Inc., 6000 Technology Dr., Bldg. 3, Huntsville, AL 35805
bSparta, Inc., 7075 Samuel Morse Dr., Suite 200, Columbia, MD 21046
ABSTRACT
Successful worm detection at real-time OC-48 and OC-192 speed requires hardware to extract web based binary sequences at faster than these speeds, and software to process the incoming sequences to identify worms. Computer hardware advancement in the form of field programmable gate arrays (FPGAs) makes real-time extraction of these sequences possible. Lacking are mathematical algorithms for worm detection in the real time data sequence, and the ability to convert these algorithms into lookup tables (LUTs) that can be compiled into FPGAs. Data Modeling provides the theory and algorithms for an effective mathematical framework for real-time worm detection and conversion of algorithms into LUTs. Detection methods currently available such as pattern recognition algorithms are limited both by the amount of time to compare the current data sequence with a historical database of potential candidates, and by the inability to accurately classify information that was unseen in the training process. Data Modeling eliminates these limitations by training only on examples of nominal behavior. This results in a highly tuned and fast running equation model that is compiled in a FPGA as a LUT and used at real-time OC-48 and OC-192 speeds to detect worms and other anomalies. This paper provides an overview of our approach for generating these Data Change Models for detecting worms, and their subsequent conversion into LUTs. A proof of concept is given using binary data from a WEBDAV, SLAMMER packet, and RED PROBE attack, with BASIC source code for the detector and LUT provided. Keywords: Data Modeling, worm detection, change detection, OC-48, OC-192, fractal conversion, FPGA, WEBDAV,
SLAMMER, RED PROBE
1. INTRODUCTION Identification of web based binary sequences containing worms and viruses in order to prevent computer network attack and intrusion is of interest to the information assurance (IA) community. By flagging these events as they are presented to the network in real-time and not allowing them to infiltrate the network, or by removing them before they can modify or damage information on the network, savings both in terms of cost and man-hours is achieved. To date, there has been no ability to perform worm detection at real-time OC-48 and OC-192 speeds. Achieving this was limited both by the computer power required to extract the sequences and by the mathematical algorithms available to identify worms. Field programmable gate array (FPGA) based computer hardware technology has been developed that overcomes the hardware limitations. FPGAs are electronic chips that are programmed with a bit stream when they are powered up, and can be reprogrammed if necessary. Programs are compiled into the FPGA in the form of a lookup table (LUT). Algorithms converted into standard computer programs must then be converted into LUT format in order to run in a FPGA computer architecture. Data Modeling provides a method for single-pass identification of worm packets in binary data that can also be compiled into simple LUT format for use in FPGAs at OC-48 and OC-192 speeds1,2,3.
2. DATA MODELING If worm packets are detectable in real-time at OC-48 and OC-192 speeds, it is possible to intercept and quarantine them before they become resident inside or attack the computer system. In previous published work, Data Modeling was successfully applied to construct change detectors and classifiers for the 1999 KDD-Cup Challenge Data sponsored by DARPA4. This work successfully demonstrated Data Modeling’s ability to perform network intrusion detection using data derived from network logon sequences. For worm detection, this process is taken further by characterizing binary sequences of zeros and ones. This binary data is converted into a fractional Brownian motion (fBm) data sequence and characterized using parametric and non-parametric statistics. These statistics are then used to drive the Data Model change detector equation and subsequent LUT. * [email protected]; phone 1 256 337-3768; fax 1 256 890-2041; sparta.com
Data Models are variables that can be embedded in other functions as variables, therefore the term used is functional5,6. A hierarchy of functional models (equations whose variables are themselves equations) such as the Data Relations Model
))))))))),,(((((((((),,,( 2121 kjinL xxxybybybyfxxxy KK = [ ] nLxxxyO 3),,,( 21 =K (1)
can be built up to combine any group or subset of previously derived models and monitor waveforms for changes and alerts. By flagging tip-off conditions, rollup diagnostics and assessments become possible that reduce the information made available as waveform dynamics unfold for generating tip-off conditions. Data Relations Modeling finds a mathematical expression that provides a good fit between given finite sample values of the independent variables and the associated values of the dependent variables of the process. The mathematical expression that fits the given sample of data is called a Data Model. This process involves finding both the functional form of the Data Model and the numeric coefficients for the Data Model. The Data Relations Modeling process of discovering automatic functions is an evolutionary and genetic directed approach where the set elements consist entirely of simple polynomial building blocks7. We assert as a fundamental premise that nominal occurs more frequently than off-nominal. If the reverse is true, then the definition simply swaps, making it still hold8. Based on this assumption, we can proceed with an algorithm that will determine normal from abnormal without a priori knowledge or representation of either. Generation of this Data Change Model can be done in either supervised or unsupervised mode. Both supervised and unsupervised modes are covered in detail in Reference 8. Data Change Modeling4,8 is unique in that no a priori examples of what constitutes an anomalous condition need to be characterized or used for deriving the equation. Using only historical knowledge of nominal regions, a Change Detection equation is derived directly from the example feature set that is very highly tuned. The simple equation can take the form of a very large polynomial and is efficiently derived using Data Modeling algorithms. In practice, this O(3n) polynomial has been derived to be as high as 6,000th order and may be even higher order if the application requires it. This simply insures and builds confidence that any number of natural non-linearities or non-stationarities that exist even in the nominal region can be adequately described with a single, albeit high order, equation. Changes in complex dynamic processes such as manufacturing assembly lines can be monitored using X-Bar and R charts. This is known as Statistical Process Control, a field pioneered by Shewhart. The power of Shewhart’s statistical quality control charts is the ability to flag when a dynamic process is not in statistical control. The X-Bar chart upper control limit (UCL) and lower control limit (LCL) is defined by
.2
2
RAXLCL
RAXUCL
X
X
−=
+= (2)
and plots how far away from the average value (X ) of a process the current measurement falls. Values that repeatedly
fall outside of three standard deviations of the average (RA2 ) have assignable causes and are labeled as being out of statistical control (off-nominal). By creating a high order polynomial that learns the associations between inputs, the equation becomes sensitive to perturbations or changes in inputs due to combinations of these same moments not occurring during the nominal interval, just as Statistical Process Control is sensitive and flags whenever values repeatedly fall outside of the sigma boundary. As such, numerical instability causes the tightly tuned equation to “blow-up”, resulting in values in excess of ± 1x1010 rather than within the training boundaries of 0.5 ± .0012. These radical departures are treated as a flag in real-time that the statistics across the feature vector have changed from status quo. Figure 1 illustrates how binary data streams are converted into fractional Brownian motion (fBm), features are calculated, and then analyzed to detect if any change has occurred in system status from nominal.
Tip-Off
Tip-Off
Nominal
Tip-Off
Tip-Off
Nominal
Equation Output
σ
T = w0+w1x1+w2x2+w3x3 +w4x12
+w5x22 +w6x3
2 +w7x13 +w8x2
3
+w9x32 +w10x1x2 +w11x1x3
+w12x2x3
Skewness StandardDeviation
KurtosisRollup Stats
Data Model
ZScoresFractal
ConversionNetwork
FlowData
SeriesData
SeriesPDFPDF
Tip-Off
Tip-Off
Nominal
Tip-Off
Tip-Off
Nominal
Equation Output
σ
T = w0+w1x1+w2x2+w3x3 +w4x12
+w5x22 +w6x3
2 +w7x13 +w8x2
3
+w9x32 +w10x1x2 +w11x1x3
+w12x2x3
Skewness StandardDeviation
KurtosisRollup Stats
Data Model
ZScoresFractal
ConversionNetwork
FlowData
SeriesData
SeriesPDFPDF
Fig. 1. Change detection using Data Modeling.
Because the data is binary, it is first transformed using
12 −= ii wz (3)
where wi is the current binary digit and zi is the transformed value. This converts the data into –1 and +1 values. In the proof of concept contained in this paper, these numbers were converted to – ½ and + ½ by dividing the results of Equation 3 by 2. Once transformed, the data is then integrated using direct summation
)(1 zzyy iii −+= − (4)
resulting in a fractional Brownian motion (fBm) or 1/f type data set. In Equation 4, yi is the current integrated value, yi-1 the previous integrated value, zi is the current data sequence point, and z the mean of the data sequence points. This has been demonstrated by the authors in previous work. Each individual descriptive statistic characterizes the mth window of the incoming data sequence. Data Modeling has available over 40 different novel parametric and non-parametric statistics and autonomously decides which statistics are necessary. For this application, only 2 statistics (standard deviation and skewness) were required for input to the Data Change Model. These real-time features are given in Equations 6 and 7 by
1
1
11
+
+
+
+ +⎟⎟⎠
⎞⎜⎜⎝
⎛=
i
i
i
iii N
y
N
Nµµ (5)
( )( )
1
211
1
21
+
++
+
+
−+⎟⎟
⎠
⎞⎜⎜⎝
⎛=
i
ii
i
iii N
y
N
N µσσ (6)
1
3
11
1+
++
+
⎟⎟⎠
⎞⎜⎜⎝
⎛ −+
=i
i
iiii
i N
yNSkew
Skewσ
µ
(7)
where i is the index corresponding to the previous time point, and i+1 the current time point. The real time calculation of the mean in Equation 5 is included for use in the real time estimates in Equations 6 and 7. Once constructed, the final Data Change Model is determined by placing a tight boundary of 0.0012 on either side of the final nominal class designation. All data sets inside of the two boundaries are considered candidate nominal, while all others are considered off-nominal and potential worms.
The equation Data Models are then converted into lookup table (LUT) format. One method for constructing LUTs for Data Relations Models is through the use of only the DOUBLE equation type given by
22111222211121221121 xxJxxxIxxxHxxxGxxFxxExxDxCxBxAy +++++++++= (8) This building block uses as input 2 channels of information (x1 and x2), which can be plotted on the x-axis and y-axis of the LUT separately. The value read from the LUT is the current estimate of the condition of the system variable (nominal or tip-off). If multiple DOUBLES are used in constructing the final Data Model, multiple LUTs can be used, each representing an individual DOUBLE. In this case, the value from one LUT is used as input into the next LUT. The proof of concept Data Model explained in the next section was constructed using a series of DOUBLE equations; however, the LUT was constructed as a final output LUT, meaning that the results of each intermediate DOUBLE equation are captured using a single LUT that maps the inputs directly into the output value. Figure 2 provides pseudo-code for generating this LUT.
ngrid = 25 : i = 0
REDIM smin(1), smax(1), sdev(1), dfeat(1), prob(ngrid,ngrid)
smin(0) = 0@ : smax(0) = 30@ : sdev(0) = 0@
smin(1) = -1.7964 : smax(1) = 1.2456 : sdev(1) = .5339
dfeat(0) = ((smax(0) + sdev(0)) - (smin(0) - sdev(0))) / (ngrid - 1)
dfeat(1) = ((smax(1) + sdev(1)) - (smin(1) - sdev(1))) / (ngrid - 1)
FOR featx = smin(0) - sdev(0) TO smax(0) + sdev(0) + .9 * dfeat(0) STEP dfeat(0)
i = i + 1 : j = 0
FOR featy = smin(1) - sdev(1) TO smax(1) + sdev(1) + .9 * dfeat(1) STEP dfeat(1)
j = j + 1
feat1 = (featx - 3.3274) / 1.1979 : feat2 = (featy - -.092) / .4131
‘Data Model equation here
call datamodel(feat1,feat2,outval)
prob(i, j) = outval
NEXT featy
NEXT featx
OPEN "cld4map.out" FOR OUTPUT AS #1
FOR i = ngrid TO 1 STEP -1
FOR j = 1 TO ngrid
PRINT #1, prob(i, j);
NEXT j
PRINT #1,
NEXT i
CLOSE : END
Fig. 2. Pseudo-code for generating an LUT.
3. PROOF OF CONCEPT The authors were supplied with four (4) data sets consisting of nominal and attack web server data in binary format. The first data set is a web server browsing session with no attacks (labeled nominal, or attack free). The second data set is a single slammer worm packet. The third data set is all the packets associated with a web server worm attack (both nominal and anomalous). The fourth data set is a second type of worm packet (code red probe) buried in among the nominal web server packets. Worm sign regions in data exist where the dynamics are different from the dynamics of nominal web server data. Statistics that capture these differences are used as input features to the Data Change Model. Worm sign regions will have one or more features that exist either outside of the nominal training boundaries, or in combinations that were not observed during training. This leads to the nominal Data Model flagging the regions as tip-off. First, the Data Change Model shown in Figure 3 was constructed using the nominal only data set. This Data Model uses the standard deviation and skewness calculated in Equations 6 and 7 and provided in a file named input. Written out by this program is a file named out1 that contains the actual Data Model equation value for nominal cases (ranging between 0.4988 and 0.5012), and 1 for potential worm detection cases. Only standard deviation (Equation 6) and skewness (Equation 7) were required by the Data Model to distinguish between the provided nominal and worm data, due in part to the larger standard deviations for the worm data than the nominal data, and for the existence of skewness for the worm data that were smaller than the currency precision used in the Data Model (resulting in being rounded to zero) and the
existence of worm packets in the fourth data set whose skewness was mathematically zero. However, as shown in the examples below, the standard deviation of these features is large, thereby causing overlap in the distributions of the features. To resolve this overlap, it is necessary to use the Data Model and subsequent lookup tables for real-time firmware implementation. Also, zero skewness also exists in the nominal data (shown in the Examples below), making it impossible to set a simple threshold to resolve nominal and worm.
DEFCUR A-Z
open "input" for input as #1
input #1, sdev
input #1, skew
close #1
sdev = (sdev-3.3274) / 1.1979
skew = (skew--0.092) / 0.4131
l1o1=0.0155
l1o1=l1o1+ (-0.0093) * (skew)
l1o1=l1o1+ (-0.0406) * (sdev)
l1o1=l1o1+ (0.0417) * (skew * skew)
l1o1=l1o1+ (-0.0472) * (sdev * sdev)
l1o1=l1o1+ (-0.1041) * (skew * sdev)
l1o1=l1o1+ (0) * (skew * skew * skew)
l1o1=l1o1+ (0.015) * (sdev * sdev * sdev)
l1o1=l1o1+ (0.0208) * (sdev* skew * skew)
l1o1=l1o1+ (0.0258)* (skew * sdev * sdev)
l2o1=-0.0608
l2o1=l2o1+ (1.7906) * (l1o1)
l2o1=l2o1+ (0.004) * (sdev)
l2o1=l2o1+ (1.3184) * (l1o1 * l1o1)
l2o1=l2o1+ (0.013) * (sdev * sdev)
l2o1=l2o1+ (0.0845) * (l1o1 * sdev)
l2o1=l2o1+ (-3.368)* (l1o1 * l1o1 * l1o1)
l2o1=l2o1+ (-0.004)* (sdev * sdev * sdev)
l2o1=l2o1+ (0.9435)* (sdev * l1o1 * l1o1)
l2o1=l2o1+ (-0.1378)*(l1o1 * sdev * sdev)
l2o2=-0.02
l2o2=l2o2+ (1.0804) * (l1o1)
l2o2=l2o2+ (0.0072) * (skew)
l2o2=l2o2+ (-0.0925) * (l1o1 * l1o1)
l2o2=l2o2+ (-0.0074) * (skew * skew)
l2o2=l2o2+ (0.0356) * (l1o1 * skew)
l2o2=l2o2+ (-4.5598)*(l1o1 * l1o1 * l1o1)
l2o2=l2o2+ (0) * (skew * skew * skew)
l2o2=l2o2+ (-0.4186)*(skew * l1o1 * l1o1)
l2o2=l2o2+ (0.1866)*(l1o1 * skew * skew)
l3o3=-0.0009
l3o3=l3o3+ (-0.1613) * (l2o1)
l3o3=l3o3+ (1.1353) * (l2o2)
l3o3=l3o3+ (1.6504) * (l2o1 * l2o1)
l3o3=l3o3+ (1.6929) * (l2o2 * l2o2)
l3o3=l3o3+ (-3.4651) * (l2o1 * l2o2)
l3o3=l3o3+ (17.8793)*(l2o1 * l2o1 * l2o1)
l3o3=l3o3+ (-2.9467)*(l2o2 * l2o2 * l2o2)
l3o3=l3o3+ (-36.9399)*(l2o2* l2o1 * l2o1)
l3o3=l3o3+ (20.7681)*(l2o1 * l2o2 * l2o2)
l3o2=0.0124
l3o2=l3o2+ (1.0601) * (l2o2)
l3o2=l3o2+ (0.0098) * (skew)
l3o2=l3o2+ (-0.7028) * (l2o2 * l2o2)
l3o2=l3o2+ (-0.0034) * (skew * skew)
l3o2=l3o2+ (0.0971) * (l2o2 * skew)
l3o2=l3o2+ (0.4137)* (l2o2 * l2o2 * l2o2)
l3o2=l3o2+ (0) * (skew * skew * skew)
l3o2=l3o2+ (-0.235)* (skew * l2o2 * l2o2)
l3o2=l3o2+ (-0.0059)*(l2o2 * skew * skew)
l4o1=-0.0177
l4o1=l4o1+ (-0.3696) * (l3o3)
l4o1=l4o1+ (1.4537) * (l3o2)
l4o1=l4o1+ (3.2139) * (l3o3 * l3o3)
l4o1=l4o1+ (4.2476) * (l3o2 * l3o2)
l4o1=l4o1+ (-7.0018) * (l3o3 * l3o2)
l4o1=l4o1+ (-0.1307)*(l3o3 * l3o3 * l3o3)
l4o1=l4o1+ (-7.5133)*(l3o2 * l3o2 * l3o2)
l4o1=l4o1+ (-5.6853)*(l3o2 * l3o3 * l3o3)
l4o1=l4o1+ (13.6263)*(l3o3 * l3o2 * l3o2)
l3o1=-0.0092
l3o1=l3o1+ (1.2443) * (l2o2)
l3o1=l3o1+ (-0.2622) * (l2o1)
l3o1=l3o1+ (1.623) * (l2o2 * l2o2)
l3o1=l3o1+ (1.4277) * (l2o1 * l2o1)
l3o1=l3o1+ (-3.1749) * (l2o2 * l2o1)
l3o1=l3o1+ (-3.0658)*(l2o2 * l2o2 * l2o2)
l3o1=l3o1+ (18.4779)*(l2o1 * l2o1 * l2o1)
l3o1=l3o1+ (21.2166)*(l2o1 * l2o2 * l2o2)
l3o1=l3o1+ (-37.9024)*(l2o2* l2o1 * l2o1)
l4o3=-0.0167
l4o3=l4o3+ (-0.2434) * (l3o1)
l4o3=l4o3+ (1.3298) * (l3o2)
l4o3=l4o3+ (2.6275) * (l3o1 * l3o1)
l4o3=l4o3+ (3.802) * (l3o2 * l3o2)
l4o3=l4o3+ (-5.9747) * (l3o1 * l3o2)
l4o3=l4o3+ (0.0312)* (l3o1 * l3o1 * l3o1)
l4o3=l4o3+ (-6.736)* (l3o2 * l3o2 * l3o2)
l4o3=l4o3+ (-5.3589)*(l3o2 * l3o1 * l3o1)
l4o3=l4o3+ (12.334)* (l3o1 * l3o2 * l3o2)
l5o1=-0.0265
l5o1=l5o1+ (0.2206) * (l4o1)
l5o1=l5o1+ (0.7282) * (l4o3)
l5o1=l5o1+ (0.0982) * (l4o1 * l4o1)
l5o1=l5o1+ (-2.7116) * (l4o3 * l4o3)
l5o1=l5o1+ (3.2379) * (l4o1 * l4o3)
l5o1=l5o1+ (-0.0158)*(l4o1 * l4o1 * l4o1)
l5o1=l5o1+ (3.2853)* (l4o3 * l4o3 * l4o3)
l5o1=l5o1+ (-9.306)* (l4o3 * l4o1 * l4o1)
l5o1=l5o1+ (6.1) * (l4o1 * l4o3 * l4o3)
l5o1= l5o1* (0.0002) + (0.5)
rem nominal l5o1 between .4988 and .5012
rem worm l5o1 set to 1
if l5o1 < .4988 then l5o1 = 1.0
if l5o1 > .5012 then l5o1 = 1.0
open "out1" for output as #1
print #1, l5o1
close #1
end
Fig. 3. QuickBASIC source code for the Worm Detector Data Model. For the nominal only data, each 256-point window flagged as being like the nominal training data. Next, statistics were generated for the unseen slammer packet data using 256-point windows. These statistics are then passed through the Data Model equation. Since this data file represents a single “worm” and no nominal packet data, all windows should flag as tip-off. Passing this data through the Data Model provided constant tip-off for all data windows as expected.
Third, statistics for an unseen worm attack (both worm packet and data packets returning the error codes generated by the attack) are presented to the Data Model for consideration. As was the case in processing the nominal data and the slammer packet data, areas that are nominal should pass through the Data Model while those that are flagged by the Data Model as being different than the training data should flag as off-nominal. Using the Data Model, data was flagged as different than the nominal data used during training, indicating the presence of a worm. Finally, a different type of worm based web attack (red probe) is presented to the Data Model. This data set consisted of a much larger region of nominal behavior before and after the worm, and a much shorter length worm than in the previous case. The Data Model flagged the new worm packet data as different than surrounding nominal packet data. Figure 4 shows the decision map generated from the Data Model and the location in the tip-off region of the worm packets for each of the data sets. This graphically captures the information contained in the LUT found in Figure 5.
0.0
Std
. Dev
.30
.0
0.4988
1.0
-2.3 Skewness 1.8
0.5012
Nominal
Tip -Off
WormPackets
Fig. 4. Decision map generated from the equation Data Model with the Red Probe worm packets superimposed.
The lookup table that can be embedded in FPGAs and used instead of the equation Data Model is given in Figure 5. The LUT gives the final Data Model nominal or tip-off decision. The size of the LUT here is 25x25; however, there is no limitation on size, nor is there a constraint that the LUT be square. In fact, the LUT can be whatever size is required for adequate characterization of the Data Model equation while keeping the LUT as small as possible. To fit the LUT on a single page and insuring legibility, the skewness axis on the LUT was split into two pieces, with the lower values presented in the top half and the higher skewness values in the lower half. Standard deviation and skewness are read directly off the bottom and left axes and the corresponding LUT value read from the intersection. Tip-off regions in the LUT are represented by ones (1) in the individual elements. There is no need for recording the value of the tip-off, and the use of ones helps minimize and fix the size of the LUT for presentation. As test cases, 8 raw binary sequences that can be passed through the LUT are provided in Table 1 (4 nominal cases followed by 4 worm cases). In reading the numbers from the LUT, if an input value fell between two columns on the LUT, the neighborhood surrounding where the value would occur on the table was read and the largest value away from ½ chosen to represent the calculation. The results of passing these 8 cases through the LUT are given in Table 2.
4. SUMMARY Data Modeling only requires examples of nominal behavior in order to train. Classical methods are limited both by the amount of time required to compare the current data sequence with a historical database of potential candidates and by the inability to accurately classify information that was unseen in the training process. Data Models do not suffer from these limitations; rather, they provide single pass solutions that accurately classify information unseen during training as being different. The results are highly tuned and fast running equation models that can be used at real-time OC-48 and OC-192 speeds to detect the presence of anomalies such as worms, which was successfully demonstrated in this work.
-2.3303
-2.1591
-1.9878
-1.8166
-1.6453
-1.4741
-1.3028
-1.1316
-0.9604
-0.7891
-0.6179
-0.4466
-0.2754
20.000021.250022.500023.7500
Skewness
25.000026.250027.500028.750030.0000
Sta
ndar
d D
evia
tion
1.25002.50003.75005.00006.25007.50008.7500
10.000011.250012.500013.750015.000016.250017.500018.7500
1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 0.5000 0.5000 0.50001 1 1 1 1 1 1 1 1 0.5001 0.5000 0.5000 0.50001 1 1 1 1 1 1 1 1 0.5001 0.5000 0.5000 0.50001 1 1 1 1 1 1 1 1 0.5001 0.5000 0.5000 0.50001 1 1 1 1 1 0.4997 0.5002 0.5001 0.5000 0.5000 0.5000 0.50001 1 1 0.5001 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.50001 1 1 1 1 1 0.5004 0.5001 0.5000 0.5000 0.5000 0.5000 0.50001 1 1 1 1 1 1 1 1 1 1 1 10.0000
-0.1042
0.0671
0.2383
0.4096
0.5808
0.7521
0.9233
1.0945
1.2658
1.4370
1.6083
1.7795
Skewness
1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1
0.5000 0.5000 0.5000 1 1 1 1 1 1 1 1 10.5000 0.5000 0.5000 0.5000 0.5000 0.5001 1 1 1 1 1 10.5000 0.5000 0.4999 0.4999 0.5000 0.5000 0.5001 0.5001 1 1 1 10.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5001 0.5001 0.5001 1 10.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5001 0.5001 0.5001 0.5001 0.5001 10.5000 0.5000 0.5000 0.5000 0.5001 0.5001 0.5001 0.5001 0.5001 0.5001 0.5000 10.5000 0.5000 0.5000 0.5000 0.5001 0.5001 0.5001 0.5000 0.5000 1 1 10.5000 0.4999 0.4999 0.5000 0.5001 0.5001 1 1 1 1 1 1
20.000021.250022.500023.750025.000026.250027.500028.750030.0000
Sta
ndar
d D
evia
tion
1.25002.50003.75005.00006.25007.50008.7500
10.000011.250012.500013.750015.000016.250017.500018.7500
0.0000
Fig. 5. LUT derived from equations and source code in Figures 2 and 3.
Case 1 0100011101000101010101000010000000101111001000000100100001010100010101000101000000101111001100010010111000110001000011010000101001001000011011110111001101110100001110100010000001110111011101110111011100101110011000110110110001101111011101010110010001110011
Case 2 0110100001101001011001010110110001100100001011100110001101101111011011010000110100001010010101010111001101100101011100100010110101000001011001110110010101101110011101000011101000100000010011010110111101111010011010010110110001101100011000010010111100110101
Case 3 1001100110011001100110011001100110011001100110011001100110011001100110011001100110011001100110011001100110011001100110011001100110011001100110011001100110011001100110011001100110011001100110011001100110011001100110011001100110011001100110011001100110011001
Case 4 0010000001110100011001010111100001110100001011110111100001101101011011000010110001100001011100000111000001101100011010010110001101100001011101000110100101101111011011100010111101111000011011010110110000101100011000010111000001110000011011000110100101100011
Case 5 1011000001000010101110000000000100000001000000010000000100110001110010011011000100011000010100001110001011111101001101010000000100000001000000010000010101010000100010011110010101010001011010000010111001100100011011000110110001101000011001010110110000110011
Case 6 1101100111111111100010110100010110110100100011010000110001000000100011010001010010001000110000011110001000000100000000011100001011000001111000100000100000101001110000101000110100000100100100000000000111011000100010010100010110110100011010100001000010001101
Case 7 0100100000000100010010000000010001001000000001000100100000000100010010000000010001001000000001000100100000000100010010001001000010010000100100001001000010010000100100001001000010010000100100001001000010010000100100001001000010010000100100001001000010010000
Case 8 0101100001011000010110000101100001011000010110000101100001011000010110000101100001011000010110000101100001011000010110000101100001011000010110000101100001011000010110000101100001011000010110000101100001011000010110000101100001011000010110000101100001011000
Table 1. 8 example cases.
Case Std. Dev. Skewness Result 1 5.1023 +/- 1.5098 0.7897 +/- 0.5339 Nominal 2 1.0863 +/- 1.5098 0.6205 +/- 0.5339 Nominal 3 0.3542 +/- 1.5098 0.0000 +/- 0.5339 Nominal 4 1.6501 +/- 1.5098 0.0992 +/- 0.5339 Nominal 5 10.5318 +/- 7.5622 0.4855 +/- 0.5270 Slammer Worm 6 12.2758 +/- 7.5622 -0.0819 +/- 0.5270 Slammer Worm 7 20.3279 +/- 7.5622 0.1443 +/- 0.5270 WebDav Worm 8 9.2613 +/- 7.5622 0.0000 +/- 0.5270 Red Probe Worm
Table 2. Results of passing statistics from 8 example cases through the LUT.
REFERENCES 1. Cohen, F.B., Protection and Security on the Information Superhighway, New York: John Wiley and Sons, c1995. 2. Denning, P.J., Computers Under Attack: Intruders, Worms, and Viruses, Reading, MA: Addison-Wesley, c1990. 3. Hurley, R.B, Decision Tables in Software Engineering, New York: Van Nostrand, c1983. 4. Jaenisch, H.M., Handley, J.W., Faucheux, J.P., Harris, B., “Data Modeling of network dynamics”, Proceedings of
SPIE, San Diego, CA, Aug 4, 2003. 5. Jaenisch, H.M., Handley, J.W., Faucheux, J.P. “Data Driven Differential Equation Modeling of fBm processes”,
Proceedings of SPIE, San Diego, CA, August 4, 2003. 6. Jaenisch, H., Handley, J., Pooley, J., Murray, S., “Virtual Instrument Prototyping With Data Modeling”, JANNAF
39th Combustion, 27th Airbreathing Propulsion, 21st Propulsion Systems Hazards, and 3rd Modeling and Simulation Subcommittees Joint Meeting, Colorado Springs, CO, (December 1-5, 2003).
7. Jaenisch, H., Handley, J., “Automatic Differential Equation Data Modeling for UAV Situational Awareness”, Society for Computer Simulation (Huntsville Simulation Conference 2003), Huntsville, AL, October 30-31, 2003.
8. Jaenisch, H., Handley, J., Lim, A., Filipovic, M., White, G., Hons, A., Deragopian, G., Schneider, M., Edwards, M., “Data Modeling for Virtual Observatory data mining”, SPIE Astronomical Telescopes and Instrumentation 2004, Glasgow, Scotland, UK, June 24, 2004.