engineering data analysis & modeling practical solutions to practical problems
DESCRIPTION
Engineering Data Analysis & Modeling Practical Solutions to Practical Problems. Dr. James McNames Biomedical Signal Processing Laboratory Electrical & Computer Engineering Portland State University. Course Overview. Key question: How to extract useful information from data? Some theory - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/1.jpg)
Engineering Data Analysis & ModelingPractical Solutions to Practical Problems
Dr. James McNamesBiomedical Signal Processing Laboratory
Electrical & Computer Engineering
Portland State University
![Page 2: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/2.jpg)
Course Overview
• Key question: How to extract useful information from data?
• Some theory• Mostly methods & applications• Problem oriented, not technology focused• Project course
![Page 3: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/3.jpg)
Talk Overview
• Problem definitions
• Applications
• Project ideas
• Course specifics
![Page 4: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/4.jpg)
Problem Definitions
• Preprocessing (briefly)– Variable selection
– Dimension reduction
• Decision theory (hypothesis testing)• Density estimation• Nonlinear optimization• Pattern recognition/Classification (very briefly)• Nonlinear modeling (univariate & multivariate)
![Page 5: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/5.jpg)
Variable Selection
P(t+1)
NonlinearModel
Inputs
P(t)Previous Price
C(t)Competitor's Price
G(t)Greenspan's BP
Output
• Many algorithms fail if too many inputs• Often fewer inputs are sufficient due to
– Redundant inputs– Irrelevant inputs
• Goal: Find a subset of inputs that maximizes model accuracy
• Is Greenspan’s BP relevant?
![Page 6: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/6.jpg)
Dimension Reduction
• Redundant inputs can also be combined into a smaller composite set– Improves accuracy– Reduces computation
• If done well, minimal information is lost• Used for signal compression• Principal component analysis is most common
yNonlinear
Model
Raw Inputs
x
Output
DimensionReduction
u
Features
![Page 7: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/7.jpg)
Dimension Reduction Example 1
![Page 8: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/8.jpg)
Dimension Reduction Example 2
![Page 9: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/9.jpg)
Nonlinear Optimization
• Find the vector a such that E(a) is minimized• Many algorithms have parameters that must be
“fit” to the data• Usually “fit” by minimizing error measure• Sometimes subject to a constraint G(a) = 0• Unconstrained optimization more common• Very widely used• Many engineering applications
![Page 10: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/10.jpg)
Pattern Recognition
• Closely related to nonlinear modeling
• Goal is to identify most likely category given an input vector
• Equivalent to drawing decision boundaries
• Following example– Crab data– Four categories– Two composite inputs
![Page 11: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/11.jpg)
Crabs Data Set
![Page 12: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/12.jpg)
Biomedical Application
• Goal: identify brain cell types from microrecordings
• Current research project
• 5 categories of cell types
• Created metrics to characterize signals
• Following scatterplot shows 2 of these metrics
![Page 13: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/13.jpg)
Neurosurgery Example
![Page 14: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/14.jpg)
Nonlinear Modeling
• Given many examples of observed variables, create a model that can predict the output
• No other assumed knowledge• Observed variables
– Quantitative– Measurable
z1,...,zn
x1,...,xn y
ProcessObservedVariables
UnobservedVariables
Output
xn ,...,xn
ObservedVariables
c
dc+1z
![Page 15: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/15.jpg)
Nonlinear Modeling
• Observed variables may not be causal• Not all causal effects are observed• Model will not be perfect• How do you measure how good the model is?
z1,...,zn
x1,...,xn y
ProcessObservedVariables
UnobservedVariables
Output
xn ,...,xn
ObservedVariables
c
dc+1z
x1,...,xn y
ModelObservedVariables Output
d
![Page 16: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/16.jpg)
Smoothing
• For single-input single-output (SISO) systems, can plot the data
• Problem is to estimate a curve that most accurately predicts future points
• Could draw a smooth curve by hand
• More difficult to implement automatically
• More than one curve may be reasonable
![Page 17: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/17.jpg)
Smoothing Example
![Page 18: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/18.jpg)
Multiple “Reasonable” Solutions
![Page 19: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/19.jpg)
Nonlinear Modeling
• Many methods do not work well
• Usually is much more difficult– Noise– Multiple inputs– Time-varying system– Small data sets
• Still an active area of research
• Will discuss "tried and true” solutions
![Page 20: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/20.jpg)
Overview of Course
• Introduction & review
• Linear models
• Univariate smoothing
• Optimization algorithms
• Nonlinear modeling
• Pattern recognition & classification
![Page 21: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/21.jpg)
Application Areas
• Engineering– Controls (system identification)– Signal processing (estimation & prediction)– Communications (channel equalization)
• Statistics
• Mathematics
• Computer science
• Systems science
![Page 22: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/22.jpg)
Application Examples
• Time series prediction– Aircraft carrier landing systems
• Spatial Wafer Patterns
• Fault Detection
• Machinery health monitoring
• Automated, objective credit rating
• Fraud detection
![Page 23: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/23.jpg)
Time Series Prediction
![Page 24: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/24.jpg)
Spatial Wafer Patterns
![Page 25: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/25.jpg)
Wafer Components
![Page 26: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/26.jpg)
Estimation (Regression) Results
![Page 27: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/27.jpg)
Fault Detection in Semiconductor Manufacturing
![Page 28: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/28.jpg)
Aircraft Carrier Landing System
• Can be very hard– Limited visibility– Rough seas– Night
• Predict location at touch down– Flight deck– Aircraft
• Is rocking of flight deck predictable?
![Page 29: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/29.jpg)
Machinery Health
Monitoring
• Cost of machinery failure can be very high• Recent growth in real-time monitoring
– Health and Usage Monitoring Systems (HUMS)– Condition Based Maintenance (CBM)
• Reduce costs• Increase safety
![Page 30: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/30.jpg)
Fraud Detection
• Credit card fraud cost $864 million in 1992
• How quickly can fraud be detected?
• The companies have amassed large data bases
• What are the patterns of fraud?
• Active area of research
![Page 31: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/31.jpg)
Past Projects
• Many past projects – See reports & slides on the web
• Many time series applications– Need not be time series related
• Many have resulted in conference and journal publications
• Expect improved quality this term
![Page 32: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/32.jpg)
Project Ideas
• It is up to you to identify a project
• Preferred– Data readily available (no new instrumentation
or study design)– Independent samples (not time series data)– Engineering related– High likelihood of success (no financial
forecasting)
![Page 33: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/33.jpg)
Course Logistics
• Project oriented– Project reports– Must meet IEEE journal requirements– May be encouraged to publish– Brief oral slide presentation at end of term
• Most projects are applied
• May also create new methods or compare existing methods
![Page 34: Engineering Data Analysis & Modeling Practical Solutions to Practical Problems](https://reader034.vdocuments.us/reader034/viewer/2022050909/56814f60550346895dbd15c4/html5/thumbnails/34.jpg)
Prerequisites
• Helpful– Random processes (ECE 565)– Signal processing (ECE 566)– Proficient at MATLAB or similar
• Required– Calculus – Probability & statistics (STAT 451)– Linear algebra (MTH 343)– Proficiency at programming