how data collection shapes mi performance
TRANSCRIPT
Manufacturing Intelligence for Intelligent Manufacturing
How Data Collection
Shapes Manufacturing
Intelligence Performance
Enterprise Manufacturing Intelligence (EMI) is a
term which applies to software used to bring a corporation's
manufacturing-related data together from many sources for the
purposes of reporting, analysis, visual summaries, and
passing data between enterprise-level and plant-floor systems.
As data is combined from multiple sources, it can be given a
new structure or context that will help users find what they
need regardless of where it came from.
The primary goal is to turn large amounts of manufacturing
data into real knowledge and drive business results based on
that knowledge.
Enterprise Manufacturing Intelligence
Working Definition
Wikipedia, others
Core Functions of EMI*
• Aggregation: Making available data from many sources, most often databases.
• Contextualization: Providing a structure, or model, for the data that will help users find what they need.
• Analysis: Enabling users to analyze data across sources and especially across production sites.
• Visualization: Providing tools to create visual summaries of the data to alert decision makers and call attention to the most important information of the moment.
• Propagation: Automating the transfer of data from the plant-floor up to enterprise-level systems or vice versa.
*AMR/Gartner
• EMI is based on the (statistical) analysis of data
collected from the manufacturing process.
• The most important element of successful
statistical analysis is the collection of data.
• If the data collection process is flawed, simple
statistical techniques will fail and sophisticated
techniques can’t fix it
• Bad Data = Bad Analytics = Bad Intelligence.
“Intelligence” is based on Analytics
• Data alone, or data compared to limits that were not
determined statistically can only provide some sense of
what a process is doing.
• Analytics helps provide meaning by identifying key
events and relationships with a known certainty.
• The following example of applied Statistical Process
Control (SPC) analysis illustrates the value of Analytics.
• SPC determines if variation in a process is unusual,
detects events, and helps point to the source or cause.
The Importance of Analytics
This is a “Run Chart” – data is displayed in a line graph with no
analysis of the data. Are any points unusually high or low?
?
?
This is an “SPC Chart” of the same data where upper and lower limits have
been calculated to determine if any of shows unusual variation. This data
shows normal variation – there are no unusually high or low points.
This is another “Run Chart” – are any points on this chart
unusually high or low?
?
?
This is the same data displayed on an SPC Chart. Note that one
point has been found to be unusually high (and worth investigating).
Two key process variables – one showing normal variation and
the other indicating that something unusual is happening.
If this is a process that is has been having its problems, these
charts will be invaluable in determining the cause.
Combining statistical limits and specifications/process set-point can
create the possibility of an “early warning” system – a simple
predictive analytic.
Upper
Specification
Upper SPC Limit
Lower SPC Limit
Lower
Specification
??
• Missed Signals – Systems fail to detect
problems
• False Alarms – Analytics indicate problems
that aren’t there
• Unreliable KPI’s
• Loss of faith in Analytics and Intelligence
systems
Consequences of Poor Data Collection Practices
• Manual sampling and collection
• Automated data collection systems
• Existing data
Primary Data Sources in Manufacturing
Influences:
• History - it was like this when I got here…
• Folk wisdom (not the result of study/analysis)
• Cost
• Convenience
Results• Overly complex methodology
• Non-random sampling
• Insufficient data
• Important data not collected
In many industries, the majority of data is collected manually
(food, consumer products, most types of packaging,
materials)
Manual Sampling and Collecting
Manual Sampling and Collecting Issues
Incoming tank car containing raw material – multiple
samples taken from the same car…
If material in car is homogenous (well mixed) the extra
samples are identical, offer no additional information, and
will affect any statistical analysis performed. If data is
“sub-grouped”, SPC charts will not work.
If the material in the car is stratified, but is mixed/blended
before use, the samples do not represent the material
used in the process.
The sample(s) taken must represent the material as it is
used in the process.
Manual Sampling and Collecting Issues
Sheet/roll process with samples taken of material before
roll-up. Difficulty in reaching across roll results in:
Easier to check the edges, misses 30% of the product…
x
x x
x xx
x
x
x
xxx
xx
x
x xx
x
x
x
xxx
Manual Sampling and Collecting Issues
Product packaged in boxes with multiple compartments:
Sample 5 items from left side on every other box, sample
5 items from right side on alternating boxes every 15
minutes, sample 5 on each side every hour, sample all
items in one box each shift, unless an out-of-spec item is
found then double sampling on same side and sample 5
on other side on every box until 10 boxes have been
sampled without an out-of-spec item…uh…except on
Leap Year when we do all of this backward…
Result (among many): Data collected is too inconsistent
to be used to analyze the process – not to mention an
annoyed workforce.
Automated Data Collection
Most data in Chemicals/Petrochemical industry is collected
by automated systems, common in all “Process” industries.
Sources:• DCS
• SCADA
• Process Historians
• Can sample multiple times per second
Types of automatically collected data:• Sensor data (process temperature, pressure, etc.)
• Analytical instrument results (chemical & physical
parameters)
• Control indicators (valve state, machine instructions, etc.)
• Process status (start up, running, shut down, fault)
• Equipment parameters (current load, temperature, speed)
Automated Data Collection
Issues:• Enormous quantities of data
• Temptation to use all of it – hard to convince otherwise
• Overwhelms analytics systems
• Oversampling can result in invalid statistical results
• Most of the data isn’t suitable for statistical analysis
Considerations:• Is the data used for anything
• How is the data used (control, alarms, analysis, reports)
• Response time required
• Process cycle
• Autocorrelation
Data sampled too frequently – the process has not had a chance to
change so the sensor is measuring the same material – the variation
is the sensor’s measurement error and SPC won’t work.
Data sampled at a frequency that allows the process to change –
the sensor is measuring different material and the variation is due to
changes in the process.
Hazards of Existing Data
Examples:• Laboratory Information Management Systems (LIMS)
• Process Historians
• Quality Systems
• MES, ERP
• That database nobody is sure about
Considerations:• Why was the data collected in first place
• Who benefits from data being right (or not-so-right)
• Was the data used for anything important - vetted?
• Were there constraints on the values?
• Can it be sampled (if there is a lot)
• Why analyze the past anyway?
Hazards of Existing Data
Things that make historical data problematic:• Data reduction (averaging, …)
• Data filtering (removing “outliers”)
• Improper sampling (biased)
• Changes is process not identified
• Data isn’t “real”
The problem with Historical Data is you often can’t tell
Data that has been averaged loses potentially important
information – in this case, data that exceeds a key limit:
• Data without context has little or no
meaning.
• Lack of context makes data “un-actionable”.
• The further the data gets from the process,
the more important it is to preserve context.
The Importance of Context
A not unusual chart with no context – just the row
number of the data file used to create the chart:
Knowing the row number of data that shows unusual
behavior doesn’t do much good:
Adding Date/Time helps, but requires looking up other information
from multiple sources to know what is really happening:
Full context – all pertinent information brought forward to the analytics
presentation allows quick recognition of problems and fast response:
Finally, if the users can add information such as Cause and Corrective
Action and have it “stick”, the information resource becomes a
Knowledge Base:
Aggregating Data Across Systems
• Increasingly major issue for NWA’s process
customers
• Provides “total process” understanding
• Helps link product quality to process operations
• Reveals relationships between raw materials, storage,
unit operations, blending, packaging/delivery
• Most “continuous process” operations actually
combine process and batch
• Key is getting a “Batch” view of overall process
• (Some Historians have functions that can help)
Three systems together know what is going on, but
no single system has all the information:
SCADA – Precise date/time,
process unit and parameters
LIMS – Product, approximate
date/time, lab test results
MES – Product, production schedule, line, customer
• Different sampling methods – time, event, and sample-
based
• Difficulty querying historized data (Historians use data
compression)
• Data in different formats, databases, structures
• Lead/lag relationships
• Auto & Cross-correlation problems
• Different analysis techniques
• Data “owned” by different groups (production,
engineering, lab)
Problems Aggregating Data Across Systems
Process Event BatchHistorian LIMS
Process, Event, & Batch Data
Aggregated Process, Event, & Batch Data
SELECT * FROM OpenQuery( INSQL, 'SELECT [DateTime], [Batch%Conc],
[BatchNumber], [ReactLevel], [ReactTemp], [SetPoint] FROM
Runtime.dbo.WideHistory WHERE DateTime >= DATEADD(hour, -1, GETDATE())
AND DateTime <= GETDATE() AND wwRetrievalMode = "cyclic" AND
wwResolution = 60000')
SELECT * FROM OpenQuery( INSQL, 'SELECT [DateTime], [Batch%Conc],
[BatchNumber], [ReactLevel], [ReactTemp], [SetPoint] FROM Runtime.dbo.WideHistory
WHERE DateTime >= DATEADD(hour, -1, GETDATE()) AND DateTime <= GETDATE()
AND wwRetrievalMode = "delta" AND wwValueDeadband = 50 ') wide INNER JOIN
EventHistory ON wide.DateTime = EventHistory.DateTime WHERE
TagName='SysStatusEvent'
Database SQL Queries for Historian only – now all we
need is some SQL for the LIMS and MES and we are all
set…
Conclusions:
• Data collection techniques should focus on data that
represents the process or material.
• The ultimate use of the data should guide how it is
collected.
• Balance the cost of data collection with the value of the
collected data.
• Be aware of the pitfalls of using historical data.
• Avoid the temptation to use “all” of the data that is
available.
• Include as much context as possible as early in the
data collection process as possible.
Questions