dedicated to embedded solutions
TRANSCRIPT
DEDICATED TO EMBEDDED SOLUTIONS
RELIABILITY IN SUBSEA ELECTRONICS TECHNIQUES TO OBTAIN HIGH RELIABILITY
STIG-HELGE LARSEN KARSTEN KLEPPE DATA RESPONS 2012-10-16
AGENDA
Introduction
Analysis and Design Techniques
Reliability Predictions
FRACAS and Data Processing Techniques
Production and Repair
Testing
Reliability Program Planning
2
THIS IS DATA RESPONS
We are a full-service, independent technology company and a leading player in the embedded solutions market. ESTABLISHED: 1986 Listed on the Oslo Stock Exchange (Ticker: DAT) CERTIFICATIONS: ISO 9001:2008 ISO 14001:2004 OHSAS 18001:2007 EMPLOYEES: 465
CUSTOMISATION
4
Humidity
Altitude
Temperature
Vibration
Salt spray
Shock
EMC
Physical size
Interfaces
Functionality
Performance
Power demands
Regulations
Standards
Operating systems
Software architecture
Hardware platform
Processor architecture
Memory and storage
Communication & I/O
Display and touch
EXTREME
CONDITIONS
CHOICE OF
TECHNOLOGY
CUSTOM
SPECIFICATION
EXAMPLE: CURRENT SENSOR BOARD
Meassuring range: 0.2–1.2 A AC
Accuracy: Better than ± 1.0 %
CAN bus interface
4-20 mA outputs
Qualified according to ISO 13628-6 for Subsea Production Control Systems
Based on Hall effect current sensor
5
RELIABILITY IN SUBSEA ELECTRONICS
INTRODUCTION
Reliability in Data Respons
Reliability study
IEC 61508
QA system
Reliability
The ability of an item to perform a required function under stated conditions for a specified period of time
Availability
The proportion of time for which the equipment is able to perform its function
7
SUBSEA
Characteristics
Relative low volumes
Need for high reliability
Low accessibility
High cost in case replacements
8
KEY POINTS
9
Techniques to obtain high reliability in electronics
Topic Areas:
Relevant Themes:
Key Points: Design Techniques and Analysis
Root Causes of Failures
Failure Reporting and Corrective Actions System
Automated Testing
Accelerated Stress Testing
Reliability Program Plan
ANALYSIS AND DESIGN TECHNIQUES
10
Techniques to obtain high reliability in electronics
Topic Areas:
Relevant Themes:
Key Points: Analysis and Design Techniques
Root Causes of Failures
Failure Reporting and Corrective Actions System
Automated Testing
Accelerated Stress Testing
Reliability Program Plan
ANALYSIS AND DESIGN TECHNIQUES
ANALYSIS AND DESIGN TECHNIQUES
Start with evaluation of the relationships between different parts of the system
Evaluate different design alternatives
Follow design guidelines
12
ANALYSIS AND DESIGN TECHNIQUES
Use design checklists
Arrange design reviews
Perform stress analysis and derating of components
13
ANALYSIS AND DESIGN TECHNIQUES
Failure Mode, Effects and Criticality Analysis (FMECA)
identifies potential failure modes
lists the effects of failures
basis for eliminating mission-critical, single-point failures
14
Hardware
Design
FMECA
Component
Data[Base]
Failure
Modes
Failure
Effects
Failure Rate
& Criticality
Numbers
ANALYSIS AND DESIGN TECHNIQUES
Failure Mode, Effects and Diagnostic Analysis (FMEDA)
includes diagnostic coverage
(the ability of any automatic diagnostics to detect failures)
15
Hardware
Design
FMEDA Component
Data[Base]
Failure
Modes
Failure
Effects
Failure Rate
& Criticality
Numbers
Diagnostic
Coverage
FMECA - EXAMPLE OF DA FORM 7611
16
FMECA - EXAMPLE OF DA FORM 7612
17
ANALYSIS AND DESIGN TECHNIQUES
Redundancy
duplicating critical parts
usually in the case of a backup or fail-safe
18
ANALYSIS AND DESIGN TECHNIQUES
Software Development Plan
Describing software development methodology and techniques
including reviews, coding standard, and testing.
Key aspect of the software reliability program.
The software reliability depends on the number of software faults.
Testing is very important for software:
every individual unit
integration
full system
19
ANALYSIS AND DESIGN TECHNIQUES
Design for Test (DFT)
make it easier to implement low level manufacturing tests
Built-In Test (BIT)
to achieve high reliability for a lower cost
Automatic Reset Features
restart if critical events
lack of communications, or
improper software operation.
20
Typical Board with Boundary-Scan Components
Source: Corelis
ANALYSIS AND DESIGN TECHNIQUES
Thermal Analysis
good working temperature for every chip
to achieve the required design for reliability and performance
Electromagnetic Analysis
good electromagnetic compatibility (EMC) design
for correct operation of different equipment in the same electromagnetic environment
21
ANALYSIS AND DESIGN TECHNIQUES
Accelerated Testing
using high stresses to get failures quickly
22
ANALYSIS AND DESIGN TECHNIQUES
Root Cause Analysis (RCA)
to correct or eliminate root causes
a tool of continuous improvement
Reliability Growth Analysis
collecting, modeling, analyzing and interpreting data
learn improvement done in the reliability of a product
23
RELIABILITY PREDICTIONS
RELIABILITY PREDICTIONS
A quick reliability analysis for the designed system is needed
MTBF is often used as a measure for reliability
Restricted to operation under stated conditions
Important to use a relevant prediction calculation procedure
25
RELIABILITY PREDICTIONS
Abstract from reliability analysis checklist in MIL-HDBK-217
26
RELIABILITY PREDICTIONS
Factors that affect the MTBF figures from vendors
Prediction methods
Predefined conditions
Quality level of components
The source and assumptions for the base failure rate of each component type
The vendors’ assumptions need to be understood.
MTBF – a indicator of reliability
27
RELIABILITY PREDICTIONS
What is the use of reliability predictions?
assessment of whether reliability goals (e.g. MTBF) can be reached
identification of potential design weaknesses
evaluation of alternative designs and life-cycle costs
the provision of data for system reliability and availability analysis
28
FRACAS & DATA PROCESSING TECHNIQUES
FRACAS & DATA PROCESSING TECHNIQUES
30
Techniques to obtain high reliability in electronics
Topic Areas:
Relevant Themes:
Key Points: Analysis and Design Techniques
Root Causes of Failures
Failure Reporting and Corrective Actions System
Automated Testing
Accelerated Stress Testing
Reliability Program Plan
FRACAS
FRACAS: Failure Reporting And Corrective Action System
31
Pareto chart: To highlight the most important among a (typically large) set of factors.
The most frequent fault causes will vary from item to item.
“No fault found” and “Root cause unknown” will often amount to a larger part of all cases.
DATA ANALYSIS: PARETO CHART
32
DATA ANALYSIS: NO FAULT FOUND
Some possible reasons for no fault found (NFF):
a seldom failure hard to recreate (e.g. failure under special conditions)
the failure is coming and going (e.g. a loose connection)
there has never been a fault on the item
33
DATA ANALYSIS: INTERMITTENT FAILURES
Intermittent Failures:
The system performs incorrectly only under certain conditions, but not others.
Can cause the same system failure if reinstalled, and can therefore generate high costs.
34
DATA ANALYSIS: PARETO CHART
Example – summarized The following categories in particular need attention:
1. Power circuit 2. PCB production / assembly 3. Input/output circuit 4. Firmware 5. Connectors or internal cables
Also often relevant for some items: 6. Secondary storage / external memory (disk) 7. Mechanical damage 8. Batteries 9. Software 10. CPU module 11. Others – for instance
short circuit internal memory (RAM) fault defect fan errors in procedure design fault
35
PRODUCTION AND REPAIR
PRODUCTION AND REPAIR
Some relevant topics:
Errors during production tests and field errors will correlate
Follow-up of suppliers
Production batch volume for electronics
Saving test data so that analysis is easily
ISO 20815 standard – Production assurance and reliability management
37
PRODUCTION AND REPAIR
IPC-A-610 - Acceptability of Electronic Assemblies
IPC J-STD-001 - Requirements for Soldered Electrical and Electronic Assemblies
IPC product classes:
CLASS 1 - General Electronic Products
CLASS 2 – Dedicated Service Electronic Products
CLASS 3 – High Performance Electronics Products
38
PRODUCTION AND REPAIR
Rework
implies a risk for the reliability, and therefore it should be requirements about the maximum allowed rework
should be substantiated and documented for each serial number
IPC-7711/7721 is the IPC standard for rework, modification and repair
39
HANDLING ELECTRONIC ASSEMBLIES
Electrostatic discharge (ESD) can occur with no visible signs of damage.
40
HANDLING ELECTRONICS ASSEMBLIES
Two simple principles of electrostatic safe handling are:
1. Only handle sensitive components in an ESD Protected Area (EPA).
2. Protect sensitive devices outside the EPA using ESD protective packaging
41
TESTING
TESTING
43
Techniques to obtain high reliability in electronics
Topic Areas:
Relevant Themes:
Key Points: Analysis and Design Techniques
Root Causes of Failures
Failure Reporting and Corrective Actions System
Automated Testing
Accelerated Stress Testing
Reliability Program Plan
AUTOMATED TESTING
Why automated testing?
human errors can be minimized
more thorough testing
enable monitoring of variations in test results
do several tests very quickly and find potential points of failure
44
AUTOMATED TESTING
Automatic Optical Inspection (AOI)
takes time to set up correctly
Automated X-Ray Inspection (AXI)
in many ways similar to AOI except that it can look through IC packages
45
Example from Axiomtek
AUTOMATED TESTING
In-Circuit Test (ICT)
often limited when pins for contact don’t get access on boards
Manufacturing Defect Analyzer (MDA)
does not check the operation of ICs
46
ICT example from RNS International
AUTOMATED TESTING
JTAG Boundary Scan
widely used
much of a board to be tested with only minimal access
its standard is IEEE 1149.1
boundary scan integrated circuits (ICs) connected serially on a board
47
Typical Board with Boundary-Scan Components
Source: Corelis
AUTOMATED TESTING
Functional Automatic Test System
use equipment for testing the function of a circuit
48
Example on a software-defined test system
from National Instruments
AUTOMATED TESTING
Built-In Test (BIT)
good accessibility to the hardware
often less-expensive tests
Loop back test
connecting transmitter and receiver on the same board
Some form of external tests will usually be required in addition to self-diagnostics
49
AUTOMATED TESTING
For testing of external interfaces using a standard protocol, a software tool can be purchased for testing and data logging
By analyzing data from testing, production areas that need attention and improvement can be pinpointed.
50
STRESS TESTING - ISO 13628 PART 6
ISO 13628 part 6 for subsea production control systems:
Qualification and EMC (electromagnetic compatibility):
Shock
Vibration
Temperature
EMC tests
ESS (Environmental Stress Screening) during production:
Random vibration
Thermal cycling
Burn-in
Final functional test
51
BATH TUB CURVE
52
HALT - HIGHLY ACCELERATED LIFE TESTING
53
Source: Turin Networks
HALT
to provoke failures commonly seen after long-term use within a relatively short period of time
take corrective measures – either changes to the design or changes in the production process
HALT - HIGHLY ACCELERATED LIFE TESTING
54
Typical tests are:
Cold Step Test
Hot Step Test
Rapid Temperature Cycling Test (e.g. 60°C/minute ramp-rate)
Stepped Vibration (random) Test
Combined Environment Stress
HASS - HIGHLY ACCELERATED STRESS SCREENING
HASS
production equivalent of HALT
to find manufacturing/ production process induced defects
55
Source: Turin Networks
Common screen varieties
RELIABILITY PROGRAM PLAN
RELIABILITY PROGRAM PLAN
57
Techniques to obtain high reliability in electronics
Topic Areas:
Relevant Themes:
Key Points: Analysis and Design Techniques
Root Causes of Failures
Failure Reporting and Corrective Actions System
Automated Testing
Accelerated Stress Testing
Reliability Program Plan
RELIABILITY PROGRAM PLAN
Reliability Program Plan
include required activities, methods, analyses, tools, and test strategies for the system
important to reach the required reliability
58
WWW.DATARESPONS.COM