the near earth asteroid rendezvous (near) rendezvous burn anomaly

51
SCL Sep-04 The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly Susan C. Lee The Johns Hopkins University Applied Physics Laboratory

Upload: aya

Post on 15-Jan-2016

57 views

Category:

Documents


1 download

DESCRIPTION

The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly. Susan C. Lee The Johns Hopkins University Applied Physics Laboratory. Disclaimer. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

SCL Sep-04

The Near Earth Asteroid Rendezvous (NEAR)Rendezvous Burn Anomaly

Susan C. Lee

The Johns Hopkins University Applied Physics Laboratory

Page 2: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

2SCL Sep-04

Disclaimer

The NEAR Mission ended in February 2001 and some documentation has dissipated. Some of this presentation relies on memory, but is basically accurate.

The Lessons Learned represent my own opinions, not necessarily those of the JHUAPL Space Department, where I have not worked since January 1998.

Page 3: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

3SCL Sep-04

Overview

• NEAR Overview

• Anomaly Description

• Investigation Findings

• Lessons Learned

Page 4: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

4SCL Sep-04

Overview

• NEAR Overview

• Anomaly Description

• Investigation Findings

• Lessons Learned

Page 5: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

5SCL Sep-04

Mission Description

• Three-year cruise to the Near-Earth Asteroid Eros

– Up to 12-day solar transit– Round-trip light times up to 40 min.– Numerous small Trajectory Correction

Maneuvers (TCMs)– 2 Large TCMs using bi-propellant

large velocity adjust thruster- Deep Space Maneuver- Eros Rendezvous Burn

– No critical time windows for TCMs

• One-year science mission orbiting Eros– TCMs planned for once a week– Frequent momentum dumps

Page 6: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

6SCL Sep-04

Spacecraft Description

• Mechanically simple– Fixed solar panels (after one-time

deployment)

– Fixed antennas- High Gain (1º BW)

- Fan Beam (40ºx 8º BW)

- Dual hemispherical low gain

• Electrically simple– Direct energy transfer power system

– 1553 bus/discrete line communications

• Computationally complex– 3-axis active guidance and control using thrusters and momentum wheels

– Careful power management

Page 7: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

7SCL Sep-04

Spacecraft Block Diagram

Page 8: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

8SCL Sep-04

Safing Design

• Goal of safing: keep the S/C viable and make ground contact1. Keep the solar panels pointed at the Sun and the load below the

solar panel output2. Point the fan beam antenna at the Earth and swap redundant RF

systems

• Joint function of the C&T Processor and G&C system

• Coordinated via housekeeping telemetry and discrete lines

Page 9: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

9SCL Sep-04

Spacecraft Mode Descriptions

• Operational– Under ground command (either

real time or uploaded command sequences)

– Solar panel normal near Sun line

• Earth Safe– Solar panel normal on Sun line– Earth in fan beam antenna– Downlink at 10 bps

• Sun Safe - rotate– Solar panel normal on Sun line– Rotate about Sun line at 1 rev/3

hours– Beacon on fan beam downlink

• Sun Safe - freeze– Same as Earth Safe

Page 10: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

10SCL Sep-04

Safing Implementation:Command and Telemetry Processor

• Simple, rule-based autonomy system– Rules checked flags, relay states, housekeeping telemetry, discrete lines

– Triggered rules point to associated command sequence

– No loops or jumps

– Priority dictated by order in the list of rules

• Single Software Mode– Check for commands at the uplink interface/Execute

– Check autonomy commands needed/Execute

– Check for commands in uploaded command sequence/Execute

– Place telemetry on the downlink according to commanded format and rate

• S/C Safing Modes implemented by executing autonomy commands to set desired power state, RF state, formats, etc.

Page 11: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

11SCL Sep-04

Safing Implementation: Guidance and Control (G&C) Processors

• Many, many possible combinations of functions performed on Flight Computer (FC) and Attitude Interface Unit (AIU) using combinations of sensors, actuators and guidance algorithms

– Implemented as in-line code triggered by IF-THEN-ELSE sequences

– Multiple flags, timers and parameters

• NOMINALLY, FC controls the S/C, using the AIU simply as an interface unit

– Guidance algorithms based on stored orbit

– Attitude based on Star Camera and IMU input

– Nominal wheel control; thruster control during TCMs only

• NOMINALLY, AIU checks for FC problems (e.g., Sun-pointing keep-in violated)

– Can ask C&T Processor to switch to backup FC

– Can take over S/C control for Safe Modes

Page 12: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

12SCL Sep-04

Thruster Use Safing - Clementine Prevention

• Thruster hardware enable/disable

• ‘Open’ commands must be sent every 40 ms, else thruster values close automatically

• Separation of thruster use function between C&T Processor and G&C

– C&T processor controls enable/disable- Faulty CTP can only enable thrusters, not open values

– G&C (AIU) controls thruster value open/close - Faulty AIU can try to open values, but won’t succeed unless CTP also

enables thruster hardware

Page 13: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

13SCL Sep-04

Thruster Use Safing - Trajectory Correction Maneuvers

• TCM thruster use under ground control only

• Parameters loaded on G&C in advance and verified prior to burn

• Timetagged commands on C&T Processor initiate and terminate burn

Page 14: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

14SCL Sep-04

Burn Abort Conditions

• NEAR burn philosophy: Better safe than sorry– No burns with critical timing

– Better to shutdown, correct problem if any, and try again

• G&C Burn Shutdown Criteria (partial list)– Attitude Keep-in violated

– Acceleration Keep-in violated

– Anomalous pressure reading on fuel or oxygen tanks

– C&T Processor signals a Safe Mode

• C&T Processor Burn Shutdown Criteria– Loss of AIU Heartbeat for 5 seconds

– Occurrence of any condition normally causing a Safe Mode entry

– G&C signals a Safe Mode

Page 15: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

15SCL Sep-04

Thruster Use Safing - Autonomous Momentum Dumps

• Autonomous use of the thrusters for emergency momentum dump only

Page 16: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

16SCL Sep-04

Overview

• NEAR Overview

• Anomaly Description

• Investigation Findings

• Lessons Learned

Page 17: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

17SCL Sep-04

Preparation - Burn Scripts

• Work began in November 1998, with DSM1 scripts as model• Significant deviation from nominal safing practice

• Final review December 7, 1998– System Engineer not present

• Brassboard testing of the nominal burn only– Brassboard configuration deviated significantly from the S/C– Burn abort not tested at all

Recommended Procedure NEAR Rendezvous Burn Procedure

RF Watchdog Timer set to expire after 2 missed DSN contacts.

RF Watchdog Timer set to nine days (rendezvous burn had continuous DSN coverage).

Short Command Loss Timer set for critical operations. No Command Loss Timer

Fast (5 sec.) AIU Heartbeat rule to detect a failed AIU. No Fast AIU Heartbeat rule

Autonomy macros must be run on both CTPs, so that they stay synchronized.

The burn abort macro was loaded only on CTP1. CTP1 ran Burn Abort while CTP2 ran Earth Safe simultaneously.

Backup systems must be maintained in a state of readiness for use.

EEPROM on both FC’s contained out-of-date orbit information that could not be used for guidance.

Critical parameters must be verified via download and comparison with expected values.

Burn parameters uploaded and used without verification.

Page 18: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

18SCL Sep-04

The Anomaly: Just the Facts, Ma’am

• Burn Command Script– Uploaded December 16, 1998

• Burn– On-board initiation at specified time– Normal execution of 200-sec settling burn– Initiation of bi-propellant burn as expected

• Anomaly– Burn abort within fraction of a second from bi-propellant initiation– S/C signal lost 37 seconds following abort

• DSN acquired Sun Safe beacon 27 hours after LoS– Freeze command stopped rotation with Earth in the fan beam, and

telemetry downlink was commanded.

Page 19: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

19SCL Sep-04

Recovery and Outcome

• When reacquired, S/C in stable Sun Safe mode controlled by the backup AIU (AIU 2)

• Mission Operations recovered to Operational Mode

– Interrupted Command History/Autonomy History downlink

– Faulty procedure used in first attempt resulted in immediate demotion back to Safe Mode and AIU switch

• S/C state assessed and immediate cause of burn abort ascertained in two days

• New burn planned and executed on Jan. 3, 2000 allows completion of NEAR mission

– Used up fuel margin

– Additional 13 months of cruise prior to mission start

– Some contamination of imager optics

– Degradation of some thrusters due to cold firings

Page 20: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

20SCL Sep-04

Data Sources for Diagnosis

Location Data type Availability Comment

SSR Housekeeeping & G&C Telemetry Lost during LVS eventData Summary (max/min values of housekeeping telemetry)

Have, but was not reset since T-6 hrs; minimum AIU values all =0, since AIU's were power cycled

In safe mode telemetry

Snapshot Data (all housekeeping telemetry for 1st 3 triggers rules)

Not reset prior to burn; contained old data from operational use of autonomy rules

In safe mode telemetry

Autonomy Rule Enable States Pre-Event HaveAutonomy Rule Enable States Post Event HaveCommand History (165 commands deep) Partially overwritten unnecessarily during recovery In safe mode telemetryAutonomy History (32 rules deep) Partially overwritten during incorrect recovery

sequenceIn safe mode telemetry

Data Summary (max/min values) Have, but was not reset since T-6 hrs; minimum AIU values all =0, since AIU's were power cycled

In safe mode telemetry

Snapshot Data (1st 3 rules) Have In safe mode telemetryAutonomy Rule Enable States Pre-Event Unconfirmed; not downloaded prior to burn One command to

downlink.Autonomy Rule Enable States Post Event Not downloaded, then overwritten One command to

downlink.Command History (165 commands deep) Partially overwritten unnecessarily during recovery In safe mode telemetryAutonomy History (32 rules deep) Partially overwritten during incorrect recovery

sequenceIn safe mode telemetry

Command History Lost during AIU1 switchesData Structure Values Downloaded late, after some modificationsCommand History HaveData Structure Values Downloaded late, after some modificationsCommand History HaveData Structure Values Downloaded late, after some modificationsIMU High Rate Data FC not commanded to storeProgram Load Never downloaded; lost during FC reboot 2 months

following event

AIU2 memory

CTP1 memory

CTP2 memory

FC1 memory

AIU1 memory

Page 21: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

21SCL Sep-04

Early Sequence of Events (1)

Burn Abort

Time from Burn Abort 00:00:10 00:06:00 00:37:54 00:42:37 00:42:47 00:42:48 00:47:48S/C Mode

OperationalEarth Safe

Sun SafeCommanded Actuators

Thrusters OnlyWheels Only

Actuators In UseThrusters Only

Wheels OnlyMomemtum Dump

System MomentumOK

Too HighThruster Request

Thruster Request ONThruster Request OFF

Thruster Enable StateEnabledDisabled

AIU in UseAIU 1AIU 2

FC in UseFC1FC2

None (AIU Only)

Page 22: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

22SCL Sep-04

Why did the burn abort?

Page 23: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

23SCL Sep-04

Time from Burn Abort 00:00:10 00:06:00 00:37:54 00:42:37 00:42:47 00:42:48 00:47:48S/C Mode

OperationalEarth Safe

Sun SafeCommanded Actuators

Thrusters OnlyWheels Only

Actuators In UseThrusters Only

Wheels OnlyMomemtum Dump

System MomentumOK

Too HighThruster Request

Thruster Request ONThruster Request OFF

Thruster Enable StateEnabledDisabled

AIU in UseAIU 1AIU 2

FC in UseFC1FC2

None (AIU Only)

Early Sequence of Events (2)

Transition to Earth Safe initiates burn shutdown command sequence and high-rate slew to Sun pointing using thrusters-only.

Command script error causes abrupt transition to wheels-only control.

NEAR goes to Sun Safe before the wheels can overcome the high rate.

Page 24: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

24SCL Sep-04

What was the script error?

• The script did not return control to the wheels at all, but did disable the thrusters.

– Without enabled thrusters, the G&C autonomously forced wheel control, but without the controlled transition.

– Because thruster-only control was commanded, the G&C used thrusters each time they were enabled.

• Need for a carefully timed sequence for returning control to the wheels was established prior to the first TCM one week after launch.

– Brassboard testing showed that the S/C was very likely to receive a “kick” from the thrusters without this controlled procedure.

• Brassboard testing of the script reproduced the early anomaly events perfectly

• The DSM1 burn script DID contain the wheels-only command (but not the right sequence), and the missing safing rules.

Page 25: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

25SCL Sep-04

Early Sequence of Events (3)

Time from Burn Abort 00:00:10 00:06:00 00:37:54 00:42:37 00:42:47 00:42:48 00:47:48S/C Mode

OperationalEarth Safe

Sun SafeCommanded Actuators

Thrusters OnlyWheels Only

Actuators In UseThrusters Only

Wheels OnlyMomemtum Dump

System MomentumOK

Too HighThruster Request

Thruster Request ONThruster Request OFF

Thruster Enable StateEnabledDisabled

AIU in UseAIU 1AIU 2

FC in UseFC1FC2

None (AIU Only)

“Kick” leaves high system momentum and initiates a momentum dump.

Command script error causes new ‘kick’ to attitude and momentum.

Page 26: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

26SCL Sep-04

Simulated Early Sequence of Events

Page 27: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

27SCL Sep-04

Complete Timeline (00:00 - 03:00)

[Ref. 22:03:16Z 20 Dec 1998]

Power System Status(D = Discharge; C = Charge)

StateOperationalEarth-SafeSun-Safe Rotate

Momentum (Body + Wheels)Off-ScaleHigh/No warmupHigh/warmupO.K.Momentum Dumps:

Gyro Mode

Guidance and Control

Attitude Interface UnitControl by FC or AIU

Key: = Uncertain

0:00 0:30 1:00 2:00 2:30

Charge D Charge Intermittent C/D Charge

(Bus V < 26V; LVS Trip) (Bus V > 28V)

Dumps #1 - #7

Normal Whole-Angle (noisy)

#1 Five Switches #2FC Probably FC FC Recovered in AUI-only Mode; timing unknown

1:30

Page 28: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

28SCL Sep-04

Complete Timeline (03:00 - 06:30)

[Ref. 22:03:16Z 20 Dec 1998]

Power System Status(D = Discharge; C = Charge)

StateOperationalEarth-SafeSun-Safe Rotate

Momentum (Body + Wheels)Off-ScaleHigh/No warmupHigh/warmupO.K.Momentum Dumps:

Gyro Mode

Guidance and Control

Attitude Interface UnitControl by FC or AIU

Key: = Uncertain

3:00 3:30 4:00 4:30 5:00 5:30 6:00

Intermittent C/D

(Bus V =23.4V)

#8 #9 #10 #11 #12 #13 #14

Page 29: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

29SCL Sep-04

Complete Timeline (06:30 - 09:00)

[Ref. 22:03:16Z 20 Dec 1998]

Power System Status(D = Discharge; C = Charge)

StateOperationalEarth-SafeSun-Safe Rotate

Momentum (Body + Wheels)Off-ScaleHigh/No warmupHigh/warmupO.K.Momentum Dumps:

Gyro Mode

Guidance and Control

Attitude Interface UnitControl by FC or AIU

Key: = Uncertain

6:30 7:00 7:30 8:00 8:30 9:00

#15

Page 30: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

30SCL Sep-04

Overview

• NEAR Overview

• Anomaly Description

• Investigation Findings

• Lessons Learned

Page 31: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

31SCL Sep-04

NEAR Anomaly Investigation

• NEAR Anomaly Review Board established on 6 January 1999– Assess APL efforts to understand and correct causes of anomaly, and

recommend additional efforts– Determine most probable cause of the anomaly– Review NEAR program and recommend improvements for future

missions

• Timeline reconstruction from available data• Determination of probable cause

– Fact of and reason for burn abort recorded in snapshot data

– Script error obvious to knowledgeable engineers- Impact of such an error known prior to the event- Impact confirmed by brassboard simulation of the burn

abort event

– Fault tree developed for anomalous momentum dumps– Analysis and 128 brassboard simulations of potential scenarios

Page 32: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

32SCL Sep-04

Findings

“The investigation established a good understanding of the events during approximately the first 47 min after the abort, but no explanation for the failure of onboard autonomy to quickly correct the problem. The Board found no evidence that any hardware fault or single-event upset contributed to the failure. Although software errors were found that could prolong and exacerbate the recovery, they by no means fully explain it.”

•No explanation for the long-term behavior of the S/C

• Only remaining branch in the fault tree is “two or more failures”

Page 33: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

33SCL Sep-04

Hardware Faults

• All hardware functioned nominally before, after and as far as can be seen with limited data, during the anomaly

• Most hardware failure modes failed to reproduce known events when simulated

• Only gyro noise gave results close to observed behavior– Required noise levels 10x higher measured on ground or in flight,

before or after the anomaly

– With high gyro noise, simulations show NEAR never recovers, so noise would have to go up and down and up and down (…)

– No credible mechanism for this phenomenon was ever suggested by APL or the gyro manufacturer (Litton)

Page 34: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

34SCL Sep-04

Software Failures

• An independent review team found 17 errors in AIU or FC software– 9 in complied code– 8 in data structures or other parameters (in addition to the

acceleration limit that precipitated the anomaly)

• One error caused a high momentum wheel rate to be set to zero– Known to have occurred at least once during the anomaly– Can cause high momentum to be calculated as low OR low

momentum to be calculated high– In simulation, S/C always recovered in 20 minutes or less

• Software error eliminated as the total cause, because:– Simulator running flight code did not exhibit anomaly– No repeat of anomaly for remainder of the mission

- But there were parameter changes and software uploads

Page 35: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

35SCL Sep-04

Can Software/SEU’s be eliminated?

• Many G&C data structures not downloaded until after certain parameters were changed– Data structures were not verified prior to burn

– SEU, upload error, or configuration management error possible

• FC1 program memory not downloaded and verified after anomaly– February 24, 1999, FC1 spontaneously re-booted (first and only)

– Could be SEU or still unknown software error

– When the anomaly review began, it found two versions 1.11 of the FC code. Which was on the S/C? Was either?

• 80,000 lines of highly convoluted code• The brassboard simulates the physics; the S/C lives it

More about this later.

Page 36: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

36SCL Sep-04

Overview

• NEAR Overview

• Anomaly Description

• Investigation Findings

• Lessons Learned

Page 37: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

37SCL Sep-04

First Observation

• Apparently, it took four, seemingly-independent errors to cause the NEAR rendezvous burn anomaly– Burn abort caused by a threshold set too low

- Data was available to set it properly

– Serious errors in a script that should have been under configuration management, reviewed and tested

– Two or more unknown errors that caused continued control problems, even after the autonomy actions corrected the configuration error- NARB concluded that no single error could produce the known behavior

• Was NEAR just unbelievably unlucky, or is there something to be learned here?– Examine the patterns

Page 38: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

38SCL Sep-04

Q: Why didn’t the G&C use data from DSM1 to set the acceleration limit?

• Consider the following:– Less than a week prior to spacecraft/rocket mating, the System

Engineer checked the alignment of the Star Camera and found it to be 90° out.

– Immediately following launch, the Star Camera was unable to find guide stars, because the on-board star map of southern hemisphere of the celestial sphere was incorrect.

– The first TCM was poorly controlled, because the control law used an inaccurate model of the thruster value action. (Manufacturer’s data was located in the G&C engineer’s file cabinet.)

• The burn anomaly was not the only or first time the NEAR G&C team failed to measure or use data to check their models

• S/C testing failed to uncover any of these errors, including the faulty acceleration limit

Page 39: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

39SCL Sep-04

Testing G&C Algorithms

• Lacking a zero-gravity environment, a wrap-around simulation with a ‘truth model’ is the only way to test G&C– Meticulous attention to modeling of physical phenomena

– Independence between the flight algorithms and the truth model• The NEAR ‘truth model’ was written by the flight team and

mirrored all the incorrect physical models used to design the S/C G&C algorithms– Although NEAR had an Independent V&V team for G&C, the flight

team GAVE them all the models

• MSX (the program prior to NEAR) had an independent team build the truth model– NEAR flight G&C team opposed an independent team for NEAR

– “50% of the errors found on MSX were in the truth model, not the S/C”

Page 40: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

40SCL Sep-04

Lesson Number 1

•Always have an independent team build the simulation that will be used to test the G&C algorithms.– Different approaches by different teams can

uncovered biases on the part of either team– Use measurements on real flight hardware as much

as possible– Accept the time spent on errors in the truth model to

get find the errors in flight algorithms

You can’t fool Mother Nature.

Page 41: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

41SCL Sep-04

Q: How did such an obviously flawed script escape notice?

• Consider the following:– At launch, most of the scripts in use by Mission Operations were last-

minute adaptations of S/C-level test scripts- Dangerous test commands still in place in the rendezvous burn script

– In just the first 8 months of operations, there were 7 entries into Safe Mode caused by Mission Operations errors- Many script errors that could have been found by brassboard testing- Lessons of the first TCM

– At the time of the rendezvous burn, Mission Operations still had no set procedure for recovery from Safe Mode- This was an Action Item from Critical Design Review

– 2 months after the burn anomaly, two new Safe Mode entries were caused by operations errors in loading orbits- Loading new orbits was a routine operation for three years of cruise

• Such events were almost accepted as an inevitable part of operations

Page 42: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

42SCL Sep-04

Preparation for Operations

• Most operations on spacecraft can be planned, scripted and tested on the ground before launch– A Concept of Operations that reflects the actual S/C design

– Contingency planning, as well as nominal operations

– Scripts that can be used in flight

• Pre-launch NEAR Mission Operations concentrated on ground system acquisition and DSN connectivity– The NEAR CONOPs was essentially generic - no thinking about

how the operations would be conducted with NEAR

– No thinking about contingencies

– No practice of operations with significant round-trip light times

• A function of better, cheaper, faster?– Nope; function of inexperience with professional operations

Page 43: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

43SCL Sep-04

Mission Operations Professionalism

• Professional Mission Operations requires discipline– Configuration management of scripts, code, parameters, etc.– Following a process

- Review process- Test requirements- Script sign-off

– Use of proven procedures to perform routine tasks– Using Problem Failure Reporting as an opportunity to learn

- Change or institute process to avoid repeat of errors

• NEAR approach: Conduct operations with a team of engineers who would become experts on the spacecraft and mission – Resulted in a ‘heroic’ mode of operations - CMM Level 1 of ops– Configuration management, reviews, sign-off on scripts were not the

interesting part of operations for the NEAR team – Didn’t acquire the degree of knowledge required for hero status

Page 44: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

44SCL Sep-04

Specialized Technical Knowledge

• Very few people were truly capable of reviewing scripts– G&C engineers didn’t understand the scripting language– Mission Operations team didn’t understand the spacecraft

• Running the brassboard simulator took knowledge and patience – Setting up the ‘truth model’ simulation– Maintenance to keep the brassboard synchronized with the S/C– Ran in real-time, so simulations took time– “Half the time, errors are in the brassboard setup, not the script”

Page 45: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

45SCL Sep-04

Lesson Number 2

•Mission Operations requires a team of experienced, dedicated professionals with a unique set of skills.– Planning, preparation, process control, configuration

management are as important than detailed technical knowledge

– Practice on the pre-flight the way you plan to operate in flight and then don’t deviate unless absolutely necessary

– Accept the time spent on errors in the ground simulator to find the errors in scripts

Being a hero means never having to say “I’m sorry”.

Page 46: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

46SCL Sep-04

Q: Why didn’t the S/C recover after autonomy corrected the G&C mis-configuration?

• Consider the following:– Pre-flight, G&C code had more SW PRF’s than the other 5 processors

combined– New versions were loaded ~ 10 times during S/C-level testing, in

Maryland and the Cape- The first three versions wouldn’t boot- Three separate problems that caused FC commands to be ignored were

discovered pre-flight and a fourth after launch.

– Telemetry was a particular problem- The G&C used the ground simulator, not telemetry, to test their software

– Prior to the anomaly, FC code was uploaded three times and the AIU once to correct major problems in flight

– 17 additional errors were found during the rendezvous burn investigation

• The existence of undiscovered G&C code errors is not unlikely, based on the continued high rate of fault discovery

Page 47: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

47SCL Sep-04

Q: Do we even need an undiscovered error to explain the anomaly?

• G&C software error caused a high momentum wheel rate to be set to zero– Can cause high momentum to be calculated as low OR low

momentum to be calculated high

• In simulation, S/C always recovered in 20 minutes or less– Limited brassboard simulation of this scenario– Other failures of the simulation to catch errors in flight code have

been caused by mismatches between the truth model and reality– How accurate is the wheel model?– How would the behavior change if the wheel model is changed?

• An attractive hypothesis– Requires only a known error in the G&C code, plus an unrealistic

wheel model– Code containing the error known to be invoked at least once

during the anomaly

Page 48: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

48SCL Sep-04

Lesson Number 3

• If the software error discovery rate is still high, keep testing, even if the S/C has already been launched.– Use the ground simulator– Use an independent team, like the NARB did following

the anomaly– Remember Lesson 1: Take every opportunity to adjust

the truth model to match S/C performance in flight

It ain’t over ‘til it’s over.

Page 49: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

49SCL Sep-04

Q: Was NEAR just really unlucky?

• Consider the following:– G&C code was known to be buggy before pre-flight

– G&C code continued to be buggy during flight

– The fact that there had been no true independent look at the G&C truth model was known within a week of launch

– Mission Operations preparation was known to be inadequate prior to launch

– Mission Operations were fault-prone throughout cruise

– Mission Operations was never asked for an accounting of their process prior to the burn anomaly

• Each event was treated individually, rather than as a pattern that had a high probability of converging into disaster– The NEAR burn anomaly represented a Management failure

Page 50: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

50SCL Sep-04

Lesson Number 4

•Management must stay informed and involved before there is a serious problem– Look for trends and patterns– When there is serious disagreement on the cause or

meaning of events, look closer– Get an independent opinion

Ultimately, leadership is responsible.

Page 51: The Near Earth Asteroid Rendezvous (NEAR) Rendezvous Burn Anomaly

51SCL Sep-04

Conclusion

• NEAR survived under the conditions described because it had:– A very forgiving mission design

- No critical timing for TCMs

– A huge fuel margin- Despite 96 m/s lost in the anomaly and the addition of a DSM2, NEAR had

sufficient fuel to conduct the mission

– A reliable and comprehensive safing system- Took care of almost everything that was thrown at it during flight- Still had one rule to go when the burn anomaly corrected itself

• The burn anomaly served as a wake-up call, and with another reload of the flight computer, went on to perform a flawless, year-long mission around Eros, terminating in a controlled landing.

• By heeding the lessons of NEAR, other, less-blessed missions can be just as successful without the initial trauma.