epic sw status

24
Guillermo Buenadicha Guillermo Buenadicha SOC Operations Support Group Page 1 XMM-Newton XMM-Newton EPIC SW Status EPIC SW Status Guillermo Buenadicha Guillermo Buenadicha TOS/OFX, SOC Operations Support TOS/OFX, SOC Operations Support Group Group Palma de Mallorca, 1 Palma de Mallorca, 1 st st February 2005 February 2005

Upload: cooper-irwin

Post on 03-Jan-2016

34 views

Category:

Documents


1 download

DESCRIPTION

EPIC SW Status. Guillermo Buenadicha TOS/OFX, SOC Operations Support Group Palma de Mallorca, 1 st February 2005. Introduction. ACTIVITIES SINCE LAST TTD SW Versions EMDH K upload EPIC PN Spare testing 5 YEARS OBSW, WRAP UP. Why modifying SW? Examples Where are we What’s next? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 1

XMM-NewtonXMM-Newton

EPIC SW StatusEPIC SW Status

Guillermo BuenadichaGuillermo Buenadicha

TOS/OFX, SOC Operations Support GroupTOS/OFX, SOC Operations Support Group

Palma de Mallorca, 1Palma de Mallorca, 1stst February 2005 February 2005

Page 2: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 2

XMM-NewtonXMM-Newton

IntroductionIntroduction

ACTIVITIES SINCE LAST TTDACTIVITIES SINCE LAST TTD

– SW VersionsSW Versions

– EMDH K uploadEMDH K upload

– EPIC PN Spare testingEPIC PN Spare testing

5 YEARS OBSW, WRAP UP.5 YEARS OBSW, WRAP UP.

– Why modifying SW?Why modifying SW?

– ExamplesExamples

– Where are weWhere are we

– What’s next?What’s next?

OBSW Contract expirationOBSW Contract expiration

Page 3: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 3

XMM-NewtonXMM-Newton

Current SW releasesCurrent SW releases

EXPERIMENT RGS OM EPIC MOS EPIC PN

Item RGS IC RGS DPP RGS CSG Hot Pixel Table Hot Column Table OM ICU OM DPU EMDH EMCR EMAE CSG EPDH EPCE EPEA

Launch S/W FM S403 16.A FM S403 Nov-99 Nov-99 FM-9 FM-9 H 14 Oct-98 H 2.3 5.14

SOCSIM SW FM S206 *** *** *** *** FM - 10 *** H *** *** H *** ***

01-Mar-00 5.17

08-Mar-00 19 Feb-00 Feb-00

17-Mar-00 FM S206

29-Jun-00 FM-10 FM-10b I

07-Sep-00 FM-11 FM-11

19-Jan-01 I

24-Jan-01 FM-12

12-Jun-01 V-10

02-Aug-01 J

30-Nov-01 FM S207

22-Jan-02 J

16-May-02 K

07-Nov-02 J+

27-Nov-02 V-11 V-11

15-Feb-03 v83

05-May-04 K

22-Nov-04 FM S208

Page 4: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 4

XMM-NewtonXMM-Newton

EMDH K UPLOADEMDH K UPLOAD The EMDH K was uploaded on the 5The EMDH K was uploaded on the 5thth of May 2004, after testing on the of May 2004, after testing on the

Spare and a prior uplink end of April.Spare and a prior uplink end of April.

A refined version used (just SW reallocation), to avoid alarms related A refined version used (just SW reallocation), to avoid alarms related to boot of the EMDH Slave processor. to boot of the EMDH Slave processor.

All items for extended BP handling in place, only pending issue is the All items for extended BP handling in place, only pending issue is the modification of the ground SW to ingest them into the relevant FITs. modification of the ground SW to ingest them into the relevant FITs. This will be implemented into the new S2K system.This will be implemented into the new S2K system.

Also in place the preventive correction for the FW movement in Also in place the preventive correction for the FW movement in presence of a sensor failure, and the presence of a sensor failure, and the HK flagging mechanismHK flagging mechanism..

Any ideas for the later???Any ideas for the later???

Page 5: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 5

XMM-NewtonXMM-Newton

EPIC MOS: EMDH KEPIC MOS: EMDH K

ECR #7ECR #7 After cooling in rev 533, After cooling in rev 533, reduction by 8reduction by 8 in the number of Bright Pixels. in the number of Bright Pixels. However, management still deemed necessary to cope with future increase. Preliminary However, management still deemed necessary to cope with future increase. Preliminary

analysis of options in ISS-MOS-BPT-TN-01 (July 2002).analysis of options in ISS-MOS-BPT-TN-01 (July 2002). Option selected for a demonstration prototype is to Option selected for a demonstration prototype is to increase up to 250 pixels per increase up to 250 pixels per

HBRHBR, using a data reduction strategy in the BPT storage., using a data reduction strategy in the BPT storage.

The constraints in any implementation are the The constraints in any implementation are the reduced and limited data spacereduced and limited data space in in the EMDH processors. The implementation chosen also impacts in the BPT uplink and the EMDH processors. The implementation chosen also impacts in the BPT uplink and report mechanisms, and the rejection procedure, although limited changes are foreseen report mechanisms, and the rejection procedure, although limited changes are foreseen there to maintain performance.there to maintain performance.

Page 6: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 6

XMM-NewtonXMM-Newton

EPIC MOS: EMDH K EPIC MOS: EMDH K ECR 12ECR 12

Existing logic, Safe enteringExisting logic, Safe entering Logic After EMDH KLogic After EMDH K

Prime sensor Closed

Redun sensor Closed

Ext Temp sensor

< 60 deg

What if FW Open, and Ext Temp broken (temp >60)???

Prime sensor Closed

Redun sensor Closed

Ext Temp sensor

< 60 deg

Check Override

bit

Move FWMove FW

If bit set to 0, no differences w.r.t. current situation.

If bit set to 1, FW will move in any case.

Page 7: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 7

XMM-NewtonXMM-Newton

EPIC PN SPARE TESTINGEPIC PN SPARE TESTING

EPIC PN Testing on the SpareEPIC PN Testing on the Spare Performed on the Panter facility on the last two Performed on the Panter facility on the last two

weeks of Nov. 2004. weeks of Nov. 2004. LABEN, ESAC, PI’sLABEN, ESAC, PI’s.. Purpose of it was to verify the readiness of the Purpose of it was to verify the readiness of the

Spare Chain, and to upgrade it with the latest SW Spare Chain, and to upgrade it with the latest SW (EPEA 517, EPDH K, latest Ops DB procedures).(EPEA 517, EPDH K, latest Ops DB procedures).

Special test devoted to perform the Special test devoted to perform the fast dump of fast dump of the offsets via HBRthe offsets via HBR..

Dedicated set of tools developed to support Dedicated set of tools developed to support testing and data analysis.testing and data analysis.

Identified the need of a warm reset in case of a Identified the need of a warm reset in case of a CDMU crash (quite unlikely!!!). CDMU crash (quite unlikely!!!). Procedure change.Procedure change.

Page 8: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 8

XMM-NewtonXMM-Newton

EPIC PN: ECR # 2EPIC PN: ECR # 2

EPDH

EPEA

FIFO

8 Kwords

HBR Buffer

40 Kwords

CDMU

BRAT selected:

16 Kbps for 1 quad

34 Kbps for 2 quad

25999 words

(130 Words +

10 msec) * 200

>= 2 secs

Up to 68000 reads

per second???

PST

HK

4

Science Queue

16 Packets

NP

4

VC-7 TMGROUND

TC F0106

Xqt_offs

Dump_offset

1 packet every 250 msec.

Page 9: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 9

XMM-NewtonXMM-Newton

NEW OFFSET DATA STRUCTURENEW OFFSET DATA STRUCTURE

EPIC PN: ECR # 2 EPIC PN: ECR # 2

CURRENT OFFSET DATA STRUCTURECURRENT OFFSET DATA STRUCTURE

0 0 I I I I I I I I J J J J J JQ Q C C O O O O O O O O O O O O

Q Q C C T T B B B B B B B B B B

P P P P O O O O O O O O O O O O

1 S S S S S S S S S S S S S S SF F F F F F F F F F F F F F F F

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 I I I I I I I I J J J J J J

+ 59 x

x 216 +

x 64 +

x 200

Q = Quadrant Id

C = CCD Id

T = Table Id

B = Block Id

P = Pixel Status

O = Offset Value

S = Seconds of Offset Calculation

F = Fraction of Seconds

I= Pixel I coordinate

J = Pixel J coordinate

Q = Quadrant Id

C = CCD Id

O = Offset Value

Page 10: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 10

XMM-NewtonXMM-Newton

5 years of SW modifications5 years of SW modifications

IOBSW Versions

0

1

2

3

4

10/1

2/19

99

10/0

3/20

00

10/0

6/20

00

10/0

9/20

00

10/1

2/20

00

10/0

3/20

01

10/0

6/20

01

10/0

9/20

01

10/1

2/20

01

10/0

3/20

02

10/0

6/20

02

10/0

9/20

02

10/1

2/20

02

10/0

3/20

03

10/0

6/20

03

10/0

9/20

03

10/1

2/20

03

10/0

3/20

04

10/0

6/20

04

10/0

9/20

04

Page 11: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 11

XMM-NewtonXMM-Newton

EPIC OBSW RELEASESEPIC OBSW RELEASES

EPIC SW Versions

0

1

10/1

2/19

99

10/0

3/20

00

10/0

6/20

00

10/0

9/20

00

10/1

2/20

00

10/0

3/20

01

10/0

6/20

01

10/0

9/20

01

10/1

2/20

01

10/0

3/20

02

10/0

6/20

02

10/0

9/20

02

10/1

2/20

02

10/0

3/20

03

10/0

6/20

03

10/0

9/20

03

10/1

2/20

03

10/0

3/20

04

10/0

6/20

04

10/0

9/20

04

EPEA 517 EMDH I EPDH I EPDH J EMDH J EPDH K EMDH J+ EMDH K

Page 12: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 12

XMM-NewtonXMM-Newton

5 years of EPIC SW 5 years of EPIC SW

EMDH EMDH 4 versions 4 versions

EMDH I, J, J+, KEMDH I, J, J+, K

EPDH EPDH 3 versions 3 versions

EPDH I, J, KEPDH I, J, K

EPEA EPEA 1 version 1 version

EPEA 517EPEA 517

No other units modifiedNo other units modified

EXPERIMENT EPIC MOS EPIC PN

Item EMDH EMCR EMAE CSG EPDH EPCE EPEA

Launch S/W H 14 Oct-98 H 2.3 5.14

SOCSIM SW H *** *** H *** ***

01-Mar-00 5.17

29-Jun-00 I

19-Jan-01 I

02-Aug-01 J

22-Jan-02 J

16-May-02 K

07-Nov-02 J+

05-May-04 K

Page 13: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 13

XMM-NewtonXMM-Newton

Reasons for SW modificationsReasons for SW modifications

The OBSW is typically modified after launch due to 3 scenarios:The OBSW is typically modified after launch due to 3 scenarios:

Need to correct launch SW bug and to tailor the Need to correct launch SW bug and to tailor the OBSW during the commissioning phase.OBSW during the commissioning phase.

CORRECTCORRECT Fit the unit to the nominal performance and Fit the unit to the nominal performance and

implementation of improvements.implementation of improvements.

ADAPTADAPT Preparation of the instrument for an extended Preparation of the instrument for an extended

lifetime and prevention/correction of HW failures or lifetime and prevention/correction of HW failures or degradation.degradation.

PROTECTPROTECT

Page 14: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 14

XMM-NewtonXMM-Newton

OBSW changes XMMOBSW changes XMM

IOBSW Versions

0

1

2

3

4

10

/12

/19

99

10

/03

/20

00

10

/06

/20

00

10

/09

/20

00

10

/12

/20

00

10

/03

/20

01

10

/06

/20

01

10

/09

/20

01

10

/12

/20

01

10

/03

/20

02

10

/06

/20

02

10

/09

/20

02

10

/12

/20

02

10

/03

/20

03

10

/06

/20

03

10

/09

/20

03

10

/12

/20

03

10

/03

/20

04

10

/06

/20

04

10

/09

/20

04

CORRECT ADAPT PROTECT

Page 15: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 15

XMM-NewtonXMM-Newton

Examples EPIC: CorrectExamples EPIC: Correct

EPEA 517EPEA 517

NCR’s 5 and 12, Low energy noise, Image correction NCR’s 5 and 12, Low energy noise, Image correction

consistent.consistent. EMDH IEMDH I

TC rejected, FW movement, Headers and trailers not TC rejected, FW movement, Headers and trailers not

matching. (NCR’s 3, 9, 14)matching. (NCR’s 3, 9, 14) EPDH IEPDH I

TC rejectedTC rejected

Page 16: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 16

XMM-NewtonXMM-Newton

Examples EPIC: AdaptExamples EPIC: Adapt

EMDH J, J+EMDH J, J+ Internal LABEN fixes, Watchdog FunctionInternal LABEN fixes, Watchdog Function

EPDH JEPDH J Watchdog functionWatchdog function

EPDH KEPDH K Fast dump of OffsetsFast dump of Offsets

Page 17: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 17

XMM-NewtonXMM-Newton

Examples EPIC: ProtectExamples EPIC: Protect

CDMU JCDMU J

NCR 83, NCR 83, PN Operating heater autonomous switch PN Operating heater autonomous switch offoff

EMDH KEMDH K

Extended BP capabilityExtended BP capability

Prevent possible sensor failure impacting on FWPrevent possible sensor failure impacting on FW

??????

Page 18: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 18

XMM-NewtonXMM-Newton

Where are we?Where are we?

After 5 years, it is time to sum up SW After 5 years, it is time to sum up SW status and status and see what is neededsee what is needed..

Several NCR’s declared as “Unresolvable”, Several NCR’s declared as “Unresolvable”, ground W/A in place. ground W/A in place. Are we happy?Are we happy?

Adaptations needed. Adaptations needed. – Any performance improvements?Any performance improvements?– Security issuesSecurity issues– Do we need to monitor performance?Do we need to monitor performance?

HW failuresHW failures– How to deal with themHow to deal with them– Mission lifetime and unit specificationsMission lifetime and unit specifications

Page 19: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 19

XMM-NewtonXMM-Newton

Unresolvable itemsUnresolvable items

NCR-39 PN quadrants not listening. NCR-39 PN quadrants not listening. Performance.Performance. NCR – 75 Fail to reset EPEA Time at 32400. NCR – 75 Fail to reset EPEA Time at 32400. Processor Processor

performace.performace. NCR –99 Late start of time in EPEA. Fixed in SAS but still NCR –99 Late start of time in EPEA. Fixed in SAS but still

problem OB. problem OB. Processor performaceProcessor performace NCR-97 TC E0001 fails transmission. NCR-97 TC E0001 fails transmission. SW bugSW bug.. NCR-100 Erroneous fine Time in the Time verification NCR-100 Erroneous fine Time in the Time verification

packet. packet. SW bugSW bug.. NCR –103, Corrupted events confused with time info’s. NCR –103, Corrupted events confused with time info’s.

Processor performanceProcessor performance.. NCR-106 Failure of F0119 (NCR-106 Failure of F0119 (LBR errorLBR error)) NCR –107 3 rows missing in MOS-2 CCD 6. NCR –107 3 rows missing in MOS-2 CCD 6. Not clearNot clear..

Page 20: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 20

XMM-NewtonXMM-Newton

Performance and securityPerformance and security

NCR’s like 75 (NCR’s like 75 (EPEA time resetEPEA time reset), 103 (), 103 (FFFF FFFF wordswords) affect quality of data, so far taken ) affect quality of data, so far taken care by On Ground SW. care by On Ground SW.

107 3 rows missing in MOS2 CCD6107 3 rows missing in MOS2 CCD6 MOS Offset???MOS Offset??? Response of PN quadrants to TC’sResponse of PN quadrants to TC’s

TREND MONITORING?, ACTIONS?TREND MONITORING?, ACTIONS?

Page 21: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 21

XMM-NewtonXMM-Newton

EPIC PN: Time Problems 3 EPIC PN: Time Problems 3 Total_Fails/All_Events

0

0.001

0.002

0.003

0.004

0.005

0.006

0.007

ExtendedFull

ExtendedFull

ExtendedFull

ExtendedFull

ExtendedFull

ExtendedFull

ExtendedFull

ExtendedFull

FullFrame

FullFrame

FullFrame

FullFrame

FullFrame

FullFrame

LargeWindow

LargeWindow

SmallWindow

SmallWindow

Revs. 478 to 510: FAILS / EVENTS ratio

Page 22: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 22

XMM-NewtonXMM-Newton

HW failures, possible impactHW failures, possible impact

µP performance, future degradation Impact µP performance, future degradation Impact on BW and quality of the data transmitted on BW and quality of the data transmitted (EPEA problems)(EPEA problems)

HW failures(PN Q2 Voltage converter HW failures(PN Q2 Voltage converter current out of limit, EMAE 28 volts line current out of limit, EMAE 28 volts line down). RGS ADC error.down). RGS ADC error.

LCL trip of, NCR 83 on PN. LCL trip of, NCR 83 on PN. Processor resets, radiation harness, non Processor resets, radiation harness, non

EDAC memories. Changes in SW variablesEDAC memories. Changes in SW variables CCD degradation or unavailability.CCD degradation or unavailability.

Page 23: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 23

XMM-NewtonXMM-Newton

WHAT’s NEXT?WHAT’s NEXT?

• Better flagging and SW debugging?Becomes more and more important. Information on scheduled process, variable status, buffer occupation…

• Failures HW related?Do we know the impact of LCL failures, quadrants not working, status of redundant channels (readout nodes, units, etc)? Any preventive measure like NCR 83 or ECR 12 OB?

• Degradation and performanceHow to cope with CCD degradation. How to handle more failures or worse response from the event analyzers.

• New modes foreseenIs there anything expected? Old modes to be revisited?

• Instrument operations in reduced coverage scenario?Can we operate them with longer outage periods?

Page 24: EPIC SW Status

Guillermo BuenadichaGuillermo BuenadichaSOC Operations Support Group

Page 24

XMM-NewtonXMM-Newton

OBSW ContractOBSW Contract

The OBSW contract ESA/Consortium The OBSW contract ESA/Consortium expires mid 2005.expires mid 2005.

Is the HW going to be kept?Is the HW going to be kept? Expertise maintenance?Expertise maintenance? Any foreseen activity before expiration?Any foreseen activity before expiration?