strategy to achieve smooth upgrades during operations vito baggiolini be/co 1
DESCRIPTION
OPERATOR CONSOLES TCP/IP communication services FIXED DISPLAYS OPERATOR CONSOLES TCP/IP communication services TIMING GENERATION CERN GIGABIT ETHERNET TECHNICAL NETWORK FILE SERVERS APPLICATI ON SERVERS SCADA SERVE RS TCP/IP communication services RT Lynx/OS VME Front Ends WORLDFIP Front Ends PLC Core Control GUIs Fixed Displays Frame Alarms (LASER) Data Concentrators Data Concentrators LHC Software Architecture Core Software Interlock System Front-End FESA servers Business Layer Front End Layer Controls SW Infrastructure Front-End FESA servers FESA servers Equipment GUIs Post Mortem Post Mortem Timing Management DB Settings & Logging DB Settings & Logging Role Based Access RBAC – Critical Settings Management Sequencer Diagnostics Monitoring DIAMON - TIM Diagnostics Monitoring DIAMON - TIM Core Control GUIs GUI Layer 3 Front-End servers GM servers CMW Controls Middleware DB Settings & Logging DB Settings & Logging DB Settings & Logging DB Settings & Logging DB AccessTRANSCRIPT
1
Strategy to achieve smooth upgrades during operations
Vito Baggiolini BE/CO
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
2
Outline
• Motivation
• Strategy for smooth upgrades
• Two concrete examples
• Conclusions
OPERATOR CONSOLES
TCP/IP communication services
FIXEDDISPLAYS
OPERATOR CONSOLES
TCP/IP communication services
TIMING GENERATION
CE
RN
GIG
AB
IT E
THE
RN
ET
TE
CH
NIC
AL
NE
TWO
RK
FILE SERVERS
APPLICATION
SERVERS
SCADA SERVE
RS
TCP/IP communication services
RT Lynx/OSVME Front Ends
WORLDFIPFront EndsPLC
Core Control GUIsFixed Displays Frame
Alarms (LASER)
DataConcentrators
LHC Software Architecture
CoreSoftware InterlockSystem
Front-End FESA servers
Business Layer
Front End Layer
Controls SW Infrastructure
Front-End FESA serversFront-End FESA servers
FESA servers
Equipment GUIs
PostMortem
Tim
ing
Man
agem
ent
DBSettings &Logging
Role Based Access RBAC – Critical Settings Management
Sequencer
DiagnosticsMonitoring
DIAMON - TIM
Core Control GUIsCore Control GUIs
GUI Layer
3
Front-End serversFront-End servers
Front-End serversGM servers
CMW Controls Middleware
DBSettings &Logging
DBSettings &LoggingLogging
DB Access
4
A distributed, highly modular control system
• GUI and Business Layer (Java) – 850 binary components (jar files) in production– Combined to 400 different GUIs and 150 server programs– Up to 600 processes on 400 machines– Developed by 50 people from 10 different groups
• Front-End Layer (C/C++)– 550 different device types (FESA + GM classes)– 70’000 devices deployed on 800 different machines– Developed by 80 people from 8 different groups
• Informal organization of development and deployments
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
5
Upgrades are necessary but risky
• The control system needs to be upgraded regularly– New functionality, improvements, bugfixes
• Upgrades during operational periods– We don’t have yearly long shutdowns anymore– Just 4-day technical stops every six weeks
• Upgrades can break the control system– Bugs inside a given component– Incompatibility between components– Incomplete upgrades due to bad coordination – …
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
6
Outline
• Motivation
• Strategy for smooth upgrades
• Two concrete examples
• Conclusions
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
13
7
Strategy for Smooth Upgrades
• Investment in quality mandated by management • Official approach now (already practiced informally)
– Analyse the impact of a change upfront (c.f. next slide)– Backward compatible upgrades if possible– Non-backward compatible upgrades only with careful
coordination and follow-up– Big changes on central systems only during shutdown
(2011/12)
• Other ingredients to smooth upgrades (not detailed here):– Planning before even starting development work– Good unit and integration testing ( Testbed)– Deploy upgrades only in those accelerators that need them– Tools to quickly revert back in case of problems V.
Bag
giol
ini,
CO
Day
22
June
201
0
8
The developer’s perspective
• Developer receives a user request for change– New functionality / improvement / bugfix / request to adapt
• Developer analyzes impact– Is the change localized in my component? – Can I do it in a backward compatible way?– If not, how many dependent projects are affected?
Can I convince them to follow me? Requires careful consideration and discussions with others
• Three outcomes and answer to users: – Yes I can do it in isolation or in a backward compatible way– Yes, but I need to coordinate with N other developers– No, I cannot do this change during the physics run
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
9
Two upgrading scenarios
• Either an upgrade is fully backward compatible • Or NOT backward compatible but carefully
coordinated, to ensure all necessary changes in all dependent components are done and validated
• Why not always impose backward compatibility?• Advantages of backward compatibility
Does not break anything Isolated upgrades possible, no big coordination needed
• Disadvantages of backward compatibility Constraint for development, not always easy to achieve and
to validate Leads to sub-optimal solutions and technical debt
Cannot just blindly advocate backward compatibility! V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
10
10
Resulting needs for development process + tools
• Need to easily find active incoming dependencies (“which active components depend on mine?”)– Need a list of all active (operational) client components– Possibility to find those that depend on my component
• Need clear public interfaces between components– E.g. Application Programming Interface (API), I/O points or
Device/Properties, network protocol– Can change everything that is not part of public interface
• Need means to validate backward compatibility– Tools to guarantee truly backward compatible changes
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
11
Resulting needs for development process + tools
• Need to easily find active incoming dependencies (“which active components depend on mine?”)– Need a list of all active (operational) client components– Possibility to find those that depend on my component
• Need clear public interfaces between components– E.g. Application Programming Interface (API), I/O points or
Device/Properties, network protocol– Can change everything that is not part of public interface
• Need means to validate backward compatibility– Tools to guarantee truly backward compatible changes
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
12
How to find active incoming dependencies
• Static analysis (without execution of code)– Configuration in Databases or files – Inspection of source code or binaries
• Dynamic analysis (gathering of info while code is executed)– Instrument operational programs– Analysis of logfiles
• Propose to use a combination of both– First dynamic analysis to identify active programs– Then static analysis of the dependencies of each program
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
8
13
How to achieve Backward Compatibility (BC)
• Interfaces have a static and a dynamic aspect– Static: Interface definition (API, device/properties)– Dynamic: Sequence of interactions
• Both aspects must be preserved
• Tools needed to assess backward compatibility– Specific tools to validate static aspect for each kind of interface– Function tests (e.g. testbed) to validate dynamic aspects
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
14
Outline
• Motivation
• Strategy for smooth upgrades
• Two concrete examples: FESA and Java
• Conclusions
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
5
15
FESA: Analysis of Incoming Dependencies
• Questions of a FESA developer – Who uses my device Magnet11/current ? – Can I change / rename / remove it?
• Static analysis (Goal: find all operational devices)– Data-driven applications Database or configuration files– Java programs with hardcoded device/property names Source code analysis (Fisheye), byte-code analysis
• Dynamic analysis (Goal: find all operational devices)– Spy on client programs that connect to devices Log-files of middleware directory service
• Manual dependency analysis is still needed– “Loosely coupled” systems, e.g. data published via JMS or DIP
• Difficult to conclude that a device is not used at all!– Default recommendation to stay backward compatible V.
Bag
giol
ini,
CO
Day
22
June
201
0
16
FESA: Achieving Backward Compatibility
• FESA interface = Device/property model– Public (non-expert) properties defined in a FESA class
• How to preserve backward compatibility:– No devices and properties are renamed or removed– No data types are modified– Sequence of interactions (protocol) is not changed
• Will integrate BC checks into FESA development tools– Early warning to developer– Ongoing work between FESA and database teams in CO– Exact policy to be discussed with equipment groups
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
17
Java: Analysis of Incoming Dependencies
• Questions of a Java developer– Who uses my MyClass.myMethod(…)? – Can I freely modify it?
• Active clients = operational programs– GUIs started by operations on CCC consoles– Programs running on server machines– We know the jar files used in these applications
• Byte-code analysis on all classes of operational Jars– Invocations of method of classes in other Jars– References to fields of classes in other Jars– Inheritance relations between classes from different jars
• Currently developing a tool for Eclipse– Proof-of-concept implemented
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
18
Java: Achieving Backward Compatibility
• Assessing BC of Application Programming Interfaces– Trickier than you first think ;-)
• Tools exist for Eclipse (“PDE API tools”) Immediate feedback to the developer when they break BC Configuration in Eclipse not straight-forward, must be
automated
• First tests done, but nothing yet decided
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
19
Conclusion
• Backward Compatibility as default approach for upgrades during operations– The only choice for widely used systems (e.g. middleware)– Default approach for FESA devices and core Java systems– “Technical Debt” needs to be cleaned up during shutdowns
• Non Backward Compatible upgrades accepted if:– All dependent components can be reliably identified and
owners are willing to follow and re-validate changes– Non-BC can be best solution in some cases. Risk must be
discussed with users on a case-by-case basis.
• In any case, (non-)BC is only one aspect of smooth upgrades
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
20
V. B
aggi
olin
i, C
O D
ay 2
2 Ju
ne 2
010
21
Commnbuild/release
SVNsources
Cmmnbuild/release
binaryrepository
Checkout sources to clean directory;Tag in SVN with release version number
Build:Compile, run unit tests, Create binaries (jars/libs)
Release:Save versioned products
to binary repository
Accelerator
Deploy: run in operations[Separate step] V.
Bag
giol
ini,
TC 1
2-M
ar-2
010
V. B
aggi
olin
i, TC
12-
Mar
-201
0
22
Java: API Consolidation
• Need to clearly specify the contract (API) to make sure it is not broken– Which packages, classes, methods are part of the API– Which ones are not (thus can be modified freely, without BC
constraints!)• Need tools to enforce correct use of APIs
• Start with make “key” APIs between clients and servers
• Part of the Software Improvement Process