continuous availability: from the shift paradigm to ...€¦ · 16 results 07 november 2017 –cmg...
TRANSCRIPT
![Page 1: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/1.jpg)
Continuous availability: from the shift paradigm
to unmanned operation.
Pietro Tiberi
17 January 2018 – TIPS Contact Group
![Page 2: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/2.jpg)
2
Agenda
Continuous availability: from the shift paradigm to unmanned operation
1
Introduction
2
Continuous
Availability
3
Results
4
Conclusions and perspective
![Page 3: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/3.jpg)
3
Introduction TIPS Non functional requirements - Reliability / Availability
(RPO=0)
(RTO=15 minutes)
Transactions Lost
Downtime
99.9%
Continuous availability: from the shift paradigm to unmanned operation
![Page 4: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/4.jpg)
4
Introduction Datacenter Operations
Continuous availability: from the shift paradigm to unmanned operation
Human based
(on shifts) Unmanned
![Page 5: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/5.jpg)
5
CONTINUOUS OPERATION
Continuous availability: from the shift paradigm to unmanned operation
![Page 6: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/6.jpg)
6
Continuous Availability From high availability to continuous availability
Continuous availability: from the shift paradigm to unmanned operation
o Redundancy
o Fault Tolerance
o Clustering
o Active Active configuration
o Proactive
monitoring
o Continuous
delivery
o Automatic
remediation
o Dynamic capacity
management
![Page 7: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/7.jpg)
7
Continuous Availability Proactive Monitoring
Continuous availability: from the shift paradigm to unmanned operation
o Infrastructure monitoring
o Application monitoring o Detect events
before failures
o Trigger automatic
actions
o Analyze the event
![Page 8: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/8.jpg)
8
Continuous Availability IT Automation
Continuous availability: from the shift paradigm to unmanned operation
![Page 9: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/9.jpg)
9
Continuous Availability From Agile to Devops
Continuous availability: from the shift paradigm to unmanned operation
![Page 10: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/10.jpg)
10
Continuous Availability DevOps - Everything as Code
Continuous availability: from the shift paradigm to unmanned operation
Code
Virtual Infrastructure
![Page 11: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/11.jpg)
11
Continuous Availability Dynamic Capacity Management
Continuous availability: from the shift paradigm to unmanned operation
o Consumption
trend analysis
o Resource utilization
rate optimization o What if scenarios
o Predict future
requirements and
trends
![Page 12: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/12.jpg)
12 Continuous availability: from the shift paradigm to unmanned operation
![Page 13: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/13.jpg)
13
Test Plant Architecture
Continuous availability: from the shift paradigm to unmanned operation
Message Layer
Database Layer
User A User B
Message Router
Message Processor
Message Router
Kafka Broker
Aerospike Database
write
store store
write
write read
put
get
get
put
Application Layer
![Page 14: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/14.jpg)
14
Results Test Architecture
Specific tests to verify the relevant
domain functions.
Common simulation layer to
reproduce real operational
environment.
executed on
Continuous availability: from the shift paradigm to unmanned operation
![Page 15: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/15.jpg)
15
Results Simulation – continous delivery (1)
Normal traffic condition (500 msg/s), timeout = 10.000 ms
Kafka cluster rolling update
0 messages lost
0 timeout expired
Continuous availability: from the shift paradigm to unmanned operation
SIMUL.APP.01 : message latency (1 sec average)
![Page 16: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/16.jpg)
16
Results
Continuous availability: from the shift paradigm to unmanned operation 07 November 2017 – CMG Impact 2017
SIMUL.APP.02 : message latency (1 sec average)
Simulation – continous delivery (2)
Heavy traffic condition (2000 msg/s), timeout = 10.000 ms
Kafka cluster rolling update
0 messages lost
some timeout expired
![Page 17: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/17.jpg)
17
Results Simulation – proactive monitoring
Continuous availability: from the shift paradigm to unmanned operation
Normal traffic condition (500 msg/s)
average E2E processing time = 45 ms
High vCPU load added to Message Processor nodes.
T0-T1 below threshold
T2-T3 exceed threshold
![Page 18: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/18.jpg)
18
Conclusions and perspective
Phased
Approach Bi-modal
Data Center
Tool
Continuous availability: from the shift paradigm to unmanned operation
![Page 19: Continuous availability: from the shift paradigm to ...€¦ · 16 Results 07 November 2017 –CMG Impact 2017 Continuous availability: from the shift paradigm to unmanned operation](https://reader033.vdocuments.us/reader033/viewer/2022060809/608e0b8dfe511b19a7184c53/html5/thumbnails/19.jpg)
Continuous availability: from the shift paradigm
to unmanned operation.
Pietro Tiberi ([email protected])
Thanks for your attention