lhcb continuous integration and deployment system · improvements to the lhcb software performance...

16

Upload: others

Post on 20-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,
Page 2: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

LHCb Continuous Integration andDeployment SystemA message based approachS.-G. Chitic, B. Couturier, M. Clemencic, J. Closier on behalf of the LHCb collaboration

CHEP 2018, Sofia, Bulgaria

12/07/2018 LHCb Continuous Integration and Deployment System 2

Page 3: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

Introduction

Distributed Continuous Integration System

Deployment System

Conclusion

12/07/2018 LHCb Continuous Integration and Deployment System 3

Page 4: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

Why the need for a complex system?

• AFS is phasing out, the users needed a centralized location for the

(nightly) builds to be installed

• CVMFS has been chosen BUT deployment is laborious and slow:

• Each installation needs to be done on Stratum0

• Each file modification needs to be included in a transaction

• Even with the current infrastructure, the Stratum0 server is busy all

day for approx. 220 Gb of installations

• Each transaction needs to be serialized. No parallel transactions can

co-exist

12/07/2018 LHCb Continuous Integration and Deployment System 4

Page 5: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

Distributed Continuous IntegrationSystem

12/07/2018 LHCb Continuous Integration and Deployment System 5

Page 6: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

General architecture

12/07/2018 LHCb Continuous Integration and Deployment System 6

Build servers

Test servers

Periodic scheduler

Commit code

Pull

com

mits

Trigger builds

Trigger Tests

Save builds results

Save tests

results

Notify build

complet

ed

Trigger build installation

Trigger tests

Results reporting

Reporting dashboard front-end

Performance testing*

STRATUM-0 STRATUM-1

* Poster 271. Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski, B. Couturier

Page 7: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

Why RabbitMQ?

• Multi-protocol support: AMQP, MQTT, etc

• Reliability: persistence, delivery ACK and high availability

• Flexible Routing

• Management UI

• Clustering and federations: already tested for our usage

• Plugin System and community supported libraries for different

programming languages (e.g. pika for Python)

12/07/2018 LHCb Continuous Integration and Deployment System 7

Page 8: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

AMQP Protocol

• Network wire-level protocol

• Defines hows clients and brokers talk

• Data serialization (framing)

• Heartbeat

• Hidden in client libraries

• AMQP Model

• Define routing and storing messages

• Defines rules how these are wired together

• Exported API

12/07/2018 LHCb Continuous Integration and Deployment System 8

Page 9: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

RabbitMQ usage in LHCb CI System

• Used as a message bus between different components of the system

• Decouples message producers from consumer on different nodes at

different stages in the system

• Provides persistent queues

• Allows for message prioritization

• Easily used and managed with pika in Python

12/07/2018 LHCb Continuous Integration and Deployment System 9

Page 10: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

Deployment System

12/07/2018 LHCb Continuous Integration and Deployment System 10

Page 11: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

Deployment System

12/07/2018 LHCb Continuous Integration and Deployment System 11

Continuous Integration agent

CVMFSNotify build ready

Consume build readyPrioritized

builds installation RabbitMQ

Connector

CVMFSLogger

CVMFSExecuter

CERN IT MonitoringGateway

Gets Jobs and returns results

Sends messages to log

Sends stats for IT monitoring

Stats to Kibana

Page 12: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

Priority policy

• Needed because of ”burst” effect from the builds servers

• Allow for more important components to be installed first

• Ordering transparent for CVMFS installer

• Order can be changed during a day installation through Continuous

Integration Agent

12/07/2018 LHCb Continuous Integration and Deployment System 12

Page 13: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

Flexibility

• For better distribution of installation:

• First install all the components for the most important platform on

all the software projects

• After, install on a per software project priority base

• Smaller installation result in installations to be propagated faster

• Possibility of injecting / removing installations

• Possibility of reordering the installation

• Better management of installation errors using separate queue in

RabbitMQ

12/07/2018 LHCb Continuous Integration and Deployment System 13

Page 14: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

Future works

• Reduce the single point of failure by using a cluster of messaging

nodes

• Take advantage of the new message bus:

• Notify other distributed components

• Inform users about the status of the system

• Improve scalability of the system

• Improve the systems monitoring and error management

12/07/2018 LHCb Continuous Integration and Deployment System 14

Page 15: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

Conclusion

• End-to-end continuous integration and deployment system

• Decoupled components on different nodes using messaging bus -

RabbitMQ

• Flexible installation system for a otherwise laborious and slow task

• Complex system but easily developed and monitored with Python

• More new opportunities using the new messaging bus

12/07/2018 LHCb Continuous Integration and Deployment System 15

Page 16: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,

home.cern