building a modern enterprise soa at linkedin
TRANSCRIPT
Building a Modern Enterprise SOA at LinkedIn
Agenda
Building Code at LinkedIn
Product Development with Multiproduct
Build Automation with Gradle
A Peek at the Future
©2013 LinkedIn Corporation. All Rights Reserved. 2
Building Code at LinkedIn
©2013 LinkedIn Corporation. All Rights Reserved. 3
In the beginning there was Network
Single, relatively homogenous code base
– Build from source, little dependency management
Java, Spring, Ant, …
JavaScript, HTML, JSPs, CSS, …
JARs, WARs, Jetty, Tomcat, …
©2013 LinkedIn Corporation. All Rights Reserved. 4
Then everything went exponential
Number of developers, programming languages, build
systems, frameworks, lines of code, servers, users, page views, …
… and pretty much everything else
©2013 LinkedIn Corporation. All Rights Reserved. 5
Hypergrowth isn’t all fun and games
Acquisitions came in with their own technology and processes
Releases became more and more painful
Productivity and stability suffered
©2013 LinkedIn Corporation. All Rights Reserved. 7
Scaling a software development organization
Requires sensible code and dependency management
– Source code APIs
– Service APIs
– Versioned dependencies
Performance is king
– Iterative improvements
Divide and conquer
– Split and isolate failures
©2013 LinkedIn Corporation. All Rights Reserved. 8
Code Isolation
Network Trunk Development
All development in Network shifted to trunk
– No branches, no merging
Continuous releases from trunk
– Deploy multiple times per day
Work started on break up and clean up
– Migrating build logic to Gradle
©2013 LinkedIn Corporation. All Rights Reserved. 10
Product Development with Multiproduct
©2013 LinkedIn Corporation. All Rights Reserved.
Traditional Software Development
Use a well-established technology stack
– Homogeneity => Simplicity
To adopt a new technology:
– Requires “out of the box” thinking and effort
– Do a proof-of-concept implementation
– Present to decision makers to demonstrate ROI => get approval
Slow-moving by design
– New technology integration expensive
– Top-down management decisions used as barrier
©2013 LinkedIn Corporation. All Rights Reserved. 12
LinkedIn Software Development
We don’t ever want to be in the box
– Technical experimentation and diversity encouraged
– Living on the bleeding edge, often defining it
The Need for Speed
– Pace of iteration
– Automation (the human is slow)
– Continuous delivery
Build versus buy
©2013 LinkedIn Corporation. All Rights Reserved. 13
Chaos Theory
Multiproduct
Toolset that is architected for a heterogeneous technology world
Agnostic to version control, build system and programming stack
– Future-proof
Abstracts common software tasks
– For example: “build”, “test”, “release”
Provide a default implementation, allow users to override
– i.e. Gradle w/ LI plug-ins
©2013 LinkedIn Corporation. All Rights Reserved. 15
Key Concepts
Elevate tooling from artifact to product level
Metadata ties the tooling together
– Ivy
– Version and Build specification
Pluggable implementation of subsystems
Version management
Continuous automated delivery
©2013 LinkedIn Corporation. All Rights Reserved. 16
©2013 LinkedIn Corporation. All Rights Reserved. 17
Source
ControlBuild System Deployment
Version Management
End-of-life dates
– Graceful deprecation and upgrades
Push version upgrades to consumers
Dependency Reports
– What products depends on me
– What products do I depend on
©2013 LinkedIn Corporation. All Rights Reserved. 18
Push My Upgrade
©2013 LinkedIn Corporation. All Rights Reserved. 19
Tracking Upgrades
©2013 LinkedIn Corporation. All Rights Reserved. 20
Continuous Delivery in Multiproduct
Automated pipeline triggered on developer change
– No other developer action needed
Publishing 10,000+ artifacts per day for 300+ products
– Mean time for a good commit: ~10 minutes
– Mean time counting failures: ~25 minutes
©2013 LinkedIn Corporation. All Rights Reserved. 21
Continuous Delivery
©2013 LinkedIn Corporation. All Rights Reserved. 22
©2013 LinkedIn Corporation. All Rights Reserved. 23
Build Automation with Gradle
©2013 LinkedIn Corporation. All Rights Reserved. 24
Why LinkedIn uses Gradle
Dependency resolution engine
Rich plug-in system w/ real programming language
– DSL has high learning curve, but powerful
Visions align
– Automation
– Continuous delivery
©2013 LinkedIn Corporation. All Rights Reserved. 25
LinkedIn Gradle plug-ins
Customize built-in plug-ins for LinkedIn’s environment
– i.e. Java, Scala, War, FindBugs, Cobertura
Add custom artifact types
– For example database patches, static content, and Hadoop workflows
Create metadata for publishing and deployment tooling to consume
Elevate concepts from artifact to product level
©2013 LinkedIn Corporation. All Rights Reserved. 26
Dependency Graph powered by Gradle
©2013 LinkedIn Corporation. All Rights Reserved. 27
©2013 LinkedIn Corporation. All Rights Reserved. 28
End-of-life enforcement
Our own gradlew: ligradle
Our own custom Gradle wrapper
Provisions Gradle and plug-ins
– Allows each product to define versions it uses
Provides lifecycle management for Gradle and plug-ins
– End-of-life
Usage data
– Used to track usage and discover problems
©2013 LinkedIn Corporation. All Rights Reserved. 29
Usage Data
©2013 LinkedIn Corporation. All Rights Reserved. 30
Source vs Binary Dependencies
Source offers incremental updates and flexibility intra-product
Binary offers stability and speed inter-product
There’s no right answer, but there are plenty of wrong answers!
©2013 LinkedIn Corporation. All Rights Reserved. 31
Network migration to Gradle
3,600 build.xml files to convert
– Many of them with custom logic
300 developers to train
Performance targets
– 2x speed-up for clean builds
– 5x speed-up for incremental builds
©2013 LinkedIn Corporation. All Rights Reserved. 32
Proof of Concept
Migrated 1,100 modules
Tested single large build versus isolated
segments
– Single large build simpler
Scale feasible
– Requires scalability and performance
work in Gradle core
©2013 LinkedIn Corporation. All Rights Reserved. 33
Gradle features required for migration
Configuration on demand
– Only configure the task graph you need
Refactored cache logic for performance
– Task history
– Dependency descriptors
Candidate performance improvements
– Parallel configuration
– Daemon stores project model
– Daemon performs continuous up-to-date checks
©2013 LinkedIn Corporation. All Rights Reserved. 34
Daemon Heap Usage
©2013 LinkedIn Corporation. All Rights Reserved. 36
Project Timeline – 1 year
Q1: Proof-of-concept and prep work
Q2: Implementation
Q3: Roll-out
Q4: Clean-up
©2013 LinkedIn Corporation. All Rights Reserved. 37
A Peek at the Future
©2013 LinkedIn Corporation. All Rights Reserved. 38
Gradle Features
Faster
– Use the daemon effectively in development and CI
– Intra-project parallel execution
More scalable
– Heap usage
Ease of use
– IDE integration
©2013 LinkedIn Corporation. All Rights Reserved. 39
More Multiproduct Intelligence
Analytics
– Common exceptions
– Usage and error patterns
Verification suite
– CheckStyle, FindBugs and Cobertura
Automation and Integration
Ease of use
©2013 LinkedIn Corporation. All Rights Reserved. 40
Distributed Build Automation
Distribute build and testing on a cluster
Automatic provisioning
Artifact sharing at scale
©2013 LinkedIn Corporation. All Rights Reserved. 41
©2013 LinkedIn Corporation. All Rights Reserved. 42