Download - CI/CD for mobile at HERE
Continuous Integration and Delivery for the HERE Mobile Apps
How we deliver mainline to millions of Android and iOS users every other
week.
DevOps Meetup BerlinStefan Verhoeff, 28 September 2016@stefan_verhoeff
All opinions expressed in this deck are the author's own and do not necessarily represent the official view of HERE
Intro
• Hi, I’m Stefan
• I run a team called Wookieewho build and run CI/CD for Android and iOS
• We build the HERE WeGoApp, give it a try ;-)
HERE WeGo App
Agenda• CI/CD for Mobile?• Our approach
– Key objectives + metrics– Pipelines– Platform architecture– Testing– Dashboards– Releasing
• Challenges & plans• Q&A
DevOps + CI/CD• Who here does CI/CD? On mobile?• Wikipedia says:
– Continuous IntegrationContinuous integration (CI) is the practice of merging all developer working copies to a shared mainline several times a day
– Continuous DeliveryContinuous delivery (CD) is a software engineering approach in which teams produce software in short cycles, ensuring that the software can be reliably released at any time. It aims at building, testing, and releasing software faster and more frequently.
DevOps + CI/CD
• CI is feedback to the developer– Did I build it right?
– Did I break something?
• CD is feedback + deliver value to the user– Did we build the right thing?
– Did it the changes cause any issues?
State of Mobile CI/CD
• We know about CI/CD for web and cloud, what is different for Mobile?
• Delivery = App Store / Play Store -> Brick wall
• You can still keep your App shippable at all times, even if you don’t ship every build
Challenges of Mobile CI/CD• Dealing with many devices
and OS versions• App store approval cycle
takes time. Apple, looking at you!
• High cost of failure. Can’t roll back easily.
• CI tools for Mobile lacking in the past, but starting to pop up lately.
CI related tools on Mobile• Platforms
– Google Firebase, Fabric, AWS Mobile
• App store automation– Fastlane
• Distribution– HockeyApp, TestFlight, Play Store channels
• Cloud devices– Amazon Device Farm, Google Test Cloud, Bitbar TestDroid
• Cloud CI– Travis, CircleCI, Bitrise, Build Buddy
Monolith, anyone?
Objectives + key metrics• Maximize developer feedback
– Coverage: lines + test cases > 80% coverage– Speed: feedback time Pre-Submit < 15 min– Reliability: red pipeline % of day > 90%
• Maximize user delight– App Store rating > 4.5 stars– NPS score > 30 score– Bug limit < 30 open bugs + SLA– Limit # critical bugs on production < 5 per year
• Visual dashboards to track
HERE implementation of CI
Pre-submit Testing
• Improves code quality
• Enables cross-team collaboration
To make the shared “Mainline” work, we introduce 2 enablers
Code Review
• Protects the mainline
• Provides direct feedback on each change
One Mainline per Product
• Reduces time to market
• Creates responsibility
• Creates focus
• Improves predictability
Code reviews through Gerrit
• Double combo:
– Pre-submit: CI feedback isolated from other developers
– Review: Peer developer feedback
• Reviews help knowledge sharing across teams
Code Review and Pre-submit Testing
Developer‘s machine
Download code
Code Hosting Server
1
New change created locally
Create/modify files
2
Upload change forreview with peers
3No
Yes
Rework
5.a
Submit5.b
Accepted
Pre-submit
4
Code Review
AutomatedTests
HERE CI stages
Mainline
Pre-submit Verification
Pre-submit Verification
Pre-submit Verification
Pre-submit Verification
Pre-submit Verification
Pre-submit Verification
Pre-submit Verification
Submit Verification
Submit Verification
Submit Verification
Submit Verification
Submit Verification
Submit Verification
Submit Verification
Submit Verification
LongerVerification < 1h
Verification> 1h
Can be releasedOptional Manual
Tests
LongerVerification < 1h
LongerVerification < 1h
Pre-submit Verification
Why we have stages
Platform architecture
Our device labs
Testing
UI test with mocks
UI test end to end
Exploratory + Regression + Compatibly
Drive in the field
Unit tests class level
Unit tests component level
1 day
2 days
100s
500+
100s
1000s
Avoiding flaky tests
• Flaky tests are a huge source of waste, we fight them with fire
• Mocking dependencies and network
• Moving tests down to component level
• Re-run after failing
• Stable testing culture. Google’s Testing of the toilet series
Visualizing test results – history matrix
Test status dashboard
Dashboards
• Gathering data and creating dashboards probably one of the best things we've done to grow awareness
• Key metrics: these are in everyone's (dev + ops) annual objectives!– Coverage
– Performance
– Stability
• Graphite + Granafa
CI performance dashboard
CI stability dashboard
Team dashboard
Jenkins wall monitor
Keeping CI stable• Wall monitor and team
dashboard visible in area• Stop the line if bug limit
reached• Build Police: rotating
developer who spots failures and finds who/what caused it
• HERO*: CI Engineer available on chat for support
* Helpful Emergency Responsible Operator
Releasing
• Release train– Ship on a regular schedule
– Always release master*
– New features hidden behind flag
• Gradual rollout– Activate features without App Store release!
– Roll out percentage wise to measure impact
– Works great combined with A/B testing
– Reduces risk and enables learning
* We do create a release branch but just to avoid regressions while we manually test
How we release• Push to App Store / Play Store still manual but automation
planned• Weekly: Alpha internal release• Every other week
– Beta– Production
• Release checklist– Comms, legal, privacy, platform, ...
• Release results tracking
Release schedule
Production feedback
• Metrics: analytics, logs and crash data
• User feedback, NPS, store reviews
CI as Code• Jenkins Job DSL plugin• Treat it as code:
– Version control– Abstractions, re-use and
refactoring– Tests– CI for CI code (infinite
recursion error)– Code reviews– Local testing
[
Constants.Branches.MASTER,
Constants.Branches.RELEASE
].each { branch ->
def pipelineBuilder = new NightlyPipelineBuilder(this)
pipelineBuilder.with {
defaultConfiguration([
client : Constants.Description.Contacts.WOOKIEE,
pipelineType: Constants.Description.PipelineType.NIGHTLY
])
baseFolders([Constants.Folders.PROJECT, branch, Constants.Folders.NIGHTLY])
triggerManifestRepoBranch = branch == Constants.Branches.MASTER ? 'master' :
'release/heresuite'
triggerManifestRepoFile = 'heresuite-android.xml'
triggerPublishArtifacts = ['manifest.xml']
defaultBuildJobName =
"${Constants.Folders.PROJECT}/${branch}/${Constants.Folders.NIGHTLY}/" +
'build_heresuite-android_arm_dev_debug_play'
addTrigger()
buildPhase.predefinedParams += [BRANCH: branch]
testPhase.predefinedParams += [BRANCH: branch]
addBuildJobs()
addTestEndurance()
addTestEndurance10()
addTestOfflineWithRerun()
addTestUnitTestCoverage()
}
Challenges & solutions• Long release cycles for App -> Move to release train + gradual rollout
• Long SDK integration cycle -> Early manual testing and connected CI systems
• Constant bug fighting App teams -> Tech improvements, refactoring, test automation, RCA
• Flaky tests -> Visualize, prune test suite, mock, component tests
• No insight into state of CI KPIs -> Collect metrics and create dashboards
• Manual job maintenance hell -> CI as Code
• Taking action on failures -> Visible screens, HERO and build police roles
• Maintenance and capacity devices -> Buy more hardware, optimize use, move to cloud
Next plans / on the radar
• More tests for CI code• Jenkins Pipeline plugin• Fastlane for iOS• App Store / Play Store
automation• TestDroid cloud testing• Multiplatform end to end tests
Thank you!
Bonus
Pipelines for Mobile
• The stages of our CI system
– Pre-Submit
– Post-Submit
– Hourly
– Nightly
Mobile specific concerns
• Signing certificates
• Distributing Dev and Beta builds
– HockeyApp
– TestFlight App Store
– Google Play Store Alpha/Beta channel
• App Store / Play Store automation
CI node dashboard
Device lab dashboard
CI building blocks
• Builds
• Static Analysis
• Unit tests
• Functional test
Test types and life cycle• Gateway
– Set of representative test run in Pre-Submit
– Very stable and fast
• Hourly– Big set up regression tests. Take an hour to run so timed
• Unproven– New tests first need to prove that they are stable enough before being added to the
hourly test set
• Unstable– Known unstable tests. Either fix or delete
• End to end– All above tests use mocks. But there is a set of . These are more brittle so high
maintenance costs. They do find more issues.
Non-functional tests
• Performance KPI
• Monkey
• Power consumption
• Memory