expedia 3x3 presentation

Post on 07-Jan-2017

188 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

3x3: Speeding Up Mobile Releases

Drew Hannay

Keqiu Hu

Jingjing Sun

Project Voyager● New version of

flagship LinkedIn app

● 250+ committers

across Android & iOS

● ~1 year of

development

● Investment in mobile

infrastructure at

LinkedIn

Before Voyager● 12 releases per year

● RC build + manual regression suite

● Mad dash to commit code before the RC cutoff

○ Missing the cutoff meant a long wait for the next release

● Product & marketing plans were made around the monthly releases

● Hard to iterate on member feedback

3x3Release three times per day, no more than three

hours from code is commit to member availability

Why three hours?● Not enough time for manual testing steps

● Not enough time to test everything

○ The goal isn’t 100% automation, it’s faster iterations

○ We don’t want engineers spending most of their time maintaining tests that break whenever a

design changes

● UI tests are chosen based on production-critical business flows

● Faster iteration helps emphasize craftsmanship

○ Devs can take the extra time to write quality code since the next release is soon

Commit Pipeline

CodeReview

StaticAnalysis

UnitTests

BuildReleaseArtifacts

UITests

AlphaRelease

FeatureDevelopment

Production Release

BetaRelease

Commit Pipeline

CodeReview

StaticAnalysis

UnitTests

BuildReleaseArtifacts

UITests

AlphaRelease

FeatureDevelopment

Production Release

BetaRelease

Static analysis● Compile-time contract with API server using Rest.li

○ Rest.li data templates are shared between API server & clients

○ Provides static analysis checks that guarantee backwards compatibility

○ Client models are code generated for further safety

● Java Checkstyle

● Android Lint

○ Over 200 checks provided by Google

○ Several custom checks written for LinkedIn-specific patterns

● Swift Lint

○ Forked version of Realm’s SwiftLint

○ Added custom checks for LinkedIn patterns

Building the code● Over 500k lines of code between Android & iOS

● Building production binaries for a large codebase is slow

● iOS & Swift

○ At one point early on, Swift compilation took over two hours

○ Refactoring into sub-projects and modules lead to a more than 50% speed up

● Android Split APKs

○ Separate binary for each combination of screen density and CPU architecture

● Distributed builds

○ Build the release binaries on separate machines while tests are running

○ Same strategy could be used for running automated tests

What do we test?● Unit tests

● Layout tests

○ Unit tests for views

○ Stress test views with long strings, short strings

○ Make sure views don’t overlap, and render properly in right-to-left mode

● Scenario tests

○ Validate that key business metric flows are working properly

○ Usually flows that span multiple screens in the app

○ App gets mock data from a local fixture server

○ Not an exhaustive suite

● Live tests (experimental)

Test stability● Testing infrastructure stability

● Tooling stability

Testing infrastructure stability● Stabilize testing environment – Hermetic Testing

Testing infrastructure stability● Stabilize testing environment

● Stabilize testing framework

○ Added reliable wait

○ Fixed unreliable APIs

Testing infrastructure stability● Stabilize testing environment

● Stabilize testing framework

● Sanitize test suite

Do some Math

99.9%1000 = ?

99% 95% 90% 80% 50%

Do some Math

99.9%1000 = ?

99% 95% 90% 80% 50%36.7%!

36.7% Reliability● Test Quality == Production Quality

● Lost confidence in test

● Developer unhappiness

● Tests should be used to prevent regression, not blocking development

36.7% Reliability● Test Quality == Production Quality

● Lost confidence in test

● Developer unhappiness

● Tests should be used to prevent regression, not blocking developmentFlaky Tests Are Worse Than No Tests

Trunk Guardian● Detect & disable flaky tests

Tooling Stability● Hardware Stability

● Build Environment Stability

● Parallelized Testing Stability

Pool Guardian

Partner teams● Historically, several partner teams validated the

build before a release

● For example, we needed sign off from the

localization team

● Lint checks catch hardcoded or improperly

formatted strings

● Layout tests catch strings that are too long and

RTL layout bugs

● Semantic correctness of translations is still

validated by translators manually

Getting to members● Every three hours, internal alpha testers get a new build

○ Mainly members of the Flagship team

○ Product managers, devs, and execs who want to see the latest code ASAP

● Every week, the rest of the company gets a new beta build

○ iOS build is submitted to Apple for review

● After a week of beta, the build is promoted to production

○ Assuming Apple’s review is complete, iOS is released

○ Take advantage of Google Play staged rollout for Android

Dogfooding● Android: Google Play alpha/beta channel

○ Easy upgrades for employees, even while off the corporate network

○ Somewhat difficult to get set up, but easy once registered

● iOS: TestFlight

○ Nice, but limited number of users

● iOS: Custom enterprise distribution

○ Scales to our number of users, but employees must be on corporate wifi to upgrade

● Splash screen in the production app encourages employees to use beta builds

Minimizing risk & enabling experiments● Take advantage of LinkedIn’s existing A/B testing infrastructure

● New features are developed behind feature flags

○ Code can be ramped dynamically to different groups of members

○ Performance of new features or changes can be monitored

● Dynamic configuration

● Server-controlled kill switch

○ Crashing or buggy code can often be disabled without a new build

Android 3x3

Consistent Environments: Android Devices● Android emulators, Genymotion, physical devices

● Practically infinite number of screen sizes

● Different configs; RAM, heap size, hardware features…

● Tests passed locally, but failed on the build server or other dev machines

● Developers were unhappy

Consistent Environments: Enter Gradle● Script to start emulators was deployed to each build machine

○ Led to bugs where incorrectly provisioned machines caused random build failures

● Creating a new emulator from scratch every time = SLOW

● Only capable of running one emulator at a time

● Already using Gradle to get consistent builds across machines

● Why should tests be different?

Consistent Environments: Gradle Solution● Create a standalone bundle

○ Download fresh system image

○ Create sdcard image

○ Run the emulator once to create all user files

○ Bundle system + user files into a tar that can be extracted and run without dependencies

● Custom Gradle plugin

○ Extracts & starts emulators

○ Manages running up to 16 emulators in parallel on one build machine

Test Stability● Layout & Scenario tests use Google’s Espresso test utility

○ Optimally fast using IdlingResources

● Android testing lifecycle

○ Start the app

○ Run all tests

○ Stop the app

● Tests were unstable due to implicit dependencies

○ Application level objects (like memory cache)

○ Data saved to disk (SharedPreferences, disk cache)

● Tests didn’t always clean up after themselves and

trying to fix it was a losing battle

Test Stability● What if we changed the lifecycle?

○ Start the app

○ Run one test

○ Stop the app & clear package data

○ Repeat (x3000)

● Super stable! And super slow :(

Test Stability: Custom Test Harness● Custom annotation processor that computes the list of all tests to run

● Construct a queue of “Test” objects -> (test method, device, locale, …)

● Start up to 16 device threads which poll the queue for a test to run

○ Much faster than static sharding, since all devices are always busy running tests

● Output a custom html + junit test report

○ Includes logcat data for each test

○ Includes screenshots for failing tests

● Runs 4500+ tests in < 14 minutes on one build machine

Android multi-emulator test run

iOS 3x3

Reliability

Scenario Tests

iOS – KIF

Speed

Speed up compiling time!

- Compiler

Speed

Speed up compiling time!

- Compiler

Speed

Speed up compiling time!

- Compiler

- Buy hardware .. (Mac Pro)

Speed

Speed up test!

- Speed up KIF

Speed

Speed up test!

- Distributed build/testing

Running in 10 machines

Each node has a reliability of 95% -> 95%10 = 60%

Parallelized Testing Stability

Improve node stability

Each node has a reliability of 98% -> 98%10 = 82%

Parallelized Testing Stability

Multi-sim in iOS

API server 3x3

API server: develop● Monitor

○ Build a monitoring system to ensure the API server is well covered

○ Monitor JVM stats, user requests stats, and etc.

● Log

○ Logging context setup

■ Test-specific logs

■ UUID to link request logs

○ Tools support for production

■ ELK (Elasticsearch, Logstash, and Kibana)

API server: build● Static analysis

○ Rest.li snapshot compatibility checker to ensure API changes are backwards compatible

● Test

○ Unit test

○ Smoke test

API server: canary & deploy● Canary release candidate

○ Run live tests against canary instances

○ Compare metrics between the canary version and the current released version

○ Error log analysis

● Promote healthy canary or rollback bad ones

3x3 after 5 months: areas to improve● Release automation

○ Production uploads to the app stores are still a manual process

○ Getting release notes & translations is painful

● Automated performance testing

○ We can sample performance of the app in production,

but don’t have a great way of catching issues before release

● Android Monkey testing

○ Enables wide range of API level & device coverage with very low overhead cost

● iOS speed improvements

○ Keep up with Swift evolution

● Bring 3x3 framework to other LinkedIn apps

Questions

3x3 blogs & videos● 3x3: Speeding up mobile releases

● 3x3: iOS Build Speed and Stability

● Test Stability - How We Make UI Tests Stable

● UI Automation: Keep it Functional - and Stable!

● Consistent Android Testing Environments with Gradle (slides)

● Effective Layout Testing Library for iOS

● Managing iOS Continuous Integration at Enterprise Scale

top related