automated testing of massively multi-player games lessons learned from the sims online larry mellon...

64
Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Upload: linda-ross

Post on 23-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Automated Testing of Massively Multi-Player Games

Lessons Learned fromThe Sims Online

Larry Mellon

Spring 2003

Page 2: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Context: What Is Automated Testing?

Page 3: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Classes Of Testing

SystemStress

Load

RandomInput

Feature Regression

Developer

QA

Page 4: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Automation Components

Collection&

Analysis

System Under Test

Repeatable, Sync’edTest Inputs

System Under Test

Startup&

Control

System Under Test

Page 5: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

What Was Not Automated?

Startup & Control

Repeatable, Synchronized Inputs

Results Analysis

Visual Effects

Page 6: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Lessons Learned: Automated Testing

Time(60 Minutes)

1/3

Wrap-up & QuestionsWhat worked best, what didn’tTabula Rasa: MMP / SPG

Fielding: Analysis & Adaptations

Design & Initial ImplementationArchitecture, Scripting Tests, Test ClientInitial Results

1/3

1/3

Page 7: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Design Constraints

Load

Regression

Churn Rate

Automation(Repeatable, Synchronized Input)

(Data Management)

Strong Abstraction

Page 8: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Test Client

Single, Data Driven Test Client

Regression Load

SingleAPI

ReusableScripts & Data

Page 9: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Test Client

Data Driven Test Client

Regression Load

SingleAPI

ReusableScripts & Data

SingleAPI

ConfigurableLogs & Metrics

Key Game StatesPass/Fail

Responsiveness

“Testing feature correctness” “Testing system performance”

Page 10: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Problem: Testing Accuracy

• Load & Regression: inputs must be– Accurate

– Repeatable

• Churn rate: logic/data in constant motion– How to keep testing client accurate?

• Solution: game client becomes test client– Exact mimicry

– Lower maintenance costs

Page 11: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Test Client == Game Client

Test Control

State

Game GUI

Client-Side Game Logic

Commands

State

Presentation Layer

Test Client Game Client

Page 12: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Game Client: How Much To Keep?

Game Client

View

Logic

Presentation Layer

Page 13: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

What Level To Test At?Game Client

MouseClicks

Presentation Layer

Regression: Too Brittle (pixel shift)Load: Too Bulky

Regression: Too Brittle (pixel shift)Load: Too Bulky

View

Logic

Page 14: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

What Level To Test At?

Game Client

InternalEvents

Presentation Layer

Regression: Too Brittle (Churn Rate vs Logic & Data)

Regression: Too Brittle (Churn Rate vs Logic & Data)

View

Logic

Page 15: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Gameplay: Semantic Abstractions

NullView ClientView

LogicPresentation LayerBuy Lot Enter Lot

Use Object

…Buy Object

~ ¾

~ ¼

Basic gameplay changes less frequently than UI or protocol implementations.Basic gameplay changes less frequently than UI or protocol implementations.

Page 16: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Scriptable User Play Sessions

• SimScript– Collection: Presentation Layer “primitives”– Synchronization: wait_until, remote_command

– State probes: arbitrary game state• Avatar’s body skill, lamp on/off, …

• Test Scripts: Specific / ordered inputs– Single user play session– Multiple user play session

Page 17: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Scriptable User Play Sessions

• Scriptable play sessions: big win– Load: tunable based on actual play– Regression: constantly repeat hundreds

of play sessions, validating correctness

• Gameplay semantics: very stable– UI / protocols shifted constantly– Game play remained (about) the same

Page 18: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

SimScript: Abstract User Actions

include_script setup_for_test.txtenter_lot $alpha_chimpwait_until game_state inlot

chat I’m an Alpha Chimp, in a Lot.

log_message Testing object purchase.

log_objects buy_objectchair 10 10log_objects 

Page 19: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

SimScript: Control & Sync

# Have a remote client use the chair remote_cmd $monkey_bot

use_object chair sit set_data avatar reading_skill 80set_data book unlock use_object book readwait_until avatar reading_skill 100set_recording on

Page 20: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Client Implementation

Page 21: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Composable Client

- Scripts- Cheat Console- GUI

Event GeneratorsEvent GeneratorsEvent Generators

Game Logic

Presentation Layer

Page 22: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Composable Client

- Scripts- Console- GUI

- Console- Lurker- GUI

Any / all components may be loaded per instanceAny / all components may be loaded per instance

Event GeneratorsEvent GeneratorsEvent GeneratorsViewing SystemsViewing SystemsViewing Systems

Game Logic

Presentation Layer

Page 23: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Lesson: View & Logic Entangled

Game Client

View

Logic

Page 24: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Few Clean Separation Points

Game Client

View

LogicPresentation Layer

Page 25: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Solution: Refactored for Isolation

Game Client

View

LogicPresentation Layer

Page 26: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

LogicPresentation Layer

Lesson: NullView Debugging

?Without (legacy) view system attached, tracing was “difficult”.

Without (legacy) view system attached, tracing was “difficult”.

Page 27: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

LogicPresentation Layer

Solution: Embedded Diagnostics

DiagnosticsDiagnosticsDiagnosticsTimeout Handlers…

Page 28: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Talk Outline: Automated Testing

Time(60 Minutes)

Wrap-up & Questions

Lessons Learned: Fielding

Design & Initial Implementation

Architecture & Design

Test Client

Initial Results

1/3

1/3

1/3

Page 29: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Mean Time Between Failure

• Random Event, Log & Execute • Record client lifetime / RAM• Worked: just not relevant in early

stages of development–Most failures / leaks found were

not high-priority at that time, when weighed against server crashes

Page 30: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Monkey Tests

• Constant repetition of simple, isolated actions against servers

• Very useful: –Direct observation of servers while

under constant, simple input–Server processes “aged” all day

• Examples:–Login / Logout–Enter House / Leave House

Page 31: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

QA Test Suite Regression

• High false positive rate & high maintenance–New bugs / old bugs–Shifting game design –“Unknown” failures

Not helping in day to day work.Not helping in day to day work.

Page 32: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Talk Outline: Automated Testing

Time(60 Minutes)

¼

½

¼ Wrap-up & Questions

Fielding: Analysis&AdaptationsNon-Determinism

Maintenance OverheadSolutions & Results

Monkey / Sniff / Load / Harness

Design & Initial Implementation

Page 33: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Analysis: Testing Isolated Features

Page 34: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Analysis: Critical Path

Failures on the Critical Path block access to much of the game.

Failures on the Critical Path block access to much of the game.

enter_house ()

Test Case: Can an Avatar Sit in a Chair?

use_object ()

buy_object ()

buy_house ()

create_avatar ()

login ()

Page 35: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Solution: Monkey Tests

• Primitives placed in Monkey Tests– Isolate as much possible, repeat 400x– Report only aggregate results

• Create Avatar: 93% pass (375 of 400)

• “Poor Man’s” Unit Test– Feature based, not class based– Limited isolation– Easy failure analysis / reporting

Page 36: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Talk Outline: Automated Testing

Time(60 Minutes)

Wrap-up & Questions

Lessons Learned: FieldingNon-Determinism

Maintenance CostsSolution Approaches

Monkey / Sniff / Load / Harness

Design & Initial Implementation1/3

1/3

1/3

Page 37: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Analysis: Maintenance Cost

• High defect rate in game code–Code Coupling: “side effects”–Churn Rate: frequent changes

• Critical Path: fatal dependencies• High debugging cost–Non-deterministic, distributed logic

Page 38: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Turnaround Time

daysBug

Introduced

Development

Checkin

Smoke

Regression

Build

Time to Fix

Cost of Detection

Tests were too far removed from introduction of defects.

Tests were too far removed from introduction of defects.

Page 39: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Critical Path Defects Were Very Costly

Impact on

Others daysBug

Introduced

Development

Checkin

Smoke

Regression

Build

Time to Fix

Cost of Detection

Page 40: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Solution: Sniff Test

Pre-Checkin Regression: don’t let broken code into Mainline.

Pre-Checkin Regression: don’t let broken code into Mainline.

Working Code

Candidate Code

Pass / Fail,Diagnostics

Development

Checkin

Smoke

Sniff

Regression

Page 41: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Solution: Hourly Diagnostics

• SniffTest Stability Checker–Emulates a developer–Every hour, sync / build / test

• Critical Path monkeys ran non-stop–Constant “baseline”

• Traffic Generation–Keep the pipes full & servers aging–Keep the DB growing

Page 42: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Analysis: CONSTANT SHOUTING IS REALLY IRRITATING

• Bugs spawned many, many, emails• Solution: Report Managers

– Aggregates / correlates across tests– Filters known defects– Translates common failure reports to their root

causes• Solution: Data Managers

– Information Overload: Automated workflow tools mandatory

Page 43: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

ToolKit Usability

• Workflow automation• Information management• Developer / Tester “push button” ease of use• XP flavour: increasingly easy to run tests

– Must be easier to run than avoid to running

– Must solve problems “on the ground now”

Page 44: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Sample Testing Harness Views

Page 45: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Load Testing: Goals

• Expose issues that only occur at scale• Establish hardware requirements• Establish response is playable @ scale• Emulate user behaviour– Use server-side metrics to tune test scripts

against observed Beta behaviour

• Run full scale load tests daily

Page 46: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Load Testing: Data Flow

ClientMetrics

Game Traffic

ResourceMetrics

Debugging Data

Test Driver CPU

Load Control Rig

Server Cluster

Load Testing Team

System Monitors

Internal Probes

Test ClientTest

ClientTest

Client

Test Driver CPU

Test ClientTest

ClientTest

Client

Test Driver CPU

Test ClientTest

ClientTest

Client

Page 47: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Load Testing: Lessons Learned

• Very successful–“Scale&Break”: up to 4,000 clients

• Some conflicting requirements w/Regression –Continue on fail–Transaction tracking–Nullview client a little “chunky”

Page 48: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Current Work

• QA test suite automation• Workflow tools• Integrating testing into the new

features design/development process

• Planned work–Extend Esper Toolkit for general use–Port to other Maxis projects

Page 49: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Talk Outline: Automated Testing

Time(60 Minutes)

Wrap-up & Questions

Lessons Learned: Fielding

Design & Initial Implementation

Biggest Wins / LossesReuseTabula Rasa: MMP & SSP

1/3

1/3

1/3

Page 50: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Biggest Wins

• Presentation Layer Abstraction– NullView client– Scripted playsessions: powerful for

regression & load• Pre-Checkin Snifftest• Load Testing• Continual Usability Enhancements • Team

– Upper Management Commitment– Focused Group, Senior Developers

Page 51: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Biggest Issues

• Order Of Testing– MTBF / QA Test Suites should have come last– Not relevant when early & game too unstable – Find / Fix Lag: too distant from Development

• Changing TSO’s Development Process– Tool adoption was slow, unless mandated

• Noise– Constant Flood Of Test Results– Number of Game Defects, Testing Defects– Non-Determinism / False Positives

Page 52: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Tabula Rasa

How Would I Start The Next Project?How Would I Start The Next Project?

Page 53: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Tabula Rasa

PreCheckin Sniff Test

There’s just no reason to let code break.There’s just no reason to let code break.

Page 54: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Tabula Rasa

PreCheckin SniffTest

Hourly Monkey Tests

Keep Mainline workingKeep Mainline working

Useful baseline & keeps servers aging.Useful baseline & keeps servers aging.

Page 55: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Tabula Rasa

Dedicated Tools Group

PreCheckin SniffTest Keep Mainline workingKeep Mainline working

Hourly Stability Checkers Baseline for DevelopersBaseline for Developers

Continual usability enhancements adapted toolsTo meet “on the ground” conditions.Continual usability enhancements adapted toolsTo meet “on the ground” conditions.

Page 56: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Tabula Rasa

PreCheckin SniffTest Keep Mainline workingKeep Mainline working

Hourly Stability Checkers Baseline for DevelopersBaseline for Developers

Dedicated Tools Group Easy to Use == UsedEasy to Use == Used

Executive Level Support

Mandates required to shift how entire teams operated.Mandates required to shift how entire teams operated.

Page 57: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Tabula Rasa

PreCheckin SniffTest Keep Mainline workingKeep Mainline working

Hourly Stability Checkers Baseline for DevelopersBaseline for Developers

Easy to Use == UsedEasy to Use == Used

Load Test: Early & Often

Executive Support Radical Shifts in ProcessRadical Shifts in Process

Dedicated Tools Group

Page 58: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Tabula Rasa

PreCheckin SniffTest Keep Mainline workingKeep Mainline working

Hourly Stability Checkers Baseline for DevelopersBaseline for Developers

Easy to Use == UsedEasy to Use == Used

Distribute Test Development & Ownership Across Full TeamDistribute Test Development & Ownership Across Full Team

Load Test: Early & Often Break it before LiveBreak it before Live

Executive Support Radical shifts in ProcessRadical shifts in Process

Dedicated Tools Group

Page 59: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Next Project: Basic Infrastructure

Control HarnessFor Clients & Components

Reference Client Self Test

Reference Feature

RegressionEngine

Living Doc

Page 60: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Building Features: NullView First

Control Harness

Reference Client

NullView Client

Self Test

Reference Feature

RegressionEngine

Living Doc

Page 61: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Build The Tests With The Code

NullView Client

Login

Self Test

Monkey Test

Nothing Gets Checked In Without A Working Monkey Test.Nothing Gets Checked In Without A Working Monkey Test.

Control Harness

Reference Client

Reference Feature

RegressionEngine

Page 62: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Conclusion

• Estimated Impact on MMP: High– Sniff Test: kept developers working– Load Test: ID’d critical failures pre-launch– Presentation Layer: scriptable play sessions

• Cost To Implement: Medium– Much Lower for SSP Games

Repeatable, coordinated inputs @ scale and pre-checkin regression were very significant schedule accelerators.

Repeatable, coordinated inputs @ scale and pre-checkin regression were very significant schedule accelerators.

Page 63: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Conclusion

Go For It…Go For It…

Page 64: Automated Testing of Massively Multi-Player Games Lessons Learned from The Sims Online Larry Mellon Spring 2003

Talk Outline: Automated Testing

Time(60 Minutes)

Wrap-up

Questions

Lessons Learned: Fielding

Design & Initial Implementation 1/3

1/3

1/3