the quality attribute of upgradability

37
NICTA Copyright 2012 From imagination to impact The Quality Attribute of Upgradabilit y Len Bass with Hiroshi Wada, Ingo Weber, Liming Zhu, Ross Jeffery

Upload: len-bass

Post on 27-Jan-2015

107 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

The Quality Attribute of

Upgradability

Len Bass with

Hiroshi Wada, Ingo Weber, Liming Zhu,

Ross Jeffery

Page 2: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact2

About NICTA

National ICT Australia

• Federal and state funded research company established in 2002

• Largest ICT research resource in Australia

• National impact is an important success metric

• ~700 staff/students working in 5 labs across major capital cities

• 7 university partners• Providing R&D services, knowledge

transfer to Australian (and global) ICT industry

NICTA technology is in over 1 billion mobile

phones

Page 3: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Consider the follow sequence.

• You have prepared an upgrade to an existing large enterprise system– You have coded it– You have tested it– It is ready!!

• Alternatively, the IT department (or you) get a package from a third party – a vendor or open source – that has been coded and tested.

• What happens then?

3

Page 4: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Consider the follow sequence.

• You have prepared an upgrade to an existing large enterprise system– You have coded it– You have tested it– It is ready!!

• Alternatively, the IT department (or you) get a package from a third party – a vendor or open source – that has been coded and tested.

• What happens then?– ~10% of the time the upgrade will fail.

4

Page 5: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

This is the upgradability problem

• How do we make upgrading a system less problematic?

• Talk outline– Characteristics of the upgrade problem– FMEA analysis

• Possible causes of failure• Failure prevention, detection, and recovery

– Relation to existing product and process quality work

5

Page 6: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Upgrades to enterprise systems are a very common occurrence

Upgrade frequency of some common systems

This frequency would suggest it is important to get the upgrades correct

6

Application Average release interval

Facebook (platform) < 7 days

Google Docs <50 days

Media Wiki 21 days

Joomla 30 days

Page 7: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Unfortunately, Upgrades Fail Often

• 4.6-10 component failures each month in three large-scale Internet services. Mostly during regular maintenance

• Average and maximum failure rates from a survey of systems administrators are 8.6% and 50%.

• Some claim that user visible failures from upgrade outweigh user visible failures from software errors.

7

Page 8: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Why is this?

• Installation is complicated.– Installation guides for SAS 9.3 Intelligence, IBM i, Oracle 11g for

Linux are ~250 pages each– Apache description of addresses and ports (one out of 16

descriptions) has following elements:• Choosing and specifying ports for the server to listen to• IPv4 and IPv6• Protocols• Virtual Hosts

– The number of configuration options that must be set can be large

• Hadoop has 206 options• Hbase has 64

– Many dependencies are not visible until execution

8

Page 9: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Provides Research Agenda

• Indeed, the surprise is not that upgrades fail 8.6% of the time but that they are successful 91.4% of the time.

• Rich area for research.

9

Page 10: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

What kind of problem is this - product?

• ISO 25010 provides– A quality in use model composed of five

characteristics (some of which are further subdivided into subcharacteristics) that relate to the outcome of interaction when a product is used in a particular context of use.

– I.e. is upgradability a quality of the system being upgraded?

• The answer is yes.

10

Page 11: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

What kind of problem is this – process?

• ITIL (Information Technology Infrastructure Library) – Change Management aims to ensure that

standardised methods and procedures are used for efficient handling of all changes.

• SPICE – ISO 15504– process assessment provides the means of

characterizing the current practice within an organizational unit in terms of the capability of the selected processes.

• Is upgradability of quality of the process used to manage information technology?

• The answer is yes.

11

Page 12: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Upgradability is a hybrid quality problem

• A hybrid quality problem is one in which improvement involves both product and process and in which the product has process awareness.

• Many product centered conferences – Dependability– Security– …

• Some process centered conferences– Software Process Improvement– SPICE– SPEG– … 12

Page 13: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Hybrid quality improvement is not well served by the academic community• Hybrid quality improvement – as we shall see – involves

close interaction between product, process and tools to support the process.

• Venues that should emphasize this interaction include– Profes (Product focused Software Development and

Process Improvement)– ASQ (Conference on Quality and Improvement)

• Yet an examination of the CFPs and proceedings for these conferences shows a distinction between process activities and product characteristics

• We will present the results of a FMEA (Failure Mode and Effects Analysis) style analysis for upgradability and then return to the hybrid quality issue

13

Page 14: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

FMEA

• Failure Modes and Effect Analysis is an inductive failure analysis for analysis of failure modes.

• FMEA involves describing – Potential failure modes– The severity and likelihood of these failures.

• We will focus on the first portion and generate the potential failure modes as well as potential prevention, detection, and recovery from these failures.

• I.e. we are performing an FMEA style analysis, not an FMEA, per se.

14

Page 15: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Scenario for Upgradability

15

• We are concerned with the following scenario– Version N+1 of an enterprise system is available for

deployment.• Version N+1 can be deployed by developers• Version N+1 can be deployed by the Information Technology

Department (The Release Manager if there is one).

– Version N+1 is completely coded and tested by its developers.

• Measures can include– Downtime– Resources (hardware or personnel) required to

perform the upgrade– Number of failed attempts to install upgrade

Page 16: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Fundamental goals during upgrade

• The literature identifies four fundamental goals while upgrade is occurring.– Efficiently manage resources – Completely and correctly specify configurations– Manage multiple versions to avoid problems with

version mismatch.– Maintain consistency of persistent data.

• Failures are caused by the violation of one of these fundamental goals.– Our FMEA analysis will look at potential causes for

violations of one of these goals.

16

Page 17: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Activities during an upgrade of a system

• Make the upgrade available. • Prepare the environment. Ensure that there are

sufficient resources available for installation and that assumed software is available.

• Configuration• Deployment• Activation

17

Page 18: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Organization of next portion of the presentation

• For each activity˗ Potential fault (a fault is a failure in waiting)˗ Prevention of the fault˗ Detection of the fault˗ Correction of the fault

• Research opportunity• Blank cell• Cell with only partial coverage

18

Page 19: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Make Upgrade available

19

Fault possibility Prevention Detection RecoveryElement omitted/included incorrectly in installing software

  Manifest

Bill of lading

Recreate distribution

System corrupted during movement

  Hash code, checksum

Retransmit

Source of distribution from an untrusted site

  Digital signature

 

Forgotten/misplaced credentials

    Separate secret Independent channel for new credentials

Credential verifier unavailable

    Codify acceptable credentials in distribution

Page 20: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Prepare environment

20

Fault possibility Prevention Detection RecoveryIncorrect versions of support libraries

Include version number in specification

Utilize services to announce incompatibilities

Encode hash of APIs

Multiple versions of support libraries simultaneously required

Include version number in nameLibraries expose version numbers Linkers version aware

Insufficient resources Rolling Upgrade  

Schema modification on database

Convert data to new schema prior to upgrade

 

Page 21: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Configuration

21

Fault possibility Prevention Detection Recovery

Missing parameter Parameter database

Parameter built into tool

Static analysis of code

   

Incorrectly specified parameter

Abstract specification

Check syntax

Validate against a specification

 

Inconsistent parameters

Constraint checker

 

Page 22: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Deployment

22

Fault possibility Prevention Detection Recovery

Insufficient resources Pre-allocate during preparation

Rolling upgrade

   

Inconsistent hardware Verify during preparation

   

Operator error     Undo mechanism

Page 23: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Activation

23

Fault possibility Prevention Detection Recovery

Discovered hidden dependency

  Monitoring Recovery block

Multiple simultaneous versions

Separation

Dynamic Software Update

Automatic translation of data when old schema is used

Version aware code and data

 

Version aware load balancer

Page 24: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Our activities in this space so far (green cells)

• Mixed version race condition solution• Operator undo

24

Page 25: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact 25

• Common practice when pushing an upgrade to a large number of servers is to perform the upgrades one (or several) servers at a time

• This means that version N+1 (the new version) will be available on some servers and version N (the old version) will be available on other servers.

• Suppose version N+1 has functionality not available in version N

What is the “mixed version race condition”

Page 26: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact 26

1. A client (browser) issues a request that is routed by the load balancer to an instance of version N+1

2. Version N+1 sends JavaScript assuming new functionality back to the client.

3. Client sends an AJAX request that utilizes new functionality and the load balancer routes it to an instance of version N.

4. Error because version N does not have the new functionality.

Now consider the following sequence

Page 27: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Mixed Version Race Condition

27

3

4

New Version

X ERROR

Client (browser) Server

1

2

5

Start rolling upgrade

Initial request

HTTP reply with embedded JavaScript

AJAX callbackOld Version

Page 28: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

What does the solution involve?

1. Label communication between instances and the client with version information

2. Modify load balancer so that messages are routed to an appropriate version

3. Modify load balancer so that messages are balanced across all child instances.

28

Page 29: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Why is this a hard problem?• Large installations have multiple distributed load balancers that

must be kept in synch. I.e. some load balancers may know about new version and some may not

• Not enough to put version number in message– Suppose second request goes to a load balancer that does not yet know about

version N+1.

• Must keep messages balanced so that all servers handle roughly the same number of requests.

29

/service/vN+1/service/vN

/service

server server server server

/service/vN

/service

server server

Page 30: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Operator undo

• After perofmring an operation in AWS, may want to go back to original state – i.e. Undo the operation

• Not always that straight-forward:– Attaching volume is no problem while the instance is

running, detaching might be problematic– Creating / changing auto-scaling rules has effect on

number of running instances• Cannot terminate additional instances, as the rule would

create new ones!

– Deleted / terminated / released resources are gone!

30

Page 31: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Undo for System Operators

31

+ commit+ pseudo-delete

begin-transaction rollback

dododo

Administrator

Page 32: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Approach

32

begin-transaction rollback

dododo

Sense cloud resources states

Sense cloud resources states

Administrator

Undo System

Page 33: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Approach

33

begin-transaction rollback

dododo

Sense cloud resources states

Sense cloud resources states

Administrator

Undo System

Goal stateGoal state

Initial state

Initial state

Page 34: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

begin-transaction rollback

dododo

Sense cloud resources states

Sense cloud resources states

PlanGenerate codeExecute

Administrator

Undo System

Goal stateGoal state

Initial state

Initial state

Set of actionsSet of

actions

Approach

34

Page 35: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Upgradability as a process&product quality

• Architecture of the system being upgraded can affect the process of installation– Suppose the system checks for version information

from dependent libraries. Then the process must encompass descriptions of what to do if an error condition occurs.

• Process of upgrade can affect the architecture of the product.– Suppose the process is supported by a tool that

checks the health of the installation of version N+1. Then the system must make visible the information used by this tool.

35

Page 36: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Summary

• Upgrade is an important problem– Upgrade failures affect user satisfaction– Upgrade failures happen frequently

• Upgrade involves the interaction of product and process quality issues. – Communities are focussed on improving the quality of

the process or the product. Not the joint process/product quality.

• Multiple opportunities for research exist.

36

Page 37: The quality attribute of upgradability

NICTA Copyright 2012 From imagination to impact

Q&A

37

Research study opportunities in dependable cloud computing:• Software Architecture • Data Management • Performance Engineering • Autonomic Computing

To find out more, send your CV and undergraduate details [email protected]

Thank You!