university of marylandjbg/teaching/lbsc_690_2012/lecture_09.pdf · the waterfall model key insight:...

66
Building, Deploying, and Living With IT Systems LBSC 690: Jordan Boyd-Graber University of Maryland November 19, 2012 Adapted from Jimmy Lin’s Slides LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 1 / 62

Upload: others

Post on 25-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Building, Deploying, and Living With IT Systems

LBSC 690: Jordan Boyd-Graber

University of Maryland

November 19, 2012

Adapted from Jimmy Lin’s Slides

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 1 / 62

Page 2: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Today’s Topics

The system life cycle

The open source model

Cloud computing

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 2 / 62

Page 3: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Take-Away Messages

Not “what are the right answers,” but “what are the right questions”

There is no right answerI It all depends on the exact circumstancesI It’s all about tradeo↵s

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 3 / 62

Page 4: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Outline

1 Designing Systems

2 Open Source and TCO

3 Privacy

4 Cloud Computing

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 4 / 62

Page 5: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

The System Life Cycle

Analysis and Design: How do we know what to build?

Implementation: How do we actually build it?

Maintenance: How do we keep it running?

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 5 / 62

Page 6: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

User-Centered Design

As opposed to what?

Understanding user needsI Who are the present and future users?I How can you understand their needs?

Understanding the use contextI How does the particular need relate to broader user activities?I How does software fit into the picture?

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 6 / 62

Page 7: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Some Library Activities

Acquisition

Cataloging

Reference

Circulation, interlibrary loan, reserves

Recall, fines, . . .

Budget, facilities schedules, payroll, . . .

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 7 / 62

Page 8: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Important Questions

Where does information originate?I Beware of “chicken and egg” problems

What components already exist?I Sometimes it’s easier to start with a clean slate

Which components should be automated?I Some things are easier to do without computers

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 8 / 62

Page 9: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Important Questions

Which components should be integrated?I Pick your poison: centralization vs. decentralizationI Implications for privacy, security, etc.

How will technology impact human processes?I Technology is not neutral

How can we take advantage of the community?I Web 2.0, Library 2.0, etc.

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 9 / 62

Page 10: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Requirements

AvailabilityI Mean Time Between Failures (MTBF)I Mean Time To Repair (MTTR)

CapacityI Number of users (typical and maximum)I Response time

FlexibilityI Upgrade pathI Interoperability with other applications

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 10 / 62

Page 11: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Decisions, Decisions. . .

O↵-the-shelf applications vs. custom-developed

“Best-of-breed” vs. integrated system

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 11 / 62

Page 12: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

More Decisions: Architectures

Desktop applicationsI What we normally think of as software

Batch processing (e.g., recall notices)I Save it up and do it all at once

Client-Server (e.g., Web)I Some functions done centrally, others locally

Peer-to-Peer (e.g., Kazaa)I All data and computation are distributed

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 12 / 62

Page 13: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

The Waterfall Model

Key insight: upfront investment in designI An hour of design can save a week of debugging!

Five stages:I Requirements: figure out what the software is supposed to doI Design: figure out how the software will accomplish the tasksI Implementation: actually build the softwareI Verification: makes sure that it worksI Maintenance: makes sure that it keeps working

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 13 / 62

Page 14: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

The Waterfall Model

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 14 / 62

Page 15: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

The Spiral Model

Build what you think you needI Perhaps using the waterfall model

Get a few users to help you debug itI First an “alpha” release, then a “beta” release

Release it as a product (version 1.0)I Make small changes as needed (1.1, 1.2, .)

Save big changes for a major new releaseI Often based on a total redesign (2.0, 3.0, )

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 15 / 62

Page 16: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

The Spiral Model

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 16 / 62

Page 17: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Unpleasant Realities

The waterfall model doesn’t work wellI Requirements usually incomplete or incorrect

The spiral model is expensiveI Redesign leads to recoding and retesting

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 17 / 62

Page 18: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

A Hybrid Model

Goal: explore requirementsI Without building the complete product

Start with part of the functionalityI That will (hopefully) yield significant insight

Build a prototypeI Focus on core functionality

Use the prototype to refine the requirements

Repeat the process, expanding functionality

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 18 / 62

Page 19: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

A Hybrid Model

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 19 / 62

Page 20: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Management Issues

Operating costsI Sta↵ timeI Physical resources (space, cooling, power)I Periodic maintenanceI Equipment replacementI Retrospective conversion

Moving from “legacy systems”I Even converting electronic information is expensive!

Incremental improvementsI No piece of software is perfect

Legal constraints

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 20 / 62

Page 21: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Management Issues

Management informationI Usage logs, audit trails, etc.I Often easy to collect, di�cult to analyze

TrainingI Sta↵I Users

Privacy, security, access control

Backup and disaster recoveryI Periodicity, storage location

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 21 / 62

Page 22: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

TCO

TCO = “Total cost of ownership”

Hardware and software isn’t the only cost!

Other (hidden) costs:I Planning, installation, integrationI Disruption and migrationI Ongoing support and maintenanceI Training (of sta↵ and end users)

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 22 / 62

Page 23: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Legal Requirements

Sarbanes-Oxley (2007) - Corporations must retain and release moreinformation

PATRIOT (2001) - Government can monitor web and e-mail

Gramm-Leach-Bliley (1999) - Financial companies

Digital Millennium Copyright Act (1998) - Illegal to circumventcopyright mechanisms

Health Insurance Portability and Accountability Act (1996, HIPAA) -Rules for handling health information, establishes rules for transferringhealth records electronically

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 23 / 62

Page 24: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Outline

1 Designing Systems

2 Open Source and TCO

3 Privacy

4 Cloud Computing

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 24 / 62

Page 25: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

What is open source?

Proprietary vs. open source software

Open source used to be a crackpot idea:I Bill Gates on GNU/Linux (3/24/1999): “I don’t really think in the

commercial market, we’ll see it in any significant way.”I MS 10-Q quarterly filing (1/31/2004): “The popularization of the

open source movement continues to pose a significant challenge to thecompany’s business model”

Open sourceI For tree hugging hippies?I Make love, not war?

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 25 / 62

Page 26: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Basic Definitions

What is a program?

What is source code?

What is object/executable code (binaries)?

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 26 / 62

Page 27: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Proprietary Software

Distribution in machine-readable binaries only

Payment for a licenseI Grants certain usage rightsI Restrictions on copying, further distribution, modification

Analogy: buying a car . . .I With the hood welded shutI That only you can driveI That you can’t change the rims on

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 27 / 62

Page 28: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Software is copyrighted

Copyright is a legal monopoly granted by the government for alimited time to promote the arts

Law gives redress to copyright holders whose work has been infringed

Software costs nothing to copy - protects the livelihood of those whowrite software

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 28 / 62

Page 29: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Open Source Principles

Free distribution and redistribution

I “Free as in speech, not as in beer”

Source code availability

Provisions for derived works

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 29 / 62

Page 30: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Open Source vs. Proprietary

Who gets the idea to develop the software?

Who actually develops the software?

How much does it cost?

Who can make changes?

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 30 / 62

Page 31: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Distinction: Free vs. Open

“Free” software is not just open source

Open source means you can view the code of a program and use itwithout charge

“Free” software means that if you distribute a program, you must alsodistribute the source

What’s the di↵erence?

Example of Open Source: Apache License1 Free to use code, can use in closed-source products

Example of Free License: Gnu Public License (viral?)1 Free to use code, but if you distribute program, must distribute code

too

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 31 / 62

Page 32: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Distinction: Free vs. Open

“Free” software is not just open source

Open source means you can view the code of a program and use itwithout charge

“Free” software means that if you distribute a program, you must alsodistribute the source

What’s the di↵erence?

Example of Open Source: Apache License1 Free to use code, can use in closed-source products

Example of Free License: Gnu Public License (viral?)1 Free to use code, but if you distribute program, must distribute code

too

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 31 / 62

Page 33: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Examples of Open Source Software

Task Proprietary OpenOS Windows GNU/LinuxIM AIM Adium

Browser Internet Explorer FirefoxImage Editor Photoshop GIMPWeb Server IIS ApacheDatabase Oracle MySQL

O�ce Suite Microsoft O�ce Open O�ce / LibreO�ce

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 32 / 62

Page 34: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Open Source: Pros

Peer-reviewed code

Dynamic community

Iterative releases, rapid bug fixes

Released by engineers, not marketing people

High quality

No vendor lock-in

Simplified licensed management

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 33 / 62

Page 35: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Pros in Detail

Peer-reviewed codeI Everyone gets to inspect the codeI More eyes, fewer bugs

Dynamic communityI Community consists of coders, testers, debuggers, users, etc.I Any person can have multiple rolesI Both volunteers and paid by companiesI Volunteers are highly-motivated to work on something that interests

them

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 34 / 62

Page 36: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Pros in Detail

Iterative releases, rapid bug fixesI Anyone can fix bugsI Bugs rapidly fixed when foundI Distribution of “patches”

Released by engineers, not marketing peopleI Stable versions ready only when they really are readyI Not dictated by marketing deadlinesI High quality

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 35 / 62

Page 37: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Pros in Detail

No vendor lock-inI Lock in: dependence on a specific program from a specific vendorI Putting content in MS Word ties you to Microsoft foreverI Open formats: can use a variety of systems

Simplified licensed managementI Can install any number of copiesI No risk of illegal copies or license auditsI No anti-piracy measures (e.g. CD keys, product activation)I No need to pay for perpetual upgradesI Doesn’t eliminate software management, of course

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 36 / 62

Page 38: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Cons of Open Source

Dead-end software

Fragmentation

Developed by engineers, often for engineers

Community development model

Inability to point fingers

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 37 / 62

Page 39: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Cons in Detail

Dead-end softwareI Development depends on community dynamics: What happens when

the community loses interest?I How is this di↵erent from the vendor dropping support for a product?

At least the source code is available

FragmentationI Code might “fork” into multiple versions: incompatibilities developI In practice, rarely happens

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 38 / 62

Page 40: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Cons in Detail

Developed by engineers, often forengineers

I My favorite “pet feature”I Engineers are not your typical users!

Community development modelI Cannot simply dictate the

development processI Must build consensus and support

within the community

Inability to point fingersI Who do you call up and yell at when

things go wrong?I Buy a support contract from a vendor!

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 39 / 62

Page 41: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Open Source Business Models

Support Sellers

Loss Leader

Widget Frosting

Accessorizing

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 40 / 62

Page 42: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

It comes down to cost. . .

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 41 / 62

Page 43: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

The TCO Debate

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 42 / 62

Page 44: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Is open source right for you?

Do you have access to the necessary expertise?

Do you have buy-in from the stakeholders?

Are you willing to retool your processes?

Are you willing to retrain sta↵ and users?

Are you prepared for a period of disruption?

Have you thought through these issues?

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 43 / 62

Page 45: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Open source isn’t just about software

Creative commons movement

“Copyleft”

Consistent with our Remix culture (Lawrence Lessig)

Various usage regimesI How you can change itI If you have to acknowledgeI If you charge for it

Within copyright regime

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 44 / 62

Page 46: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Outline

1 Designing Systems

2 Open Source and TCO

3 Privacy

4 Cloud Computing

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 45 / 62

Page 47: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Why Privacy is a Good Thing

Build trust in your service

Protect your users from identity theft / embarassment

Protect yourself from legal action

Philosophical / ethical reasons

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 46 / 62

Page 48: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Why Privacy is not always a Good Thing

More data leads to better serviceI “collaborative filtering”I Targeted advertising/deals

Transparency and accountability

People aren’t good at keeping track of their own information

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 47 / 62

Page 49: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

The Good: Netflix Challenge

“sanitized” data

Improved predictions of algorithms on what movies you’d like

Created an invaluable real-world dataset for researchers

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 48 / 62

Page 50: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

The Bad: AOL Search Queries

Collection of searches given to AOL

“landscapers in Lilburn, GA”, “shadow lake subdifivsion gwinnettcounty” revealed user id 4417749 was Thelma Arnold, a 62-year-oldwidow

Also revealed other information about users

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 49 / 62

Page 51: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Best Practices

Have a privacy statement and policy

Develop retention policies and audit them

Allow users to export / expunge their data

Consider external privacy audits

Minimize the data you collect

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 50 / 62

Page 52: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Outline

1 Designing Systems

2 Open Source and TCO

3 Privacy

4 Cloud Computing

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 51 / 62

Page 53: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

What is Cloud Computing?

Web-scale problems

Large data centers

Di↵erent models of computing

Highly-interactive Web applications

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 52 / 62

Page 54: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Web-Scale Problems

Characteristics:I Definitely data-intensiveI May also be processing intensive

Examples:I Crawling, indexing, searching, mining the WebI “Post-genomics” life sciences researchI Other scientific data (physics, astronomers, etc.)I Sensor networksI Web 2.0 applications

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 53 / 62

Page 55: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

How much data?

Wayback Machine has 2 PB + 20TB/month (2006)

Google processes 20 PB a day (2008)

“all words ever spoken by humanbeings” 5 EB

NOAA has 1 PB climate data (2007)

CERN’s LHC will generate 15 PB a year(2008)

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 54 / 62

Page 56: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 55 / 62

Page 57: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Large Data Centers

Web-scale problems? Throw more machines at it!

Clear trend: centralization of computing resources in large datacenters

I Necessary ingredients: fiber, juice, and spaceI What do Oregon, Iceland, and abandoned mines have in common?

Important Issues:I RedundancyI E�ciencyI UtilizationI Management

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 56 / 62

Page 58: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 57 / 62

Page 59: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 57 / 62

Page 60: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Key Technology: Virtualization

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 58 / 62

Page 61: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Di↵erent Computing Models

Utility computing

Why buy machines when you can rent cycles?I Examples: Amazon’s EC2, GoGrid, AppNexus

Platform as a Service (PaaS)I Give me nice API and take care of the implementationI Example: Google App Engine

Software as a Service (SaaS)I Just run it for me!I Example: Gmail

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 59 / 62

Page 62: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Web Applications

What is the nature of software applications?

From the desktop to the browserI Rise of Web-based applicationsI Examples: Google Maps, Facebook

How do we deliver highly-interactive Web-based applications?I Ajax (Asynchronous JavaScript and XML)

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 60 / 62

Page 63: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

The Grand Plan

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 61 / 62

Page 64: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Discussion Question

Privacy in a Library

What information do you need and what should you ask for?

How long should you keep each?

What if you wanted to suggest media to a user? Do you need to keepall of their data forever?

What if you wanted to allow users to pay fines online?

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 62 / 62

Page 65: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Discussion Question

Privacy in a Library

What information do you need and what should you ask for?

How long should you keep each?

What if you wanted to suggest media to a user? Do you need to keepall of their data forever?

What if you wanted to allow users to pay fines online?

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 62 / 62

Page 66: University of Marylandjbg/teaching/LBSC_690_2012/lecture_09.pdf · The Waterfall Model Key insight: upfront investment in design I An hour of design can save a week of debugging!

Discussion Question

Privacy in a Library

What information do you need and what should you ask for?

How long should you keep each?

What if you wanted to suggest media to a user? Do you need to keepall of their data forever?

What if you wanted to allow users to pay fines online?

LBSC 690 (Boyd-Graber) Building, Deploying, and Living With IT Systems November 19, 2012 62 / 62