1 computer science and the socio- economic sciences fred roberts, rutgers university

100
1 Computer Science and the Socio-Economic Sciences Fred Roberts, Rutgers University

Post on 20-Dec-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

1

Computer Science and the Socio-Economic Sciences

Fred Roberts, Rutgers University

Page 2: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

2

CS and SS •Many recent applications in CS involve issues/problems of long interest to social scientists:

preference, utilityconflict and cooperationallocationincentivesconsensussocial choicemeasurement

•Methods developed in SS beginning to be used in CS

Page 3: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

3

CS and SS •CS applications place great strain on SS methods

Sheer size of problems addressedComputational power of agents an issueLimitations on information possessed by playersSequential nature of repeated applications

•Thus: Need for new generation of SS methods

•Also: These new methods will provide powerful tools to social scientists

Page 4: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

4

CS and SS: Outline

1.CS and Consensus/Social Choice

2. CS and Game Theory

3. Algorithmic Decision Theory

Page 5: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

5

CS and SS: Outline

CS and Consensus/Social Choice

2. CS and Game Theory

3. Algorithmic Decision Theory

Page 6: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

6

CS and Consensus/Social Choice • Relevant social science problems: voting, group

decision making• Goal: based on everyone’s opinions, reach a “consensus” • Typical opinions:

“first choice”ranking of all alternativesscores classifications

• Long history of research on such problems.

Page 7: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

7

CS and Consensus/Social Choice Background: Arrow’s Impossibility Theorem: There is no “consensus method” that satisfies

certain reasonable axioms about how societies should reach decisions.

Input: rankings of alternatives.Output: consensus ranking.

Kenneth ArrowNobel prize winner

Page 8: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

8

CS and Consensus/Social Choice

There are widely studied and widely used consensus methods.

One well-known consensus method: “Kemeny-Snell medians”: Given setof rankings, find ranking minimizingsum of distances to other rankings.

Kemeny-Snell medians are having surprising new applications in CS.

John Kemeny,pioneer in time sharingin CS

Page 9: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

9

CS and Consensus/Social Choice Kemeny-Snell distance between rankings: twice

the number of pairs of candidates i and j for which i is ranked above j in one ranking and below j in the other + the number of pairs that are ranked in one ranking and tied in another.

Kemeny-Snell median: Given rankings a1, a2, …, ap, find a ranking x so that

d(a1,x) + d(a2,x) + … + d(ap,x) is minimized.Sometimes just called Kemeny median.

Page 10: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

10

CS and Consensus/Social Choice a1 a2 a3

Fish Fish ChickenChicken Chicken FishBeef Beef Beef

Median = a1. If x = a1:

d(a1,x) + d(a2,x) + d(a3,x) = 0 + 0 + 2is minimized.If x = a3, the sum is 4.For any other x, the sum is at least 1 + 1 + 1 = 3.

Page 11: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

11

CS and Consensus/Social Choice a1 a2 a3

Fish Chicken BeefChicken Beef FishBeef Fish Chicken

Three medians = a1, a2, a3.

This is the “voter’s paradox” situation.

Page 12: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

12

CS and Consensus/Social Choice a1 a2 a3

Fish Chicken BeefChicken Beef FishBeef Fish Chicken

Note that sometimes we wish to minimize

d(a1,x)2 + d(a2,x)2 + … + d(ap,x)2

A ranking x that minimizes this is called a Kemeny-Snell mean.

In this example, there is one mean: the ranking declaring all three alternatives tied.

Page 13: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

13

CS and Consensus/Social Choice a1 a2 a3

Fish Chicken BeefChicken Beef FishBeef Fish Chicken

If x is the ranking declaring Fish, Chickenand Beef tied, then

d(a1,x)2 + d(a2,x)2 + … + d(ap,x)2 = 32 + 32 + 32 = 27.

Not hard to show this is minimum.

Page 14: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

14

CS and Consensus/Social Choice

Theorem (Bartholdi, Tovey, and Trick, 1989; Wakabayashi, 1986): Computing the Kemeny median of a set of rankings is an NP-complete problem.

Page 15: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

15

Meta-search and Collaborative Filtering Meta-search

• A consensus problem• Combine page rankings from several search

engines• Dwork, Kumar, Naor, Sivakumar (2000):

Kemeny-Snell medians good in spam resistance in meta-search (spam by a page if it causes meta-search to rank it too highly)

• Approximation methods make this computationally tractable

Page 16: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

16

Meta-search and Collaborative Filtering

Collaborative Filtering

• Recommending books or movies• Combine book or movie ratings• Produce ordered list of books or movies to

recommend• Freund, Iyer, Schapire, Singer (2003):

“Boosting” algorithm for combining rankings.• Related topic: Recommender Systems

Page 17: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

17

Meta-search and Collaborative Filtering

A major difference from SS applications:

• In SS applications, number of voters is large, number of candidates is small.

• In CS applications, number of voters (search engines) is small, number of candidates (pages) is large.

• This makes for major new complications and research challenges.

Page 18: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

18

Large Databases and Inference

• Real data often in form of sequences• GenBank has over 7 million sequences

comprising 8.6 billion bases. • The search for similarity or patterns has

extended from pairs of sequences to finding patterns that appear in common in a large number of sequences or throughout the database: “consensus sequences”.

• Emerging field of “Bioconsensus”: applies SS consensus methods to biological databases.

Page 19: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

19

Large Databases and Inference

Why look for such patterns?

Similarities between sequences or parts of sequences lead to the discovery of shared phenomena.

For example, it was discovered that the sequence for platelet derived factor, which causes growth in the body, is 87% identical to the sequence for v-sis, a cancer-causing gene. This led to the discovery that v-sis works by stimulating growth.

Page 20: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

20

Large Databases and Inference Example

Bacterial Promoter Sequences studied by Waterman (1989):

RRNABP1: ACTCCCTATAATGCGCCATNAA: GAGTGTAATAATGTAGCCUVRBP2: TTATCCAGTATAATTTGTSFC: AAGCGGTGTTATAATGCC

Notice that if we are looking for patterns of length 4, each sequence has the pattern TAAT.

Page 21: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

21

Large Databases and Inference Example

Bacterial Promoter Sequences studied by Waterman (1989):

RRNABP1: ACTCCCTATAATGCGCCATNAA: GAGTGTAATAATGTAGCCUVRBP2: TTATCCAGTATAATTTGTSFC: AAGCGGTGTTATAATGCC

Notice that if we are looking for patterns of length 4, each sequence has the pattern TAAT.

Page 22: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

22

Large Databases and Inference Example

However, suppose that we add another sequence:

M1 RNA: AACCCTCTATACTGCGCG

The pattern TAAT does not appear here.However, it almost appears, since the pattern

TACT appears, and this has only one mismatch from the pattern TAAT.

Page 23: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

23

Large Databases and Inference Example

However, suppose that we add another sequence:

M1 RNA: AACCCTCTATACTGCGCG

The pattern TAAT does not appear here.However, it almost appears, since the pattern

TACT appears, and this has only one mismatch from the pattern TAAT.

So, in some sense, the pattern TAAT is a good consensus pattern.

Page 24: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

24

Large Databases and Inference Example

We make this precise using best mismatch distance.

Consider two sequences a and b with b longer than a.

Then d(a,b) is the smallest number of mismatches in all possible alignments of a as a consecutive subsequence of b.

Page 25: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

25

Large Databases and Inference Example

a = 0011, b = 111010

Possible Alignments:111010111010 1110100011 0011 0011

The best-mismatch distance is 2, which is achieved in the third alignment.

Page 26: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

26

Large Databases and Inference Example

Now given a database of sequences a1, a2, …, an.Look for a pattern of length k. One standard method (Smith-Waterman): look for

a consensus sequence b that minimizes

i[k-d(b,ai)]/d(b,ai),

where d is best mismatch distance.

In fact, this turns out to be equivalent to calculating medians like Kemeny-Snell medians.

Algorithms for computing consensus sequences are important in modern molecular biology.

Page 27: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

27

Large Databases and Inference Preferential Queries

• Look for flight from New York to Beijing• Have preferences for

airlineitinerarytype of ticket

• Try to combine responses from multiple travel-related websites

• Sequential decision making: Next query or information access depends on prior responses.

Page 28: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

28

Consensus Computing, Image Processing • Old SS problem: Dynamic modeling of how

individuals change opinions over time, eventually reaching consensus.

• Often use dynamic models on graphs• Related to neural nets.

• CS applications: distributed computing.• Values of processors in a network are updated

until all have same value.

Page 29: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

29

Consensus Computing, Image Processing • CS application: Noise removal in digital images• Does a pixel level represent noise?• Compare neighboring pixels.• If values beyond threshold, replace pixel value

with mean or median of values of neighbors.• Related application in distributed computing.• Values of faulty processors are replaced by those

of neighboring non-faulty ones.• Berman and Garay (1993) use “parliamentary

procedure” called cloture

Page 30: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

30

Computational Intractability of Consensus Functions

• Bartholdi, Tovey and Trick: There are voting schemes where it can be computationally intractable to determine who won an election.

• Computational intractability can be a good thing in an election: Designing voting systems where it is computationally intractable to “manipulate” the outcome of an election by “insincere voting”:Adding votersDeclaring voters ineligibleAdding candidatesDeclaring candidates ineligible

Page 31: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

31

Electronic Voting

• Issues:CorrectnessAnonymityAvailabilitySecurityPrivacy

Page 32: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

32

Electronic VotingSecurity Risks in Electronic Voting

• Threat of “denial of service attacks”• Threat of penetration attacks involving a

delivery mechanism to transport a malicious payload to target host (thru Trojan horse or remote control program)

• Private and correct counting of votes• Cryptographic challenges to keep votes private• Relevance of work on secure multiparty

computation

Page 33: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

33

Electronic Voting

Other CS Challenges:

• Resistance to “vote buying”• Development of user-friendly interfaces• Vulnerabilities of communication path between

the voting client (where you vote) and the server (where votes are counted)

• Reliability issues: random hardware and software failures

Page 34: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

34

Software & Hardware Measurement • Theory of measurement developed by

mathematical social scientists• Measurement theory studies ways to combine

scores obtained on different criteria.• A statement involving scales of measurement is considered meaningful if its

truth or falsity is unchanged under acceptable transformations of all scales involved.

• Example: It is meaningful to say that I weigh more than my daughter.

• That is because if it is true in kilograms, then it is also true in pounds, in grams, etc.

Page 35: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

35

Software & Hardware Measurement • Measurement theory has studied what statements you

can make after averaging scores.• Think of averaging as a consensus method.• One general principle: To say that the average score of

one set of tests is greater than the average score of another set of tests is not meaningful (it is meaningless) under certain conditions.

• This is often the case if the averaging procedure is to take the arithmetic mean: If s(xi) is score of xi, i = 1, 2, …, n, then arithmetic mean is

is(xi)/n.• Long literature on what averaging methods lead to

meaningful conclusions.

Page 36: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

36

Software & Hardware Measurement A widely used method in hardware measurement:

Score a computer system on different benchmarks.

Normalize score relative to performance of one base system

Average normalized scoresPick system with highest average.Fleming and Wallace (1986): Outcome can

depend on choice of base system. Meaningless in sense of measurement theoryLeads to theory of merging normalized scores

Page 37: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

37

Software & Hardware Measurement Hardware Measurement

417 83 66 39,449 772

244 70 153 33,527 368

134 70 135 66,000 369

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

Data from Heath, Comput. Archit. News (1984)

Page 38: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

38

Software & Hardware Measurement Normalize Relative to Processor R

417

1.00

83

1.00

66

1.00

39,449

1.00

772

1.00

244

.59

70

.84

153

2.32

33,527

.85

368

.48

134

.32

70

.85

135

2.05

66,000

1.67

369

.45

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

Page 39: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

39

Software & Hardware Measurement Take Arithmetic Mean of Normalized Scores

417

1.00

83

1.00

66

1.00

39,449

1.00

772

1.00

244

.59

70

.84

153

2.32

33,527

.85

368

.48

134

.32

70

.85

135

2.05

66,000

1.67

369

.45

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

ArithmeticMean

1.00

1.01

1.07

Page 40: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

40

Software & Hardware Measurement Take Arithmetic Mean of Normalized Scores

417

1.00

83

1.00

66

1.00

39,449

1.00

772

1.00

244

.59

70

.84

153

2.32

33,527

.85

368

.48

134

.32

70

.85

135

2.05

66,000

1.67

369

.45

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

ArithmeticMean

1.00

1.01

1.07

Conclude that machine Z is best

Page 41: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

41

Software & Hardware Measurement Now Normalize Relative to Processor M

417

1.71

83

1.19

66

.43

39,449

1.18

772

2.10

244

1.00

70

1.00

153

1.00

33,527

1.00

368

1.00

134

.55

70

1.00

135

.88

66,000

1.97

369

1.00

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

Page 42: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

42

Software & Hardware Measurement Take Arithmetic Mean of Normalized Scores

417

1.71

83

1.19

66

.43

39,449

1.18

772

2.10

244

1.00

70

1.00

153

1.00

33,527

1.00

368

1.00

134

.55

70

1.00

135

.88

66,000

1.97

369

1.00

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

ArithmeticMean

1.32

1.00

1.08

Page 43: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

43

Software & Hardware Measurement Take Arithmetic Mean of Normalized Scores

417

1.71

83

1.19

66

.43

39,449

1.18

772

2.10

244

1.00

70

1.00

153

1.00

33,527

1.00

368

1.00

134

.55

70

1.00

135

.88

66,000

1.97

369

1.00

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

ArithmeticMean

1.32

1.00

1.08

Conclude that machine R is best

Page 44: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

44

Software and Hardware Measurement • So, the conclusion that a given machine is best

by taking arithmetic mean of normalized scores is meaningless in this case.

• Above example from Fleming and Wallace (1986), data from Heath (1984)

• Sometimes, geometric mean is helpful.• Geometric mean is

is(xi)n

Page 45: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

45

Software & Hardware Measurement Normalize Relative to Processor R

417

1.00

83

1.00

66

1.00

39,449

1.00

772

1.00

244

.59

70

.84

153

2.32

33,527

.85

368

.48

134

.32

70

.85

135

2.05

66,000

1.67

369

.45

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

GeometricMean

1.00

.86

.84

Conclude that machine R is best

Page 46: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

46

Software & Hardware Measurement Now Normalize Relative to Processor M

417

1.71

83

1.19

66

.43

39,449

1.18

772

2.10

244

1.00

70

1.00

153

1.00

33,527

1.00

368

1.00

134

.55

70

1.00

135

.88

66,000

1.97

369

1.00

BENCHMARK

R

M

Z

PROCESSOR

E F G H IGeometricMean

1.17

1.00

.99

Still conclude that machine R is best

Page 47: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

47

Software and Hardware Measurement• In this situation, it is easy to show that the conclusion

that a given machine has highest geometric mean normalized score is a meaningful conclusion.

• Even meaningful: A given machine has geometric mean normalized score 20% higher than another machine.

• Fleming and Wallace give general conditions under which comparing geometric means of normalized scores is meaningful.

• Research area: what averaging procedures make sense in what situations? Large literature.

• Note: There are situations where comparing arithmetic means is meaningful but comparing geometric means is not.

Page 48: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

48

Software and Hardware Measurement

• Message from measurement theory to computer science:

Do not perform arithmetic operations on data without paying attention to whether the conclusions you get are meaningful.

Page 49: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

49

CS and SS: Outline

1.CS and Consensus/Social Choice

2. CS and Game Theory

3. Algorithmic Decision Theory

Page 50: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

50

CS and Game Theory• Game theory a long history in

economics; also in operations research, mathematics

• Recently, computer scientists discovering relevance to their problems

• Increasingly complex games arise in practical applications: auctions, Internet

• Need new game-theoretic methods for CS problems.

• Need new CS methods to solve modern game theory problems.

Page 51: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

51

CS and Game Theory: Algorithmic Issues

Nash Equilibrium

• Each player chooses a strategy• If no player can benefit by changing

his strategy while others leave theirs unchanged, we are in Nash equilibrium.

• In 1951, Nash showed every game has a Nash equilibrium.

• How hard is this to compute?

John NashNobel prize winner

Page 52: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

52

Example: Nash Equilibrium• 2-player game

• Strategy = number between 0 and 3

• Both players win lower amount.

• Player with higher amount pays $2 to player with lower amount

0,0 2,-2 2,-2 2,-2

-2,2 1,1 3,-1 3,-1

-2,2 -1,3 2,2 4,0

-2,2 -1,3 0,4 3,3

Player 2 strategy

0 1 2 3

0

1

2

3

Player 1 strategy

Source: Wikipedia

Page 53: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

53

Example: Nash Equilibrium• 0-0 is unique

Nash equilibrium

• Any other strategy: one player can lower his to below other’s and improve.

0,0 2,-2 2,-2 2,-2

-2,2 1,1 3,-1 3,-1

-2,2 -1,3 2,2 4,0

-2,2 -1,3 0,4 3,3

Player 2 strategy

0 1 2 3

0

1

2

3

Player 1 strategy

Source: Wikipedia

Page 54: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

54

Example: Nash Equilibrium• 0-0 is unique

Nash equilibrium

• Any other strategy: one player can lower his to below other’s and improve.

• E.g.: From 2-2, player 1 lowers his number to 1

0,0 2,-2 2,-2 2,-2

-2,2 1,1 3,-1 3,-1

-2,2 -1,3 2,2 4,0

-2,2 -1,3 0,4 3,3

Player 2 strategy

0 1 2 3

0

1

2

3

Player 1 strategy

Source: Wikipedia

Page 55: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

55

Example: Nash Equilibrium• 0-0 is unique

Nash equilibrium

• Any other strategy: one player can lower his to below other’s and improve.

• E.g.: From 2-2, player 1 lowers his number to 1

0,0 2,-2 2,-2 2,-2

-2,2 1,1 3,-1 3,-1

-2,2 -1,3 2,2 4,0

-2,2 -1,3 0,4 3,3

Player 2 strategy

0 1 2 3

0

1

2

3

Player 1 strategy

Source: Wikipedia

Page 56: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

56

Example: Nash Equilibrium• 0-0 is unique

Nash equilibrium

• Any other strategy: one player can lower his to below other’s and improve.

• E.g.: From 2-2, player 1 lowers his number to 1 (or player 2 lowers his to 1)

0,0 2,-2 2,-2 2,-2

-2,2 1,1 3,-1 3,-1

-2,2 -1,3 2,2 4,0

-2,2 -1,3 0,4 3,3

Player 2 strategy

0 1 2 3

0

1

2

3

Player 1 strategy

Source: Wikipedia

Page 57: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

57

CS and Game Theory: Algorithmic Issues

Nash Equilibrium• 2-player games: can use linear programming

methods.• Recent powerful result (Daskalakis, Goldberg,

Papadimitriou 2005): for 4-player games, problem is PPAD-complete.

• (PPAD: class of search problems where solution is known to exist by graph-theoretic arguments.)

• PPAD-complete means: If exists polynomial algorithm, then exists one for Brouwer fixed points, which seems unlikely.

Page 58: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

58

CS and Game Theory: Algorithmic Issues

Other Algorithmic Challenges• Repeated games.

• Issues of sequential decision making• Issues of learning to play

• Other “solution concepts” in multi-player games: “power indices” (Shapley, Banzhaf, Coleman)Need calculate them for huge gamesMostly computationally intractableArise in many applications in CS, e.g.,

multicasting

Page 59: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

59

Computational Issues in Auction Design

• Auctions increasingly used in business and government.

• Information technology allows complex auctions with huge number of bidders.

• Auctions are unusually complicated games.

Page 60: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

60

Computational Issues in Auction Design

Bidding functions maximizing expected profit can be exceedingly difficult to compute.

Determining the winner of an auction can be extremely hard. (Rothkopf, Pekec, Harstad 1998)

Page 61: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

61

Computational Issues in Auction Design

Combinatorial Auctions

• Multiple goods auctioned off.• Submit bids for combinations of goods.• This leads to NP-complete allocation

problems.• Might not even be able to feasibly express all

possible preferences for all subsets of goods.• Rothkopf, Pekec, Harstad (1998): determining

winner is computationally tractable for many economically interesting kinds of combinations.

Page 62: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

62

Computational Issues in Auction DesignSome other Issues:

• Internet auctions: Unsuccessful bidders learn from previous auctions.

• Issues of learning in repeated plays of a game.

• Related to software agents acting on behalf of humans in electronic marketplaces based on auctions.

• Cryptographic methods needed to preserve privacy of participants.

Page 63: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

63

Allocating/Sharing Costs & Revenues• Game-theoretic solutions have long

been used to allocate costs to different users in shared projects.Allocating runway fees in airportsAllocating highway fees to trucks of

different sizesUniversities sharing library facilitiesFair allocation of telephone calling

charges among users sharing complex phone systems (Cornell’s experiment)

Page 64: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

64

Allocating/Sharing Costs & RevenuesShapley Value

• Shapley value assigns a payoff to each player in a multi-player game.

• Consider a game in which some coalitions of players win and some lose, with no subset of a losing coalition winning.

• Consider a coalition forming at random, one player at a time.

• A player i is pivotal if addition of i throws coalition from losing to winning.

• Shapley value of i = probability i is pivotal if an order of players is chosen at random.

• In such games with winners/losers, called Shapley-Shubik power index.

Lloyd Shapley

Page 65: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

65

Allocating/Sharing Costs & RevenuesShapley Value

Example: Board of Directors of CompanyShareholder 1 holds 3 shares.Shareholders 2, 3, 4, 5, 6, 7 hold 1 share each.A majority of shares are needed to make a decision.Coalition {1,4,6} is winning.Coalition {2,3,4,5,6} is winning.

Shareholder 1 is pivotal if he is 3rd, 4th, or 5th.So shareholder 1’s Shapley value is 3/7.Sum of Shapley values is 1 (since they are probabilities)Thus, each other shareholder has Shapley value

(4/7)/6 = 2/21

Page 66: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

66

Allocating/Sharing Costs & RevenuesShapley Value

Allocating Runway Fees at AirportsLarger planes require longer runways.Divide runways into meter-long

segments.Each month, we know how many

landings a plane has made.Given a runway of length y meters,

consider a game in which the players are landings and a coalition “wins” if the runway is not long enough for planes in the coalition.

Page 67: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

67

Allocating/Sharing Costs & RevenuesShapley Value

Allocating Runway Fees at AirportsA landing is pivotal if it is the first

landing added that makes a coalition require a longer runway.

The Shapley value gives the cost of the yth meter of runway to a given landing.

We then add up these costs over all runway lengths a plane requires and all landings it makes.

Page 68: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

68

Allocating/Sharing Costs & Revenues

Multicasting

• Applications in multicasting.• Unicast routing: Each packet sent from a

source is delivered to a single receiver.• Sending it to multiple sites: Send multiple

copies and waste bandwidth.• In multicast routing: Use a directed tree connecting source to all receivers.• At branch points, a packet is duplicated as necessary.

Page 69: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

69

Multicasting

Page 70: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

70

Allocating/Sharing Costs & Revenues

Multicasting

• Multicast routing: Use a directed tree connecting source to all receivers.

• At branch points, a packet is duplicated as necessary.

• Bandwidth is not directly attributable to a single receiver.

• How to distribute costs among receivers?• One idea: Use Shapley value.

Page 71: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

71

Allocating/Sharing Costs & Revenues• Feigenbaum, Papadimitriou, Shenker (2001):

no feasible implementation for Shapley value in multicasting.

• Note: Shapley value is uniquely characterized by four simple axioms.

• Sometimes we state axioms as general principles we want a solution concept to have.

• Jain and Vazirani (1998): polynomial time computable cost-sharing algorithmSatisfying some important axiomsCalculating cost of optimum multicast tree within

factor of two of optimal.

Page 72: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

72

Bounded Rationality

• Traditional game theory assumption: Strategic agents are fully rational; can completely reason about consequences of their actions.

• But: Consider bounded computational power.

Page 73: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

73

Bounded RationalitySome issues:

• Looking at bounded rationality as bounded recall in repeated games.

• Modeling bounded rationality when strategies are limited to those implementable on finite state automata

• What are optimal strategies in large, complex games arising in CS applications for players with bounded computational power?

• E.g.: How do players with limited computational power determine minimal bid increases in an auction to transform losing bids into winning ones?

Page 74: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

74

Streaming Data in Game Theory

Streaming Data Analysis:• When you only have one shot at the

data as it streams by• Widely used to detect trends and

sound alarms in applications in telecommunications and finance

• AT&T uses this to detect fraudulent use of credit cards or impending billing defaults

• Other relevant work: methods for detecting fraudulent behavior in financial systems

Page 75: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

75

Streaming Data in Game Theory

Streaming Data Analysis:

• “One pass” mechanism of interest in game theory-based allocation schemes in multicasting Herzog, Shenker, Estrin (1997)

• Arises in on-line auctions.Need to develop bidding strategies if only

one pass is allowed

Page 76: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

76

CS and SS: Outline

1.CS and Consensus/Social Choice

2. CS and Game Theory

3. Algorithmic Decision Theory

Page 77: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

77

Algorithmic Decision Theory • Decision makers in many fields (engineering,

medicine, economics, …) have:Remarkable new technologies to useHuge amounts of information to help themAbility to share information at unprecedented

speeds and quantities

Page 78: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

78

Algorithmic Decision Theory • These tools bring daunting new problems:

Massive amounts of data are often incomplete, unreliable, or distributed

Interoperating/distributed decision makers and decision making devices need coordination

Many sources of data need to be fused into a good decision.

• There are few highly efficient algorithms to support decisions.

Page 79: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

79

Sequential Decision Making • Making some decisions before all data

is in.• Sequential decision problems arise in:

Communication networksTesting connectivity, paging

cellular customers, sequencing tasks

ManufacturingTesting machines, fault

diagnosis, routing customer service calls

Page 80: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

80

Sequential Decision Making • Sequential decision problems arise in:

Artificial IntelligenceOptimal derivation strategies in

knowledge bases, best-value satisficing search, coding decision tables

MedicineDiagnosing patients, sequencing

treatments

Page 81: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

81

Sequential Decision Making

Online Text Filtering Algorithms

• We seek to identify “interesting” documents from a stream of documents

• Widely studied problem in machine learning

Page 82: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

82

Sequential Decision Making Online Text Filtering Algorithms: A Model

• As a document arrives, need to decide whether or not to present it to an oracle

• If document presented to oracle and is interesting, get r reward units.

• If presented and not interesting, get penalty of c units.

• What is a strategy for maximizing expected payoff?

• See Fradkin and Littman (2005) for recent work using sequential decision making methods

Page 83: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

83

Inspection Problems • Inspection problem: in what order to do tests to inspect containers for drugs,

bombs, etc.?• Do we inspect? What test do we do next?

How do outcomes of earlier tests affect this decision?

• Simplest case: Entities being inspected need to be classified as ok (0) or suspicious (1).

• Binary decision tree model for testing. • Follow left branch if ok, right branch if

suspicious. • Find cost-minimizing binary decision tree.

Page 84: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

84

Inspection Problems

Follow left branch if ok, right branch if suspicious.

Page 85: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

85

Sequential Decision Making Problem Some More Details:

•Containers have attributes, each in a number of states

•Sample attributes:Levels of certain kinds of chemicals or biological materialsWhether or not there are items of a certain kind in the cargo listWhether cargo was picked up in a certain port

Page 86: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

86

Sequential Decision Making Problem•Simplest Case: Attributes are in state 0 or 1•State 1 means have attribute and that is suspicious.

•Then: Container is a binary string like 011001

•So: Classification is a decision function F that assigns each binary string to a category 0 or 1: A Boolean function.

011001 F(011001)

If attributes 2, 3, and 6 are present and others are not, assign container to category F(011001).

Page 87: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

87

Binary Decision Tree Approach•Reach category 1 from the root by:a0 L to a1 R a2 R 1 ora0 R a2 R1

•Container classified in category 1 iff it hasa1 and a2 and not a0 or a0 and a2 and possibly a1.

•Corresponding Boolean function F(111) = F(101) = F(011) = 1, F(abc) = 0 otherwise.

Page 88: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

88

Binary Decision Tree Approach•This binary decision tree corresponds to the same Boolean function

F(111) = F(101) = F(011) = 1, F(abc) = 0 otherwise.

However, it has one less observation node ai. So, it is more efficient if all observations are equally costly and equally likely.

Page 89: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

89

Binary Decision Tree Approach•Realistic problem much more difficult:

Test result errors Tests cost different amounts of money and take different amounts of timeThere are queues to wait for testingOne can adjust the thresholds of detectors.There are penalties for false negatives and false positives.

•Challenging problems for computer science

Gamma ray detector

Page 90: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

90

Inspection Problems

• Problem of finding optimal binary decision tree has many other uses:AI: rule-based systems Circuit complexityReliability analysisTheory of programming/databases

• In general, problem is NP-complete

Page 91: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

91

Inspection Problems

• Some cases of decision functions where the problem is tractable:k-out-of-n systemsCertain series-parallel systemsRead-once systems“regular systems”Horn systems

• Recent results in case of inspection problems at ports: Stroud and Saeger

(2004), Anand, et al. (2006).

Page 92: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

92

Computational Approaches to Information Management in Decision Making

Representation and Elicitation

• Successful decision making requires efficient elicitation of information and efficient representation of the information elicited.

• Old problems in the social sciences.• Computational aspects becoming a focal point

because of need to deal with massive and complex information.

Page 93: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

93

Computational Approaches to Information Management in Decision Making

Representation and Elicitation

• Example I: Social scientists study preferences: “I prefer beef to fish”

• Extracting and representing preferences is key in decision making applications.

Page 94: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

94

Computational Approaches to Information Management in Decision Making

Representation and Elicitation

• “Brute force” approach: For every pair of alternatives, ask which is preferred to the other.

• Often computationally infeasible.

Page 95: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

95

Computational Approaches to Information Management in Decision Making

Representation and Elicitation

• In many applications (repeated games, collaborative filtering), important to elicit preferences automatically.

• CP-nets introduced as tool to represent preferences succinctly and provide ways to make inferences about preferences (Boutilier, Brafman, Doomshlak, Hoos, Poole 2004).

Page 96: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

96

Computational Approaches to Information Management in Decision Making

Representation and Elicitation

• Example II: combinatorial auctions.• Decision maker needs to elicit preferences

from all agents for all plausible combinations of items in the auction.

• Similar problem arises in optimal bundling of goods and services.

• Elicitation requires exponentially many queries in general.

Page 97: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

97

Computational Approaches to Information Management in Decision Making

Representation and Elicitation• Challenge: Recognize situations in which

efficient elicitation and representation is possible.

• One result: Fishburn, Pekec, Reeds (2002)• Even more complicated: When objects in

auction have complex structure. • Problem arises in:

Legal reasoning, sequential decision making, automatic decision devices, collaborative filtering.

Page 98: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

98

Concluding Comment• In recent years, interplay between CS and biology has transformed major parts of Bio into an information science.• Led to major scientific breakthroughs in

biology such as sequencing of human genome.

• Led to significant new developments in CS, such as database search.

• The interplay between CS and SS not nearly as far along.

• Moreover: problems are spread over many disciplines.

Page 99: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

99

Concluding Comment

• However, CS-SS interplay has already developed a unique momentum of its own.

• One can expect many more exciting outcomes as partnerships between computer scientists and social scientists expand and mature.

Page 100: 1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

100