chess - probabilistic rating theory

8/10/2019 Chess - Probabilistic Rating Theory

1/24

1

Probabilistic Rating Theory

(Odds Ratings)

FormulaeA = ( Ao + Bo + R )

B = ( Ao + BoR )

P = 1 / (1 + e ( KR ) )

R = (log ( (e ( KRo( Ro> 0 ) ) + S ) / ( e ( KRo( Ro< 0 ) ) + 1S ) ) ) / K

K = ( log ( 1 / Ps1 ) ) / Rs

Table of Contents

1. Introduction 2

2. Point Sequences 7

3. The Rating Function 8

4. Concatenation of Probabi li ties 9

5. Concatenati on Laws 10

6.

Practical Ratings 117. Fundamental Theorem of Games of Ski ll 12

8. I ndividual Games 13

9. Probabil ity Transformations 15

10.Rating Changes 16

11.Rating Tolerance 17

12.Rating Change Formulae 20

13.Formulae Summary 22

14.Playing Strength Di str ibuti on 23

15.Non-chronological Processing 24


2/24

2

1. I ntroduction

Chess is one of many games of skill at which the novice finds the accomplished

master invincible. This theory is designed to ascribe a RATING to the playing

strength manifested by sequences of game results. ODDS RATINGS are numbers of

mathematical significance. The ratings of two members of the chess playingfraternity allow the calculation of winning odds. Ratings (and odds) will change

with each game result unless equal players draw. It is a moot point whether actual

playing strength (and associated rating) is effectively constant. Only playing

strength, as manifested in game results, can be measured. We refer to calculated

ratings, which oscillate between fixed limits (TOLERANCE). Within tolerance a

player will perceive his recent PERFORMANCE, but beyond that a change in playing

strength is indicated. Performance variations are a result of POINT SEQUENCES

associated with playing strength, and the interaction of multiple point sequences when

playing several opponents, as in tournaments.

This theory is applicable to any game (situation) where the fraction of absolutesuccess is measureable. The probability associated with ratings can be interpreted as

the expected fraction of success per game. It provides the most current and accurate

measure that can be calculated from the available data. A few generations of Sharp

programmable calculators have permitted the evolution of both text book theory and

program code (a Swiss Tournament Manager). The STM doubles as a simulator,

which has verified the efficacy of the system. Randomized results generated

according to implied odds have been used to reconstruct assumed ratings. Accuracy

is achieved as predicted by theoretical tolerance expectations.

The system can be parameterized. For ease of mental odds estimation, as well as a

comparable rating range, these seem best: 0 to 3000 should cover all assumed 910

players, considered Normal with mean 1500. The following table of odds against

rating differences implies a tolerance of plus or minus 50 points. Rapid convergence

allows mean = provisional rating. Zero sum rating changes maintain the mean and

prevent rating inflation. As such, Fischer at 2760 would win 6200 games per loss (or

3100 per draw) against the average player.

TABLE 1

Wins per Loss Rating Difference

1 0

2 100

4 2008 300

16 400

32 500

64 600

128 700

256 800

512 900

1024 1000

This table (or its implied function) forms one statement of the fundamental hypothesis

upon which Odds Ratings rests, allowing the deduction of all (seven) necessary

formulae and (five) protocols for non-chronological processing.


3/24

3

Formulae Summary

Ratings

oa and ob will denote the old ratings of two players, a and b the new ratings after

player A scores s {0, , 1}, the result of one game, against player B.

Rating Difference

or will denote the old rating difference

o o or a b

and rwill denote the new rating difference

r a b after player A scores s {0, , 1}, the result of one game, against player B.

System Constant

Ifs

r is the rating difference used to denote sq wins per loss, then the system constant

kwill be

1log s

s

k qr

2sq and 100sr is suggested as the best parameterization (see TABLE 1).

Probability

Given a rating difference r, the expected wins per loss q is given bykrq e

or probability p by

1

1 1 krq

pq e

+ tanh ( kr)

Rating Difference Transformation

An old rating difference or and a game score s {0, , 1} will produce a new rating

difference rgiven by( 0)

( 0)

1log

1

o o

o o

kr r

kr r

e sr

k e s

where 1true and 0false .

Rating Change

a and b are the new ratings after a rating difference change from or to r

a ( )o oa b r

b ( )o oa b r

Note: These formulae are sufficient to process chronological rating changes (as for

instance while managing a Swiss tournament).


4/24

4

Non-Chronological Processing

Non-chronological processing requires special protocols. With Odds Ratings a

chronology must be assumed to allow processing as normal. These protocols

optimize the assumptions and prevent anomalous ones, as explained below. They

effectively average the performance of each player. In fact, non-chronologicalprocessing is the only average of any kind produced by Odds Ratings.

The protocols follow:

1. Maximize Drawsindividual pairs of players are assumed to have drawn the

maximum number of games achieving their overall score. Draw tolerances

are smaller than win tolerances, and the corresponding point sequences

converge more rapidly (see below for formula).

2. Score Based Point Sequencesby normalizing the point sequences so that the

wins per loss ratio remains as constant throughout as possible, the

accumulation of losses or wins at one end of the sequence does not occur (bydesign or accident). Such would produce the effect of an increasing or

decreasing rating throughout the games. The i th game result is in a normal

point sequence, maximizing draws, is given by

is ( int( + 2 abip )int( + 2 ( i 1 ) abp ) )

where:

int x is the largest whole number less than or equal to x ,

abab

ab

sp

g,

abs is the overall score,and abg is the total number of games,

by player A against player B.

3. Completed Roundsif the maximum games by any individual pair of players

is mg , then processing must be executed as mg rounds, where no pair of players

play more than one game per round. Furthermore, pairs with one game play

in the last round, those with 2 games in the last 2 rounds, those with mg 1 in

all but the first. That way the players with fewer games are processed against

more accurate ratings.

4. Hierarchical Processingbefore each round, players are sorted according to

their progressively recalculated ratings. They are then processed from thelowest to the highest, each against the lowest to the highest remaining. First

in best dressed effects are thus avoided, the weaker players getting first bite of

the cherry, rather than the field being plundered of points by the strongest first.

5. Two Pass Processingby using two passes to reprocess a completed Swiss

tournament, the method most effectively rates an entire field of unrated

players (set to the system average = provisional rating). However, a small

number of unrated players among several rated players need no special

treatment. This same method also provides the perfect tie-break, finely

grading relative play on the basis of a single event.


5/24

5

Conclusion

The information here is sufficient to implement Odds Ratings either on a site such as

Chess.com or worldwide. The theoretical proof or derivation from the fundamental

hypothesis is available, if involved. This hypothesis was, in fact, that the relationship

between probabilities p and rating differences ris, in essencep = + tanh r

which, when graphed, will be seen to be intuitively obvious. But the shape came to

me in 1974, and Im ashamed to say that the exact function came to me only ten years

ago, though others had been tried. The correct function precipitated some

surprisingly simplified mathematics and vastly superior results. That the exponential

nature didnt occur to me sooner I can only blame on the fact that I never did my

homework. I should have noticed the similarity to differential equations for capstans.

But here is the reason for the invincibility of masters of the game, and the many

grades in any form of mastery. TABLE 1 is then a relatively trivial mathematical

implication of this (with added parameterization for the more familiar scaling).

Point sequences form an integral and necessary part of the theory of Odds Ratings.

For the chess player who is also a mathematician (such as Arpad Elo, Max Euwe and

Jose Capablanca) this theory does a lot to provide him with more realistic

expectations of his abilities. The interaction of these sequences during tournaments

also goes far to demonstrate possible outcomes that can be expected. Point

sequences are idealized ordered sets of game results. They are not very far from thereality, certainly not as far as Professor Arpad Elo imagined. The measurement of

the rating of an individual might be compared with the measurement of the position of

a cork bobbing up and down on the surface of agitated water with a yard stick tied to a

rope swaying in the wind. This is an illusion, not a fact. Point sequences dont

stray so far as the bobbing cork, and the linear functions used are the yard stick, and

way off. But Elo Ratings and its offspring are not based on probability, but on

statistics. Averages take much data even when nothing is changing overall. What is

more changeable than man?

Arpad Elo was commissioned by the USCF in 1959 to replace dysfunctional systems,

and provided it on request in one year. FIDE struggled on 10 years more without Elo

p = + tanh( kr)

1log s

s

k qr

sq = 2 Wins per Losssr = 100 Rating Difference


6/24

6

Ratings. Presumably they were adopted in desperation because nothing better was

forthcoming. Remember that Einstein didnt produce Relativity on demand in one

year, rather in ten, and driven by curiosity. He said It is a wonder that the tender

plant of a young childs curiosity isnt entirely strangled by modern methods of

instruction. Well, thats the best excuse I can come up with for not doing my

homework. But Arpad did the best he could in the time he had. The inductive leapoften required is usually not available on demand. Like the tender plant it must be

nurtured and loved. And beginning with an incorrect hypothesis can leave you

flogging a dead horse, just like placing faith in a flawed chess opening system.

Johnny Chess,

Diploma of Teaching,

Secondary Mathematics & Computing, 1990.


7/24

7

2. Point Sequences

Agame of skillis one at which absolute mastery is unobtainable. This mathematics

can accordingly never deliver, nor process, a probabilityof one or zero. It measures

the relative degree of mastery ofplayers of a particular game.

Chess is agame of skill, although, for a while, Capablanca was thought (even by

himself) to have attained such complete mastery. For six years he did not lose a

single game. While computers are on the verge of invincibility by human masters,

there will be no certainty of outcome amongst themselves. The number of electrons

in the visible universe is less than the number of games possible.

The relative playing strength of any two players A and B is evident in the proportion

of points scored by each. Thus our measurable degree of relative mastery consists of

the probability abp of a win(absolute success) by player A. If A is the stronger

player and will risk losing rather than draw, his 'm th to his 'n th game results is

willconsist of the point sequence

int( ) int( ( 1) )i ab abs ip i p ( )i

where , 1, , 1,i m m n n

For instance, 0.75, 1, 4abp m n is the sequence {1,1,0,1}.

On the other hand, were A a cautious player, who will draw before risking a loss,

never losing against a weaker player

(int( 2 ) int( 2( 1) ))i ab abs ip i p ( )ii

where the same ,abp m and n give the sequence {1, ,1, }.

These idealizations lose no generality, since the reality deviates but little. The

deviations can be interpreted as fluctuations in playing strength. We need to infer

abp from these normal point sequences. However, for a given point sequence, abp is

not unique.

Given 0.75, 1, 8abp m n sequence ( )i is {1,1,0,1,1,1,0,1} but 0.8abp produces

the sequence {1,1,0,1,1,1,1,0} so that 7n cannot distinguish abp to this precision.

On the contrary note, too long a sequence implies too long a rating period, whereab

p

is not likely to have remained constant for these two players. Further, we are only

finding the relative strength of players A and B, and other players need to beintegrated into our calculations. This integration requires Odds Ratings,which we

shall deduce.


8/24

8

3. The Rating Function

Each player will be assigned a rating representing his playing strength. This will

increase when the weaker player in a game wins or draws, and decrease when he

loses. Rating points will not be created or destroyed by rated games. They will be

won or lost like poker chips, where each player starts play with the same value. Theinitiate will receive the average rating. The average rating will thereby never change.

We require a rating function ( )f r which will convert a rating difference rbetween

player A ( rated a ) and player B ( rated b )

r a b

to a probabilityabp or simply p

( )f r p

Intuitively, we expect that the following hold

(0) f

lim ( ) 1r

f r

lim ( ) 0r

f r

Some candidates include

2

1( )

2 2 1

rf r

r (iii)

11 1( ) tan2

f r r (iv)

1 1( ) tanh

2 2 2

rf r (v)

Of these (v) is the simplest and can be simplified further to

2 2

2 2

1 1 1 1 1

2 2 2 2 1 1

r rr r

r r r r

e e e e

e ee e

and finally to

1( )

1 rp f r

e (v)

This satisfies Einsteins expectation that the laws of nature tend to be relatively

simple. It also allows some surprisingly simple deductions. We shall adopt (v) as

the fundamental axiom relating probabil iti es and playing strengths in games of

skill.

Some may hasten to dismiss this claim because of the degree of mastery attainable in

chess, compared to, say, darts. The difference between games of skill lies in the

range of probabilities attainable by any population. For instance, a measure of

masterability might be the probability represented by one standard deviation from the

mean. This would require statistics to evaluate. It depends on the nature of the

game, as well as on the capabilities of the players.


9/24

9

4.Concatenation of Probabili ties

Let abp be the probability that player A wins against player B, and bcp that player B

wins against player C. We will call acp , the probability that player A wins against

player C, the concatenation of abp and bcp . The concatenation operator will be thesymbol & thus

&ab bc acp p p

The result of this operation depends on our definition of ( )f r p and its inverse

function 1( )f p rso that:

from (iii)2

1

2

4 4 1( )

4 4

p pr f p

p p (vi)

from (iv)

1 1( ) tan( ( ))2

r f p p (vii)

and from (v)

1 1( ) log( 1)r f pp

(viii)

Then we have, for concatenation

1 1

& ( ) ( )

( ( ) ( ))

ab bc ac

ab bc

p p p f a c f a b b c

f f p f p(ix)

For example, given 0.75ab bcp p we have 0.8780...,0.8524...,0.9acp using (iii),

(iv) and (v) respectively. Note that the choice of (v) produces a rational formula forconcatenation. Using (viii) in (ix) we have

1 1 1 11 1( ) ( ) log( ) log( ) log( . )1 1

ab bcab bc

ab bc ab bc

p pf p f p

p p p p

which when substituted into (v) gives

(1 )(1 )

ab bcac

ab bc ab bc

p pp

p p p p

so that we obtain the formula for probabilities p and q

&(1 )(1 )

pqp q

pq p q (x)

and for rational operands

&( )( )

a c ac

b d ac b a d c (xi)

For example3 4 12 3

&7 5 12 4 1 4

Mathematical induction easily proves the simple formula for an arbitrary number of

concatenations

11 2

1 1

& &... &

(1 )

n

i

in n n

i i

i i

p

p p p

p p

(xii)


10/24

10

5. Concatenation Laws

Let , ,p q rbe probabilities, and & the concatenation operator defined by

&(1 )(1 )

pqp q

pq p q (x)

Probabilities under concatenation can be described as anAbelian group with identity

element , and inverse 1 (1 )p p . A mathematician will then immediately

understand that the following is true for concatenation.

We can show using (x) that the following properties of Abelian groups hold

0. 0 & 1p q (is also a probability) for any probabilities ,p q

1. & ( & ) ( & ) &p q r p q r

2. & p p for any probability p

3. & (1 ) p p for any probability p

4. & &p q q p for any probabilities ,p q

These laws are useful for dealing with concatenation mathematics. For example,

suppose we need to know what p needs to be concatenated with a particular q to

produce the probability r . We need to solve the equation &p q rby eliminating

q from the left hand side (LHS) of the equation.

&p q r1 1( & ) & &p q q r q

1 1& ( & ) &p q q r q by 1.

1

& &p r q by 3.1&p r q by 2.

& (1 )p r q

For example, the solution to3 1

&4 3

p is now trivial

1 3 1 1 1 1& (1 ) &

3 4 3 4 1 2 3 7p

This is much simpler than solving using (x) directly. These laws will be used later to

greatly simplify some important derivations.


11/24

11

6. Practical Ratings

The rating function (v) supplies all conceivable probabilities for practical play

over a very small domain of rating values. 14 rating points cover probabilities from

0.0001 to 0.9999. A scale constant can be introduced to produce the familiar user

friendly range. This is not some arbitrary constant used to fine tune dysfunction.We decide that a probability

sp shall be represented by a rating difference sq , and

this is used to generate our system scaling constant k. We also need to decide on theconstant system average. A range of 0 to 3000, Normally distributed, will imply a

mean of 1500. Given 2 / 3sp and 100sr , the range may be exceeded if chess

goes galactic or includes computers, but it should suffice us on middle earth among

men during the fourth age... It makes odds estimation easy, and seems roughly

equivalent to the high maintenance guide to predicting the outcome in use since

1960.

Accordingly, we will define theprobability functionas the scaled rating function1

( )1 kr

p re

(xiii)

and its inverse, the rating difference function, as

1 1( ) log( 1)r p

k p (xiv)

Substitutingsp and sr into (xiv) and rearranging, thesystem constantis given by

1 1log( 1)

s s

kr p

(xv)

Choosing the suggested parameters,3log2

6.931471806... 10100k

The mean must be the provisional rating to prevent inflation, and all rating changes

must be to the nearest integer to prevent leakage due to rounding error. This last will

mean that a rating differences greater than 717 will cause no direct change, but that

would be poor organization.


12/24

12

7. Fundamental Theorem of Games of Ski l l

I f player A wins mgames per loss against player B, and player B wins ngames per

loss against player C, then player A wins mn games per loss against player C.

This is a direct consequence of (xi) and easily proven by substituting

1

mp

m

and

1

nq

n

giving

&1

mnp q

mn

The simplicity of this theorem is intuitively obvious. It convinces me that our

axiomatic basis (v) is the correct mathematical formulation of the material reality.Its validity can be contested as the theorem provides a statistically verifiable

deduction.


13/24


14/24

14

Now consider point sequences with draws maximized (ii) for:

p = 1/2

{ }

p = 3/4

{ 1 1 1 1 }

p = 5/6

{ 1 1 1 1 1 1 1 }

p = 7/8

{ 1 1 1 1 1 1 1 1 1 1 1 }

In this instance the corresponding probability transformations will need to be

p = 1/2

{ 1/2 1/2 1/2 1/2 1/2 1/2 }

p = 3/4

{ 4/5 3/44/5 3/44/5 3/44/5 3/4 }

p = 5/6

{ 7/8 5/66/7 7/8 5/66/7 7/8 5/66/7 7/8 5/6 }

p = 7/8

{ 9/10 10/11 7/88/9 9/10 10/11 7/8 8/9 9/10 10/11 7/8 8/9 9/10 10/11 7/8 }

These sequences apply to p . The sequences for the opponents in every case

require replacing scores s with 1 s and probabilities p with the inverse 1 p .

Discovering formulae which will produce the required probability transformations is

now a fairly simple matter of tri al and algebra.


15/24

15

9. Probabil ity Transformations

Inspection will show that the following formulae produce the required probability

transformations p for the game results s and the initial probabilityo

p as indicated.

1

,2 o

o

p pp ,s =1

1,

4 2

o

o

o

pp p

p,s =

,2

o

o

o

pp p

p,s =0

2,

1

o

o

o

pp p

p,s =1

3,

2 2

o

o

o

pp p

p,s =

,1

o

o

o

pp p

p,s =0 (xvi)

But it will be noted that the score s can be incorporated into the formulae (xvi) to

reduce the number needed to

(1 ),

2

o o

o

o

s p pp p

p

( 1),

1

o

o

o

s pp p

p (xvii)

It will be apparent that, if the probabilities are incorrect in the sequence, they will

rapidly converge toward the correct values with each cycle. This is easilydemonstrated using a programmable calculator.


16/24

16

10. Rating Changes

Probabilities can only apply to pairs of players. The rating difference function (xiv)

will convert the transformed probability into a rating difference

1 1

( ) log( 1)r r p k p

Example

Player A and player B are two new members. They are given the system average for

their provisional ratings, so that a = b = 1500. Thus the rating difference is

0r a b Using the probability function (xiii)

1( )

1 krp p r

e=

Let us assume that player A wins the first game. All our variables so far become our

original values for recalculation

1500

1500

0

0.5

o

o

o

o

ab

r

p

Nowo

p and s = 1 so from (xvi) we can transform the probability p using

1 1 2

2 2 1 / 2 3op

p

By (xiv) the new rating difference ris then

1 1

( ) log( 1) 100r r p k p

Now, since rating points are won and lost conservatively, our new ratings are given by

the simultaneous solution of

o oa b a b

a b r

which gives

1( )

2 o o

a a b r

1( )

2 o ob a b r

so that a = 1550 and b = 1450. It is now in the best interest of establishing ratingsquickly, that new opponents be found to match their new ratings. Player As rating

will rise by 50 points, each time he wins against an equal or higher rated player. This

sets the limit for possible rating increase, depending heavily on available competition.

But this limit of 50 points is also our rating tolerance.


17/24

17

11. Rating Tolerance

In chess, draws are the result of 50% of master games. In Chinese chess, draws are

rare, largely due to the fact that the stalemated player is deemed to have lost as in

draughs (checkers). Morphy sometimes lost trying to win drawn games. Schlechter

was called the drawing master and may have been Emmanuel Laskers equal, but intheir match together, the condition required by the defending champion was two

games up. This forced Schlechter to play the last game to win, and lose an easy

draw instead. Fischer knew draws do not win tournaments, and achieved an

unprecedented 110 round robin result in the USA Championships 196364.

Capablanca considered that throwing the game to the winds of chance was a mistake,

and his games show it. Alekhine said that chess was more than just knowledge and

logic when he complicated for creative opportunities. A brilliant tactician will take

lesser mortals out of their depth, but risking somewhat to luck.

Understanding rating tolerances will help explain the greater success of the risk

takers. Win tolerances are far greater than draw tolerances, especially when wellmatched. Therefore they win, as well as lose, more spectacularly. This theory

explains why.

Point sequences (i) and (ii) where , 1, 2, 3, ...1

np n

nare prime sequences.

For n = 2, p = 2/3 and (i) gives { 1 1 0 1 1 0 1 1 0 1 1 0 }

For n = 3, p = 3/4 and (i) gives { 1 1 1 0 1 1 1 0 1 1 1 0 }

For any 2/3 < p < 3/4 such as p = 5/7 our point sequence is no longer regular.

For p = 5/7 (i) gives { 1 1 0 1 1 1 0 1 1 0 1 1 1 0 }

Composite sequences such as this alternate between the behaviour of the bounding

prime sequences. We need only to understand prime sequence tolerance for all

practical purposes, and consider that the rating is fluctuating between them. The

length of time that the composite sequence would emulate one or the other prime

sequence depends on proximity. The same argument applies to the practical

sequence, which would behave similarly, but less regularly again. The ratings can be

interpreted as responding to variations in performance within tolerance, and to a

change in playing strength whenever tolerance is exceeded. It will also become

apparent that the top end (of a field of players who are interacting largely among

themselves) will tend toward ratings erring in excess. The converse is true for thebottom end. Those in the middle are more accurate and err either way. This is a

fortuitous circumstance as players migrating to better matched groups will

import/export rating points as required. Intergroup matches will then produce

necessary adjustments very efficiently.


18/24

18

Win Tolerance

Our question is, how much does the calculated probability change throughout the

cycle, given that convergence is achieved to precision. If our players A and B with

a b maximize wins by (i), their normal point sequences completely neutralize

maximum error at the convergence point, a loss for player A. This error is thencompletely corrected by (xvi) with p , s = 0

2

o

o

pp

p

Therefore we need to solve

&2

pq p

p

for q, because2

p

pis the actual probability and p is the highest calculated

probability obtained. Thus by (xiv) our rating tolerance is given by ( )r q since

( ) ( ) ( )2

pr r q r p

p

We can use the concatenation laws to isolate q in

&2

& ( ) '& & ( ) '2 2 2

2 2& (1 ) &

2 2

pq p

p

p p pq p

p p p

p pq p p

p p

and this can be evaluated using (x) to give2

2 2

2 2 2 2

2 2( )

2 2 2( ) 22

2 2 2 2 3( ) 3( ) ( )(1 )

2 2

p p

p p p ppq

p p p p p p p p pp

p p

Surprisingly the (win) tolerance is independent of p (and r), and for the suggested

parameters we find that the rating difference will have a maximum error of

2 1 3( ) log( 1) 100

3 2

r r

k

and since the error is split between both players (the highest rated with an excess, and

the lowest with a deficit)

actual r ating = calculated rating1 2

( )2 3

r (xviii)

which using suggested parameters is 50rating points.


19/24

19

Draw Tolerance

If the stronger player suffers at worst draws in his normal point sequences (ii), we

need to use (xvi), p , s =

1

4 2

o

o

p

p p

and solve for q in2

2

1 1 3 3 3( ) 3& ( ) ' & (1 ) &

4 2 4 2 4 2 1 3 4 1 4

p p p p p pq p p p

p p p p p p

Unlike win tolerances, draw tolerances are dependent on probability (or rating

difference). As p approaches , q approaches

1

2

3 1lim

1 4 2p

p

p

and the tolerance approaches

1

( )2r r = 0.

On the other hand, as p approaches 1, q approaches

1

3 3lim

1 4 5p

p

p

and as before

actual r ating = calculated rating1 3

( )2 5

r (xix)

which using the suggested parameters is 29.24812503... rating points.


20/24


21/24

21

An old rating differenceo

r and a game score s {0, , 1} will produce a new rating

difference r given by( 0)

( 0)

1( , ) log

1

o o

o o

kr r

o kr r

e sr R r s

k e s (xxii)

where 1true and 0false .

It may be noted here, that the score s may actually be continuous, 0 1s , as ameasure of absolute success, in any venture. As such, there will be no rating change

if and only if the value of this success variable is the same as the probability ( ).o

p r


22/24

22

13. Formulae Summary

This has been dealt with in this series, for reasons of utility, in 1. I ntroduction.

It may be noted here that the natural expression of probability, and easier to relate to

for players, is as wins per loss. A value of 1 indicates the perfectly matched

opponent. The mathematics itself would have been simpler in the compilation of thistheory. And lastly, consider Swiss tournament management. Minimising the total

wins per loss represents the best possible pairing of available players from round to

round. This method was used in the Swiss Tournament Manager simulation program

mentioned earlier. It will certainly have contributed to the verification of Odds

Ratings and its implementation as STM. The code would not be difficult to translate

for computers. The organization of games plays as great a part in rating players as

the rating system itself in terms of efficiency. As for internet chess playing sites with

meaningful feedback (as well as supplying pre-game odds), implementation was easy

and already available from the material in 1. I ntroduction.


23/24

23

14. Playing Strength Distr ibution

While the nature of a game of skill might be thought to determine the rating function,

it is my conjecture that this is not the case. The relationship between r and p is

rather the common feature. The nature of the game and that of the race will influencethe distribution of playing strengths. But while our ratings are scaled, probabilities

are absolute, and that probability representing one standard deviation from the mean

has great significance. Only the implementation of meaningful ratings, and the

compilation of statistics using these, can evaluate this measure of complexity or

masterability (reverse sides of the same coin).

Consider, for instance, stud poker. The element of chance being greater would

reduce its masterability. However, anything from working the odds and non verbal

communication skills, to unknown powers of the human psyche, may increase

masterability to surprising degrees. This theory would then quantify such differences

in games of skill.


24/24

24

15. Non Chronological Processing

This has been dealt with in this series, for reasons of utility, in 1. I ntroduction.

Game results can represent a round robin or a rating period without any regard to the

order of the games. A simple n n cross-table can be used to tally game results.Another use of such a table is to deal with an entire unrated field by using two passes.

This same technique (setting all players to the mean before processing) can be used

for a tie-break, as it will rate relative play for that one event, without regard to

previous success or failure.

THE END

chess - probabilistic rating theory

Documents