chess - probabilistic rating theory
TRANSCRIPT
-
8/10/2019 Chess - Probabilistic Rating Theory
1/24
1
Probabilistic Rating Theory
(Odds Ratings)
FormulaeA = ( Ao + Bo + R )
B = ( Ao + BoR )
P = 1 / (1 + e ( KR ) )
R = (log ( (e ( KRo( Ro> 0 ) ) + S ) / ( e ( KRo( Ro< 0 ) ) + 1S ) ) ) / K
K = ( log ( 1 / Ps1 ) ) / Rs
Table of Contents
1. Introduction 2
2. Point Sequences 7
3. The Rating Function 8
4. Concatenation of Probabi li ties 9
5. Concatenati on Laws 10
6.
Practical Ratings 117. Fundamental Theorem of Games of Ski ll 12
8. I ndividual Games 13
9. Probabil ity Transformations 15
10.Rating Changes 16
11.Rating Tolerance 17
12.Rating Change Formulae 20
13.Formulae Summary 22
14.Playing Strength Di str ibuti on 23
15.Non-chronological Processing 24
-
8/10/2019 Chess - Probabilistic Rating Theory
2/24
2
1. I ntroduction
Chess is one of many games of skill at which the novice finds the accomplished
master invincible. This theory is designed to ascribe a RATING to the playing
strength manifested by sequences of game results. ODDS RATINGS are numbers of
mathematical significance. The ratings of two members of the chess playingfraternity allow the calculation of winning odds. Ratings (and odds) will change
with each game result unless equal players draw. It is a moot point whether actual
playing strength (and associated rating) is effectively constant. Only playing
strength, as manifested in game results, can be measured. We refer to calculated
ratings, which oscillate between fixed limits (TOLERANCE). Within tolerance a
player will perceive his recent PERFORMANCE, but beyond that a change in playing
strength is indicated. Performance variations are a result of POINT SEQUENCES
associated with playing strength, and the interaction of multiple point sequences when
playing several opponents, as in tournaments.
This theory is applicable to any game (situation) where the fraction of absolutesuccess is measureable. The probability associated with ratings can be interpreted as
the expected fraction of success per game. It provides the most current and accurate
measure that can be calculated from the available data. A few generations of Sharp
programmable calculators have permitted the evolution of both text book theory and
program code (a Swiss Tournament Manager). The STM doubles as a simulator,
which has verified the efficacy of the system. Randomized results generated
according to implied odds have been used to reconstruct assumed ratings. Accuracy
is achieved as predicted by theoretical tolerance expectations.
The system can be parameterized. For ease of mental odds estimation, as well as a
comparable rating range, these seem best: 0 to 3000 should cover all assumed 910
players, considered Normal with mean 1500. The following table of odds against
rating differences implies a tolerance of plus or minus 50 points. Rapid convergence
allows mean = provisional rating. Zero sum rating changes maintain the mean and
prevent rating inflation. As such, Fischer at 2760 would win 6200 games per loss (or
3100 per draw) against the average player.
TABLE 1
Wins per Loss Rating Difference
1 0
2 100
4 2008 300
16 400
32 500
64 600
128 700
256 800
512 900
1024 1000
This table (or its implied function) forms one statement of the fundamental hypothesis
upon which Odds Ratings rests, allowing the deduction of all (seven) necessary
formulae and (five) protocols for non-chronological processing.
-
8/10/2019 Chess - Probabilistic Rating Theory
3/24
3
Formulae Summary
Ratings
oa and ob will denote the old ratings of two players, a and b the new ratings after
player A scores s {0, , 1}, the result of one game, against player B.
Rating Difference
or will denote the old rating difference
o o or a b
and rwill denote the new rating difference
r a b after player A scores s {0, , 1}, the result of one game, against player B.
System Constant
Ifs
r is the rating difference used to denote sq wins per loss, then the system constant
kwill be
1log s
s
k qr
2sq and 100sr is suggested as the best parameterization (see TABLE 1).
Probability
Given a rating difference r, the expected wins per loss q is given bykrq e
or probability p by
1
1 1 krq
pq e
+ tanh ( kr)
Rating Difference Transformation
An old rating difference or and a game score s {0, , 1} will produce a new rating
difference rgiven by( 0)
( 0)
1log
1
o o
o o
kr r
kr r
e sr
k e s
where 1true and 0false .
Rating Change
a and b are the new ratings after a rating difference change from or to r
a ( )o oa b r
b ( )o oa b r
Note: These formulae are sufficient to process chronological rating changes (as for
instance while managing a Swiss tournament).
-
8/10/2019 Chess - Probabilistic Rating Theory
4/24
4
Non-Chronological Processing
Non-chronological processing requires special protocols. With Odds Ratings a
chronology must be assumed to allow processing as normal. These protocols
optimize the assumptions and prevent anomalous ones, as explained below. They
effectively average the performance of each player. In fact, non-chronologicalprocessing is the only average of any kind produced by Odds Ratings.
The protocols follow:
1. Maximize Drawsindividual pairs of players are assumed to have drawn the
maximum number of games achieving their overall score. Draw tolerances
are smaller than win tolerances, and the corresponding point sequences
converge more rapidly (see below for formula).
2. Score Based Point Sequencesby normalizing the point sequences so that the
wins per loss ratio remains as constant throughout as possible, the
accumulation of losses or wins at one end of the sequence does not occur (bydesign or accident). Such would produce the effect of an increasing or
decreasing rating throughout the games. The i th game result is in a normal
point sequence, maximizing draws, is given by
is ( int( + 2 abip )int( + 2 ( i 1 ) abp ) )
where:
int x is the largest whole number less than or equal to x ,
abab
ab
sp
g,
abs is the overall score,and abg is the total number of games,
by player A against player B.
3. Completed Roundsif the maximum games by any individual pair of players
is mg , then processing must be executed as mg rounds, where no pair of players
play more than one game per round. Furthermore, pairs with one game play
in the last round, those with 2 games in the last 2 rounds, those with mg 1 in
all but the first. That way the players with fewer games are processed against
more accurate ratings.
4. Hierarchical Processingbefore each round, players are sorted according to
their progressively recalculated ratings. They are then processed from thelowest to the highest, each against the lowest to the highest remaining. First
in best dressed effects are thus avoided, the weaker players getting first bite of
the cherry, rather than the field being plundered of points by the strongest first.
5. Two Pass Processingby using two passes to reprocess a completed Swiss
tournament, the method most effectively rates an entire field of unrated
players (set to the system average = provisional rating). However, a small
number of unrated players among several rated players need no special
treatment. This same method also provides the perfect tie-break, finely
grading relative play on the basis of a single event.
-
8/10/2019 Chess - Probabilistic Rating Theory
5/24
5
Conclusion
The information here is sufficient to implement Odds Ratings either on a site such as
Chess.com or worldwide. The theoretical proof or derivation from the fundamental
hypothesis is available, if involved. This hypothesis was, in fact, that the relationship
between probabilities p and rating differences ris, in essencep = + tanh r
which, when graphed, will be seen to be intuitively obvious. But the shape came to
me in 1974, and Im ashamed to say that the exact function came to me only ten years
ago, though others had been tried. The correct function precipitated some
surprisingly simplified mathematics and vastly superior results. That the exponential
nature didnt occur to me sooner I can only blame on the fact that I never did my
homework. I should have noticed the similarity to differential equations for capstans.
But here is the reason for the invincibility of masters of the game, and the many
grades in any form of mastery. TABLE 1 is then a relatively trivial mathematical
implication of this (with added parameterization for the more familiar scaling).
Point sequences form an integral and necessary part of the theory of Odds Ratings.
For the chess player who is also a mathematician (such as Arpad Elo, Max Euwe and
Jose Capablanca) this theory does a lot to provide him with more realistic
expectations of his abilities. The interaction of these sequences during tournaments
also goes far to demonstrate possible outcomes that can be expected. Point
sequences are idealized ordered sets of game results. They are not very far from thereality, certainly not as far as Professor Arpad Elo imagined. The measurement of
the rating of an individual might be compared with the measurement of the position of
a cork bobbing up and down on the surface of agitated water with a yard stick tied to a
rope swaying in the wind. This is an illusion, not a fact. Point sequences dont
stray so far as the bobbing cork, and the linear functions used are the yard stick, and
way off. But Elo Ratings and its offspring are not based on probability, but on
statistics. Averages take much data even when nothing is changing overall. What is
more changeable than man?
Arpad Elo was commissioned by the USCF in 1959 to replace dysfunctional systems,
and provided it on request in one year. FIDE struggled on 10 years more without Elo
p = + tanh( kr)
1log s
s
k qr
sq = 2 Wins per Losssr = 100 Rating Difference
-
8/10/2019 Chess - Probabilistic Rating Theory
6/24
6
Ratings. Presumably they were adopted in desperation because nothing better was
forthcoming. Remember that Einstein didnt produce Relativity on demand in one
year, rather in ten, and driven by curiosity. He said It is a wonder that the tender
plant of a young childs curiosity isnt entirely strangled by modern methods of
instruction. Well, thats the best excuse I can come up with for not doing my
homework. But Arpad did the best he could in the time he had. The inductive leapoften required is usually not available on demand. Like the tender plant it must be
nurtured and loved. And beginning with an incorrect hypothesis can leave you
flogging a dead horse, just like placing faith in a flawed chess opening system.
Johnny Chess,
Diploma of Teaching,
Secondary Mathematics & Computing, 1990.
-
8/10/2019 Chess - Probabilistic Rating Theory
7/24
7
2. Point Sequences
Agame of skillis one at which absolute mastery is unobtainable. This mathematics
can accordingly never deliver, nor process, a probabilityof one or zero. It measures
the relative degree of mastery ofplayers of a particular game.
Chess is agame of skill, although, for a while, Capablanca was thought (even by
himself) to have attained such complete mastery. For six years he did not lose a
single game. While computers are on the verge of invincibility by human masters,
there will be no certainty of outcome amongst themselves. The number of electrons
in the visible universe is less than the number of games possible.
The relative playing strength of any two players A and B is evident in the proportion
of points scored by each. Thus our measurable degree of relative mastery consists of
the probability abp of a win(absolute success) by player A. If A is the stronger
player and will risk losing rather than draw, his 'm th to his 'n th game results is
willconsist of the point sequence
int( ) int( ( 1) )i ab abs ip i p ( )i
where , 1, , 1,i m m n n
For instance, 0.75, 1, 4abp m n is the sequence {1,1,0,1}.
On the other hand, were A a cautious player, who will draw before risking a loss,
never losing against a weaker player
(int( 2 ) int( 2( 1) ))i ab abs ip i p ( )ii
where the same ,abp m and n give the sequence {1, ,1, }.
These idealizations lose no generality, since the reality deviates but little. The
deviations can be interpreted as fluctuations in playing strength. We need to infer
abp from these normal point sequences. However, for a given point sequence, abp is
not unique.
Given 0.75, 1, 8abp m n sequence ( )i is {1,1,0,1,1,1,0,1} but 0.8abp produces
the sequence {1,1,0,1,1,1,1,0} so that 7n cannot distinguish abp to this precision.
On the contrary note, too long a sequence implies too long a rating period, whereab
p
is not likely to have remained constant for these two players. Further, we are only
finding the relative strength of players A and B, and other players need to beintegrated into our calculations. This integration requires Odds Ratings,which we
shall deduce.
-
8/10/2019 Chess - Probabilistic Rating Theory
8/24
8
3. The Rating Function
Each player will be assigned a rating representing his playing strength. This will
increase when the weaker player in a game wins or draws, and decrease when he
loses. Rating points will not be created or destroyed by rated games. They will be
won or lost like poker chips, where each player starts play with the same value. Theinitiate will receive the average rating. The average rating will thereby never change.
We require a rating function ( )f r which will convert a rating difference rbetween
player A ( rated a ) and player B ( rated b )
r a b
to a probabilityabp or simply p
( )f r p
Intuitively, we expect that the following hold
(0) f
lim ( ) 1r
f r
lim ( ) 0r
f r
Some candidates include
2
1( )
2 2 1
rf r
r (iii)
11 1( ) tan2
f r r (iv)
1 1( ) tanh
2 2 2
rf r (v)
Of these (v) is the simplest and can be simplified further to
2 2
2 2
1 1 1 1 1
2 2 2 2 1 1
r rr r
r r r r
e e e e
e ee e
and finally to
1( )
1 rp f r
e (v)
This satisfies Einsteins expectation that the laws of nature tend to be relatively
simple. It also allows some surprisingly simple deductions. We shall adopt (v) as
the fundamental axiom relating probabil iti es and playing strengths in games of
skill.
Some may hasten to dismiss this claim because of the degree of mastery attainable in
chess, compared to, say, darts. The difference between games of skill lies in the
range of probabilities attainable by any population. For instance, a measure of
masterability might be the probability represented by one standard deviation from the
mean. This would require statistics to evaluate. It depends on the nature of the
game, as well as on the capabilities of the players.
-
8/10/2019 Chess - Probabilistic Rating Theory
9/24
9
4.Concatenation of Probabili ties
Let abp be the probability that player A wins against player B, and bcp that player B
wins against player C. We will call acp , the probability that player A wins against
player C, the concatenation of abp and bcp . The concatenation operator will be thesymbol & thus
&ab bc acp p p
The result of this operation depends on our definition of ( )f r p and its inverse
function 1( )f p rso that:
from (iii)2
1
2
4 4 1( )
4 4
p pr f p
p p (vi)
from (iv)
1 1( ) tan( ( ))2
r f p p (vii)
and from (v)
1 1( ) log( 1)r f pp
(viii)
Then we have, for concatenation
1 1
& ( ) ( )
( ( ) ( ))
ab bc ac
ab bc
p p p f a c f a b b c
f f p f p(ix)
For example, given 0.75ab bcp p we have 0.8780...,0.8524...,0.9acp using (iii),
(iv) and (v) respectively. Note that the choice of (v) produces a rational formula forconcatenation. Using (viii) in (ix) we have
1 1 1 11 1( ) ( ) log( ) log( ) log( . )1 1
ab bcab bc
ab bc ab bc
p pf p f p
p p p p
which when substituted into (v) gives
(1 )(1 )
ab bcac
ab bc ab bc
p pp
p p p p
so that we obtain the formula for probabilities p and q
&(1 )(1 )
pqp q
pq p q (x)
and for rational operands
&( )( )
a c ac
b d ac b a d c (xi)
For example3 4 12 3
&7 5 12 4 1 4
Mathematical induction easily proves the simple formula for an arbitrary number of
concatenations
11 2
1 1
& &... &
(1 )
n
i
in n n
i i
i i
p
p p p
p p
(xii)
-
8/10/2019 Chess - Probabilistic Rating Theory
10/24
10
5. Concatenation Laws
Let , ,p q rbe probabilities, and & the concatenation operator defined by
&(1 )(1 )
pqp q
pq p q (x)
Probabilities under concatenation can be described as anAbelian group with identity
element , and inverse 1 (1 )p p . A mathematician will then immediately
understand that the following is true for concatenation.
We can show using (x) that the following properties of Abelian groups hold
0. 0 & 1p q (is also a probability) for any probabilities ,p q
1. & ( & ) ( & ) &p q r p q r
2. & p p for any probability p
3. & (1 ) p p for any probability p
4. & &p q q p for any probabilities ,p q
These laws are useful for dealing with concatenation mathematics. For example,
suppose we need to know what p needs to be concatenated with a particular q to
produce the probability r . We need to solve the equation &p q rby eliminating
q from the left hand side (LHS) of the equation.
&p q r1 1( & ) & &p q q r q
1 1& ( & ) &p q q r q by 1.
1
& &p r q by 3.1&p r q by 2.
& (1 )p r q
For example, the solution to3 1
&4 3
p is now trivial
1 3 1 1 1 1& (1 ) &
3 4 3 4 1 2 3 7p
This is much simpler than solving using (x) directly. These laws will be used later to
greatly simplify some important derivations.
-
8/10/2019 Chess - Probabilistic Rating Theory
11/24
11
6. Practical Ratings
The rating function (v) supplies all conceivable probabilities for practical play
over a very small domain of rating values. 14 rating points cover probabilities from
0.0001 to 0.9999. A scale constant can be introduced to produce the familiar user
friendly range. This is not some arbitrary constant used to fine tune dysfunction.We decide that a probability
sp shall be represented by a rating difference sq , and
this is used to generate our system scaling constant k. We also need to decide on theconstant system average. A range of 0 to 3000, Normally distributed, will imply a
mean of 1500. Given 2 / 3sp and 100sr , the range may be exceeded if chess
goes galactic or includes computers, but it should suffice us on middle earth among
men during the fourth age... It makes odds estimation easy, and seems roughly
equivalent to the high maintenance guide to predicting the outcome in use since
1960.
Accordingly, we will define theprobability functionas the scaled rating function1
( )1 kr
p re
(xiii)
and its inverse, the rating difference function, as
1 1( ) log( 1)r p
k p (xiv)
Substitutingsp and sr into (xiv) and rearranging, thesystem constantis given by
1 1log( 1)
s s
kr p
(xv)
Choosing the suggested parameters,3log2
6.931471806... 10100k
The mean must be the provisional rating to prevent inflation, and all rating changes
must be to the nearest integer to prevent leakage due to rounding error. This last will
mean that a rating differences greater than 717 will cause no direct change, but that
would be poor organization.
-
8/10/2019 Chess - Probabilistic Rating Theory
12/24
12
7. Fundamental Theorem of Games of Ski l l
I f player A wins mgames per loss against player B, and player B wins ngames per
loss against player C, then player A wins mn games per loss against player C.
This is a direct consequence of (xi) and easily proven by substituting
1
mp
m
and
1
nq
n
giving
&1
mnp q
mn
The simplicity of this theorem is intuitively obvious. It convinces me that our
axiomatic basis (v) is the correct mathematical formulation of the material reality.Its validity can be contested as the theorem provides a statistically verifiable
deduction.
-
8/10/2019 Chess - Probabilistic Rating Theory
13/24
-
8/10/2019 Chess - Probabilistic Rating Theory
14/24
14
Now consider point sequences with draws maximized (ii) for:
p = 1/2
{ }
p = 3/4
{ 1 1 1 1 }
p = 5/6
{ 1 1 1 1 1 1 1 }
p = 7/8
{ 1 1 1 1 1 1 1 1 1 1 1 }
In this instance the corresponding probability transformations will need to be
p = 1/2
{ 1/2 1/2 1/2 1/2 1/2 1/2 }
p = 3/4
{ 4/5 3/44/5 3/44/5 3/44/5 3/4 }
p = 5/6
{ 7/8 5/66/7 7/8 5/66/7 7/8 5/66/7 7/8 5/6 }
p = 7/8
{ 9/10 10/11 7/88/9 9/10 10/11 7/8 8/9 9/10 10/11 7/8 8/9 9/10 10/11 7/8 }
These sequences apply to p . The sequences for the opponents in every case
require replacing scores s with 1 s and probabilities p with the inverse 1 p .
Discovering formulae which will produce the required probability transformations is
now a fairly simple matter of tri al and algebra.
-
8/10/2019 Chess - Probabilistic Rating Theory
15/24
15
9. Probabil ity Transformations
Inspection will show that the following formulae produce the required probability
transformations p for the game results s and the initial probabilityo
p as indicated.
1
,2 o
o
p pp ,s =1
1,
4 2
o
o
o
pp p
p,s =
,2
o
o
o
pp p
p,s =0
2,
1
o
o
o
pp p
p,s =1
3,
2 2
o
o
o
pp p
p,s =
,1
o
o
o
pp p
p,s =0 (xvi)
But it will be noted that the score s can be incorporated into the formulae (xvi) to
reduce the number needed to
(1 ),
2
o o
o
o
s p pp p
p
( 1),
1
o
o
o
s pp p
p (xvii)
It will be apparent that, if the probabilities are incorrect in the sequence, they will
rapidly converge toward the correct values with each cycle. This is easilydemonstrated using a programmable calculator.
-
8/10/2019 Chess - Probabilistic Rating Theory
16/24
16
10. Rating Changes
Probabilities can only apply to pairs of players. The rating difference function (xiv)
will convert the transformed probability into a rating difference
1 1
( ) log( 1)r r p k p
Example
Player A and player B are two new members. They are given the system average for
their provisional ratings, so that a = b = 1500. Thus the rating difference is
0r a b Using the probability function (xiii)
1( )
1 krp p r
e=
Let us assume that player A wins the first game. All our variables so far become our
original values for recalculation
1500
1500
0
0.5
o
o
o
o
ab
r
p
Nowo
p and s = 1 so from (xvi) we can transform the probability p using
1 1 2
2 2 1 / 2 3op
p
By (xiv) the new rating difference ris then
1 1
( ) log( 1) 100r r p k p
Now, since rating points are won and lost conservatively, our new ratings are given by
the simultaneous solution of
o oa b a b
a b r
which gives
1( )
2 o o
a a b r
1( )
2 o ob a b r
so that a = 1550 and b = 1450. It is now in the best interest of establishing ratingsquickly, that new opponents be found to match their new ratings. Player As rating
will rise by 50 points, each time he wins against an equal or higher rated player. This
sets the limit for possible rating increase, depending heavily on available competition.
But this limit of 50 points is also our rating tolerance.
-
8/10/2019 Chess - Probabilistic Rating Theory
17/24
17
11. Rating Tolerance
In chess, draws are the result of 50% of master games. In Chinese chess, draws are
rare, largely due to the fact that the stalemated player is deemed to have lost as in
draughs (checkers). Morphy sometimes lost trying to win drawn games. Schlechter
was called the drawing master and may have been Emmanuel Laskers equal, but intheir match together, the condition required by the defending champion was two
games up. This forced Schlechter to play the last game to win, and lose an easy
draw instead. Fischer knew draws do not win tournaments, and achieved an
unprecedented 110 round robin result in the USA Championships 196364.
Capablanca considered that throwing the game to the winds of chance was a mistake,
and his games show it. Alekhine said that chess was more than just knowledge and
logic when he complicated for creative opportunities. A brilliant tactician will take
lesser mortals out of their depth, but risking somewhat to luck.
Understanding rating tolerances will help explain the greater success of the risk
takers. Win tolerances are far greater than draw tolerances, especially when wellmatched. Therefore they win, as well as lose, more spectacularly. This theory
explains why.
Point sequences (i) and (ii) where , 1, 2, 3, ...1
np n
nare prime sequences.
For n = 2, p = 2/3 and (i) gives { 1 1 0 1 1 0 1 1 0 1 1 0 }
For n = 3, p = 3/4 and (i) gives { 1 1 1 0 1 1 1 0 1 1 1 0 }
For any 2/3 < p < 3/4 such as p = 5/7 our point sequence is no longer regular.
For p = 5/7 (i) gives { 1 1 0 1 1 1 0 1 1 0 1 1 1 0 }
Composite sequences such as this alternate between the behaviour of the bounding
prime sequences. We need only to understand prime sequence tolerance for all
practical purposes, and consider that the rating is fluctuating between them. The
length of time that the composite sequence would emulate one or the other prime
sequence depends on proximity. The same argument applies to the practical
sequence, which would behave similarly, but less regularly again. The ratings can be
interpreted as responding to variations in performance within tolerance, and to a
change in playing strength whenever tolerance is exceeded. It will also become
apparent that the top end (of a field of players who are interacting largely among
themselves) will tend toward ratings erring in excess. The converse is true for thebottom end. Those in the middle are more accurate and err either way. This is a
fortuitous circumstance as players migrating to better matched groups will
import/export rating points as required. Intergroup matches will then produce
necessary adjustments very efficiently.
-
8/10/2019 Chess - Probabilistic Rating Theory
18/24
18
Win Tolerance
Our question is, how much does the calculated probability change throughout the
cycle, given that convergence is achieved to precision. If our players A and B with
a b maximize wins by (i), their normal point sequences completely neutralize
maximum error at the convergence point, a loss for player A. This error is thencompletely corrected by (xvi) with p , s = 0
2
o
o
pp
p
Therefore we need to solve
&2
pq p
p
for q, because2
p
pis the actual probability and p is the highest calculated
probability obtained. Thus by (xiv) our rating tolerance is given by ( )r q since
( ) ( ) ( )2
pr r q r p
p
We can use the concatenation laws to isolate q in
&2
& ( ) '& & ( ) '2 2 2
2 2& (1 ) &
2 2
pq p
p
p p pq p
p p p
p pq p p
p p
and this can be evaluated using (x) to give2
2 2
2 2 2 2
2 2( )
2 2 2( ) 22
2 2 2 2 3( ) 3( ) ( )(1 )
2 2
p p
p p p ppq
p p p p p p p p pp
p p
Surprisingly the (win) tolerance is independent of p (and r), and for the suggested
parameters we find that the rating difference will have a maximum error of
2 1 3( ) log( 1) 100
3 2
r r
k
and since the error is split between both players (the highest rated with an excess, and
the lowest with a deficit)
actual r ating = calculated rating1 2
( )2 3
r (xviii)
which using suggested parameters is 50rating points.
-
8/10/2019 Chess - Probabilistic Rating Theory
19/24
19
Draw Tolerance
If the stronger player suffers at worst draws in his normal point sequences (ii), we
need to use (xvi), p , s =
1
4 2
o
o
p
p p
and solve for q in2
2
1 1 3 3 3( ) 3& ( ) ' & (1 ) &
4 2 4 2 4 2 1 3 4 1 4
p p p p p pq p p p
p p p p p p
Unlike win tolerances, draw tolerances are dependent on probability (or rating
difference). As p approaches , q approaches
1
2
3 1lim
1 4 2p
p
p
and the tolerance approaches
1
( )2r r = 0.
On the other hand, as p approaches 1, q approaches
1
3 3lim
1 4 5p
p
p
and as before
actual r ating = calculated rating1 3
( )2 5
r (xix)
which using the suggested parameters is 29.24812503... rating points.
-
8/10/2019 Chess - Probabilistic Rating Theory
20/24
-
8/10/2019 Chess - Probabilistic Rating Theory
21/24
21
An old rating differenceo
r and a game score s {0, , 1} will produce a new rating
difference r given by( 0)
( 0)
1( , ) log
1
o o
o o
kr r
o kr r
e sr R r s
k e s (xxii)
where 1true and 0false .
It may be noted here, that the score s may actually be continuous, 0 1s , as ameasure of absolute success, in any venture. As such, there will be no rating change
if and only if the value of this success variable is the same as the probability ( ).o
p r
-
8/10/2019 Chess - Probabilistic Rating Theory
22/24
22
13. Formulae Summary
This has been dealt with in this series, for reasons of utility, in 1. I ntroduction.
It may be noted here that the natural expression of probability, and easier to relate to
for players, is as wins per loss. A value of 1 indicates the perfectly matched
opponent. The mathematics itself would have been simpler in the compilation of thistheory. And lastly, consider Swiss tournament management. Minimising the total
wins per loss represents the best possible pairing of available players from round to
round. This method was used in the Swiss Tournament Manager simulation program
mentioned earlier. It will certainly have contributed to the verification of Odds
Ratings and its implementation as STM. The code would not be difficult to translate
for computers. The organization of games plays as great a part in rating players as
the rating system itself in terms of efficiency. As for internet chess playing sites with
meaningful feedback (as well as supplying pre-game odds), implementation was easy
and already available from the material in 1. I ntroduction.
-
8/10/2019 Chess - Probabilistic Rating Theory
23/24
23
14. Playing Strength Distr ibution
While the nature of a game of skill might be thought to determine the rating function,
it is my conjecture that this is not the case. The relationship between r and p is
rather the common feature. The nature of the game and that of the race will influencethe distribution of playing strengths. But while our ratings are scaled, probabilities
are absolute, and that probability representing one standard deviation from the mean
has great significance. Only the implementation of meaningful ratings, and the
compilation of statistics using these, can evaluate this measure of complexity or
masterability (reverse sides of the same coin).
Consider, for instance, stud poker. The element of chance being greater would
reduce its masterability. However, anything from working the odds and non verbal
communication skills, to unknown powers of the human psyche, may increase
masterability to surprising degrees. This theory would then quantify such differences
in games of skill.
-
8/10/2019 Chess - Probabilistic Rating Theory
24/24
24
15. Non Chronological Processing
This has been dealt with in this series, for reasons of utility, in 1. I ntroduction.
Game results can represent a round robin or a rating period without any regard to the
order of the games. A simple n n cross-table can be used to tally game results.Another use of such a table is to deal with an entire unrated field by using two passes.
This same technique (setting all players to the mean before processing) can be used
for a tie-break, as it will rate relative play for that one event, without regard to
previous success or failure.
THE END