chess - probabilistic rating theory

Upload: johnny-chess

Post on 02-Jun-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Chess - Probabilistic Rating Theory

    1/24

    1

    Probabilistic Rating Theory

    (Odds Ratings)

    FormulaeA = ( Ao + Bo + R )

    B = ( Ao + BoR )

    P = 1 / (1 + e ( KR ) )

    R = (log ( (e ( KRo( Ro> 0 ) ) + S ) / ( e ( KRo( Ro< 0 ) ) + 1S ) ) ) / K

    K = ( log ( 1 / Ps1 ) ) / Rs

    Table of Contents

    1. Introduction 2

    2. Point Sequences 7

    3. The Rating Function 8

    4. Concatenation of Probabi li ties 9

    5. Concatenati on Laws 10

    6.

    Practical Ratings 117. Fundamental Theorem of Games of Ski ll 12

    8. I ndividual Games 13

    9. Probabil ity Transformations 15

    10.Rating Changes 16

    11.Rating Tolerance 17

    12.Rating Change Formulae 20

    13.Formulae Summary 22

    14.Playing Strength Di str ibuti on 23

    15.Non-chronological Processing 24

  • 8/10/2019 Chess - Probabilistic Rating Theory

    2/24

    2

    1. I ntroduction

    Chess is one of many games of skill at which the novice finds the accomplished

    master invincible. This theory is designed to ascribe a RATING to the playing

    strength manifested by sequences of game results. ODDS RATINGS are numbers of

    mathematical significance. The ratings of two members of the chess playingfraternity allow the calculation of winning odds. Ratings (and odds) will change

    with each game result unless equal players draw. It is a moot point whether actual

    playing strength (and associated rating) is effectively constant. Only playing

    strength, as manifested in game results, can be measured. We refer to calculated

    ratings, which oscillate between fixed limits (TOLERANCE). Within tolerance a

    player will perceive his recent PERFORMANCE, but beyond that a change in playing

    strength is indicated. Performance variations are a result of POINT SEQUENCES

    associated with playing strength, and the interaction of multiple point sequences when

    playing several opponents, as in tournaments.

    This theory is applicable to any game (situation) where the fraction of absolutesuccess is measureable. The probability associated with ratings can be interpreted as

    the expected fraction of success per game. It provides the most current and accurate

    measure that can be calculated from the available data. A few generations of Sharp

    programmable calculators have permitted the evolution of both text book theory and

    program code (a Swiss Tournament Manager). The STM doubles as a simulator,

    which has verified the efficacy of the system. Randomized results generated

    according to implied odds have been used to reconstruct assumed ratings. Accuracy

    is achieved as predicted by theoretical tolerance expectations.

    The system can be parameterized. For ease of mental odds estimation, as well as a

    comparable rating range, these seem best: 0 to 3000 should cover all assumed 910

    players, considered Normal with mean 1500. The following table of odds against

    rating differences implies a tolerance of plus or minus 50 points. Rapid convergence

    allows mean = provisional rating. Zero sum rating changes maintain the mean and

    prevent rating inflation. As such, Fischer at 2760 would win 6200 games per loss (or

    3100 per draw) against the average player.

    TABLE 1

    Wins per Loss Rating Difference

    1 0

    2 100

    4 2008 300

    16 400

    32 500

    64 600

    128 700

    256 800

    512 900

    1024 1000

    This table (or its implied function) forms one statement of the fundamental hypothesis

    upon which Odds Ratings rests, allowing the deduction of all (seven) necessary

    formulae and (five) protocols for non-chronological processing.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    3/24

    3

    Formulae Summary

    Ratings

    oa and ob will denote the old ratings of two players, a and b the new ratings after

    player A scores s {0, , 1}, the result of one game, against player B.

    Rating Difference

    or will denote the old rating difference

    o o or a b

    and rwill denote the new rating difference

    r a b after player A scores s {0, , 1}, the result of one game, against player B.

    System Constant

    Ifs

    r is the rating difference used to denote sq wins per loss, then the system constant

    kwill be

    1log s

    s

    k qr

    2sq and 100sr is suggested as the best parameterization (see TABLE 1).

    Probability

    Given a rating difference r, the expected wins per loss q is given bykrq e

    or probability p by

    1

    1 1 krq

    pq e

    + tanh ( kr)

    Rating Difference Transformation

    An old rating difference or and a game score s {0, , 1} will produce a new rating

    difference rgiven by( 0)

    ( 0)

    1log

    1

    o o

    o o

    kr r

    kr r

    e sr

    k e s

    where 1true and 0false .

    Rating Change

    a and b are the new ratings after a rating difference change from or to r

    a ( )o oa b r

    b ( )o oa b r

    Note: These formulae are sufficient to process chronological rating changes (as for

    instance while managing a Swiss tournament).

  • 8/10/2019 Chess - Probabilistic Rating Theory

    4/24

    4

    Non-Chronological Processing

    Non-chronological processing requires special protocols. With Odds Ratings a

    chronology must be assumed to allow processing as normal. These protocols

    optimize the assumptions and prevent anomalous ones, as explained below. They

    effectively average the performance of each player. In fact, non-chronologicalprocessing is the only average of any kind produced by Odds Ratings.

    The protocols follow:

    1. Maximize Drawsindividual pairs of players are assumed to have drawn the

    maximum number of games achieving their overall score. Draw tolerances

    are smaller than win tolerances, and the corresponding point sequences

    converge more rapidly (see below for formula).

    2. Score Based Point Sequencesby normalizing the point sequences so that the

    wins per loss ratio remains as constant throughout as possible, the

    accumulation of losses or wins at one end of the sequence does not occur (bydesign or accident). Such would produce the effect of an increasing or

    decreasing rating throughout the games. The i th game result is in a normal

    point sequence, maximizing draws, is given by

    is ( int( + 2 abip )int( + 2 ( i 1 ) abp ) )

    where:

    int x is the largest whole number less than or equal to x ,

    abab

    ab

    sp

    g,

    abs is the overall score,and abg is the total number of games,

    by player A against player B.

    3. Completed Roundsif the maximum games by any individual pair of players

    is mg , then processing must be executed as mg rounds, where no pair of players

    play more than one game per round. Furthermore, pairs with one game play

    in the last round, those with 2 games in the last 2 rounds, those with mg 1 in

    all but the first. That way the players with fewer games are processed against

    more accurate ratings.

    4. Hierarchical Processingbefore each round, players are sorted according to

    their progressively recalculated ratings. They are then processed from thelowest to the highest, each against the lowest to the highest remaining. First

    in best dressed effects are thus avoided, the weaker players getting first bite of

    the cherry, rather than the field being plundered of points by the strongest first.

    5. Two Pass Processingby using two passes to reprocess a completed Swiss

    tournament, the method most effectively rates an entire field of unrated

    players (set to the system average = provisional rating). However, a small

    number of unrated players among several rated players need no special

    treatment. This same method also provides the perfect tie-break, finely

    grading relative play on the basis of a single event.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    5/24

    5

    Conclusion

    The information here is sufficient to implement Odds Ratings either on a site such as

    Chess.com or worldwide. The theoretical proof or derivation from the fundamental

    hypothesis is available, if involved. This hypothesis was, in fact, that the relationship

    between probabilities p and rating differences ris, in essencep = + tanh r

    which, when graphed, will be seen to be intuitively obvious. But the shape came to

    me in 1974, and Im ashamed to say that the exact function came to me only ten years

    ago, though others had been tried. The correct function precipitated some

    surprisingly simplified mathematics and vastly superior results. That the exponential

    nature didnt occur to me sooner I can only blame on the fact that I never did my

    homework. I should have noticed the similarity to differential equations for capstans.

    But here is the reason for the invincibility of masters of the game, and the many

    grades in any form of mastery. TABLE 1 is then a relatively trivial mathematical

    implication of this (with added parameterization for the more familiar scaling).

    Point sequences form an integral and necessary part of the theory of Odds Ratings.

    For the chess player who is also a mathematician (such as Arpad Elo, Max Euwe and

    Jose Capablanca) this theory does a lot to provide him with more realistic

    expectations of his abilities. The interaction of these sequences during tournaments

    also goes far to demonstrate possible outcomes that can be expected. Point

    sequences are idealized ordered sets of game results. They are not very far from thereality, certainly not as far as Professor Arpad Elo imagined. The measurement of

    the rating of an individual might be compared with the measurement of the position of

    a cork bobbing up and down on the surface of agitated water with a yard stick tied to a

    rope swaying in the wind. This is an illusion, not a fact. Point sequences dont

    stray so far as the bobbing cork, and the linear functions used are the yard stick, and

    way off. But Elo Ratings and its offspring are not based on probability, but on

    statistics. Averages take much data even when nothing is changing overall. What is

    more changeable than man?

    Arpad Elo was commissioned by the USCF in 1959 to replace dysfunctional systems,

    and provided it on request in one year. FIDE struggled on 10 years more without Elo

    p = + tanh( kr)

    1log s

    s

    k qr

    sq = 2 Wins per Losssr = 100 Rating Difference

  • 8/10/2019 Chess - Probabilistic Rating Theory

    6/24

    6

    Ratings. Presumably they were adopted in desperation because nothing better was

    forthcoming. Remember that Einstein didnt produce Relativity on demand in one

    year, rather in ten, and driven by curiosity. He said It is a wonder that the tender

    plant of a young childs curiosity isnt entirely strangled by modern methods of

    instruction. Well, thats the best excuse I can come up with for not doing my

    homework. But Arpad did the best he could in the time he had. The inductive leapoften required is usually not available on demand. Like the tender plant it must be

    nurtured and loved. And beginning with an incorrect hypothesis can leave you

    flogging a dead horse, just like placing faith in a flawed chess opening system.

    Johnny Chess,

    Diploma of Teaching,

    Secondary Mathematics & Computing, 1990.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    7/24

    7

    2. Point Sequences

    Agame of skillis one at which absolute mastery is unobtainable. This mathematics

    can accordingly never deliver, nor process, a probabilityof one or zero. It measures

    the relative degree of mastery ofplayers of a particular game.

    Chess is agame of skill, although, for a while, Capablanca was thought (even by

    himself) to have attained such complete mastery. For six years he did not lose a

    single game. While computers are on the verge of invincibility by human masters,

    there will be no certainty of outcome amongst themselves. The number of electrons

    in the visible universe is less than the number of games possible.

    The relative playing strength of any two players A and B is evident in the proportion

    of points scored by each. Thus our measurable degree of relative mastery consists of

    the probability abp of a win(absolute success) by player A. If A is the stronger

    player and will risk losing rather than draw, his 'm th to his 'n th game results is

    willconsist of the point sequence

    int( ) int( ( 1) )i ab abs ip i p ( )i

    where , 1, , 1,i m m n n

    For instance, 0.75, 1, 4abp m n is the sequence {1,1,0,1}.

    On the other hand, were A a cautious player, who will draw before risking a loss,

    never losing against a weaker player

    (int( 2 ) int( 2( 1) ))i ab abs ip i p ( )ii

    where the same ,abp m and n give the sequence {1, ,1, }.

    These idealizations lose no generality, since the reality deviates but little. The

    deviations can be interpreted as fluctuations in playing strength. We need to infer

    abp from these normal point sequences. However, for a given point sequence, abp is

    not unique.

    Given 0.75, 1, 8abp m n sequence ( )i is {1,1,0,1,1,1,0,1} but 0.8abp produces

    the sequence {1,1,0,1,1,1,1,0} so that 7n cannot distinguish abp to this precision.

    On the contrary note, too long a sequence implies too long a rating period, whereab

    p

    is not likely to have remained constant for these two players. Further, we are only

    finding the relative strength of players A and B, and other players need to beintegrated into our calculations. This integration requires Odds Ratings,which we

    shall deduce.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    8/24

    8

    3. The Rating Function

    Each player will be assigned a rating representing his playing strength. This will

    increase when the weaker player in a game wins or draws, and decrease when he

    loses. Rating points will not be created or destroyed by rated games. They will be

    won or lost like poker chips, where each player starts play with the same value. Theinitiate will receive the average rating. The average rating will thereby never change.

    We require a rating function ( )f r which will convert a rating difference rbetween

    player A ( rated a ) and player B ( rated b )

    r a b

    to a probabilityabp or simply p

    ( )f r p

    Intuitively, we expect that the following hold

    (0) f

    lim ( ) 1r

    f r

    lim ( ) 0r

    f r

    Some candidates include

    2

    1( )

    2 2 1

    rf r

    r (iii)

    11 1( ) tan2

    f r r (iv)

    1 1( ) tanh

    2 2 2

    rf r (v)

    Of these (v) is the simplest and can be simplified further to

    2 2

    2 2

    1 1 1 1 1

    2 2 2 2 1 1

    r rr r

    r r r r

    e e e e

    e ee e

    and finally to

    1( )

    1 rp f r

    e (v)

    This satisfies Einsteins expectation that the laws of nature tend to be relatively

    simple. It also allows some surprisingly simple deductions. We shall adopt (v) as

    the fundamental axiom relating probabil iti es and playing strengths in games of

    skill.

    Some may hasten to dismiss this claim because of the degree of mastery attainable in

    chess, compared to, say, darts. The difference between games of skill lies in the

    range of probabilities attainable by any population. For instance, a measure of

    masterability might be the probability represented by one standard deviation from the

    mean. This would require statistics to evaluate. It depends on the nature of the

    game, as well as on the capabilities of the players.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    9/24

    9

    4.Concatenation of Probabili ties

    Let abp be the probability that player A wins against player B, and bcp that player B

    wins against player C. We will call acp , the probability that player A wins against

    player C, the concatenation of abp and bcp . The concatenation operator will be thesymbol & thus

    &ab bc acp p p

    The result of this operation depends on our definition of ( )f r p and its inverse

    function 1( )f p rso that:

    from (iii)2

    1

    2

    4 4 1( )

    4 4

    p pr f p

    p p (vi)

    from (iv)

    1 1( ) tan( ( ))2

    r f p p (vii)

    and from (v)

    1 1( ) log( 1)r f pp

    (viii)

    Then we have, for concatenation

    1 1

    & ( ) ( )

    ( ( ) ( ))

    ab bc ac

    ab bc

    p p p f a c f a b b c

    f f p f p(ix)

    For example, given 0.75ab bcp p we have 0.8780...,0.8524...,0.9acp using (iii),

    (iv) and (v) respectively. Note that the choice of (v) produces a rational formula forconcatenation. Using (viii) in (ix) we have

    1 1 1 11 1( ) ( ) log( ) log( ) log( . )1 1

    ab bcab bc

    ab bc ab bc

    p pf p f p

    p p p p

    which when substituted into (v) gives

    (1 )(1 )

    ab bcac

    ab bc ab bc

    p pp

    p p p p

    so that we obtain the formula for probabilities p and q

    &(1 )(1 )

    pqp q

    pq p q (x)

    and for rational operands

    &( )( )

    a c ac

    b d ac b a d c (xi)

    For example3 4 12 3

    &7 5 12 4 1 4

    Mathematical induction easily proves the simple formula for an arbitrary number of

    concatenations

    11 2

    1 1

    & &... &

    (1 )

    n

    i

    in n n

    i i

    i i

    p

    p p p

    p p

    (xii)

  • 8/10/2019 Chess - Probabilistic Rating Theory

    10/24

    10

    5. Concatenation Laws

    Let , ,p q rbe probabilities, and & the concatenation operator defined by

    &(1 )(1 )

    pqp q

    pq p q (x)

    Probabilities under concatenation can be described as anAbelian group with identity

    element , and inverse 1 (1 )p p . A mathematician will then immediately

    understand that the following is true for concatenation.

    We can show using (x) that the following properties of Abelian groups hold

    0. 0 & 1p q (is also a probability) for any probabilities ,p q

    1. & ( & ) ( & ) &p q r p q r

    2. & p p for any probability p

    3. & (1 ) p p for any probability p

    4. & &p q q p for any probabilities ,p q

    These laws are useful for dealing with concatenation mathematics. For example,

    suppose we need to know what p needs to be concatenated with a particular q to

    produce the probability r . We need to solve the equation &p q rby eliminating

    q from the left hand side (LHS) of the equation.

    &p q r1 1( & ) & &p q q r q

    1 1& ( & ) &p q q r q by 1.

    1

    & &p r q by 3.1&p r q by 2.

    & (1 )p r q

    For example, the solution to3 1

    &4 3

    p is now trivial

    1 3 1 1 1 1& (1 ) &

    3 4 3 4 1 2 3 7p

    This is much simpler than solving using (x) directly. These laws will be used later to

    greatly simplify some important derivations.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    11/24

    11

    6. Practical Ratings

    The rating function (v) supplies all conceivable probabilities for practical play

    over a very small domain of rating values. 14 rating points cover probabilities from

    0.0001 to 0.9999. A scale constant can be introduced to produce the familiar user

    friendly range. This is not some arbitrary constant used to fine tune dysfunction.We decide that a probability

    sp shall be represented by a rating difference sq , and

    this is used to generate our system scaling constant k. We also need to decide on theconstant system average. A range of 0 to 3000, Normally distributed, will imply a

    mean of 1500. Given 2 / 3sp and 100sr , the range may be exceeded if chess

    goes galactic or includes computers, but it should suffice us on middle earth among

    men during the fourth age... It makes odds estimation easy, and seems roughly

    equivalent to the high maintenance guide to predicting the outcome in use since

    1960.

    Accordingly, we will define theprobability functionas the scaled rating function1

    ( )1 kr

    p re

    (xiii)

    and its inverse, the rating difference function, as

    1 1( ) log( 1)r p

    k p (xiv)

    Substitutingsp and sr into (xiv) and rearranging, thesystem constantis given by

    1 1log( 1)

    s s

    kr p

    (xv)

    Choosing the suggested parameters,3log2

    6.931471806... 10100k

    The mean must be the provisional rating to prevent inflation, and all rating changes

    must be to the nearest integer to prevent leakage due to rounding error. This last will

    mean that a rating differences greater than 717 will cause no direct change, but that

    would be poor organization.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    12/24

    12

    7. Fundamental Theorem of Games of Ski l l

    I f player A wins mgames per loss against player B, and player B wins ngames per

    loss against player C, then player A wins mn games per loss against player C.

    This is a direct consequence of (xi) and easily proven by substituting

    1

    mp

    m

    and

    1

    nq

    n

    giving

    &1

    mnp q

    mn

    The simplicity of this theorem is intuitively obvious. It convinces me that our

    axiomatic basis (v) is the correct mathematical formulation of the material reality.Its validity can be contested as the theorem provides a statistically verifiable

    deduction.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    13/24

  • 8/10/2019 Chess - Probabilistic Rating Theory

    14/24

    14

    Now consider point sequences with draws maximized (ii) for:

    p = 1/2

    { }

    p = 3/4

    { 1 1 1 1 }

    p = 5/6

    { 1 1 1 1 1 1 1 }

    p = 7/8

    { 1 1 1 1 1 1 1 1 1 1 1 }

    In this instance the corresponding probability transformations will need to be

    p = 1/2

    { 1/2 1/2 1/2 1/2 1/2 1/2 }

    p = 3/4

    { 4/5 3/44/5 3/44/5 3/44/5 3/4 }

    p = 5/6

    { 7/8 5/66/7 7/8 5/66/7 7/8 5/66/7 7/8 5/6 }

    p = 7/8

    { 9/10 10/11 7/88/9 9/10 10/11 7/8 8/9 9/10 10/11 7/8 8/9 9/10 10/11 7/8 }

    These sequences apply to p . The sequences for the opponents in every case

    require replacing scores s with 1 s and probabilities p with the inverse 1 p .

    Discovering formulae which will produce the required probability transformations is

    now a fairly simple matter of tri al and algebra.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    15/24

    15

    9. Probabil ity Transformations

    Inspection will show that the following formulae produce the required probability

    transformations p for the game results s and the initial probabilityo

    p as indicated.

    1

    ,2 o

    o

    p pp ,s =1

    1,

    4 2

    o

    o

    o

    pp p

    p,s =

    ,2

    o

    o

    o

    pp p

    p,s =0

    2,

    1

    o

    o

    o

    pp p

    p,s =1

    3,

    2 2

    o

    o

    o

    pp p

    p,s =

    ,1

    o

    o

    o

    pp p

    p,s =0 (xvi)

    But it will be noted that the score s can be incorporated into the formulae (xvi) to

    reduce the number needed to

    (1 ),

    2

    o o

    o

    o

    s p pp p

    p

    ( 1),

    1

    o

    o

    o

    s pp p

    p (xvii)

    It will be apparent that, if the probabilities are incorrect in the sequence, they will

    rapidly converge toward the correct values with each cycle. This is easilydemonstrated using a programmable calculator.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    16/24

    16

    10. Rating Changes

    Probabilities can only apply to pairs of players. The rating difference function (xiv)

    will convert the transformed probability into a rating difference

    1 1

    ( ) log( 1)r r p k p

    Example

    Player A and player B are two new members. They are given the system average for

    their provisional ratings, so that a = b = 1500. Thus the rating difference is

    0r a b Using the probability function (xiii)

    1( )

    1 krp p r

    e=

    Let us assume that player A wins the first game. All our variables so far become our

    original values for recalculation

    1500

    1500

    0

    0.5

    o

    o

    o

    o

    ab

    r

    p

    Nowo

    p and s = 1 so from (xvi) we can transform the probability p using

    1 1 2

    2 2 1 / 2 3op

    p

    By (xiv) the new rating difference ris then

    1 1

    ( ) log( 1) 100r r p k p

    Now, since rating points are won and lost conservatively, our new ratings are given by

    the simultaneous solution of

    o oa b a b

    a b r

    which gives

    1( )

    2 o o

    a a b r

    1( )

    2 o ob a b r

    so that a = 1550 and b = 1450. It is now in the best interest of establishing ratingsquickly, that new opponents be found to match their new ratings. Player As rating

    will rise by 50 points, each time he wins against an equal or higher rated player. This

    sets the limit for possible rating increase, depending heavily on available competition.

    But this limit of 50 points is also our rating tolerance.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    17/24

    17

    11. Rating Tolerance

    In chess, draws are the result of 50% of master games. In Chinese chess, draws are

    rare, largely due to the fact that the stalemated player is deemed to have lost as in

    draughs (checkers). Morphy sometimes lost trying to win drawn games. Schlechter

    was called the drawing master and may have been Emmanuel Laskers equal, but intheir match together, the condition required by the defending champion was two

    games up. This forced Schlechter to play the last game to win, and lose an easy

    draw instead. Fischer knew draws do not win tournaments, and achieved an

    unprecedented 110 round robin result in the USA Championships 196364.

    Capablanca considered that throwing the game to the winds of chance was a mistake,

    and his games show it. Alekhine said that chess was more than just knowledge and

    logic when he complicated for creative opportunities. A brilliant tactician will take

    lesser mortals out of their depth, but risking somewhat to luck.

    Understanding rating tolerances will help explain the greater success of the risk

    takers. Win tolerances are far greater than draw tolerances, especially when wellmatched. Therefore they win, as well as lose, more spectacularly. This theory

    explains why.

    Point sequences (i) and (ii) where , 1, 2, 3, ...1

    np n

    nare prime sequences.

    For n = 2, p = 2/3 and (i) gives { 1 1 0 1 1 0 1 1 0 1 1 0 }

    For n = 3, p = 3/4 and (i) gives { 1 1 1 0 1 1 1 0 1 1 1 0 }

    For any 2/3 < p < 3/4 such as p = 5/7 our point sequence is no longer regular.

    For p = 5/7 (i) gives { 1 1 0 1 1 1 0 1 1 0 1 1 1 0 }

    Composite sequences such as this alternate between the behaviour of the bounding

    prime sequences. We need only to understand prime sequence tolerance for all

    practical purposes, and consider that the rating is fluctuating between them. The

    length of time that the composite sequence would emulate one or the other prime

    sequence depends on proximity. The same argument applies to the practical

    sequence, which would behave similarly, but less regularly again. The ratings can be

    interpreted as responding to variations in performance within tolerance, and to a

    change in playing strength whenever tolerance is exceeded. It will also become

    apparent that the top end (of a field of players who are interacting largely among

    themselves) will tend toward ratings erring in excess. The converse is true for thebottom end. Those in the middle are more accurate and err either way. This is a

    fortuitous circumstance as players migrating to better matched groups will

    import/export rating points as required. Intergroup matches will then produce

    necessary adjustments very efficiently.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    18/24

    18

    Win Tolerance

    Our question is, how much does the calculated probability change throughout the

    cycle, given that convergence is achieved to precision. If our players A and B with

    a b maximize wins by (i), their normal point sequences completely neutralize

    maximum error at the convergence point, a loss for player A. This error is thencompletely corrected by (xvi) with p , s = 0

    2

    o

    o

    pp

    p

    Therefore we need to solve

    &2

    pq p

    p

    for q, because2

    p

    pis the actual probability and p is the highest calculated

    probability obtained. Thus by (xiv) our rating tolerance is given by ( )r q since

    ( ) ( ) ( )2

    pr r q r p

    p

    We can use the concatenation laws to isolate q in

    &2

    & ( ) '& & ( ) '2 2 2

    2 2& (1 ) &

    2 2

    pq p

    p

    p p pq p

    p p p

    p pq p p

    p p

    and this can be evaluated using (x) to give2

    2 2

    2 2 2 2

    2 2( )

    2 2 2( ) 22

    2 2 2 2 3( ) 3( ) ( )(1 )

    2 2

    p p

    p p p ppq

    p p p p p p p p pp

    p p

    Surprisingly the (win) tolerance is independent of p (and r), and for the suggested

    parameters we find that the rating difference will have a maximum error of

    2 1 3( ) log( 1) 100

    3 2

    r r

    k

    and since the error is split between both players (the highest rated with an excess, and

    the lowest with a deficit)

    actual r ating = calculated rating1 2

    ( )2 3

    r (xviii)

    which using suggested parameters is 50rating points.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    19/24

    19

    Draw Tolerance

    If the stronger player suffers at worst draws in his normal point sequences (ii), we

    need to use (xvi), p , s =

    1

    4 2

    o

    o

    p

    p p

    and solve for q in2

    2

    1 1 3 3 3( ) 3& ( ) ' & (1 ) &

    4 2 4 2 4 2 1 3 4 1 4

    p p p p p pq p p p

    p p p p p p

    Unlike win tolerances, draw tolerances are dependent on probability (or rating

    difference). As p approaches , q approaches

    1

    2

    3 1lim

    1 4 2p

    p

    p

    and the tolerance approaches

    1

    ( )2r r = 0.

    On the other hand, as p approaches 1, q approaches

    1

    3 3lim

    1 4 5p

    p

    p

    and as before

    actual r ating = calculated rating1 3

    ( )2 5

    r (xix)

    which using the suggested parameters is 29.24812503... rating points.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    20/24

  • 8/10/2019 Chess - Probabilistic Rating Theory

    21/24

    21

    An old rating differenceo

    r and a game score s {0, , 1} will produce a new rating

    difference r given by( 0)

    ( 0)

    1( , ) log

    1

    o o

    o o

    kr r

    o kr r

    e sr R r s

    k e s (xxii)

    where 1true and 0false .

    It may be noted here, that the score s may actually be continuous, 0 1s , as ameasure of absolute success, in any venture. As such, there will be no rating change

    if and only if the value of this success variable is the same as the probability ( ).o

    p r

  • 8/10/2019 Chess - Probabilistic Rating Theory

    22/24

    22

    13. Formulae Summary

    This has been dealt with in this series, for reasons of utility, in 1. I ntroduction.

    It may be noted here that the natural expression of probability, and easier to relate to

    for players, is as wins per loss. A value of 1 indicates the perfectly matched

    opponent. The mathematics itself would have been simpler in the compilation of thistheory. And lastly, consider Swiss tournament management. Minimising the total

    wins per loss represents the best possible pairing of available players from round to

    round. This method was used in the Swiss Tournament Manager simulation program

    mentioned earlier. It will certainly have contributed to the verification of Odds

    Ratings and its implementation as STM. The code would not be difficult to translate

    for computers. The organization of games plays as great a part in rating players as

    the rating system itself in terms of efficiency. As for internet chess playing sites with

    meaningful feedback (as well as supplying pre-game odds), implementation was easy

    and already available from the material in 1. I ntroduction.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    23/24

    23

    14. Playing Strength Distr ibution

    While the nature of a game of skill might be thought to determine the rating function,

    it is my conjecture that this is not the case. The relationship between r and p is

    rather the common feature. The nature of the game and that of the race will influencethe distribution of playing strengths. But while our ratings are scaled, probabilities

    are absolute, and that probability representing one standard deviation from the mean

    has great significance. Only the implementation of meaningful ratings, and the

    compilation of statistics using these, can evaluate this measure of complexity or

    masterability (reverse sides of the same coin).

    Consider, for instance, stud poker. The element of chance being greater would

    reduce its masterability. However, anything from working the odds and non verbal

    communication skills, to unknown powers of the human psyche, may increase

    masterability to surprising degrees. This theory would then quantify such differences

    in games of skill.

  • 8/10/2019 Chess - Probabilistic Rating Theory

    24/24

    24

    15. Non Chronological Processing

    This has been dealt with in this series, for reasons of utility, in 1. I ntroduction.

    Game results can represent a round robin or a rating period without any regard to the

    order of the games. A simple n n cross-table can be used to tally game results.Another use of such a table is to deal with an entire unrated field by using two passes.

    This same technique (setting all players to the mean before processing) can be used

    for a tie-break, as it will rate relative play for that one event, without regard to

    previous success or failure.

    THE END