Spade Heart Rugby and District Bridge League Diamond Club
Rugby and District Bridge League
Handicap 2015-16

RDBL Handicap Tournament: Handicap Calculation
                                    JB/19-01-‘16


Introduction

At the Rugby and District Bridge League AGM in July 2014 a revised system of calculating handicaps for the annual tournament was discussed and adopted. A document was circulated describing the revised system and demonstrating its fairness by analysis of the outcomes of previous matches. Some minor amendments were made soon afterwards and prior to the introduction of the new system in September 2014.

In July of 2015 the outcomes of 2014-15 matches were analysed in the same way, confirming the conclusion that the new system was fair. The present document is an update of the original, incorporating the results of the further analysis.

The assessment of fairness involves basic probability calculations, at the level of calculating the probability of throwing X heads in Y tosses of a coin. Many players will have encountered such calculations in school mathematics, but may not have found them to have any useful application.


Initial study

The necessity for change arose from an initial study of the results of all handicap matches then recorded on the RDBL Bridgewebs website, which included results for the four years up to 2013-14. No earlier results for handicap matches could be traced.

The outcomes of all matches between a lower-ranking team (with a higher handicap) and a higher-ranking team (with a lower handicap) were divided into aggregate wins and losses for the higher-ranking team. Matches between teams with equal handicaps were very few and were disregarded. The remaining 56 matches resulted in 19 wins and 37 losses for the higher-ranking teams. The disparity in outcomes, in the ratio of about 1 to 2, suggested the possibility that an unintended bias in the handicaps favoured lower-ranking teams.
 
The principle of any handicap competition is that differences in strength between teams are compensated by handicaps that offset the expected differences in unadjusted scores. In the RDBL tournament these are determined from the results of the previous year’s divisional matches. If all teams in the handicap competition were fairly handicapped and if all teams were to perform exactly according to past form, then the result of every match would be a draw. That doesn’t happen because of variations in performance and, in principle, the winner of a handicap competition will be the team that most consistently exceeds its previous level of performance.

The outcomes of all matches should nevertheless divide more or less equally between wins and losses for the higher-ranking (or lower-ranking) teams. So 56 matches were expected to produce 28 wins and 28 losses for both. It would not be surprising if there were, say, 26 wins and 30 losses, but 19 wins and 37 losses suggested the need for further investigation. The obvious means of investigation was to calculate the probability of this outcome, given a presumed fair system of handicaps.

The task is much simpler than might be thought because, for two outcomes of equal likelihood in any one trial, the probability is the same as for tossing a coin and finding 19 heads in 56 trials. A bridge tournament is more fun than a coin-tossing tournament, but the probabilities of unequal aggregate outcomes are identical. The probabilities were therefore calculable by mathematical methods that have been known for over two hundred years. The arithmetic is tedious if performed by hand, but was easily performed with an EXCEL spreadsheet. The answer that emerged was that the probability of a fairly handicapped tournament producing no more than 19 wins for the higher-ranking team over 56 matches is about 1%. That didn’t rule out the possibility of the handicaps being fair, but it did mean that there was a 99% probability that the handicaps were not fair. It suggested that the existing handicap system was unintentionally weighted towards lower-ranking teams, and might be corrected by reducing the size of handicaps - but further investigation revealed a more complex picture.

The results of probability calculations may conveniently be presented as in Figure 1. The lines show relationships between number of wins and number of matches at various levels of probability. The central, black line in Figure1 shows the expected outcome that the numbers of wins and losses will be equal, but the probability of that precise outcome is small. The blue lines are symmetrical boundaries to the most likely 50% of outcomes, excluding the upper 25% and the lower 25% of outcomes. Aggregate match outcomes in the range between these boundaries are reasonably likely, and would not suggest any unfairness in the handicaps. The green lines bound the most likely 90% of outcomes, excluding the upper and lower 10% of outcomes, and aggregate outcomes outside this range may be regarded as fairly improbable. The orange lines bound the most likely 98% of outcomes, excluding the upper and lower 1% of outcomes, and aggregate outcomes outside this range are highly improbable.

 

Handicap tournament results aggregated over the four years to 2013-14, as discussed in the previous section, are shown by the red diagonal cross. As stated above, there is only 1% probability of as high a number of wins, and hence a 99% probability that the handicaps are not fair. The purple diagonal cross shows the aggregated results over the two years up to 2013-14, indicating a similarly low probability of fairness.

A further assessment of handicap fairness was made by analysis of the results of the 2013-14 divisional matches. These were the only divisional results then available on the Bridgewebs site and no earlier records could be traced. The test performed was to calculate handicaps according to the existing system (based on the 2013-14 results), to apply them retrospectively to the divisional results and to determine the numbers of wins and losses by the lower-ranking and higher-ranking teams. Of the 90 matches played a few results were discarded either because both teams had the same handicap or because the handicapped result was a draw. The remainder showed that only 34/87 matches were won by the team with the higher handicap. This outcome is shown in Figure 1 by the green orthogonal cross. It falls outside the fairly improbable range and approaches the highly improbable boundary. The handicap system was thus again found to be unsatisfactory, but this time it appeared to favour higher-ranking teams.

What the analysis confirmed unambiguously was the need to examine the structure of the handicap system.


System structure

The existing system for determining handicaps was as follows:

(i) Teams were ranked according to Average Victory Points (Average VPs) scored in the most recent divisional competition,
(ii) Within each division, handicaps were awarded on an incremental scale of 6 International Match Points (IMPs), starting at zero for the highest ranked,
(iii) A scale encompassing all divisions was produced by awarding the highest-ranked team in Div.2 (or Div.3) the same handicap as the lowest-ranked team in Div.1 (or Div.2), then adjusting the handicaps for all lower-ranking teams accordingly.

Thus, for 3 divisions of 6 teams, the nominal range of handicaps was from zero to 90. In practice some teams with very similar Average VPs might be awarded the same handicap. And new teams with no known form might be included, but with no satisfactory basis for determining their handicaps.

Two possible flaws in this procedure were apparent,

1) The Average VPs that determine rank exhibited quite irregular intervals, as shown in Table 1 below by the ranges for each division. As one might expect, the distribution of Average VPs resembles the natural distribution of, for example, body weight or IQ; showing relatively little variation between teams of middling rank and greater variation between teams in the upper and lower ranks. The uniform increments in handicap were not commensurate with the variable differences in Average VPs.
2) There is no reason to suppose that there is no overlap between divisions. It had been the practice to promote the two teams at the top of Div.2 (or Div.3) and demote the two teams at the bottom of Div.1 (or Div.2), with the satisfactory consequence that some teams alternated between divisions in successive years. The handicaps are inconsistent with this practice unless they are adjusted to a corresponding overlap.

For these reasons an alternative system was investigated in which,

1) Within any one division, the Expected VP Difference between the highest-ranking and any other team was calculated exactly from the difference in Average VPs. Expected VP Difference was calculated as the difference in Average VPs multiplied by 2(N-1)/N, where N is the number of teams in the division. This relationship may not be self-evident, but may easily be verified by reversing the process, i.e. by calculating Average VPs on the assumption that every match produces the Expected VP Difference.
2) Merging of the divisions was achieved by setting the highest in Div.2 (or Div.3) at a level midway between the second-lowest and third-lowest in Div.1 (or Div.2). Merged VP Differences then overlapped by two ranks and were consistent with promotion policy.
3) Merged VP Differences were converted to handicaps in equivalent IMPs, calculated as,

            Handicap IMPs = 2.75 x Merged VP Diff.s

The conversion of Merged VP Diff.s to Handicap IMPs, by linear proportion, approximates to the conversion of IMPs to VPs on the standard scorecard, but extends beyond the scorecard limit of 20 VPs.

Table 1 shows handicaps calculated according to these two systems from the divisional match results for 2013-14. The alternative system described above is referred to as the “Form-L” system.


   Table1: Handicaps based on 2013-14 Divisional Average VPs


Division 
Team Divisional
Average
VPs Existing
Handicaps
IMPs Form-L handicaps
    Expected
VP Diff.s Merged
VP Diff.s Handicap
IMPs
1 Eroes 14.0 0 0 0 0
1 Discards 14.0 0 0.00 0.00 0
1 Gambits 13.4 12 1.00 1.00 3
1 Dragons 10.0 18 6.67 6.67 18
1 Diamonds 5.3 24 14.50 14.50 40
1 JCBC 3.3 30 17.83 17.83 49
    1         Range      10.7    
2 Rabbits 11.5 30 0 10.58 29
2 Royals 10.6 36 1.50 12.08 33
2 Cavaliers 10.1 42 2.33 12.92 36
2 Rebels 10.0 48 2.50 13.08 36
2 Clubs 9.9 54 2.67 13.25 36
2 Imps 7.9 60 6.00 16.58 46
    2        Range      3.6    
3 Donuts 17.0 60 0 13.17 36
3 United 15.3 66 2.83 16.00 44
3 Dream 8.9 72 13.50 26.67 73
3 Jokers 8.3 78 14.50 27.67 76
3 Lambs 7.7 84 15.50 28.67 77
3 Pioneers 2.8 90 23.67 36.83 101
    3        Range      14.2    


The Form-L system was tested for fairness in exactly the same way as the existing system, by determining win rates from the divisional results after the retrospective application of handicaps. Lower-ranking teams then won 44/88 matches. This outcome is entirely consistent with a fair system, as shown in Fig. 2 by the green orthogonal cross that coincides exactly with the expected outcome.

 


Further study

At the end of the 2014-15 season, during which the Form-L system was first in force, the match outcomes were analysed in the same fashion as before. It was found that lower-ranking teams won 39/80 divisional matches, as indicated by the purple orthogonal cross in Fig.2 lying very close to the expected outcome. Also shown, by the red orthogonal cross, is the combined outcome for divisional matches in 2013-14 and 2014-15 of 83/168. The nearness of the combined outcome to the expected outcome, for a total of 168 trials, convincingly demonstrates the fairness of the Form-L system.

Also shown in Fig.2 is the outcome of the 2014-15 handicap tournament matches, in which lower-ranking teams won 6/12 matches – exactly in line with expectation. The number of matches is small, and will not accumulate to a statistically significant number until further handicap tournaments have been played under the same system over another four years. It is desirable that a record of results is kept over that period to facilitate further corroboration of the fairness of the system.


Handicapping of new teams

When the Form-L handicap system was introduced, for the 2014-15 season, the customary scheduling of the Handicap Cup and Open Cup competitions was reversed. The Handicap competition now takes place in the second half of the season and the Open competition takes place in the first half. This allows a handicap to be calculated for any new team from its results in divisional matches up to the end of the calendar year. The calculation awards a VP differential relative to each opponent’s average VPs from the previous season, averages the results and converts that to handicap IMPs in the proportion given above. The calculation also accommodates the results of matches played between two new teams, by iteration of a trial solution.


Conclusion

It is clear from the initial study that the existing handicap system did not satisfactorily compensate for the relative strengths of the competing teams. And it is clear from both the initial and further studies that the Form-L system now in force is a much fairer basis for handicap tournaments.

Any team, from the lowest-ranking to the highest-ranking, will now be able to enter the handicap tournament with the same chance of ultimate victory as any other. The eventual victors will have the satisfaction of knowing that their success was entirely merited – because they most consistently exceeded their performance in the previous season.

 

Footnote: In the course of this investigation an EXCEL spreadsheet was produced to perform the automatic calculation of Handicap IMPs from the previous season’s Divisional Average VPs. The spreadsheet includes a routine for calculating Handicap IMPs for a new team. A copy of the spreadsheet is in the possession of the RDBL Committee.