The Relative Strengths of the Openings
By C. H. O’D. Alexander and E. T. O. Slater
The British Chess Magazine, LXXV, 6, June 1955
SOME time ago the authors of this article began to get intrigued‑partly as a chess interest, partly as a statistical one‑in the problem of an alternative approach to an evaluation of the openings. As things are, strong and weak lines for either player are identified within the ambit of any given opening; but there are very varying opinions about the relative merits of the main openings, each regarded as a whole. The question arose whether, by analysing the results over a long period of time, one could show that some of the established openings were better than others, for White or for Black as the case might be.
First, the method of tackling the problem. This is by no means as easy as it might appear. Suppose we take the Sicilian, for example. Over our sample of 436 tournament games we found 130 White wins, 157 draws, 149 Black wins (we shall write such a result as 130/157/149 from now on). At first sight this indicates that the Sicilian is in Black's favour. But this fails to take into account that, the Sicilian being an aggressive defence, there is a marked tendency for a stronger player to use it against a weaker one when playing for a win. In fact, close examination of the players invoed shows that the average strength of the Black player, in these 436 games, was markedly above that of White; and that, going by the players' strengths, Black should have done even better than he did. This example shows that it is essential to take into account the expected result of each game based on playing strength, if the final statistics are to have any meaning.
The next problem was how to estimate playing strength. Should it be based on the results of the tournament in question (leaving one liable to errors arising from the smallness of the sample), or should it be based on results over a period of years (ignoring the great variations in playing form from tournament to tournament)? We thought that the disadvantages of small samples would be considerably less serious than those involved in the second method, and accordingly adopted the first one. The expected' result of a game between A and B was, therefore, defined as the chance of a win, draw, or loss for A, based solely on the performances of the two players in the rest of the tournament in question. Thus, if there was on this basis ¼ chance of a win for White, ¼ chance of a draw, ¼ chance of a win for Black, this would contribute ¼ ¼ ½ to the final expected score of the opening under consideration.
Not to try the patience of readers too much, we will not enlarge on the method of calculating these chances, except to make three remarks for the benefit of statisticians: (1) the initial assumption was made that in the absence of other evidence all "expected results," as defined above, were equally likely (this, though not exactly true, cannot be far out); (2) given this assumption, an accurate formula for sampling error was calculated; (3) in tournaments where draws were disproportionately frequent, the formua in (2) was modified accordingly. Whether or not this sketchy, but still perhaps over‑long explanation will convince readers that we have considered and dealt with the pitfalls of a statistical analysis of this kind, we cannot say. We do, however, feel quite satisfied ourselves that the method used was fundamentally sound; and various internal checks and verifications were possible which confirmed its correctness.
The table on the opposite page shows the actual and expected results. Before going on to the general deductions to be made, we might say something about detailed interpretation, again taking the Sicilian as an example. "Actual" results on 436 games was 130/157/149‑or + 19 for Black; the "expected" result, based on playing strength, was 107/177/152 ‑ or + 45 for Black. Thus White scored 45 ‑ 19 = 26 more points out of 436 than expected. This is about 6 percent; and it is approximately correct to interpret this as meaning that, amongst equal players, White will score 53 per cent and Black 47 per cent in the Sicilian.
Now the deductions: ‑ there are a number that can be made, some being those one would anticipate, others (to us at least) surprising. First, the results are clearly significant: the differences between the actual and the expected results are much greater than could be explained by chance. Of the significant. results the most important are‑
(1) White has an advantage in every opening except Colle's. On average, between equal players, White will score about 55 per cent.
(2) The Queen's Pawn is stronger than the King's Pawn opening for White. Whereas after 1 P‑Q 4 White scores about 564 per cent between equal players, after 1 P‑K 4 he only scores 54'3 per cent.
(3) The less played lines are on the whole good for those who play them. This is a very interesting result. It can hardly be due to those lines being better than the more common ones; and the most plausible explanation is that if a player specializes in one of the rather less common lines, he thus gains an appreciable advantage from his opponent's comparative unfamiliarity with it. The Queen's Gambit Accepted and the Dutch‑the two least played of the standard defences to the Queen's‑have the best record against it. Alekhine's Defence, the least played of the main defences, has the best record against the King's Pawn. And Réti's opening, much less played than the King's Pawn or Queen's Pawn, has a better record than either for White. Only ColIc's opening is an exceDtion to this rule, and most chess players would agree that it is a very indifferent line of play.
(4) The Lopez has no terrors. Black does better against the Lopez than in any other opening except Alekhine's and the Colle.
(5) The quiet form of the Queen's Gambit (1 P‑Q 4, P‑Q 4: 2 P‑Q B 4, P‑K 3; 3 Kt‑K B 3, Kt‑K B 3 or transpositions leading to this) is very strong for White. On the other hand, the more aggressive form with 3 Kt‑Q B 3 does rather less well for White than the average Queen's Pawn.
(6) The Slav is a bad defence to the Queen's Pawn, with an expected score 60/40 in favour of White between equal players.
(7) The Caro Kann is a bad defence to the King's Pawn, and the French is not by any means as good as 1 . . ., P‑K 4, the Sicilian or Alekhine's Defence.
There are many points of interest not covered by this analysis‑notably the effect of time. how does the success of White and Black in various openings change over the years? Our general impression is "not very much," but more data and more analysis is needed for a clear answer. There is also more work to be done in analysing the swings by the strength of the players taking part. In the Sicilian, for instance, It seems that most of Black's loss comes from his losing many more games than he should in encounters in which he is much the stronger player. if this is verified, it is probably accounted for by overpressing (always dangerous with Black); but we should not like to state with confidence that this further breakdown is sound.
Finally, although we firmly believe our results are significant, they are not strong enough to outweigh really strong personal preferences. But we do think that players who have not yet formed set habits might benefit from looking at the figures, and at least think twice before, say, taking up the Slav or the Caro Kann.