Imagine that you’re at your local. It’s a smallish event, so there are only 23 of you in the room, including all the volunteers and staff. All of a sudden, someone produces a birthday cake from underneath a table. But before they can say who it’s for, two people step up to claim it. Wow, you think to yourself, what are the odds?
As it happens, they’re not bad: with 23 people, it’s actually slightly more likely that at least two of them will share a birthday than it is that none of them will share a birthday. This fact has been popularized as “the birthday paradox,” although in fact it’s nothing more than a quirk of statistics.
What does this have to do with fighting games, you ask? We’ll get there. First, though, let’s a more obviously relevant question: which main-stage fighting game is currently the most balanced?
The Case For Smash Ultimate
Recently, the Smash community put on two S-tier tournaments on the same weekend. Genesis 7 and Evolution (Evo) Japan both took place on the last weekend of January, and both featured exceptionally high-level play. They also featured an uncommon amount of character variety. Neither event had a repeat character in its Smash Ultimate top eight, which wasn’t true of any other main-stage game played at either event. Aside from Ultimate, they all had duplicates: Melee, Street Fighter V, Tekken 7, BlazBlue: Cross Tag Battle, Samurai Shodown, and Soul Calibur VI.
Intuitively, this variety of results marks Ultimate as a uniquely well-balanced game (at least at the highest levels of competition). We can also delve deeper and look at the characters themselves. At Genesis, top eight featured a robbery character (Wario), characters with unorthodox movement and tricky combos (Pikachu and Peach), steamroller (Zero Suit Samus), and archetypal brawlers like Mario. Meanwhile, at Evo Japan, there were several projectile-based characters (Olimar, Pac-Man, and Duck Hunt), a swordie (Shulk), and a hit-and-run character (Sonic).
No matter how you break it down, Ultimate had outstanding character diversity. At both Genesis and Evo Japan, the first character repeat not only occurred outside the top eight, it was a character that didn’t make top eight at all. (Both tournaments had two Palutena players in ninth place.) Granted, there were repeats between the two tournaments, namely, Fox, Joker, Mario, and Zero Suit Samus. But if we pretended to combine both tournaments, that would still only give us eight repeats in the top sixteen – which is the second-lowest number of all the main-stage games, trailing only Soulcalibur VI.
In short, despite its massive roster, no one character is dominating Ultimate, and the game allows for a wide variety of different playstyles. Surely Ultimate is therefore the most balanced fighter at the moment, right? Well, maybe.
Enter The “Paradox”
While it is important to ask how much character diversity any given game has, it’s also important to know how much character diversity we should expect. If you see a top eight that has only four characters, each of which is represented twice, you might guess that the game is poorly balanced. But if that game only has four characters to begin with, then that top eight would be perfectly balanced.
So how should we calibrate our expectations? This is where the birthday paradox comes back into play. Contrary to our intuitions, we should expect to find at least one shared birthday among any 23 random people. Likewise, when we run the math, we may find that Ultimate’s large roster size blurs the picture painted solely by the results of Evo Japan and Genesis 7.
Using the same math that goes into the birthday paradox, we can come up with a smarter assessment of our main-stage fighting games. We’ll look at some examples next.
The Case For Smash Ultimate…Maybe
Sometimes the math only tells us what we already know. For example, consider the case of Tekken 7. At Evo Japan, seven out of the top eight Tekken players chose Leroy at some point in the tournament. Instinctively, we see those results and we suspect that Tekken might have a balance problem. And, indeed, the math backs that up: if Tekken were perfectly balanced, there would only be a 44% chance that it would have any repeat characters in its top eight, let alone seven repeats. In fact, in a perfectly balanced version of Tekken, the results from Evo Japan would only have a 0.002% chance of occurring. As such, we can safely say that Tekken is not perfectly balanced and that the results of Evo Japan point to the nature of that imbalance.
Similarly, the number for Melee is 0.37%. Here again, there’s no surprise: “Fox Only” is a meme for a reason. Yet there are some cases in which the math matters. Street Fighter V had only four repeat characters as opposed to Samurai Shodown’s seven, but the statistical numbers flip that around. Per the birthday paradox formula, Samurai Shodown is much more likely to feature repeats of any sort (78% vs. 54%), and its Evo Japan results are more likely as well (49% vs. 34%). As such, Samurai Shodown’s repeats shouldn’t count against it as much as Street Fighter’s. In other words, the math suggests that Samurai Shodown is better balanced relative to its roster size than Street Fighter V is.
What, then, of Ultimate? As it turns out, there’s good news and bad news. The good news is that, according to the statistics, there’s a 69% chance that a perfectly balanced version of Ultimate would feature no repeats. But that number is double-edged. It’s encouraging if you look at the most recent results, but it’s equally discouraging if you look at what happened in previous tournaments. For example, as recently as October, The Big House 9 had nine repeat characters in its top eight (three Wolf players, two Palutenas, two Olimars, and two Misters Game & Watch). The odds of that happening? Well below 1%. Similarly, as noted above, Soulcalibur VI actually had the most diverse top sixteen, but its roster is roughly one-third the size of Ultimate’s. Here again, the math complicates matters.
The Case For Patience
In the Fighting Game Community, balance is a subject that understandably receives a lot of attention. As is also the case with other competitive communities, we get antsy when we see the same results over and over again. Even so, the most that we can say about Smash Ultimate is that it’s capable of a truly beautiful amount of balance and diversity. The results from Genesis 7 and Evo Japan proved that much, but they didn’t prove any more than that.
And perhaps that’s the most that we can ask for. Perfection is never an option, either for the people who play fighting games or for the people who make them. As long as our games are balanced enough, that will give us what we need. For now, let’s all just hope that Leroy isn’t the next Smash Ultimate DLC character.