Wednesday Morning Mythra 8/9: Perpetually Underranked: Japan, Or “How We Got Here” ft. Stuart98

Hugh-Jay "Trade War" Yu
35 min readAug 9, 2023

--

Twitter. YouTube. Neo’s here and he’s currently passed out in the hotel room! I forgot to post or do TMM yesterday due to travel complications, so Stuart, one of the core members of the LumiRank team, reached out and asked if he could get a platform to talk about the LumiRank stuff and a lot of the general Japan bias claims he has to field. What I expected was a short recap of ranking history and how the LumiRank stuff works behind the scenes a little. What I got was…fully comprehensive. Enjoy!

Hi, I’m the guy who wrote the Lumi(!!)Rank algorithm. The ranking’s been releasing! And at least so far, the feedback’s been pretty good; some things don’t look right at first glance, but once they see all the data, most people don’t seem to think anything’s egregious.

Most people. [EDITORS NOTE: You all got owned by this guy.]

One thing that is clear is that LumiRank has a lot of Japan. I mean a lot. Nearly half of the top 50 is from Japan, and in the 51–100 range they form an outright supermajority. To be sure, this is the best year Japan’s ever had in Ultimate or Smash 4, but still, that’s a lot more than normal; should they have that many?

I think the issue here isn’t that there’s so many players from Japan this year; rather, with only a couple exceptions, they’ve been systemically underrated on previous rankings, and this one is one of, if not the first, to get things right.

Brawl

Brawl only had one “global” ranking during the game’s heyday, the SSBBRank 2014, which ranked the top 99 players in the world from 2013–2014. Japan was pretty well represented in the top 20, with 3 players in the top 10 and 2 more in 11–20.

When you’re like “Hey, I know that guy!” but you’d rather not.

This surely means they’d have like, at least 2 dozen more players in the top 100 right?

Wrong.

Of the 79 players ranked 100–21 on the 2014 SSBBRank, only one, Shogun, was from Japan. What was happening? Were those six players just on a whole different level from the rest of their country?

Uh, no.

When you’re like “Hey, I know that guy! He’s been playing for a whole decade? That’s cool, I didn’t know that.”

What actually happened is something pretty common in panel rankings at the time:

Determining that creating an accurate worldwide top 100 would be too difficult, the SSBBRank creators instead created a top 99 players who had competed in North America and called that the top 100 best Brawl Players, denying recognition to top talent like Masha, Edge, and Choco who put up top-notch results in Japan but didn’t travel outside of it. In fairness, it was difficult to even find brackets for many of these Japanese events; even today many of them are nowhere to be found. It’s not like the SSBBRank creators were putting a whole lot of effort in elsewhere either.

POV: You know how counting works. [EDITORS NOTE: I have no clue what the issue he is referring to here is OH IT WAS POINTED OUT TO ME THATS SO FUNNY LOL]

Regardless, the result was that while dozens of North American smashers were recognized by the ranking, many of the best smashers in Japan would be obscured. Some would eventually find recognition through later Smash titles; others would only grow in obscurity through the present day.

The PGRv1

The advent of Smash 4 meant major changes to the smash landscape; events got a lot bigger, with five separate events in Smash 4’s first year alone surpassing Brawl’s 400 entrant record set at Apex 2012. In Japan, Umebura and Sumabato — series which rarely or never exceeded 100 entrants in Brawl — saw 100 entrants become the floor and on occasion even exceed 200 entrants, unheard of for a more or less monthly tournament series in Japan. With this change in landscape also came a change in the rankings. No world ranking was produced for 2015; eventually, Dom, suar, and Zan would introduce the Panda Global Rankings, a ranking using “objective” metrics to determine the 50 best players in the world using events from the end of January 2015 through the first week of May 2016. To qualify from the ranking, one only needed to attend one evaluated event, and unlike the 2014 SSBBRank, events from all over the world were evaluated and used to qualify players for the ranking. So what was the breakdown of those events?

My favorite countries that were comparably strong in Smash 4, Japan and Sweden.

The PGRv1 only considered events they deemed as “nationals” or “majors” (by modern standards, we’d probably use the terms “majors” and “supermajors” respectively). These definitions nominally required some international attendance, of which Japan had none except at Umebura First Anniversary Tournament — not to say that these standards were consistently applied to the US.

My favorite overseas player, VoiD (he is from Hawaii, that counts)

As a result, only one Japanese event was counted for the ranking. This led to only 7 Japanese players being ranked in the top 50, with players like Rain — who was absolutely dominating Japan — being underranked, while others like Choco, Edge, and SH were left off entirely. Japan wasn’t alone in being impacted by this; the exclusion of big Mexican events like Thunderstruck, True Combo, and The Arena led to top MX talent like Wonf and Serge being unranked, while the exclusion of regional events in North America arbitrarily led to some top NA talent being underranked or unranked. However, no country had more events or as stacked of events be excluded as Japan, with more players left off of the ranking as a result. BarnardsLoop has done a lot of great cataloging and scoring these excluded events from all over the world in the Retro OrionStats 2015 TTS, which will hopefully be used at some point to create a far more accurate ranking of the best players in the world from early Smash 4, finding dozens of should-have-been majors in Japan that should have counted but didn’t.

The PGRv2

Criticism over the PGRv1’s exclusion of international events led to serious changes in tournament evaluation for the PGRv2. Multiple members were added to the team to improve its international coverage, most notably the goat juddy96, who from early Smash 4 through early Ultimate transliterated Japanese brackets into English. The PGRv2 used a greatly expanded set of tournaments from all corners of the world; while the PGRv1 evaluated 22 tournaments in a time-period of approximately 15 months, v2 evaluated 49 tournaments in a time period of approximately 8 months, an over 4x increase in event density. It didn’t evaluate everything it should have — notable Japanese events missed include Karisuma 10 (feat. Earth, Taiheita, and SH), Rikabura 6 (feat. Abadango, KEN, and Kameme), and Waseda Festival 2016 (feat. KEN, Earth, Kameme, and Abadango) — but the list of missed Japanese events wasn’t drastically longer or more stacked than the missed American events, which included events like SCR 2016 (feat. ANTi, VoiD, Zenyou, and Tyrant), 2GG Pay it Forward (feat. VoiD, JK, Tyrant, and Ito), and Don’t Park on the Grass (feat. DKWill, MVD, and ESAM).

In terms of the Japanese players who did make top 50, I don’t think the PGRv2 underrated any of them drastically — my older ΩRank algorithm when ran on that season with just the PGRv2 tournaments only differs significantly from the PGRv2 on two Japanese players: Earth is placed 34th instead of 25th, and Taiheita is placed 35th instead of 49th. ΩRank does notably place three Japanese players in the 40s who missed top 50 on the PGR: Ri-ma, Shuton, and Tsu. Ri-ma is there entirely off of a strong 13th place at Big House that came with a Larry Lurr win, while Shuton and Tsu are two players who, unlike every player who made the actual PGRv2 top 50, attended no events outside of Japan. I’m not sure if there was a secret requirement that players have a North American event to make top 50 that season or if the PGRv2 methodology simply thought the 50 players it placed above them were all stronger; if the latter, I suspect undervaluing Japanese depth is at fault, as both players had a majority of their losses be to strong depth players outside of the top 50.

The PGRv3

Early 2017 saw a marked improvement in the results of Japanese depth players. While the old guard players like KEN, Abadango, Komorikiri, and Ranai continued to put up top 10 or top 20 level results, the new wing stacked up strong result after strong result, with Kirihara’s top 8 finish at 2GG Civil War and victory at Frame Perfect Series 2, Tsu’s breakout 2nd place finish at Frostbite 2017, and T’s out of nowhere 3rd place at 2GG Civil War. As a result, the PGRv3 saw by far the most Japanese players of any ranking up to that point, with 15 in the top 50. A number that was very high by the standards of the time — but also very low by the standards of today, in a season where Japan unambiguously excelled. What was going on?

In my head, I thought of the PGR as steadily improving throughout Smash 4 before stagnating more or less with v5 and the Spring 2019 PGRU, and then taking a small jump forward with the Fall 2019 PGRU. I am reminded of the incorrectness of this narrative every time I look at the PGRv3. It simply wasn’t a very good ranking, and I think one central methodological flaw is at fault: attendanceposting. Let’s compare two players the PGRv3 ranked within the top 20: Ranai, ranked 17th, and Kirihara, ranked 19th. Kirihara had a 5–5 record against the PGRv3 top 10, an 8–6 record against the top 20, with a peak of winning major Frame Perfect Series 2 over ZeRo and a worst placement of 13th. Ranai, on the other hand, had a 3–8 record against the top 10, a 7–13 record against the top 20, with a peak of 9th at CEO 2017 and valleys including two 33rds and three 17ths. The big difference in Ranai’s favor? Ranai went to 15 tournaments, Kirihara went to 7. This central flaw pervaded everywhere, from Choco (who had placements of 3rd 3rd 3rd 13th at ranked events and only one loss outside the top 50, Brood) only barely making it onto the ranking at 49th, Rich Brown making top 40, Abadango making top 10 over KEN… all throughout the PGRv3, differences in ranks between players can all too often be explained better by comparing attendance between two players than by looking at the strength of the results at the events they did attend.

This, of course, systemically underrated Japan by nature of being a ranking that evaluated 44 tournaments in the United States but merely 12 in Japan; American players simply had far more opportunities to build up the attendance necessary to get ranked high. A secondary contributor to Japan’s relatively low ranking compared to what you’d expect by modern standards is non-retroactivity: Japanese players scored very strong results throughout the season, but their tournaments were only evaluated by the standards of their much weaker PGRv2 results. This meant that Japan’s only major for the season was Umebura Japan major with a further 4 B-tiers. The retroactive tiering system for ΩRank for that season, by contrast, gives them 3 majors (points >=1800), and a further 5 B-tiers (points >= 900). The result is that on ΩRank’s system, not only are there two additional Japanese players in the top 50 (and solidly so) in low attendance players Ron and Brood, but nearly every top 50 Japanese player on the PGRv3 is higher — drastically so in many cases, such as Choco (49th -> 29th), Edge (47th -> 34th), Shuton (33rd -> 19th), Kirihara (19th -> 11th), and KEN (16th -> 8th). And this is on an algorithm that I abandoned at the start of this year for attendanceposting too hard at the top level!

(As an aside, a big factor in Ron — the highest ranked player on the PGR 100 never to make top 50 on an individual Smash 4 PGR — missing PGR this season is the exclusion of Hirosuma Revolution from the list of qualified tournaments. Of course on ΩRank for the season he manages to make top 50 even without it; with it he could have an argument for top 30)

Ron, my beloved.

The PGRv4

Yeah Japan just gets screwed here. The PGRv4 algorithm was dramatically improved from v3; placements declined greatly in importance and the raw placement no longer mattered in favor of who you outplaced or were outplaced by, there was a single overall score used by the algorithm rather than separate placement and win scores, and attendanceposting was greatly reduced. The ranking also had a sharp reduction in the number of Japanese players in the top 50, from 15 down to only 9. And unlike the prior two seasons, ΩRank ran on this season doesn’t have any differences in which Japanese players make top 50, with Kameme only barely making it on. What happened?

It’s a combination of things. Players like HIKARU and T, who had incredible runs at US majors in the first half of the year, couldn’t replicate the magic. Earth, a Japanese mainstay, fell off dramatically. 9B and Ranai, consistently top 25 in the world level players, retired from competition after August’s Dubai Dojo 2 to focus on their new job, one we would eventually learn was playtesting Super Smash Bros. Ultimate. The biggest factor was simply down to event volume, however. The PGRv4 was Smash 4 at its most frenetic, with ten S-tier events and 4 more A-tiers, averaging out to produce a major every two weeks on average. All 14 of these majors were held in the United States, the most US-centric a season of smash has ever been. Against this torrent Japan was able to offer up a mere six C-tiers and 3 B-tiers, plus the de-facto Japanese C-tier Dubai Dojo 2. Dudes just got swamped. I’m curious what the LumiRank algorithm outputs for this season since it’s average based rather than additive, but that algorithm was designed for a substantially larger load of data than the PGR used and there’s no easy way to add additional smash 4 data beyond what the PGR used to a database at present.

The PGRv5

Burnout from the crowded, frenetic PGRv4 season and dissatisfaction with the Smash 4 metagame led to a serious decline in top player attendance in North America in 2018, leading to a large decline in the number of US majors on the PGRv5. Meanwhile, Japan increased its event output compared to the season prior and once again put up strong results in the US, though the decline of 2GG from 2017 meant that their trips were more limited in number. As a result, Japan modestly increased their representation compared to the PGR v4, going from 9 representatives to 11. I believe that this still was substantially underrepresenting them, primarily due to nonretroactivity and a low baseline TTS value due to their low numbers on the PGRv4. Unfortunately, I deleted the version of ΩRank that was restricted to just the PGRv5 season in favor of a full year 2018 version a year ago, so a direct comparison between the PGRv5 and ΩRank’s output is impossible, but I remember that players like HIKARU, Towa/Atelier, and Masashi, unranked by the PGR, were all within the top 50, while players like KEN and Choco were substantially higher than where the PGR placed them. The full year version of ΩRank also places yuzu and zackray into the top 50, but I think that is primarily due to performances from after the PGRv5 season ended. Also placed in top 50 in the full year ranking are Kome, Eim, and T, but I can’t remember with these players if their strongest performances were within the PGRv5 season or not.

One notable issue that persisted through the PGRv5 with even greater prominence than in prior seasons was the Tournament Tier System’s failure to capture relevant events. With the season much less dominated by majors, a greater emphasis was placed on regionals — including on those which failed to qualify for PGR. Maister’s wins over ESAM and Tyroy at COMBO BREAKER 2018, which failed to qualify for PGR, sparked a conversation on what should count, but the list of events with multiple strong players that nonetheless failed to qualify that season included far more than just Combo Breaker. Among the notable events that should have counted but didn’t were Karisuma 19 and Rikabura 9 in Japan, Ys in Europe, Koronbasu Version Black and Push More Buttons in the Midwest, No Man’s LAN XIII in Ontario, Flatiron 3 in Colorado, and the CSL Smash Western in SoCal. The unfortunate conclusion at the time was that it wasn’t worth counting these “D-tier” events since 90% of them didn’t matter — an understandable one, given that this was in the days before Smashdata, when all of the data had to be collected and cleaned manually to create the rankings. This problem of uncounted tournaments would only get worse in the following year.

The PGR 100

Rather than do a normal top 50 for the back end of 2018, PGStats opted instead to produce a top 100 for the entirety of Smash 4’s lifespan. It was an ambitious project and one I’d like to replicate for Ultimate at some point (the only use case where I think ΩRank might still be a better fit than the LumiRank algorithm, as I think ΩRank would better handle players like Sparg0 who existed throughout the game’s lifespan but differing wildly in skill level). The choice to do an all-time ranking meant players like Jakal in the US, MagiMagi in Europe, and Yuzu and Tea in Japan didn’t get recognition for their efforts towards the close of Smash 4’s life. Also, the decision to release the ranking prior to Ultimate’s release meant that there was insufficient time to incorporate Smash 4’s final major, Umebura Smash 4 FINAL, into the ranking. I have some serious issues with the end product (Lima had absolutely zero argument to be over Mistake) but I think Japan was handled by it very well, perhaps better than any other PGR up until Panda’s implosion last year. How this was achieved is interesting however; rather than being purely algorithmic, the PGR 100 was a blending of the algorithm’s output for PGR tiered events from Apex 2015 through November 2018, and the Japanese power rankings. This of course meant that while Japanese players with strong results who had been underrepresented on the PGR — such as Ron, Rain, Masashi, and SH — were well represented, top Mexican talent that had been underrepresented such as Wonf, Serge, and Waymas found no such relief. This was also a brute force approach to solving the problem, one which — as we’d soon see — would mean the underlying problems causing low Japanese representation on prior PGRs would go unresolved .

The Spring 2019 PGRU

The advent of Ultimate caused much the same shift in the scene as Smash 4 had: massive increased in playercount and event count, massively increased event caps in Japan. PGStats entered the new game primed to make yet another PGR, with minimal changes in the algorithm compared to where they had left off in Smash 4 (which was fine, since the PGR algorithm by late Smash 4 was quite good and the biggest flaws were in the tournament tiering system). Two major problems would hound the first Ultimate PGR, however, and they would combine to make… Tristate low‽‽ Japan low‽‽ It’s a little complicated.

The first problem was the choice of start date: Ultimate started having notable events almost immediately after its release, with Don’t Park on the Grass 2018 and the single-elimination Umebura SP 1 being the first huge ones. PGStats tried to balance the natural need for players to experiment before finding a character they jived with and the desire to create an accurate ranking for the first few months of the game’s lifespan by choosing to start the ranking period with Genesis 6, the first premier event of the game’s lifespan; this was however nearly two months after the game’s release, during which time numerous notable events occurred including Umebura SP 2, Sumabato SP 1, Karisuma SP 1, and Smash Awesome! in Japan, and Super Splat Bros, SoCal Chronicles, Midwest Mayhem Ultimate, NYXL Pop-Up!, and majors Glitch 6 and Let’s Make Moves. This choice in and of itself caused players like Gen, Larry Lurr, ZD, and Choco, who all had strong performances during this time period, to miss the PGRU.

The second problem was the tournament tier system used for Spring 2019 PGRU. The TTS had been introduced in the PGRv3 to tier tournaments based on their general and ranked entrant counts, and refined in the PGRv4 to use the higher value of general entrant points and top 50 entrant points to tier events, after which it was basically left unchanged. Overwhelmed with the increased volume of events, PGStats raised the general entrant threshold to make C-tier and qualify for the ranking from 150 entrants to 200 entrants. More impactfully, they temporarily eliminated ranked player values from the TTS until the end of the first season, in spite of the fact that more than half the PGRv5 would go on to make the first PGRU and many more would come close. The result is that not only were some events, such as Collision 2019, dramatically undertiered (with Collision being evaluated as a C-tier when it had the talent of a major), but many more impactful events were missed entirely. In Japan, this list includes events like Kurobra 15, where Abadango scored a win on Zackray; the TrueGaming Invitational (another de-facto Japanese Middle Eastern event), where Lea scored wins on Shuton and Tea; Shulla-bra SP 4, where Shuton lost to Ke-ya and Munekin to place 9th.

Most egregiously, EGS Cup was not counted for the PGR. This was a tournament featuring MkLeo, KEN, Abadango, Umeki, and Zackray, and a host of Japanese depth talent like HIKARU, yuzu, Shky, Etsuji, Shogun, Kuro, Huto, YOC, and chicken. A retroactive retiering of it at the end of the year would have given it a value of 816, placing it solidly in B-tier. It is possibly, to this day, KEN’s best run of his career, as after losing his round 2 to Pokemon Trainer player supa he won 11 sets in losers over YOC, Huto, Etsuji, Kuro, Zackray, Abadango, and finally becoming the first — and thus far only — player in Ultimate’s history to defeat MkLeo in grands from Losers side.

That run did not count for PGR.

Imagine barely beating YOC for 33rd, winning 5 more sets in losers to make top 8, then doing this once you get to top 8, then you find out none of it counted towards the rankings.

The exclusion of EGS Cup from the Spring 2019 PGRU had a very easy to state effect: it meant that instead of KEN being likely top 30 minimum and possibly top 20, he was unranked.

This means that Japan overall was low, and if everything from late December onward that should have counted had counted, they’d have been a lot higher, right?

As it turns out, the answer depends on the algorithm used. For as many events as Japan had that weren’t counted, the US had more.

A lot more.

Remember that event where Gen beat Light? No? Well it happened.

Or the one where Suarez sent Light to losers to get into Grands winners side? No? Well it happened.

Oh hey, did you know that Venia beat Mr. E at that same event and also beat Rivers at Let’s Make Moves, then didn’t go to anything for the actual PGR season? That’s a thing that happened.

Oh, I forgot that 8BitMan (area 51 on the actual PGRU) had a Cosmos win at Glitch 6 that didn’t count for PGR. Also a Salem win at a 111 entrant regional that wasn’t counted.

Oh right, one of Cosmos’ best events, Saints Gaming Live, the one where he got a Marss win to place 2nd, wasn’t counted for PGR.

On top of that, Japan wasn’t working off of a very high baseline. Shuton, Zackray, Tea, ProtoBanham, and Lea all performed very well in the US, but some of their other players less so. Nietono, on his sole US appearance that season, got 49th, losing to Dark Wizzy in winners and Glutonny in losers with a best win of Charliedaking. Kameme won two regional US events he traveled to, but lost exclusively to low top 50 and non-top 50 players when he did lose stateside: to Ned and Goblin at Frostbite 2019, and to Fatality and Peabnut at Just Roll With It! 11.

So as it turns out, ΩRank on the PGRU data disliked Japan even more than the PGRU itself did. And then running it with the expanded dataset only made the problem worse.

So I developed an easy solution: since Japan’s low because they have fewer events, make everything count for 60% more. (the only other season I had to do something like this with for ΩRank was 2021)

And it fixed the problem!

It’s also a really bad way to do rankings, do not recommend. Issues like this were why I eventually moved away from additive algorithms this year into an averaging algorithm for LumiRank.

Speaking of LumiRank, it’s been months since I last ran its algorithm on Spring 2019, and I’ve made dozens if not hundreds of changes to it since to improve it, but this is what it outputted the last time I ran it on the expanded spring 2019 dataset. A lot of weird stuff to be sure that later changes to the algorithm probably would fix (hello Peabnut, hello Venia), but it appropriately ranks Japan without any regional multipliers or other special treatment, when the PGRU, ΩRank, and even the first Ultimate OrionRank all failed.

The Fall 2019 PGRU

What’s remarkable about the difference between Japan’s 12 player appearance on the Spring 2019 PGRU and their 20 player appearance on the Fall 2019 PGRU is how they managed to accomplish it. On the surface, a lot of their accomplishments were pretty similar. They won a big supermajor in the states (Shuton’s Prime Saga and Zackray’s Big House), they had a contingent of players who repeatedly traveled to the states and did pretty well (Abadango, Raito, Umeki, and Tsu in the Spring and Kome, Kameme, Raito, and Tea in the fall). Once again they had several notable tournaments fail to qualify for PGRU, this time being events like Waseda Festival 2019, POP-OFF, and Seibugeki 3. So what changed?

Two main things:

  1. On the Spring 2019 PGRU, they had 1 S-tier, 1 A-tier, and 2 B-tiers. On the Fall 2019 PGRU, they had 4 A-tiers and 6 B-tiers.
  2. On the Spring 2019 PGRU, they had six players place in top 16 of a US major. On the Fall 2019 one, they had twelve.

Their increased major volume kept them competitive against a US that was nearly as frenetic as the PGRv4 season of Smash 4, while depth player after depth player came stateside and stacked up very well against their overseas counterparts. Those players who came over and got shut out still managed to put up excellent performances back home, with Etsuji, Umeki, Kuro, and KEN making up for subpar or worse performances stateside with phenomenal performances back home.

In fact, the only Japanese players I can really say got screwed over by the PGRU were Choco (whose 7–7 record against the top 50 and 4–4 record against the top 20 with a worst loss of Shky should probably have been good for top 20, not merely top 30) and Brood (whose loss record was actually really good that season even if all the wins came from that one major he somehow got 2nd at). Really if there’s one region that got screwed by the Fall 2019 PGRU, it’s Europe; quiK’s record against top 50 players was excellent and his record against EU depth was also commendable, yet the PGRU not only didn’t place him top 50, he didn’t even come close. Leffen also was snubbed, partly due to Dreamhack Winter 2019 being two entrants short of qualifying. ΩRank used on an expanded dataset for this season fixes all of these problems. The months-old version of the LumiRank algorithm also fixes these problems, though I think it creates a bunch of new ones in return. I think you can probably get a good idea of why I proceeded to make dozens if not hundreds of changes to the algorithm after running these tests.

Of course, the ensuing backlash to the PGRU’s release, all the “Ron > Leffen lol bad ranking” memes, tristate mad that most of their players fell off, etc caused a re-evaluation of the PGRU’s methodology that eventually led to the decision to change subsequent PGRUs to a panel. Covid meant that we never saw what that would have looked like in 2020, and PGStats failure to produce a global ranking from 2020 until when they imploded meant that community attention shifted onto an entirely different ranking.

The Ranking Eclipsed

OrionRank, the work of BarnardsLoop and EazyFreezie, had been around since early 2017. Their basic methodology maintained relatively consistent throughout, even if adjustments were made to the details: tier events based on a combination of general entrant counts, ranked players, unranked players who had good placings, and power ranking values. Assign players points based on their average placements at these events. Evaluate wins and losses based on these placement points to determine the overall order for the ranking. In Smash 4, OrionRank generally was somewhat more favorable to Japan than the PGR was, and was much more favorable towards Europe. At the same time, its intraregional ordering would often appear very strange, with consistent placements being rewarded compared to a player’s wins and losses. OrionRank generally received relatively little attention compared to the PGR. In Ultimate, OrionRank became somewhat more unfavorable towards Japan than the PGRU algorithm was, while still being much more favorable towards Europe. OrionRank’s more pessimistic outlook towards Japan and more positive outlook towards North America for its full year 2019 edition led to some disgruntled Americans annoyed at the perceived lack of North American representation on the PGRU (and, more justifiably, Europeans, for similar reasons) to promote it in early 2020 but the PGRU retained the bulk of the spotlight. After COVID killed the 2020 ranking season, OrionRank released a pre-quarantine Top 100, starting a trend of global rankings released without PGR contemporaries that would last through the end of 2022. It was OrionRank Ultimate: Eclipse, however, that would, for better or for worse, cement OrionRank as the premier ranking system in the community, covering late 2020 events through the end of 2021.

I want so desperately not to find things to criticize here. Unfortunately, uh, there’s a lot.

Some things are easy to explain. Gackt is not top 50 because his best tournament, Mjolner 1, was erroneously excluded (the OrionRank team thought it was a local). Quite a few players are low because EPI 2 was excluded on account of being an invitational (Etsuji, Gackt, Akakikusu, and ProtoBanham all among them). KEN, Kome, and Zackray are perhaps higher than expected on account of 2020 events for them like Kagaribi 1 and Mēsuma.

This ranking has always been bizarre to me because it has the players you’d expect in top 12, then KEN at 16th, then it gets weird. Atelier outside of top 20 isn’t too surprising in a vacuum, but it is very surprising he was 11–12 vs the players above him. Akakikusu has solid losses and wins on every single Japanese player above him but places only at 30th. Then there’s just… no Japanese players in the 31–40 range. Then Hero is 41st. 41st! For the guy with no bad losses, and KEN Yoshidora ProtoBanham wins at majors! I just don’t understand how you can have a ranking claim that there are 5 Japanese players in the top 12, and 2 in the 31–50 range. I’ve got two main explanations for what’s happening here:

  1. Japan didn’t have a whole lot of events (America didn’t either, but they had more proportionally)
  2. Japan’s placements were pretty mixed

OrionRank included some late 2020 events that add to its total, but still excludes the EPI 1 invitational from late 2020, and also excludes EPI 2 as well as Mjolner 1, making it still fewer events than expected.

The rankings outside of Japan are also quite strange: Glutonny is top 10 on OrionRank Eclipse without a single win on other top 10 players. Goblin is bizarrely low after what was unequivocally the best season of his career. Aaron top 50 doesn’t seem odd at first, then you realize this is excluding the summit where he beat Tweek, Riddles, and Cosmos, meaning his only top 50 wins were on Lima, Fatality, and Ned, with a really bad Riptide and Mainstage. Ouch!? 47th was very funny and got all the attention, but that had an easy explanation: the algorithm really liked it when you won things, even if the events you were winning were very small. That only diverted attention from all the somewhat less strange things that were much harder to explain.

What this playercard hides, of course, is that Big D was yet to be considered a notable win, a notable win that Ouch has, without fail, continued to farm throughout the past few years

I had planned to make a big production out of an ΩRank release for this season, following a slew of changes I had made to the algorithm. Of course that big release never happened because I really should have learned before this year that doing all the work for a ranking release without a whole team helping out is a bad idea. One problem I ran into when adjusting the algorithm for the season was simple: Japan did kinda meh at the SWT 2021 and that being the main crossover event meant that it led to a huge chain devaluing of Japan across the board, combined with Japan’s low event volume. The quick and dirty fix here of course was simply make Japanese events give 30% more points than their US counterparts; not doing this meant that things spiraled such that Tea would end up over ProtoBanham and Hero would end up outside the top 50 (there’s a reason I eventually abandoned this algorithm!)

Here’s what ΩRank outputs for that season. Around the middle of last year I went and made a bunch of additional changes to the algo, it’s that version you’ve seen throughout this article previously. Here’s what that version outputs for 2021 (wow it’s so different!). Lastly, here’s what the months old version of the LumiRank algorithm seen in the last two sections outputs for 2021. It’s surprisingly coherent compared to what this version outputted for 2019, the only things I’m particularly confused by are why Lea is so low compared to other Japanese players and why there’s so many Europeans in the lower end of the top 100. Actually, looking at Lea’s data now I think I know why he’s low, it’s a problem based on good losses but lack of wins at events that should be handled significantly better now.

Summer 2022: The Ranking Wars

Between June 29th, 2022 and August 26th, 2022, seven different Smash top 50s or top 100s were released covering four different time periods. There was OrionRank Mid-year 2022, a ranking that was, like OrionRank Eclipse, very strange, but in very different ways. There was the EchoRank 2022 half-year report; EchoRank was also very strange, but with its highly experimental methodology it had always been strange. There was ΩRank Summer 2022, the only time I managed to both start and finish a release for ΩRank over its duration. There was the RaccoonStats Rank, a global top 100 made by a panel of stats nerds and stats nerds alone. Then there were the three regional PGRU v3 top 50s, each made by a panel primarily comprising of people who were definitely not stats nerds.

OrionRank released first and got by far the most attention. Like Eclipse, in spite of a plentiful Japan presence in the top 30 (or top 31), there was almost a total lack of Japanese players outside of that, with a gap of 16 players between Kameme and DIO and another 12 between DIO and Kome. Like Eclipse, the ordering of US players continued to confuse. Goblin, who went from having Tweek, Dabuz, Chag (forma de top 20), and Sonix wins in 2021 to a singular Tea win and not much else, rose 20 spots in the rankings. Ned was also very high, a placement that seems difficult to explain when every single player he beat ranked underneath him. ProtoBanham seems low, until you realize this ranking released before Double Down 2022.

EchoRank’s half-year report, covering essentially the same time-span as OrionRank except without some very end of 2021 stuff, released to… mostly questionmarks. Though not as odd as the 2021 ranking that actually included both all of 2020 and 2021 (and that controversially ranked Nairo in the 1.5th position), there were still some serious questions around the high rankings of DDee (14th), Ouch!? (13th), and Monte (27th), and the conversely low ranking of Yoshidora (26th). The Ouch!? and Monte rankings would normalize in the back half of the year (to 34th and 71st respectively), though DDee would still remain high (19th). Yoshidora was an interesting case: he had traded 3–4 sets against acola (EchoRank #3). However, EchoRank was a time-linear algorithm whose head to head evaluation formula wasn’t directly tied to ranking position. As a result, acola was actually below Yoshidora on the head to head evaluation formula when they fought, causing Yoshidora to be at a net loss for trading 3–4 against the #3 player in the world at the time. EchoRank was also relatively low on Japanese depth, ultimately ranking the same number of Japanese players within the top 50 (11) as OrionRank in the same timespan.

ΩRank was by a wide margin the most favorable towards Japan of the three algorithms, placing 21 within the top 50 (albeit with a minor multiplier on Japanese events). Unexpectedly, it was also the most favorable of the three algorithms towards Europe, in stark contrast to its behavior the year prior (where Oryon was the only European other than Glutonny to breach the top 100). Feedback was generally positive, with Bloom4Eva’s top 20 ranking being the biggest point of contention, along with the decision to rank Zackray off of only one tournament. Substantial changes made after release to eliminate regional multipliers from the calculation and alter the calculation of outplacements would have actually increased the Japanese representation, boosting Sigma into the top 50 and Yoshidora into the top 10, while lowering Europe. Although ΩRank received less attention than OrionRank overall, it received a particularly large amount of attention from Japan, especially for the top 10.

RaccRank’s largely Japan favorable panel actually ranked fewer JP players in the top 50 than ΩRank, placing Repo in there but (understandably) not Etsuji or Zackray, who had two and one tournaments in the ranking period respectively. More in line with expectations, they were substantially lower on Europe than the algorithms were, placing Raflow and quiK outside the top 50. RaccRank got the same reception it’s likely always going to get: some people think it’s way too high on Japan and way too low on Europe, some people are surprised by unexpected placements but see the vision, most don’t pay enough attention to form any opinion whatsoever.

I don’t think panels of non-stats nerds work well for Ultimate; there’s just too much data, too many good players for most people to keep track of it all. I don’t think there was any universe where the PGRU v3 panel (at least in North America) was going to output a good ranking; there were too many guys on there who were always going to rank based on vibes rather than numbers. Having said all that, I think the panel was set up to fail. I understand why the front third of 2022 was chopped out of the season — the Omicron flare up was a real scare, even if by the time Glitch -Infinite- happened it had died down. On the other hand, you will never convince me that the reason CEO 2022 (let alone GOML and Double Down) was excluded from the season wasn’t so that the entire Panda Cup would be in one season. This led to an absurdly short season where players would get top 50 off of precious little data. The lack of a TTS and an official dataset made it all too easy for panelists who would have looked at numbers were they provided some to fall back on the vibes and the easily digestible numbers, leading to things like Myran ranking abnormally high off a very empty 5th at Genesis. [Editors Note: The Ikan disrespect??? Stuart, I thought you were trying to make amends with the Midwest…]

I understand all that, the reasoning behind it, even if some of the reasoning is badly motivated.

What I don’t understand, what I cannot fathom the reasoning for, is why the Japanese PGRUv3 was changed from an ordered list to an unordered “50 players to watch”.

It wasn’t what TOs requested; they made no such requests. It isn’t what panelists were expecting; they thought they were submitting ballots for an ordered top 50. The ordered top 50 does in fact exist, hidden away; I’ve seen it, it’s kinda weird but probably more coherent than the North American list. I don’t understand why it’s consigned to obscurity, Japan getting an even more scuffed version of an already scuffed ranking by virtue of it not even being a ranking. I don’t think it gave any of their players substantial visibility; the list was overlooked, forgotten, consigned to the dustbin of Smash history to which would be added far more than anyone anticipated by the end of the year.

The Blow-Up

I don’t need to go over the events of November 2022 here; they’ve been covered to death. For the Japanese smash scene, the blow was devastating, as they were set to make up a full 11/32 spots in the finals, including with poorly traveled players like Yoshidora and Nietono. The loss of the circuit also tanked the motivation of players like ProtoBanham, whose attendance has dropped off drastically in 2023 as a result. As far as rankings go, the effects were more interesting. The plan up until the blow-up was for the PGRUv4 to be a single global panel ranking made by panelists from around the world, covering a time span from CEO 2022 through the Panda Cup. In the aftermath of Panda’s dissolution, a discussion and vote was held in the panelist discord to determine the future of the rankings. The vote was relatively close, with a majority of voting NA panelists voting to keep the ranking panel based, EU fairly evenly split, and Japan voting in favor of switching to an algorithm. After the vote concluded, PracticalTAS, designer of the PGR algorithm and then man in charge of the PGRU balloting process, handed off the reins for the “official” rankings to BarnardsLoop, setting the stage for the formation of a new official rankings. First, however, we should go over the rankings already in the works before this.

The Ranking that wasn’t: ΩRank 2022

It seems paradoxical to say that Japan had a worse second half of 2022 than they had in the first. Japanese players performed well at the Ludwig Smash Invitational, very well at Scuffed World Tour, unexpectedly got to grands at Glitch — Regen, and won Let’s Make Moves Miami and Apex 2022 outright. Despite this they went from having 22 players in top 50 (20 after subtracting players with less than three tournaments) on the Spring 2022 ΩRank to 19 players (18 after subtracting players with less than three tournaments) on the full year one. The switch from half year to full year wasn’t even planned out in advance; it was rather made as a response to relatively few Japanese tournaments in the second half of the year portending drastically skewed regional balance despite small differences in how the Japanese players were actually performing in one half of the year vs the other. Even so, Japanese representation went down, European representation went down slightly, and US representation went up. Comparing OrionRank and ΩRank 2022, ΩRank was drastically higher on Japan in the first half of the year, yet in the full year actually had fewer Japanese players in top 50, although it still had the edge in the top 100. It also had somewhat higher European representation, a stark contrast with expectations for OrionRank. This sensitivity to event volume continuing to plague ΩRank contributed to my decision to move away from that algorithm last year, and experiments earlier this year that accidentally led to the inverse problem — runaway Japan inflation, to the point where an algorithm would give Japan 30 spots in the top 50 last year — led me to abandon that very direction of algorithm as a whole.

Attentive (and less attentive) readers might notice that ΩRank 2022 never actually finished releasing last year. They might also notice that this article is now over 7400 words and there’s still probably at least a thousand left before it gets to the end — these two things are related! I like being comprehensive and covering everything there is to cover, preemptively answering questions people might have. The problem of course is that writing that does this takes time, and it’s now been over 12 hours since I started this article.

[EDITORS NOTE: He finished this while boarding. He DM’d me the final paragraph of this outside of Google Docs.]

I meant to get a head start on writing the ΩRank blurbs ahead of time in the weekend leading up to the release, but instead I spent that time correcting two errors I noticed in the algorithm that moved a couple players around slightly. In hindsight I should have just released the ranking as it was rather than doing that — the changes didn’t affect much, and there was a more impactful error that I didn’t catch (one that lowered Skittles by several spots due to a dq being reported as an 0–2) that made its way onto the ranking anyway. Regardless, I fell behind schedule, then I fell behind to the point where I could never catch back up. I posted all of the remaining playercards Cloudhead made for the project in the Ultimate Stats discord a couple months ago; searching for “from:stuart98 miya playercard” should pull them up.

[EDITORS NOTE: He wasn’t kidding, this worked. Don’t ask about the names, unless you want to learn about who “peranbox” is]

The Last Hurrah: OrionRank 2022

It’s a matter of no small irony that what was by far the best version of the OrionRank algorithm was produced just prior to the algorithm being retired altogether. Gone was the bizarre region composition that had dogged prior versions of the ranking, gone were the “list of players, ostensibly ordered” moments. There were oddities still too be sure, but the number and severity were much reduced. OrionRank from placing just 11 Japanese players in top 50 in the mid-year release to a full 19.

And the production! EazyFreezie pulled out all the stops, creating not only by far the highest quality playercards of any OrionRank release yet, but also producing expertly edited videos, the likes of which hadn’t really been seen since the pre-quarantine PGRs.

It’s pretty nuts how crazy dramatic the Byleth reveal is.

Nonetheless the ranking wasn’t perfect. Vastly improved over prior iterations, but imperfect; most of the players who felt high (such as Anathema, Ned, and Fatality) were still of the US and most of those low (like HIKARU, Yaura, Paseriman, and Shirayuki) were still from Japan, but the magnitudes of both these statements was far reduced from prior iterations. If OrionRank had continued on like this into the future, then it would have been no disservice to the community. Instead, decisions were being made that would render OrionRank unnecessary.

Roads taken and not taken: UltRank 2022

The vote of the ranking panelists meant that the four ultimate ranking admins: me, kenniky, EazyFreezie, and BarnardsLoop, all busy working on releases for our own ranking systems, also had to put together, in a very short amount of time, a new algorithmic ranking that would hopefully serve as the primary one going forward, even if it wasn’t taken as the primary one for 2022. The logical solution to the quandary was to simply reuse our existing algorithms and infrastructure we had already previously developed: RankRank.

[EDITORS NOTE: I don’t hate the RankRank methodology. It’s kind of like a panel, for algorithms]

Averaging our existing algorithms is something we could do simply by pasting numbers into a spreadsheet that already existed, with little time needed to do so. More impactful than our decision to use the existing RankRank infrastructure was our decision to abide by the planned PGRU v4 ranking period. We wanted to keep consistent with the expectations the PGRU had already set for the season. In hindsight, I think this was clearly a mistake for two reasons. The ideal wasn’t worth keeping; OrionRank had already largely displaced the PGRU in legitimacy and it had a full year timeline. And then there was the result of this decision. Japan only had one supermajor in the latter half of the year, Maesuma TOP 10, compared to 3 in the first half, and they had fewer majors as well. The loss of many Japanese players’ best events — but relatively few strong events for American players — was not a huge loss for Japan on an attendance agnostic algorithm like EchoRank. However, EchoRank in 2022 was strongly disfavorable to Japan compared to ΩRank and OrionRank. Meanwhile, Japanese players fell sharply on the additive algorithms of Orion and Ω. The result is that while the full year rankings for ΩRank, OrionRank, and EchoRank respectively gave Japan 36, 34, and 23 players in the top 100, UltRank 2022 featured only 26 in the top 100, with Yoshidora and Hero in particular experiencing sharp declines from their full year positions. To be honest, the reception to the ranking was better than I expected, and that was reception was already decidedly mixed.

The pitfalls of the half year ranking period sharply imalancing the regional breakdown led to the decision to adopt the half-year check in/full year release format going forward. It also contributed to the move away from additive algorithms towards an averaging one. The RankRank methodology was never intended to be more than a temporary solution owing to the slapdash nature of UltRank 2022. With the 2022 ranking finished, we moved on to planning out the 2023 ranking, and it was quickly decided that we would switch to using only one algorithm for 2023. With this decision, OrionRank and ΩRank ceased to have a purpose, and the decision was made to fold them into UltRank. Algorithm tests were done through late May, at which point it was decided that what would become the LumiRank algorithm was what was worth focusing our efforts towards.

Things done now not done then

The LumiRank algorithm has the lessons of 7 years of ranking algorithms to learn from. With nearly 1,000 events evaluated — compared to the mere 22 on the first PGR — we now have a more accurate picture of the competitive scene than ever before. Ironically, many measures taken that would have in the past prevented Japanese underrepresentation on the rankings now prevent their overrepresentation; the competitive landscape in 2023 is incomparable to any previous time in smash’s history, with the decline of many US major series while new majors and superregionals are appearing in Japan every month leading to them forming a majority of major events for the first time in Smash history. As such, while prior ranking algorithms would likely have led to Japan capturing significantly more than an outright majority on the ranking, LumiRank reflects their strength this season — without crowding out the rest of the world. As the Smash scene continues to evolve, LumiRank will continue to adapt to meet the needs of the scene.

And that’s all he wrote me. Doesn’t this guy have a Dandori Battle bracket to prepare for? Congrats on making it down here! I can’t believe I let my standards for guest columns be Kenniky writing about Klang for three pages. Stuart has some really unique insight, and I’m excited to see where the LumiRank stuff goes from here. Thanks for reading!

--

--

Hugh-Jay "Trade War" Yu
Hugh-Jay "Trade War" Yu

Written by Hugh-Jay "Trade War" Yu

Author of Tuesday Morning Mythra. Corrin Sun, Vira Moon, Linne Rising.

No responses yet