Friday, November 20, 2015

Influence of region, skill and patch on predictability in League of Legends

Last month, I used random forests to predict the winner of League of Legends games. Since then I have downloaded datasets from different regions, different ELOs, and on different patches, and checked how these factors influence predictability. This post will summarize the results, but for details of the analyses, please see this notebook.

Differences between regions

Korea is famed for its great players, and an "open mid" philosophy (in a normal game of League of Legends, you cannot surrender until 20 minutes into the game; in Korea, players accelerate this process by letting the other team win the game quickly through an "open mid"). Are we able to see those attributes in our model? As a first way to look at this, we can plot the length of games in different regions.
The EUW and NA have similar distributions of game lengths, with around 8% of games being surrendered at 20 minutes. The Korean games, on the other hand, have a few percent more games ending before 20 minutes, due to open mids, and a 14% of games ending right after 20 minutes. Since so many games are ending earlier, can we see this in the model's predictions?

Indeed we can, with games being a few percentage points more predictable for games that last 20 minutes or less. However, once the games last 25 minutes, all the regions are similarly predictable. This means that the Koreans are surrendering winnable games.

Skill and predictability

We can run the same type of analysis for skill. In the notebook I compare Bronze V games to challenger solo queue, and show that low skill games are slower, and less predictable than high skill games. Here, I will compare high skill solo queue games to master tier team ranked. The team ranked dataset is quite a bit smaller than the solo queue data (1,000 games vs 30,000), but in my tuning, I've found there is not a big difference in accuracy when the dataset is over 1,000 games.

We can start again by plotting game length.

Team ranked games are about two minutes faster than solo queue games. We can then run the prediction algorithm again on the team ranked games.

Team ranked games are much more predictable than solo queue games! (This is in fact the largest difference in predictability I have seen between different datasets). What is likely happening here is that coordinated teams are more able to capitalize on their advantages, and press them around the map. With fewer mistakes to allow comebacks, the games are faster, and more predictable. It would be interesting to further look at professional games to see how they compare.

Patches and predictability

In recent patches, Riot has made significant changes to the game, which makes it feel like the games are faster, and comebacks more difficult. I scraped challenger and master ranked games from the most recent patch, and compared them to ranked games from Season 5. We can again start with game length.

If you thought games were faster, your intuition was right, as games are around three minutes faster in preseason 2016. What about predictability?

As you'd expect given the previous relationships between speed and predictability, preseason 2016 is a few percentage easier to predict. In a previous blog post, I investigated the importance of different features on predictability, and found that dragons were not important in helping teams win. Putting on my amateur designer hat, this means that dragon is even less important, since games are less likely to last long enough for the accumulated advantages to matter.

In my next post, I'm going to go more in depth on the modeling, investigating random forest hyper-parameters, and maybe trying a few different models.