Introduction
For this project, I decided to do my clustering analysis of heroes from the MOBA game League of Legends. Although I do not play League of Legends, this idea interested me because I would be able to compare the clusters produced by my clustering analysis against the pre-defined hero roles in the game. Many games have defined hero roles, and I decided to go with League of Legends because of its high number of heroes (172). The main question I’m trying to answer is: How do my clusters compare to the pre-defined hero roles?
What is Clustering?
K-means clustering is the machine learning algorithm I’ll be using for this project. It is an unsupervised learning technique that groups similar data points into clusters. The way it works is by placing down k random points, diving the data into voronoi cells based on the points, calculating the center of all the points in each given cell, moving the cluster center to those points, and then repeating until it stabilizes and data stops changing clusters.
The Dataset
Some of the things that distinguish heroes in MOBAs can be hard to quantify. Primarily, their abilities. However, I decided to get the main numerical information I could. Hero ratings give some insight into a hero’s strengths, which may partially reflect their abilities. Hero stats are the raw stats of the hero, along with how they scale with their character level.
| Feature Category | Explanation | Variables in this Category |
| Identifying Information | Will be used to identify each hero. | name |
| Pre-existing clusters | Will be compared against the clusters produced. | herotype, role, position |
| Hero Rating | Broad ratings of the hero’s playstyle. | damage, toughness, control, mobility, utility, style, difficulty |
| Hero Stats | Base stats of the hero. | hp_base, hp_lvl, mp_base, mp_lvl, arm_base, arm_lvl, mr_base, mr_lvl, hp5_base, hp5_lvl, mp5_base, mp5_lvl, dam_base, dam_lvl, as_base, as_lvl, range, ms |
Format Conversion
My original data came in a LUA file format, while I needed it in a csv for my data analysis. Because of that, I needed to convert it. However, I don’t have experience with using LUA files, so I needed to convert the LUA to json first before I could convert to csv. In this case, the LUA file was structurally similar to a json file, just with some differences in syntax. To fix that, I wrote a python program that converted the LUA text (in a .txt file) into json formatted text (also in a .txt file). I then converted that output file to json and manually cleaned up any remaining errors/warnings. Finally, I wrote a second python program to convert the json data to csv, extracting the information I wanted.
Diagram
[Placeholder]
Data Understanding
[Placeholder]
Code
[Placeholder]
Output
[Placeholder]
Data Pre-processing
[Placeholder]
Code
[Placeholder]
Modeling & Evaluation
[Placeholder]
Code
[Placeholder]
Output
[Placeholder]
Impact / Conclusion
[Placeholder]
References
[Placeholder]