Solution for machine learning competition hosted by Analytics Vidhya.
- numpy
- pandas
- sklearn
- matplotlib
- seaborn
- keras
- data_making.ipynb
- hero_encoding.ipynb
- models.ipynb
- player_encoding.ipynb
- Here I have used keras autoencodders to decrease features for Heroes and Players.
- Then these encoded features and some other features are used to create a predictive model
Dota2 is a free-to-play multiplayer online battle arena (MOBA) video game. Dota 2 is played in matches between two teams of five players, with each team occupying and defending their own separate base on the map. Each of the ten players independentlycontrols a powerful character, known as a "hero" (which they choose at the start of the match), who all have unique abilities and differing styles of play. During a match, players collect experience points and items for their heroes to successful battle with the opposing team's heroes, who attempt to do the same to them. A team wins by being the first to destroy a large structure located in the opposing team's base, called the "Ancient".
You’re given dataset of professional Dota players and their most frequent 10 heroes. The data also includes details about the heros (Kind of Hero (nuker, initiator and so on), their base attack, strength, movement speed). Here both train and test dataset is divided into two dataset(train9.csv & train1.csv and test9.csv & test1.csv).
train9.csv and train1.csv contain the user performance for their most frequent 9 heroes and 10th hero respectively. Both train9.csv and train1.csv have below fields.
Feature |
Description |
user_id |
The id of the user |
hero_id |
The id of the hero the player played with |
id |
Unique id |
num_games |
The number of games the player played with that hero |
num_wins |
Number of games the player won with this particular hero |
kda_ratio (target) |
((Kills + Assists)*1000/Deaths) Ratio: where kill, assists and deaths are average values per match for that hero |
test9.csv contain the different set of user (different from training set) performance for their most frequent 9 heroes. test9.csv has similar fields as train9.csv. Now, the aim is to predict the performance (kda_ratio) of the same set of users (test users) with the 10th hero which is test1.csv.
Feature |
Description |
user_id |
The id of the user |
hero_id |
The id of the hero (of which the kda_ratio has to be predicted) |
id |
Unique id |
num_games |
The total number of games the player played with this hero |
We also have "hero_data.csv" which contains information about heros.
Feature |
Description |
hero_id |
Id of the hero |
primary_attr |
A string denoting what the primary attribute of the hero is (int- initiator, agi- agility, str- strength and so on) |
attack_type |
String, :”Melee“ or “Ranged” |
roles |
An array of strings which have roles of heroes (eg Support, Disabler, Nuker, etc.) |
base_health |
The basic health the hero starts with |
base_health_regen,base_mana,base_mana_regen, base_armor,base_magic_restistance, base_attack_min,base_attack_max,base_strength, base_agility,base_intelligence,strength_gain,agility_gain,intelligence_gain, attack_range,projectile_speed,attack_rate,move_speed,turn_rate |
These are the basic stats the heroes start with (some remain same throughout the game) |
The predictions will be evaluated on RMSE.
The public private split is 40:60