Quantify skill with math
iNv|eKComm
Posts: 394Player
Fellas, I'm in the planning stage of developing a Pug client. An important part of this client that I want to handle is tracking how players perform and in some way quantifying their skill into a number. We've had this conversation many times before and its gotten a bit crazy. So i'm looking for opinions supported by math and maybe even some form of proofs by example if you're ambitious.
Basically I'm offloading some thinking onto you guys. Let the games begin.
Basically I'm offloading some thinking onto you guys. Let the games begin.
2016 Flank Gaming Network Season 1 Champion
2016 FraggedNation Season 4 Champion
2015 FraggedNation Season 2 Main Champion
2015 ESL Test Cup Second place
2014 FraggedNation UMM Tournament Champion
2014 FraggedNation Old School map Tournament Champion
2014 TWL Season 1 Second Place
2013 TWL 7v7 Beta Tournament Champion
2013 TWL 5v5 Beta Tournament Second Place
2016 FraggedNation Season 4 Champion
2015 FraggedNation Season 2 Main Champion
2015 ESL Test Cup Second place
2014 FraggedNation UMM Tournament Champion
2014 FraggedNation Old School map Tournament Champion
2014 TWL Season 1 Second Place
2013 TWL 7v7 Beta Tournament Champion
2013 TWL 5v5 Beta Tournament Second Place
Comments
The Ghost Recon: Phantoms team published a whitepaper on doing matchmaking with a trained neural network. With enough data to train on, you can easily have one spit out skill values instead of matchmaking assignments. If you've got the data and a good toolbox, neural networks can be really convenient, but you'll likely never understand how it really comes up with its ratings.
At the team level, Elo or Glicko is really easy to implement based on match outcomes. They're not really meant for team games.
At the individual player level, you can compute scores with a version of Microsoft's TrueSkill algorithm (I don't think they've ever released the **full** algorithm, but there's enough there to get going). It takes more effort to set up than something like Glicko, but it's able to individually weight player adjustments from the same match.
I'd like to have it heavily weighted on your ability to win. Since with a pug client (or at least with mine) teams will be random. So that means a high win percentage would say something about how well you work with a team or how overwhelmingly well you can perform in spite of them. But I would like to be able to observe someones stats and assume something about them. I guess that makes me unsure. But win percentage while on the pug client is very important unlike the normal win percentage based on pubbing i would think.
2016 FraggedNation Season 4 Champion
2015 FraggedNation Season 2 Main Champion
2015 ESL Test Cup Second place
2014 FraggedNation UMM Tournament Champion
2014 FraggedNation Old School map Tournament Champion
2014 TWL Season 1 Second Place
2013 TWL 7v7 Beta Tournament Champion
2013 TWL 5v5 Beta Tournament Second Place
First-Person Shooter", Buckley Chen Knowles CIG 2013)
But from what you've posted, TrueSkill will fit what you're looking for really well. It's been used throughout the HALO series, the math is an extension of the Glicko model, big teams have already thought through all of the problems, and you'd probably be able to find implementations floating around online.
http://trueskill.org/
(...and if you're feeling nuts, you can dive into the full derivation of the math behind the algorithm and why it all works: http://www.moserware.com/assets/computing-your-skill/The Math Behind TrueSkill.pdf )
2016 FraggedNation Season 4 Champion
2015 FraggedNation Season 2 Main Champion
2015 ESL Test Cup Second place
2014 FraggedNation UMM Tournament Champion
2014 FraggedNation Old School map Tournament Champion
2014 TWL Season 1 Second Place
2013 TWL 7v7 Beta Tournament Champion
2013 TWL 5v5 Beta Tournament Second Place
I love this community!
Try importing braces from __future__ and you should be fine
I love this community!
First are you looking to start tracking the stats yourself or are you relying on AAPG currents stats (Because alot of people have bugged stats)
I assume by pug client your talking about scrims/matches/pugs tracking only right?
-Round wins IMO are just as an important factor as match wins
-Another would be kills per minute.
-Score per minute is another sign
-Kill death ratio is also a sign but not a huge factor. But having a good kdr is still an accomplish regardless of how one accomplishes it.
-Kills per match is a good one as well.
There is other things you can also consider. Obviously the above each one would have a different amount of score/ELO tied to it.
Not 100% sure I understand exactly what you are trying to get feedback on but I hope this helps some what.
LMAO! Good one!
Any statistic, including round win/loss, doesn't account for opponent strength. I can beat a 6-year-old at chess all day long and have a perfect win/loss, but I wouldn't deserve to be rated best in the world for it. That's the big advantage of the Glicko // Bayesian inference model over anything based just on player stats.
I'd really love damage/round. Devs pls?
I love this community!
Easy to enter doesn't always mean fun to enter. Especially in such a small community you'll always have people you can stand, wise-versa, people who you can't/don't want to play with because of playstyle/attitude etc., you'll have people tilting fast, passive agressive behaviour. I could go on but I guess I made the point. Speaking for myself I'd probably not enjoy such mode.
I love this community!
I want players to have a rating. A rating that changes after each match based on the opponents they defeat. Maybe a better way of going about this it to give everyone a base score and have them go up and down based on the average score of the opposing team if they win. Then throw in other small bonuses for kills and stuff.
This will be open to anyone who wants to give organized and fair matchmaking a try. It will be third party software, so you can have it open and waiting for 8 or 10 players while you do other things... like play a pub.
2016 FraggedNation Season 4 Champion
2015 FraggedNation Season 2 Main Champion
2015 ESL Test Cup Second place
2014 FraggedNation UMM Tournament Champion
2014 FraggedNation Old School map Tournament Champion
2014 TWL Season 1 Second Place
2013 TWL 7v7 Beta Tournament Champion
2013 TWL 5v5 Beta Tournament Second Place
On the other hand, I can LOSE at chess all day long to a six year old and it WOULD be representative of my skill...
This has been a test of the emergency flame-fest system. Please do not adjust your set.
I worked conceptually on some elo-algorithms for a ladder some years ago. It's very easy to build this for 1on1 games. Matchmaking wise and how you can brake this down to the individual skill. But with random teams its more complicate to extract something you can work with. There are some points which separate a good from a bad calculation. Good matchmaking algorithms rewarding consistent performance.
In a good roster, there should be a less amount of players with fluctuating performance or an insufficient dataset. Bad matchmaking algorithms just look at the totals (per team). Good one keep the skill level near the meridian with low deviation. Very good one even take up the trends in the calculation (in particular for the best players each side)
You have to remember, if there are only game results. Every algorithm reacts inflexible and ultra slow. This would mandatory result in bad matchmaking. For scrims its basically impracticable to force some random stranger to input all necessary data/player-stats & double-check them. The usability would be better if at the end of the match each player could voluntarily evaluate the performance of the opposing players (=who was crucial) e.g. from 1-6 (6vs6) resulting 3x6 points which must be awarded. then doing the average of all who have "voted". So the individual contribution can be better determined - whatis incredibly helpful when determining the new level of skills for the next match.