Quantify skill with math

Fellas, I'm in the planning stage of developing a Pug client. An important part of this client that I want to handle is tracking how players perform and in some way quantifying their skill into a number. We've had this conversation many times before and its gotten a bit crazy. So i'm looking for opinions supported by math and maybe even some form of proofs by example if you're ambitious.

Basically I'm offloading some thinking onto you guys. Let the games begin.
2016 Flank Gaming Network Season 1 Champion
2016 FraggedNation Season 4 Champion
2015 FraggedNation Season 2 Main Champion
2015 ESL Test Cup Second place
2014 FraggedNation UMM Tournament Champion
2014 FraggedNation Old School map Tournament Champion
2014 TWL Season 1 Second Place
2013 TWL 7v7 Beta Tournament Champion
2013 TWL 5v5 Beta Tournament Second Place
«1

Comments

  • iNv|mikEy2kiNv|mikEy2k Posts: 96Player
    Im from the south. Since we are talking about math, I will sit out of this conversation.
  • SacchoSaccho Posts: 1,577Player
    edited March 2016
    Stats-based or outcome-based?

    The Ghost Recon: Phantoms team published a whitepaper on doing matchmaking with a trained neural network. With enough data to train on, you can easily have one spit out skill values instead of matchmaking assignments. If you've got the data and a good toolbox, neural networks can be really convenient, but you'll likely never understand how it really comes up with its ratings.

    At the team level, Elo or Glicko is really easy to implement based on match outcomes. They're not really meant for team games.

    At the individual player level, you can compute scores with a version of Microsoft's TrueSkill algorithm (I don't think they've ever released the **full** algorithm, but there's enough there to get going). It takes more effort to set up than something like Glicko, but it's able to individually weight player adjustments from the same match.
  • iNv|eKCommiNv|eKComm Posts: 394Player
    edited March 2016
    I guess I didn't post requirements exactly.

    I'd like to have it heavily weighted on your ability to win. Since with a pug client (or at least with mine) teams will be random. So that means a high win percentage would say something about how well you work with a team or how overwhelmingly well you can perform in spite of them. But I would like to be able to observe someones stats and assume something about them. I guess that makes me unsure. But win percentage while on the pug client is very important unlike the normal win percentage based on pubbing i would think.
    2016 Flank Gaming Network Season 1 Champion
    2016 FraggedNation Season 4 Champion
    2015 FraggedNation Season 2 Main Champion
    2015 ESL Test Cup Second place
    2014 FraggedNation UMM Tournament Champion
    2014 FraggedNation Old School map Tournament Champion
    2014 TWL Season 1 Second Place
    2013 TWL 7v7 Beta Tournament Champion
    2013 TWL 5v5 Beta Tournament Second Place
  • SacchoSaccho Posts: 1,577Player
    Another idea, if you'll have an external client monitoring input -- one group found they could predict player skill really well just from how they pressed the movement keys. ("Predicting Skill from Gameplay Input to a
    First-Person Shooter", Buckley Chen Knowles CIG 2013)

    But from what you've posted, TrueSkill will fit what you're looking for really well. It's been used throughout the HALO series, the math is an extension of the Glicko model, big teams have already thought through all of the problems, and you'd probably be able to find implementations floating around online.
  • SacchoSaccho Posts: 1,577Player
    edited March 2016
    Here, for example, is a well-documented and free (BSD) implementation in Python.
    http://trueskill.org/


    (...and if you're feeling nuts, you can dive into the full derivation of the math behind the algorithm and why it all works: http://www.moserware.com/assets/computing-your-skill/The Math Behind TrueSkill.pdf )
  • iNv|eKCommiNv|eKComm Posts: 394Player
    Interesting. I'm going to take a look. I'm very interested in using this in something other than python.
    2016 Flank Gaming Network Season 1 Champion
    2016 FraggedNation Season 4 Champion
    2015 FraggedNation Season 2 Main Champion
    2015 ESL Test Cup Second place
    2014 FraggedNation UMM Tournament Champion
    2014 FraggedNation Old School map Tournament Champion
    2014 TWL Season 1 Second Place
    2013 TWL 7v7 Beta Tournament Champion
    2013 TWL 5v5 Beta Tournament Second Place
  • SSKnecaboSSKnecabo Posts: 2,721Player
  • -=DA=-Movax-=DA=-Movax Posts: 42Player
    edited March 2016
    I refuse to use a modern programming language that requires whitespace as part of its syntax. :P
  • SSKnecaboSSKnecabo Posts: 2,721Player
    I refuse to use a modern programming language that requires whitespace as part of its syntax. :P

    Try importing braces from __future__ and you should be fine :p
  • .dcG-Colts^.dcG-Colts^ Posts: 1,973Player
    iNv|eKComm wrote: »
    I guess I didn't post requirements exactly.

    I'd like to have it heavily weighted on your ability to win. Since with a pug client (or at least with mine) teams will be random. So that means a high win percentage would say something about how well you work with a team or how overwhelmingly well you can perform in spite of them. But I would like to be able to observe someones stats and assume something about them. I guess that makes me unsure. But win percentage while on the pug client is very important unlike the normal win percentage based on pubbing i would think.

    First are you looking to start tracking the stats yourself or are you relying on AAPG currents stats (Because alot of people have bugged stats)

    I assume by pug client your talking about scrims/matches/pugs tracking only right?
    -Round wins IMO are just as an important factor as match wins
    -Another would be kills per minute.
    -Score per minute is another sign
    -Kill death ratio is also a sign but not a huge factor. But having a good kdr is still an accomplish regardless of how one accomplishes it.
    -Kills per match is a good one as well.

    There is other things you can also consider. Obviously the above each one would have a different amount of score/ELO tied to it.

    Not 100% sure I understand exactly what you are trying to get feedback on but I hope this helps some what.
    Pie charts + Graphs= Very Bad.



  • IO_i_OIIO_i_OI Posts: 1,107Player
    iNv|mikE wrote: »
    Im from the south. Since we are talking about math, I will sit out of this conversation.

    LMAO! Good one!
    googley avatar aapg


  • Twitchr.Carbon8Twitchr.Carbon8 Posts: 288Player
    i would think that with enough time, rounds won/round lost would be the most significant figure
  • SacchoSaccho Posts: 1,577Player
    i would think that with enough time, rounds won/round lost would be the most significant figure

    Any statistic, including round win/loss, doesn't account for opponent strength. I can beat a 6-year-old at chess all day long and have a perfect win/loss, but I wouldn't deserve to be rated best in the world for it. That's the big advantage of the Glicko // Bayesian inference model over anything based just on player stats.
  • KodenKoden Posts: 282Player
    Who is this actually targeted to? I mean, it's not like you got hundreds of people playing so-called "competitive". Actually, with the end of the ESL Winter cup you got even a few less...
    Derob6.jpg
  • ={101st}=Whiplash27={101st}=Whiplash27 Posts: 2,102Player
    edited March 2016
    Competitive matchmaking allows people to play more serious matches without having to be on a team. It has its own set of issues (such as people berating bad players), but it has a much lower barrier to entry and helps people with similar skill levels face off against each other.
    You joined the world's greatest army to become a graphic artist? Outstanding!
  • SSKnecaboSSKnecabo Posts: 2,721Player
    .dc-colts^ wrote: »
    iNv|eKComm wrote: »
    I guess I didn't post requirements exactly.

    I'd like to have it heavily weighted on your ability to win. Since with a pug client (or at least with mine) teams will be random. So that means a high win percentage would say something about how well you work with a team or how overwhelmingly well you can perform in spite of them. But I would like to be able to observe someones stats and assume something about them. I guess that makes me unsure. But win percentage while on the pug client is very important unlike the normal win percentage based on pubbing i would think.

    First are you looking to start tracking the stats yourself or are you relying on AAPG currents stats (Because alot of people have bugged stats)

    I assume by pug client your talking about scrims/matches/pugs tracking only right?
    -Round wins IMO are just as an important factor as match wins
    -Another would be kills per minute.
    -Score per minute is another sign
    -Kill death ratio is also a sign but not a huge factor. But having a good kdr is still an accomplish regardless of how one accomplishes it.
    -Kills per match is a good one as well.

    There is other things you can also consider. Obviously the above each one would have a different amount of score/ELO tied to it.

    Not 100% sure I understand exactly what you are trying to get feedback on but I hope this helps some what.

    I'd really love damage/round. Devs pls?
  • SSKnecaboSSKnecabo Posts: 2,721Player
    Whiplash27 wrote: »
    Competitive matchmaking allows people to play more serious matches without having to be on a team. It has its own set of issues (such as people berating bad players), but it has a much lower barrier to entry and helps people with similar skill levels face off against each other.

    Easy to enter doesn't always mean fun to enter. Especially in such a small community you'll always have people you can stand, wise-versa, people who you can't/don't want to play with because of playstyle/attitude etc., you'll have people tilting fast, passive agressive behaviour. I could go on but I guess I made the point. Speaking for myself I'd probably not enjoy such mode.
  • iNv|eKCommiNv|eKComm Posts: 394Player
    To clear up a few things. I would like to keep my data completely separate from the official database. Mainly because their DB has proven to be unreliable. Also because I dont care about the statistics of a player when the pub or are in a server running around with a teammate trying stuff. This does not mean I won't utilize their database to track match stats (such as retrieving deltas from before and after a match). However, the historical data means very little to me.

    I want players to have a rating. A rating that changes after each match based on the opponents they defeat. Maybe a better way of going about this it to give everyone a base score and have them go up and down based on the average score of the opposing team if they win. Then throw in other small bonuses for kills and stuff.

    This will be open to anyone who wants to give organized and fair matchmaking a try. It will be third party software, so you can have it open and waiting for 8 or 10 players while you do other things... like play a pub.

    2016 Flank Gaming Network Season 1 Champion
    2016 FraggedNation Season 4 Champion
    2015 FraggedNation Season 2 Main Champion
    2015 ESL Test Cup Second place
    2014 FraggedNation UMM Tournament Champion
    2014 FraggedNation Old School map Tournament Champion
    2014 TWL Season 1 Second Place
    2013 TWL 7v7 Beta Tournament Champion
    2013 TWL 5v5 Beta Tournament Second Place
  • Keebler750Keebler750 Posts: 3,589Beta Tester
    Saccho wrote: »
    ... I can beat a 6-year-old at chess all day long and have a perfect win/loss, but I wouldn't deserve to be rated best in the world for it. ....

    On the other hand, I can LOSE at chess all day long to a six year old and it WOULD be representative of my skill...

    :D



    ______

    This has been a test of the emergency flame-fest system. Please do not adjust your set.
  • GoA]z!o[N]cya[GoA]z!o[N]cya[ Posts: 109Player
    edited March 2016
    What data would you like to gather? First source is the official db. (Here you can generate a start "elo" for newcomers that never used your tool before) Second Source is your db, but what data can you input there? Match outcome only or the individual stats too? The latter is best practice for every algorithm but you would have an hard time gathering all this data without a client that can track and analyze these stats from the logs. Without this kind of tool everything additional has to be entered by players and has to be confirmed by both sides... so you have to think carefully what to add there.

    I worked conceptually on some elo-algorithms for a ladder some years ago. It's very easy to build this for 1on1 games. Matchmaking wise and how you can brake this down to the individual skill. But with random teams its more complicate to extract something you can work with. There are some points which separate a good from a bad calculation. Good matchmaking algorithms rewarding consistent performance.
    In a good roster, there should be a less amount of players with fluctuating performance or an insufficient dataset. Bad matchmaking algorithms just look at the totals (per team). Good one keep the skill level near the meridian with low deviation. Very good one even take up the trends in the calculation (in particular for the best players each side)

    You have to remember, if there are only game results. Every algorithm reacts inflexible and ultra slow. This would mandatory result in bad matchmaking. For scrims its basically impracticable to force some random stranger to input all necessary data/player-stats & double-check them. The usability would be better if at the end of the match each player could voluntarily evaluate the performance of the opposing players (=who was crucial) e.g. from 1-6 (6vs6) resulting 3x6 points which must be awarded. then doing the average of all who have "voted". So the individual contribution can be better determined - whatis incredibly helpful when determining the new level of skills for the next match.
Sign In or Register to comment.