Refactor Game.compute_rating #496

cleborys · 2019-09-21T18:40:18Z

Mostly sparked by issue #485.
Based on PR #495 so includes a lot of commits from there.
I'll rebase once that's merged, so it will only include the last two commits.

server/config.py

server/games/game_rater.py

tests/unit_tests/test_game_rater.py

server/games/game_rater.py

tests/unit_tests/test_game_rater.py

Askaholic

Yapf can be helpful, but I'd always double check what it did because sometimes it just isn't worth it. I've been reading about black and it seems pretty good. The thing I'm not convinced by though is how it formats imports and will each imported item on its own line.

Note to self for testing:
We should test to see what happens in the scenario where one player ctrl-k's in order to destroy the other player's ACU. IIRC there was an issue with this a while back where the player who ctrl-k'd would always be awared a win instead of a draw.

server/games/game.py

server/games/game_rater.py

Askaholic · 2019-12-18T22:47:02Z

For rebasing you should just need to:

Pull latest develop into your local develop branch
Checkout this branch
Run git rebase develop

Then resolve conflicts and do git rebase --continue until everything has been resolved.

cleborys · 2019-12-18T23:07:14Z

If I rebase onto develop, it replays all commits in the branch, including ones that were already merged in other PRs and that produces a lot of conflicts because at the start I merged an earlier copy of Wesmania's branch into my local changes that weren't merged into develop yet. I think the correct way would be to cherry-pick all commits after d2d8f9a, but my first attempt wasn't successful. I can give it a shot tomorrow, but I was assuming we will squash anyways so the messed-up history would at least not hit develop?

Askaholic · 2019-12-19T00:07:27Z

No it wont affect develop directly. It just means if you ever go back and look at the PR there will be a bunch of extra commits in it. And IDK how it would affect a revert, but that's unlikely. If there aren't any conflicts, or not many conflicts then I tend to rebase out of OCD at this point.

If you want to cherry pick only a few commits I think you can use git rebase -i which will let you pick exactly which commits you want to replay.

Askaholic

Looks like the logic is pretty solid; you got lots of tests which is great! If you are available to run a few test games at some point let me know.

Yapf has worked nicely in many places, but also not so nicely in others. Would be nice to tidy up the rough spots a bit.

server/games/game.py

tests/unit_tests/test_game.py

Askaholic · 2019-12-22T00:08:39Z

Ok so I tested a game where two acus kill eachother with the ACU explosion. Here is what happened:

First scores getting reported, army 1 reports an acu kill (score of 1):

TRACE    Dec 22  00:00:30 LobbyConnection                <<: {'command': 'EnforceRating', 'args': [], 'target': 'game'}
TRACE    Dec 22  00:00:30 LobbyConnection                <<: {'command': 'GameResult', 'args': [1, 'score 1'], 'target': 'game'}
INFO     Dec 22  00:00:30 CustomGame.8572080             161739 reported result for army 1: score 1
TRACE    Dec 22  00:00:30 LobbyConnection                <<: {'command': 'EnforceRating', 'args': [], 'target': 'game'}
TRACE    Dec 22  00:00:30 LobbyConnection                <<: {'command': 'GameResult', 'args': [1, 'score 1'], 'target': 'game'}
INFO     Dec 22  00:00:30 CustomGame.8572080             166198 reported result for army 1: score 1
TRACE    Dec 22  00:00:31 LobbyConnection                <<: {'command': 'GameResult', 'args': [1, 'defeat -10'], 'target': 'game'}
INFO     Dec 22  00:00:31 CustomGame.8572080             166198 reported result for army 1: defeat -10
TRACE    Dec 22  00:00:31 LobbyConnection                <<: {'command': 'GameResult', 'args': [2, 'defeat -10'], 'target': 'game'}
INFO     Dec 22  00:00:31 CustomGame.8572080             166198 reported result for army 2: defeat -10
TRACE    Dec 22  00:00:31 LobbyConnection                <<: {'command': 'GameResult', 'args': [1, 'defeat -10'], 'target': 'game'}
INFO     Dec 22  00:00:31 CustomGame.8572080             161739 reported result for army 1: defeat -10

Then when rating happens

ERROR    Dec 22  00:02:25 CustomGame.8572080             Error during game end: Attempted to rate game with inconsistent outcome. Teams: defaultdict(<class 'list'>, {1: [Player(Gardelouve, 161739, (1301.53, 91.3757), (27.0048, 40.9483)), Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004))]}) Outcomes by player: {Player(Gardelouve, 161739, (1301.53, 91.3757), (27.0048, 40.9483)): <GameOutcome.DEFEAT: 'defeat'>, Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004)): <GameOutcome.DEFEAT: 'defeat'>} Outcomes by team: [<GameOutcome.DEFEAT: 'defeat'>, <GameOutcome.DEFEAT: 'defeat'>]
Traceback (most recent call last):
  File "/code/server/games/game.py", line 448, in on_game_end
    await self.rate_game()
  File "/code/server/games/custom_game.py", line 19, in rate_game
    new_ratings = self.compute_rating(RatingType.GLOBAL)
  File "/code/server/games/game.py", line 849, in compute_rating
    return rater.compute_rating()
  File "/code/server/games/game_rater.py", line 34, in compute_rating
    ranks = self._ranks_from_team_outcomes(team_outcomes)
  File "/code/server/games/game_rater.py", line 119, in _ranks_from_team_outcomes
    f"Attempted to rate game with inconsistent outcome. Teams: {self._players_by_team} Outcomes by player: {self._outcome_by_player} Outcomes by team: {team_outcomes}"
server.games.game_rater.GameRatingError: Attempted to rate game with inconsistent outcome. Teams: defaultdict(<class 'list'>, {1: [Player(Gardelouve, 161739, (1301.53, 91.3757), (27.0048, 40.9483)), Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004))]}) Outcomes by player: {Player(Gardelouve, 161739, (1301.53, 91.3757), (27.0048, 40.9483)): <GameOutcome.DEFEAT: 'defeat'>, Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004)): <GameOutcome.DEFEAT: 'defeat'>} Outcomes by team: [<GameOutcome.DEFEAT: 'defeat'>, <GameOutcome.DEFEAT: 'defeat'>]

In this situation it should be counted as a draw. So maybe we need to add that as an edge case. If both armies reported defeat then consider the game a draw.

EDIT:
Actually it looks like the error happened because it thought the game results were unknown, even though there was a defeat reported for each player.

…reported UNKNOWN.

Co-Authored-By: Askaholic <askaholic907@gmail.com>

use == for int identity, remove duplicate test reformat using yapf clean up yapf formatting hiccups

cleborys · 2020-01-09T17:21:51Z

Thank you for testing that! I changed the convention to treat games where both parties report defeat as draws.
Could you elaborate on why you think it threw because of unknown outcomes? Both logs you posted seem to suggest GameOutcome.DEFEAT for both players.

Askaholic · 2020-01-09T21:36:32Z

Looking at it now I guess I just interpreted it wrong. It even logged what the outcomes were and they were simply defeat for both teams.

server/config.py

server/games/game_rater.py

Askaholic · 2020-01-17T06:26:02Z

Looks correct now:

INFO     Jan 17  06:21:06 CustomGame.8572100             Game finished normally
DEBUG    Jan 17  06:21:06 CustomGame.8572100             Saving scores from game 8572100
INFO     Jan 17  06:21:06 CustomGame.8572100             Result for army 2, player: Player(Rikai, 58635, (1226.48, 62.806), (1018.36, 67.6586)): -10
INFO     Jan 17  06:21:06 GameResults                    Conflicting scores (Counter({1: 2, -10: 2})) reported for game 8572100
INFO     Jan 17  06:21:06 CustomGame.8572100             Result for army 1, player: Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004)): 1
INFO     Jan 17  06:21:06 CustomGame.8572100             Score for player Player(Rikai, 58635, (1226.48, 62.806), (1018.36, 67.6586)): -10
INFO     Jan 17  06:21:06 CustomGame.8572100             Score for player Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004)): 1
DEBUG    Jan 17  06:21:06 GameRater                      Rating groups: [{Player(Rikai, 58635, (1226.48, 62.806), (1018.36, 67.6586)): trueskill.Rating(mu=1226.480, sigma=62.806)}, {Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004)): trueskill.Rating(mu=1463.550, sigma=89.350)}]
DEBUG    Jan 17  06:21:06 GameRater                      Ranks: [0, 0]
INFO     Jan 17  06:21:06 CustomGame.8572100             Saving rating change stats
DEBUG    Jan 17  06:21:06 CustomGame.8572100             New global rating for Player(Rikai, 58635, (1226.48, 62.806), (1018.36, 67.6586)): trueskill.Rating(mu=1233.975, sigma=62.584)
DEBUG    Jan 17  06:21:06 CustomGame.8572100             New global rating for Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004)): trueskill.Rating(mu=1448.571, sigma=87.021)

* failing test for issue 485 bug * First refactor of compute_rating. Does not handle games with results reported UNKNOWN. * compute_rating moved to GameRater class * Apply suggestions from code review Co-Authored-By: Askaholic <askaholic907@gmail.com> * Add double victory test, make FFA_TEAM constant use == for int identity, remove duplicate test reformat using yapf clean up yapf formatting hiccups * Treat game as draw if both parties report defeat * yapf cleanup and a few naming refactors * suggested use of player_factory * review suggestions Co-authored-by: Askaholic <askaholic@protonmail.com>

Askaholic reviewed Sep 21, 2019

View reviewed changes

server/config.py Outdated Show resolved Hide resolved

Askaholic reviewed Sep 21, 2019

View reviewed changes

server/games/game_rater.py Outdated Show resolved Hide resolved

server/games/game_rater.py Outdated Show resolved Hide resolved

tests/unit_tests/test_game_rater.py Show resolved Hide resolved

tests/unit_tests/test_game_rater.py Outdated Show resolved Hide resolved

Askaholic reviewed Sep 24, 2019

View reviewed changes

server/games/game_rater.py Outdated Show resolved Hide resolved

tests/unit_tests/test_game_rater.py Outdated Show resolved Hide resolved

tests/unit_tests/test_game_rater.py Outdated Show resolved Hide resolved

Askaholic requested changes Dec 18, 2019

View reviewed changes

server/games/game.py Outdated Show resolved Hide resolved

server/games/game.py Outdated Show resolved Hide resolved

server/games/game.py Outdated Show resolved Hide resolved

server/games/game_rater.py Outdated Show resolved Hide resolved

Askaholic mentioned this pull request Dec 19, 2019

Wrong game result #283

Closed

Askaholic reviewed Dec 21, 2019

View reviewed changes

cleborys and others added 5 commits January 9, 2020 16:16

failing test for issue 485 bug

a056ee4

First refactor of compute_rating. Does not handle games with results …

18a26b7

…reported UNKNOWN.

compute_rating moved to GameRater class

3c0fdb2

Apply suggestions from code review

c0aebf7

Co-Authored-By: Askaholic <askaholic907@gmail.com>

Add double victory test, make FFA_TEAM constant

80bda30

use == for int identity, remove duplicate test reformat using yapf clean up yapf formatting hiccups

cleborys force-pushed the trueskill-ranks branch from 084f5b7 to 80bda30 Compare January 9, 2020 15:41

Treat game as draw if both parties report defeat

3e4f8d6

cleborys added 2 commits January 9, 2020 21:44

yapf cleanup and a few naming refactors

22dae1a

suggested use of player_factory

2734a5d

Askaholic reviewed Jan 9, 2020

View reviewed changes

server/config.py Outdated Show resolved Hide resolved

server/games/game_rater.py Outdated Show resolved Hide resolved

cleborys and others added 2 commits January 9, 2020 23:37

review suggestions

2c46e4a

Merge branch 'develop' into trueskill-ranks

8d3b37f

Askaholic merged commit e591a8c into FAForever:develop Jan 17, 2020

Askaholic mentioned this pull request Jan 17, 2020

Ctrl K for Draw in Global games registering as a win FAForever/fa#2737

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Game.compute_rating #496

Refactor Game.compute_rating #496

cleborys commented Sep 21, 2019

Askaholic left a comment

Askaholic commented Dec 18, 2019

cleborys commented Dec 18, 2019 •

edited

Loading

Askaholic commented Dec 19, 2019

Askaholic left a comment

Askaholic commented Dec 22, 2019 •

edited

Loading

cleborys commented Jan 9, 2020

Askaholic commented Jan 9, 2020

Askaholic commented Jan 17, 2020

Refactor Game.compute_rating #496

Refactor Game.compute_rating #496

Conversation

cleborys commented Sep 21, 2019

Askaholic left a comment

Choose a reason for hiding this comment

Askaholic commented Dec 18, 2019

cleborys commented Dec 18, 2019 • edited Loading

Askaholic commented Dec 19, 2019

Askaholic left a comment

Choose a reason for hiding this comment

Askaholic commented Dec 22, 2019 • edited Loading

cleborys commented Jan 9, 2020

Askaholic commented Jan 9, 2020

Askaholic commented Jan 17, 2020

cleborys commented Dec 18, 2019 •

edited

Loading

Askaholic commented Dec 22, 2019 •

edited

Loading