Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Game.compute_rating #496

Merged
merged 10 commits into from
Jan 17, 2020
Merged

Conversation

cleborys
Copy link
Member

Mostly sparked by issue #485.
Based on PR #495 so includes a lot of commits from there.
I'll rebase once that's merged, so it will only include the last two commits.

server/config.py Outdated Show resolved Hide resolved
server/games/game_rater.py Outdated Show resolved Hide resolved
server/games/game_rater.py Outdated Show resolved Hide resolved
tests/unit_tests/test_game_rater.py Show resolved Hide resolved
tests/unit_tests/test_game_rater.py Outdated Show resolved Hide resolved
server/games/game_rater.py Outdated Show resolved Hide resolved
tests/unit_tests/test_game_rater.py Outdated Show resolved Hide resolved
tests/unit_tests/test_game_rater.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@Askaholic Askaholic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yapf can be helpful, but I'd always double check what it did because sometimes it just isn't worth it. I've been reading about black and it seems pretty good. The thing I'm not convinced by though is how it formats imports and will each imported item on its own line.

Note to self for testing:
We should test to see what happens in the scenario where one player ctrl-k's in order to destroy the other player's ACU. IIRC there was an issue with this a while back where the player who ctrl-k'd would always be awared a win instead of a draw.

server/games/game.py Outdated Show resolved Hide resolved
server/games/game.py Outdated Show resolved Hide resolved
server/games/game.py Outdated Show resolved Hide resolved
server/games/game_rater.py Outdated Show resolved Hide resolved
@Askaholic
Copy link
Collaborator

For rebasing you should just need to:

  1. Pull latest develop into your local develop branch
  2. Checkout this branch
  3. Run git rebase develop

Then resolve conflicts and do git rebase --continue until everything has been resolved.

@cleborys
Copy link
Member Author

cleborys commented Dec 18, 2019

If I rebase onto develop, it replays all commits in the branch, including ones that were already merged in other PRs and that produces a lot of conflicts because at the start I merged an earlier copy of Wesmania's branch into my local changes that weren't merged into develop yet. I think the correct way would be to cherry-pick all commits after d2d8f9a, but my first attempt wasn't successful. I can give it a shot tomorrow, but I was assuming we will squash anyways so the messed-up history would at least not hit develop?

@Askaholic
Copy link
Collaborator

No it wont affect develop directly. It just means if you ever go back and look at the PR there will be a bunch of extra commits in it. And IDK how it would affect a revert, but that's unlikely. If there aren't any conflicts, or not many conflicts then I tend to rebase out of OCD at this point.

If you want to cherry pick only a few commits I think you can use git rebase -i which will let you pick exactly which commits you want to replay.

@Askaholic Askaholic mentioned this pull request Dec 19, 2019
Copy link
Collaborator

@Askaholic Askaholic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the logic is pretty solid; you got lots of tests which is great! If you are available to run a few test games at some point let me know.

Yapf has worked nicely in many places, but also not so nicely in others. Would be nice to tidy up the rough spots a bit.

server/games/game.py Outdated Show resolved Hide resolved
server/games/game.py Outdated Show resolved Hide resolved
server/games/game.py Outdated Show resolved Hide resolved
server/games/game.py Outdated Show resolved Hide resolved
server/games/game.py Outdated Show resolved Hide resolved
tests/unit_tests/test_game.py Outdated Show resolved Hide resolved
tests/unit_tests/test_game.py Outdated Show resolved Hide resolved
tests/unit_tests/test_game.py Outdated Show resolved Hide resolved
tests/unit_tests/test_game.py Outdated Show resolved Hide resolved
tests/unit_tests/test_game.py Outdated Show resolved Hide resolved
@Askaholic
Copy link
Collaborator

Askaholic commented Dec 22, 2019

Ok so I tested a game where two acus kill eachother with the ACU explosion. Here is what happened:

First scores getting reported, army 1 reports an acu kill (score of 1):

TRACE    Dec 22  00:00:30 LobbyConnection                <<: {'command': 'EnforceRating', 'args': [], 'target': 'game'}
TRACE    Dec 22  00:00:30 LobbyConnection                <<: {'command': 'GameResult', 'args': [1, 'score 1'], 'target': 'game'}
INFO     Dec 22  00:00:30 CustomGame.8572080             161739 reported result for army 1: score 1
TRACE    Dec 22  00:00:30 LobbyConnection                <<: {'command': 'EnforceRating', 'args': [], 'target': 'game'}
TRACE    Dec 22  00:00:30 LobbyConnection                <<: {'command': 'GameResult', 'args': [1, 'score 1'], 'target': 'game'}
INFO     Dec 22  00:00:30 CustomGame.8572080             166198 reported result for army 1: score 1
TRACE    Dec 22  00:00:31 LobbyConnection                <<: {'command': 'GameResult', 'args': [1, 'defeat -10'], 'target': 'game'}
INFO     Dec 22  00:00:31 CustomGame.8572080             166198 reported result for army 1: defeat -10
TRACE    Dec 22  00:00:31 LobbyConnection                <<: {'command': 'GameResult', 'args': [2, 'defeat -10'], 'target': 'game'}
INFO     Dec 22  00:00:31 CustomGame.8572080             166198 reported result for army 2: defeat -10
TRACE    Dec 22  00:00:31 LobbyConnection                <<: {'command': 'GameResult', 'args': [1, 'defeat -10'], 'target': 'game'}
INFO     Dec 22  00:00:31 CustomGame.8572080             161739 reported result for army 1: defeat -10

Then when rating happens

ERROR    Dec 22  00:02:25 CustomGame.8572080             Error during game end: Attempted to rate game with inconsistent outcome. Teams: defaultdict(<class 'list'>, {1: [Player(Gardelouve, 161739, (1301.53, 91.3757), (27.0048, 40.9483)), Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004))]}) Outcomes by player: {Player(Gardelouve, 161739, (1301.53, 91.3757), (27.0048, 40.9483)): <GameOutcome.DEFEAT: 'defeat'>, Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004)): <GameOutcome.DEFEAT: 'defeat'>} Outcomes by team: [<GameOutcome.DEFEAT: 'defeat'>, <GameOutcome.DEFEAT: 'defeat'>]
Traceback (most recent call last):
  File "/code/server/games/game.py", line 448, in on_game_end
    await self.rate_game()
  File "/code/server/games/custom_game.py", line 19, in rate_game
    new_ratings = self.compute_rating(RatingType.GLOBAL)
  File "/code/server/games/game.py", line 849, in compute_rating
    return rater.compute_rating()
  File "/code/server/games/game_rater.py", line 34, in compute_rating
    ranks = self._ranks_from_team_outcomes(team_outcomes)
  File "/code/server/games/game_rater.py", line 119, in _ranks_from_team_outcomes
    f"Attempted to rate game with inconsistent outcome. Teams: {self._players_by_team} Outcomes by player: {self._outcome_by_player} Outcomes by team: {team_outcomes}"
server.games.game_rater.GameRatingError: Attempted to rate game with inconsistent outcome. Teams: defaultdict(<class 'list'>, {1: [Player(Gardelouve, 161739, (1301.53, 91.3757), (27.0048, 40.9483)), Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004))]}) Outcomes by player: {Player(Gardelouve, 161739, (1301.53, 91.3757), (27.0048, 40.9483)): <GameOutcome.DEFEAT: 'defeat'>, Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004)): <GameOutcome.DEFEAT: 'defeat'>} Outcomes by team: [<GameOutcome.DEFEAT: 'defeat'>, <GameOutcome.DEFEAT: 'defeat'>]

In this situation it should be counted as a draw. So maybe we need to add that as an edge case. If both armies reported defeat then consider the game a draw.

EDIT:
Actually it looks like the error happened because it thought the game results were unknown, even though there was a defeat reported for each player.

cleborys and others added 5 commits January 9, 2020 16:16
Co-Authored-By: Askaholic <askaholic907@gmail.com>
use == for int identity, remove duplicate test

reformat using yapf

clean up yapf formatting hiccups
@cleborys
Copy link
Member Author

cleborys commented Jan 9, 2020

Thank you for testing that! I changed the convention to treat games where both parties report defeat as draws.
Could you elaborate on why you think it threw because of unknown outcomes? Both logs you posted seem to suggest GameOutcome.DEFEAT for both players.

@Askaholic
Copy link
Collaborator

Looking at it now I guess I just interpreted it wrong. It even logged what the outcomes were and they were simply defeat for both teams.

server/config.py Outdated Show resolved Hide resolved
server/games/game_rater.py Outdated Show resolved Hide resolved
@Askaholic
Copy link
Collaborator

Looks correct now:

INFO     Jan 17  06:21:06 CustomGame.8572100             Game finished normally
DEBUG    Jan 17  06:21:06 CustomGame.8572100             Saving scores from game 8572100
INFO     Jan 17  06:21:06 CustomGame.8572100             Result for army 2, player: Player(Rikai, 58635, (1226.48, 62.806), (1018.36, 67.6586)): -10
INFO     Jan 17  06:21:06 GameResults                    Conflicting scores (Counter({1: 2, -10: 2})) reported for game 8572100
INFO     Jan 17  06:21:06 CustomGame.8572100             Result for army 1, player: Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004)): 1
INFO     Jan 17  06:21:06 CustomGame.8572100             Score for player Player(Rikai, 58635, (1226.48, 62.806), (1018.36, 67.6586)): -10
INFO     Jan 17  06:21:06 CustomGame.8572100             Score for player Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004)): 1
DEBUG    Jan 17  06:21:06 GameRater                      Rating groups: [{Player(Rikai, 58635, (1226.48, 62.806), (1018.36, 67.6586)): trueskill.Rating(mu=1226.480, sigma=62.806)}, {Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004)): trueskill.Rating(mu=1463.550, sigma=89.350)}]
DEBUG    Jan 17  06:21:06 GameRater                      Ranks: [0, 0]
INFO     Jan 17  06:21:06 CustomGame.8572100             Saving rating change stats
DEBUG    Jan 17  06:21:06 CustomGame.8572100             New global rating for Player(Rikai, 58635, (1226.48, 62.806), (1018.36, 67.6586)): trueskill.Rating(mu=1233.975, sigma=62.584)
DEBUG    Jan 17  06:21:06 CustomGame.8572100             New global rating for Player(Askaholic, 166198, (1463.55, 89.3498), (1469.66, 214.004)): trueskill.Rating(mu=1448.571, sigma=87.021)

@Askaholic Askaholic merged commit e591a8c into FAForever:develop Jan 17, 2020
Brutus5000 pushed a commit that referenced this pull request Jan 27, 2020
* failing test for issue 485 bug

* First refactor of compute_rating. Does not handle games with results reported UNKNOWN.

* compute_rating moved to GameRater class

* Apply suggestions from code review

Co-Authored-By: Askaholic <askaholic907@gmail.com>

* Add double victory test, make FFA_TEAM constant

use == for int identity, remove duplicate test

reformat using yapf

clean up yapf formatting hiccups

* Treat game as draw if both parties report defeat

* yapf cleanup and a few naming refactors

* suggested use of player_factory

* review suggestions

Co-authored-by: Askaholic <askaholic@protonmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants