Skip to content

Conversation

@gabrfarina
Copy link
Contributor

@gabrfarina gabrfarina commented Sep 25, 2020

This draft pull request tracks progress on the implementation of the Battleship game. One of the goals of this implementation is to be able to exactly match the game generator that was used to generate game instances for some recent papers on extensive-form correlated equilibria. This particular version of the game was introduced in here.

Things done:

The pull request is pretty big, and given that build is green, I think it would make sense to have a preliminary review to see if everything is on the right track. It would also be great if one of the maintainers could help shed some light on my last three things above.

r? @lanctot

@michalsustr
Copy link
Collaborator

This is great, thank you! I willy likely use this game as well :)

Maybe I can say something about the last points:

Figure out if implementing InformationStateTensor is important (so far, only InformationStateString is implemented)

It is not "important", as in the game will work completely fine without it. It would be however useful for algorithms that make use of learning (neural networks).

Figure out a plan for registering the game as zero-sum only when the loss_multiplier parameter is == 1.0. Is this important?

You can first register it as general sum and "override" the type, like here: https://github.com/deepmind/open_spiel/blob/master/open_spiel/game_transforms/turn_based_simultaneous_game.cc#L267

Figure out what the observers are and whether it makes sense to have them for this implementation

Observers are a recent addition to the library. Basically we ended up with a bunch of methods like (InformationState|Observation|PublicObservation)(String|Tensor) with respective tensor layouts, etc., and the API for State started to be cluttered. The plan is that these existing methods will become deprecated, once all games implement their own observers. If you do not want to implement tensor observations (which I would recommend however), you can skip it, and just implement the current string variant you already have for a BattleShipObserver. You can check out KuhnObserver / LeducObserver / GoofspielObserver (that one is still a PR, but should be merged soon). The usage of observer allows us to parametrize the observations based on the information revealed, and even use custom observer parameters. Example of leduc poker.

@finbarrtimbers
Copy link
Contributor

Ah, this is fantastic, thanks for doing this. I have some research that will use this once it's implemented. I was looking into implementing this myself, so thanks for saving me the work!

@gabrfarina
Copy link
Contributor Author

Wow, thanks so much for the enthusiasm :-)

@michalsustr Thanks for the information about observers, I think I see now. I think it is definitely worthwhile to add the tensor-based representations, though at this point it would make a lot of sense to create an observer to get ahead of the deprecation wave for the non-observer-based methods.

I think that for this game in particular it is OK for InformationState and Observation to be the same, so perhaps only having one observer will be fine. I'm not sure about PublicObservation(String|Tensor) though, do you have a pointer for it? From the name, I guess it would just be a sequence of (player, shot, shot outcome) triplets, with no information about where the players placed their ships?

Thanks for pointing out to an example of how to override the game type, I will add something like that. I think it's important to do that type of override just in case some code down the line checks that the game is zero-sum by only looking into the trait object before doing anything, and would therefore fail on Battleship even when the game is actually zero sum.

Copy link
Collaborator

@lanctot lanctot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you generate a playthrough?

@lanctot
Copy link
Collaborator

lanctot commented Sep 26, 2020

Very nice implementation.

Generally I like the incremental plan for large code changes, so we can even accept this when you're happy with the basic implementation and you can submit more PRs later. Or if it's easier you can just keep adding to this one. Your call.

A few follow-ups:

  • I think it's ok to defer the more general observers until later. The basic game implementation is already large enough, and not many games have them yet. Much easier to get started just implementing the ones you have done now, and you can change them later. I kind of like having the simple ones for the simpler games. We have not really discussed deprecating these as a team yet. I think it will make sense to retain them simpler ones for games where the heavier machinery is not needed, or as a first step so it's not such a burden for writing new games (exactly the situation you're in), and to preserve backward-compatibility. Either way, my recommendation is to start with these.
  • Similarly for the InformationStateTensor and ObservationTensor: you can just add these later. Or I might!

@gabrfarina
Copy link
Contributor Author

Filed #382. Currently blocked on that issue.

@gabrfarina
Copy link
Contributor Author

Generally I like the incremental plan for large code changes, so we can even accept this when you're happy with the basic implementation and you can submit more PRs later. Or if it's easier you can just keep adding to this one. Your call.

I don't mind adding all the low hanging fruits to this implementation, including some tests. Before marking it as ready, I would also like to make sure that this implementation can reproduce some of the results from [1].

A few follow-ups:

  • I think it's ok to defer the more general observers until later. The basic game implementation is already large enough, and not many games have them yet.

Sounds good! It sounds like adding Observers to this game would be smooth sailing, so for the time being, I will focus on the "basic" methods like you suggested.

  • Similarly for the InformationStateTensor and ObservationTensor: you can just add these later. Or I might!

If I understood correctly how these work from the doc in spiel.h, adding those would be pretty easy --- just allocate a big enough vector, and set a few entries to 1. I'm happy to take a first crack. Do you have automatic ways of checking that the String and Tensor implementations are in sync?

@gabrfarina
Copy link
Contributor Author

gabrfarina commented Sep 26, 2020

I opened #386 and applied the fix locally to this branch. Right now the basic tests pass.

@gabrfarina
Copy link
Contributor Author

gabrfarina commented Sep 26, 2020

Tests are now green locally 🎉

The only test that was failing was this
https://github.com/deepmind/open_spiel/blob/91da643188fcacd7a57bf5a64ee392d6c919dcfc/open_spiel/integration_tests/api_test.py#L461-L471

It looks like several games are on the same boat so for now I've just added battleship to the list. @lanctot have you been having many false positives with that test --- is it reasonable to have battleship opt out without much investigation? Do you have a feeling for why it might fail on Battleship in particular?

@lanctot
Copy link
Collaborator

lanctot commented Sep 26, 2020

Yep this is ok for now. To be honest, I don't remember.. @michalsustr can you clarify?

@gabrfarina
Copy link
Contributor Author

I've implemented ObservationString as discussed, so I'm going to close the last unresolved comment from @lanctot's preliminary review.

On second thought, I think that it would make sense for everybody if InformationStateTensor was added in a followup PR. The changeset is already pretty large, and implementing that method can be its own standalone task with its own review. So I think for now I will only implement the string version. The tensor version could even be implemented through an observer, as @michalsustr suggested.

At this point the only big task that remains is to write some tests and make sure that the games generated by this implementation match the implementation used in [1].

@michalsustr
Copy link
Collaborator

Do you have a feeling for why it might fail on Battleship in particular?

I would have to go through the code more carefully, I might take a look tomorrow. I added a few comments though. What was the error message for the failed test?

have you been having many false positives with that test ?

I believe the test is fine, it's just that those games are indeed not implementing observations correctly. I didn't have the capacity to fix all the games back then. I think the most common case was that information state / observation was spanning States with different MoveNumbers, because it did not distinguish between encoding observations when the requested player was acting, or the opponent(s) were.

@gabrfarina
Copy link
Contributor Author

gabrfarina commented Sep 27, 2020

@michalsustr this is the error:

AssertionError: False is not true : Observation consistency failed.
It should hold that partition of 'information_state_string' is Relation.SUBSET_OR_EQUALS of 'move_number'
Problem occurred while evaluating action history '2 2' for player 0.
The state representation is:
/0:h_1_0/1:h_1_0
--------------------------------------------------------------------------------
Detailed information:
The set of histories that have the same 'information_state_string':
{'2', '2 0', '2 2'}
The set of histories that have the same 'move_number':
{'0 0', '0 2', '2 0', '2 2'}
The set difference is:
{'0 0', '0 2'}
The problematic information_state_string is '/h_1_0'
The problematic move_number is '2'

By default the board is 2 x 2, and each player has exactly one ship of length 2. The action string h_1_0 means that the player has placed their ship horizontally, starting from the leftmost corner in (row, col) = (1, 0).

If you want to poke around with the implementation, you can clone my battleship branch and git revert 60dd0fe which was the comment blacklisting battleship from the observation consistency tests.

EDIT Not sure if it's relevant, but the action indices are "reused" in the implementation. So for example action 0 could mean "place a ship horizontally starting from (0, 0)" or "shoot at cell (0, 0)" depending on the state. Is it possible that the consistency check would be confused by this?

@gabrfarina
Copy link
Contributor Author

@lanctot

Can you generate a playthrough?

What's the best way to do that? I noticed that integration_tests contains a few playthroughs. Should we add one for Battleship? How did you generate those files?

@lanctot
Copy link
Collaborator

lanctot commented Sep 27, 2020

What's the best way to do that? I noticed that integration_tests contains a few playthroughs. Should we add one for Battleship? How did you generate those files?

There is a script to do it automatically: https://github.com/deepmind/open_spiel/blob/master/open_spiel/scripts/generate_new_playthrough.sh. Just run it with the game string as the argument.

@lanctot
Copy link
Collaborator

lanctot commented Sep 28, 2020

Ok I did several updates to the master branch, you can rebase now. Note that I slightly changed the formatting of the doubles to remove unnecessary trailing zeroes (new function in spiel_utils.cc called FormatDouble).

@lanctot
Copy link
Collaborator

lanctot commented Oct 6, 2020

If you have any remaining FIXMEs in the code, can you change them all to TODO? They break the import :)

@gabrfarina
Copy link
Contributor Author

Travis failed again on oh_hell. Last time, when @lanctot manually restarted the CI job the error disappeared. I was thinking that it could be possible that the oh_hell test would hit a time limit and that would make the test flaky, but I remember seeing that the test was passing pretty fast on other runs?

@michalsustr
Copy link
Collaborator

Would really prefer state-independent IDs, both to be more consistent with other games and to not hinder the neural net / RL applications

Do you think it would make sense to check this for games in general in the api_test.py ?

@lanctot
Copy link
Collaborator

lanctot commented Oct 6, 2020

Travis failed again on oh_hell. Last time, when @lanctot manually restarted the CI job the error disappeared. I was thinking that it could be possible that the oh_hell test would hit a time limit and that would make the test flaky, but I remember seeing that the test was passing pretty fast on other runs?

Yeah I think something is wrong, happened in another PR too. Thanks for flagging, I've opened an issue: #396

@lanctot
Copy link
Collaborator

lanctot commented Oct 6, 2020

Do you think it would make sense to check this for games in general in the api_test.py ?

Hmm.. how could you?

Also, we didn't strictly disallow it, so it's not really failing to meet the API standard, more of a convention (but maybe we should document it better).

@michalsustr
Copy link
Collaborator

michalsustr commented Oct 6, 2020

Hmm.. how could you?

Get legal action ids from sampled states, call action to string and check that ids have consistently the same strings. But yeah, maybe it's not such a great idea :) Especially if it is not required for games to do right now.

@lanctot
Copy link
Collaborator

lanctot commented Oct 7, 2020

Get legal action ids from sampled states, call action to string and check that ids have consistently the same strings. But yeah, maybe it's not such a great idea :) Especially if it is not required for games to do right now.

Oh yeah I definitely do not want to force state-independent string representations. Many games have game-specific move descriptions (e.g. chess, backgammon) and we use State::ActionToString to turn the opaque action ids into something interpretable and meaningful for that state and game.

@lanctot
Copy link
Collaborator

lanctot commented Oct 7, 2020

@gabrfarina @finbarrtimbers is this good to go? I'd like to import it so we can include it in Monday's update.

Seems good AFAICT but it's still showing a change requested.

@gabrfarina
Copy link
Contributor Author

I'll add dimension tests today to validate the table with game sizes in battleship.h. If everything goes well, this will be ready in a few hours :-)

@finbarrtimbers
Copy link
Contributor

Lgtm!

@gabrfarina
Copy link
Contributor Author

Now blocked on #399 . Marking the PR as draft again to prevent accidental merging. Once #399 is solved, hopefully the new tests on the game dimensions will pass.

@gabrfarina gabrfarina marked this pull request as draft October 7, 2020 19:31
@lanctot
Copy link
Collaborator

lanctot commented Oct 7, 2020

Now blocked on #399 . Marking the PR as draft again to prevent accidental merging. Once #399 is solved, hopefully the new tests on the game dimensions will pass.

This is really easy, you just need to add the shots info to your ToString (but please keep the nice-looking board!) More detail in my response in the issue above.

Edit: This property of ToString is also used in a few other places: in the best response code, and as the default implementation of State::operator==. I will also note to clarify this in the docs if it's not clear. And thanks for your patience!

@gabrfarina
Copy link
Contributor Author

Thanks @lanctot . Based on the discussion in #399 I've reverted the old implementation of ToString, and added the new tests about the game sizes. The game sizes are now computed correctly, and this PR's Battleship generator matches exactly the sizes of the games generated by my generator 🎉

So, I'm now more confident that things are in a good shape.

Marking as ready for review again.

@gabrfarina gabrfarina marked this pull request as ready for review October 7, 2020 21:58
@gabrfarina
Copy link
Contributor Author

Seems good AFAICT but it's still showing a change requested.

@lanctot The change request was created by you a while ago, but I've since resolved it so I think you can go ahead and remove it if it all looks good to you? :-)

@lanctot
Copy link
Collaborator

lanctot commented Oct 8, 2020

@lanctot The change request was created by you a while ago, but I've since resolved it so I think you can go ahead and remove it if it all looks good to you? :-)

Oops, yes there has been a lot of turnaround and comments. I'm importing it now. Thanks again, this PR was useful to us in many ways.

@gabrfarina
Copy link
Contributor Author

Thanks again, this PR was useful to us in many ways.

My pleasure, this has been a fun PR, and it was great to see all this enthusiasm :-)
Thanks all again for the help @lanctot @michalsustr @finbarrtimbers

I'm happy to stick around to answer any question / request for change.
150+ comments and 50+ commits in, I'm excited to see this get merged! 🚀 🚀 🚀

@lanctot lanctot added the imported This PR has been imported and awaiting internal review. Please avoid any more local changes, thanks! label Oct 8, 2020
This was referenced Oct 9, 2020
@OpenSpiel OpenSpiel merged commit 011f850 into google-deepmind:master Oct 11, 2020
@finbarrtimbers
Copy link
Contributor

Hey @gabrfarina, do you have any plans to implement InformationStateTensor? If not, I might take a crack at it as I'd like to run some experiments in Battleship.

@gabrfarina
Copy link
Contributor Author

@finbarrtimbers Thanks for the ping 🙂 I'm a bit swamped with an internal deadline by the end of next week so if you could take a crack at it that would be great, given that I'm also not super familiar with the best practices around that method (will it just be one-hot encoding for each turn? Or something more compact?). If that helps, I'm definitely happy to stick around to help / answer any question you might have!

Also, I will cut out a few hours to port over the Sheriff game generator soon...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes imported This PR has been imported and awaiting internal review. Please avoid any more local changes, thanks!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants