Implement an efficient CFR solver. #393

michalsustr · 2020-10-01T21:31:11Z

This PR is a work-in-progress for implementing efficient CFR solvers for infostate trees.
@elkhrt @gabrfarina I would appreciate your feedback on the work.

Implement infostate tree representation.
Make sure the construction works also for a sequence of states where the same player chooses actions repeatedly. (correct alternation of decision/observation nodes, so that the number of decision nodes' children corresponds to the number of actions in those nodes). Also, the child index in the vector should correspond to the legal action index.
Add more infostate tree tests. I am not sure how to make them so they are not brittle. Testing tensor observations is not a good idea at the moment, as they change quite a bit right now.
Implement CFR that will build infostate trees for both players, and then run iterations only on these lightweight trees.
Run this CFR on the same suite of tests as the current implementation.

I expect that this CFR implementation will be much faster, as making rollouts in the game is slow and locating CFRInfoStateValues in a map indexed by strings is nice for understanding the algorithm, but not for efficiency.

Some things I decided not to do (in this PR):

Allow to prune the observation nodes. Because we cannot be sure about the construction of the infostate tree until it is finished, some observation nodes can be redundant. We can optionally prune them to get a smaller tree.
Identify infoste nodes by a sequence (sequence-form). Not sure whether just having a vector if ints is good enough, but if more info is needed you can always use these to traverse the infostate trees.
Compute sequence tuples for any State. I don't need that for my purposes, but I can add it if @gabrfarina would like to have it.

I outlined a number of welcome contributions in the code.

…wrong fix?

… const X*

This is due to future interfacing with neural nets, as we will not need such a high precision. Doubles are still used for CFRInfoStateValues for the moment.

…built to work properly anyway.

elkhrt

Could you revert the changes to the other algorithms and just add the new algorithm in this PR?

elkhrt · 2020-10-23T09:11:28Z

open_spiel/algorithms/infostate_cfr.h

+// A helper struct that allows to propagate reach probs / cf values
+// up and down the tree.
+struct InfostateTreeValuePropagator {
+  // Tree and the tree structure information. These must not change!


This seems ugly. Shouldn't these values be parameters to the propagation function rather than members here?

elkhrt · 2020-10-23T09:14:16Z

open_spiel/utils/action_view.h

+  // Collects legal actions at the specified state.
+  ActionView(const State& state);
+
+  // Provides an iterator over flattened actions. This is equivalent to calling


This looks like unnecessary duplication.

elkhrt · 2020-10-23T09:15:27Z

open_spiel/integration_tests/api_test.py

 if __name__ == "__main__":
  absltest.main()
+
+# TODO checks


Please make these TODOs clearer. You can link to github issues if you want, or just write a longer description of each in-line here.

elkhrt · 2020-10-23T09:16:24Z

open_spiel/algorithms/infostate_cfr.h

+  }
+
+ private:
+  std::array<InfostateTreeValuePropagator, 2> propagators_;


What's the array for?

elkhrt · 2020-10-23T09:16:42Z

open_spiel/algorithms/infostate_cfr.h

+
+ private:
+  std::array<InfostateTreeValuePropagator, 2> propagators_;
+  // Map from player 1 index (key) to player 0 (value).


michalsustr · 2020-10-23T12:35:09Z

Thank you for the comments! I will go over them.

Since I have a lot of other code piled up locally, I was thinking about how to incorporate it efficiently. I came up with the idea that I split them up into smaller PRs that are better structured. The code has mostly a linear chain of dependence, and I will post new PRs as we move forward the chain. This will allow us to incorporate the review feedback and have a cleaner git history. I appreciate the quality code review, therefore please do not feel under a rush for reviewing the PRs. This change will only mean that the contributions will come slower from our side.

I will close the PR for now, but I will address the comments in the new smaller PRs. Thanks again!

Implement base infostate tree.

bab4a32

googlebot added the cla: yes label Oct 1, 2020

michalsustr changed the title ~~Implement base infostate tree.~~ Implement an efficient CFR solver. Oct 1, 2020

michalsustr mentioned this pull request Oct 2, 2020

Implement Battleship game #376

Merged

14 tasks

michalsustr added 6 commits October 2, 2020 14:50

Add terminal value.

79d827b

Make sure to properly make observation / decision nodes. Update tests.

feb3b95

Expose legal actions in the infostate tree.

2a4ba14

Add couple of checks.

708c7e5

Add identification of sequences.

6d9be9d

Uodate comments.

3f57900

findmyway mentioned this pull request Oct 4, 2020

Improve CFR JuliaReinforcementLearning/ReinforcementLearningZoo.jl#99

Merged

7 tasks

michalsustr added 19 commits October 5, 2020 14:16

Get the Tree from a Node.

8aa6837

Refactor infostate tree - call a MakeNode function.

6bc84a7

Move constructor order.

f2054a1

Track if the infostate tree is balanced.

e432dbb

Track if the infostate tree is balanced.

4752bdc

Return reference for Root().

0c7d1b0

Add recomputation of the tree balance.

21456dd

Add code to balance the tree.

a625df0

Simplify the code.

74f7bf9

Add a full Kuhn Poker rebalancing test.

ecbb030

Redesign tree/node with CRTP

817fcc7

Add more information into CFRNode

75353d3

Properly use parent node pointer and legal actions

4185460

Add infostate implications.

3773e8a

Add support for simultaneous move nodes.

f3ba202

Supply the player to get player's policy. A fix for sim move games.

779188c

Distinguish between terminal value and terminal chance reach probs.

ec3471c

Update comments, add [[nodiscard]] when returning a templated class.

b7c67ea

Rename enum.

c07cdef

michalsustr added 20 commits October 15, 2020 14:02

Store corresponding states for leaf nodes.

d8a1ba6

Add is leaf node method.

79a978a

Move const requirements.

d58fd04

Move files so they do not share the same prefix.

3156bfd

Add a basic depth-limited tree test.

929f54a

Add leaves iterator.

7232622

Add depth-limited tests for all depths.

30dc360

Rename depth limit to move limit.

cfdd68e

Use more explicit child iterator.

c0aa4de

Make more use of IsLeafNode() method. Formatting changes.

8b96aa6

Fix travis error messages - forward the computation. Maybe this is a …

4e3998c

…wrong fix?

Make sure that GetStatePolicy primarily uses (State, Player) variant.

8471ad0

Use std::unique_ptr<CFRTree> for Propagator. Change types X const* to…

2a45250

… const X*

Collect chance reach probs of corresponding states.

a2a2d87

Prepare reach probs externally.

7e99e85

Use floats instead of doubles.

cb1b387

This is due to future interfacing with neural nets, as we will not need such a high precision. Doubles are still used for CFRInfoStateValues for the moment.

Expose views of reach probs / cf values.

36386c6

Rename methods.

e986e3f

Drop depth limit from infostate cfr - we need to have the whole tree …

a0d4072

…built to work properly anyway.

Add test for depth-limited subgames.

4385fd7

michalsustr mentioned this pull request Oct 22, 2020

Sequence-form LP implementation #417

Closed

elkhrt reviewed Oct 23, 2020

View reviewed changes

open_spiel/algorithms/infostate_cfr.h

}

private:

std::array<InfostateTreeValuePropagator, 2> propagators_;

Copy link

Member

elkhrt Oct 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the array for?

elkhrt reviewed Oct 23, 2020

View reviewed changes

michalsustr closed this Oct 23, 2020

michalsustr mentioned this pull request Oct 31, 2020

Thoughts on C++ optimization (LP solving) #398

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement an efficient CFR solver. #393

Implement an efficient CFR solver. #393

Uh oh!

michalsustr commented Oct 1, 2020 •

edited

Loading

Uh oh!

elkhrt left a comment

Uh oh!

elkhrt Oct 23, 2020

Uh oh!

elkhrt Oct 23, 2020

Uh oh!

elkhrt Oct 23, 2020

Uh oh!

elkhrt Oct 23, 2020

Uh oh!

elkhrt Oct 23, 2020

Uh oh!

michalsustr commented Oct 23, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Implement an efficient CFR solver. #393

Implement an efficient CFR solver. #393

Uh oh!

Conversation

michalsustr commented Oct 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elkhrt left a comment

Choose a reason for hiding this comment

Uh oh!

elkhrt Oct 23, 2020

Choose a reason for hiding this comment

Uh oh!

elkhrt Oct 23, 2020

Choose a reason for hiding this comment

Uh oh!

elkhrt Oct 23, 2020

Choose a reason for hiding this comment

Uh oh!

elkhrt Oct 23, 2020

Choose a reason for hiding this comment

Uh oh!

elkhrt Oct 23, 2020

Choose a reason for hiding this comment

Uh oh!

michalsustr commented Oct 23, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

michalsustr commented Oct 1, 2020 •

edited

Loading