Skip to content

Commit

Permalink
feature(pu): add 2 mcts related iclr2024 papers
Browse files Browse the repository at this point in the history
  • Loading branch information
puyuan1996 committed Mar 7, 2024
1 parent 711d718 commit 6d98c0e
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 0 deletions.
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -374,6 +374,14 @@ Here is a collection of research papers about **Monte Carlo Tree Search**.
- ExpEnv: USPTO datasets
- [Code](https://github.com/binghong-ml/retro_star)
#### ICLR
- [The Update Equivalence Framework for Decision-Time Planning](https://openreview.net/forum?id=JXGph215fL) 2024
- Samuel Sokota, Gabriele Farina, David J Wu, Hengyuan Hu, Kevin A. Wang, J Zico Kolter, Noam Brown
- Key: imperfect-information games, search, decision-time planning, update equivalence
- ExpEnv: Hanabi, 3x3 Abrupt Dark Hex and Phantom Tic-Tac-Toe
- [Efficient Multi-agent Reinforcement Learning by Planning](https://openreview.net/forum?id=CpnKq3UJwp) 2024
- Qihan Liu, Jianing Ye, Xiaoteng Ma, Jun Yang, Bin Liang, Chongjie Zhang
- Key: multi-agent reinforcement learning, planning, multi-agent MCTS
- ExpEnv: SMAC, LunarLander, MuJoCo, and Google Research Football
- [Become a Proficient Player with Limited Data through Watching Pure Videos](https://openreview.net/pdf?id=Sy-o2N0hF4f) 2023
- Weirui Ye, Yunsheng Zhang, Pieter Abbeel, Yang Gao
- Key: pre-training from action-free videos, forward-inverse cycle consistency (FICC) objective based on vector quantization, pre-training phase, fine-tuning phase.
Expand Down
8 changes: 8 additions & 0 deletions README.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -374,6 +374,14 @@ and internal state transition dynamics,
- ExpEnv: USPTO datasets
- [Code](https://github.com/binghong-ml/retro_star)
#### ICLR
- [The Update Equivalence Framework for Decision-Time Planning](https://openreview.net/forum?id=JXGph215fL) 2024
- Samuel Sokota, Gabriele Farina, David J Wu, Hengyuan Hu, Kevin A. Wang, J Zico Kolter, Noam Brown
- Key: imperfect-information games, search, decision-time planning, update equivalence
- ExpEnv: Hanabi, 3x3 Abrupt Dark Hex and Phantom Tic-Tac-Toe
- [Efficient Multi-agent Reinforcement Learning by Planning](https://openreview.net/forum?id=CpnKq3UJwp) 2024
- Qihan Liu, Jianing Ye, Xiaoteng Ma, Jun Yang, Bin Liang, Chongjie Zhang
- Key: multi-agent reinforcement learning, planning, multi-agent MCTS
- ExpEnv: SMAC, LunarLander, MuJoCo, and Google Research Football
- [Become a Proficient Player with Limited Data through Watching Pure Videos](https://openreview.net/pdf?id=Sy-o2N0hF4f) 2023
- Weirui Ye, Yunsheng Zhang, Pieter Abbeel, Yang Gao
- Key: pre-training from action-free videos, forward-inverse cycle consistency (FICC) objective based on vector quantization, pre-training phase, fine-tuning phase.
Expand Down

0 comments on commit 6d98c0e

Please sign in to comment.