Pickomino-Env

Website for Pickomino Gymnasium Environment for Reinforcement Learning project.

Please access the repository for the most recent README.md

Pickomino-Env

Description

An environment conforming to the Gymnasium API for the dice game Pickomino (Heckmeck am Bratwurmeck) Goal: train a Reinforcement Learning agent for optimal play. Meaning, decide which face of the dice to collect, when to roll and when to stop.

Action Space

The Action space is a tuple with two integers. Tuple (int, int)

Action = [dice_face (0-5), action_type (0=roll, 1=stop)].

0-5: Face of the dice, which you want to take, where:
- 0 -> face 1
- 1 -> face 2
- 2 -> face 3
- 3 -> face 4
- 4 -> face 5
- 5 -> face worm
0-1: Roll (0) or stop (1).

Observation Space

The observation is a dict with shape (4,) with the values corresponding to the following: dice, table and player.

Observation	Max	Shape
dice_collected	8	(6,)
dice_rolled	8	(6,)
tiles_table	1	(16,)
tile_players	36	number_of_players

Note: There are eight dice to roll and collect. A die has six sides with the number of eyes one through five, but a worm instead of a six. The values correspond to the number of eyes, with the worm also having the value five (and not six!). The 16 tiles are numbered 21 to 36 and have worm values from one to four in spread in four groups. The game is for two to seven players. Here your Reinforcement Learning Agent is the first player. The other players are computer bots. The bots play, according to a heuristic. When you create the environment, you have to define the number of bots.

For a more detailed description of the rules, see the file pickomino-rulebook.pdf. You can play the game online here: https://www.maartenpoirot.com/pickomino/. The heuristic used by the bots is described here: https://frozenfractal.com/blog/2015/5/3/how-to-win-at-pickomino/.

Rewards

The goal is to collect tiles in a stack. The winner is the player, which at the end of the game has the most worms on her tiles. For the Reinforcement Learning Agent a reward equal to the value (worms) of a tile is given when the tile is picked. For a failed attempt (see rulebook), a corresponding negative reward is given. When a bot steals your tile, no negative reward is given. Hence, the total reward at the end of the game can be greater than the score.

Starting State

dice_collected = [0, 0, 0, 0, 0, 0].
dice_rolled = [3, 0, 1, 2, 0, 2] Random dice, sum = 8.
tiles_table = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1].
tile_players = [0, 0, 0] (with number_of_bots = 2).

Episode End

The episode ends if one of the following occurs:

Termination: If there are no more tiles to take on the table = Game Over.
Termination: Action out of allowed range [0–5, 0-1].

Truncation

Truncation: Attempt to break the rules, the game continues, and you have to give a new valid action.

Failed Attempt

Note that a Failed Attempt means: If a tile is present, put it back on the table and get a negative reward. However, the game continues, so the Episode does not end.

Arguments

These must be specified.

Parameter	Type	Default	Description
`number_of_bots`	int	--	Number of bot opponents (1-6) you want to play against
`render_mode`	str or None	None	Visualization mode: None (training), "human" (display), or "rgb_array" (recording)

Setup

pip install pickomino-env

Usage example

import gymnasium as gym

# Create environment
env = gym.make("Pickomino-v0", render_mode="human", number_of_bots=2)

# Reset and get initial observation
obs, info = env.reset(seed=42)

# Run one episode
terminated = False
truncated = False
total_reward = 0

while not terminated and not truncated:
    # Agent selects action: (dice_face, roll_choice)
    action = env.action_space.sample()  # Random action for demo

    # Step environment
    obs, reward, terminated, truncated, info = env.step(action)
    total_reward += reward

    if truncated:
        print(f"Invalid action: {info['explanation']}")
        break

print(f"Episode finished. Total reward: {total_reward}")
env.close()

Resources

Game Rules: Pickomino Rulebook
Play Online: Maarteen Poirot's Pickomino
Bot Strategy: How to Win at Pickomino
Repository: smallgig/Pickomino
Gymnasium: https://gymnasium.farama.org/

License

MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Website for Pickomino Gymnasium Environment for Reinforcement Learning project.

Please access the repository for the most recent README.md

Pickomino-Env

Description

Action Space

Observation Space

Rewards

Starting State

Episode End

Truncation

Failed Attempt

Arguments

Setup

Usage example

Resources

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

smallgig/pickomino.github.io

Folders and files

Latest commit

History

Repository files navigation

Website for Pickomino Gymnasium Environment for Reinforcement Learning project.

Please access the repository for the most recent README.md

Pickomino-Env

Description

Action Space

Observation Space

Rewards

Starting State

Episode End

Truncation

Failed Attempt

Arguments

Setup

Usage example

Resources

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Packages