Skip to content

ronenh24/farkle_simulation

Repository files navigation

Farkle Simulation

Author

Ronen Huang

Time Frame

November 2021 to December 2021 (modified for Python usage), January 2025 to Present (add deep Q-Learning).

Install Farkle Simulation

The Farkle Simulation can be downloaded by pip.

pip install farkle-simulation

Farkle

The rules of Farkle can be seen in the In a Nutshell section of https://farkle.games/official-rules/. In this case, there are two players with Player 1 as the first player and Player 2 as the second player.

The scoring system can be seen in the table below

Dice to Keep Score
Three 1s or Straight 1000
Three Different Pairs 750
Three 6s 600
Three 5s 500
Three 4s 400
Three 3s 300
Three 2s 200
One 1 100
One 5 50

A player begins their turn with all six dice.

  • If there are scoring combinations, a player can choose any and
    • Either keep rolling or stop
    • When number of dice is 0, it resets to 6
  • Otherwise, a player has "farkled" and the turn score is 0

Once a player reaches 10,000 points, they have won the game.

Strategies

The implementation of the strategies is in strategy.py.

The strategies that Player 1 and Player 2 can use are naive strategy, simple RL strategy, and custom strategy (manual). The turn function goes through one turn.

from farkle_simulation.components.strategy import\
  NaiveStrategy, SimpleRLStrategy, CustomStrategy

naive_strategy = NaiveStrategy()
simple_rl_strategy = SimpleRLStrategy()
custom_strategy = CustomStrategy()

naive_turn_score = naive_strategy.turn(current_score, advantage)
simple_rl_turn_score = simple_rl_strategy.turn(current_score, advantage)
custom_turn_score = custom_strategy.turn(current_score, advantage)

Custom Strategy

The player chooses the action each roll via typing 1 to 7 on the command line.

Naive Strategy

For each roll

  • Choose action that maximizes roll score
  • Stop if number of dice is less than or equal to 2 and disadvantage of less than 1,000.

Simple Reinforcment Learning Strategy

For each roll

  • Choose action that maximizes reward based on
    • Number of Rolls
      # The gamma is reward factor decrease by roll.
      actual_reward = gamma ** np.log2(num_rolls + 1) * roll_score
    • Distance to 10,000 and Current Score
      # Increases as close to 10,000.
      distance_factor = distance_scale ** (current_score / 10000)
    • Advantage
      # Increase as more behind.
      advantage_factor = 1
      if advantage < 0:
          advantage_factor *= max(
            np.emath.logn(advantage_scale, -advantage), 1
          )

The architecture of the deep Q-learning networks takes 11 dimensional input state

  • Distance
  • Advantage
  • Current Turn Score
  • Roll Maxes

and predicts reward for each of 7 possible actions

  • Keep 1 to 6 dice
  • Stop

This can be seen in architecture.py.

The training process can be seen at simple_farkle_rl.py. The best model state dictionary is saved as simple_action_reward_state_dict.pt.

from farkle_simulation.components.simple_farkle_rl import train

train()

The plot of turn score by current score can be seen in training_simple_rl.jpg and the table in training_simple_rl.csv. Training Simple RL

Simulation

The implementation of the simulation is in simulation.py.

The convergence_plot function plots the expected probability Player 1 wins using some strategy with Player 2 using some other strategy.

from farkle_simulation.components.simulation import convergence_plot

convergence_plot(naive_strategy, simple_rl_strategy)
convergence_plot(simple_rl_strategy, naive_strategy)

The convergence plots can be seen at convergence_naive_simple_rl.jpg Convergence Plot Naive vs Simple RL and convergence_simple_rl_naive.jpg Convergence Plot Simple RL vs Naive

The histogram function plots the average probability Player 1 wins using some strategy with Player 2 using some other strategy.

from farkle_simulation.components.simulation import histogram

histogram(naive_strategy, simple_rl_strategy)
histogram(simple_rl_strategy, naive_strategy)

The histograms can be seen at histogram_naive_simple_rl.jpg Histogram Naive vs Simple RL and histogram_simple_rl_naive.jpg Histogram Simple RL vs Naive

Pipeline

To train the simple RL agent and compare the strategies through convergence plot and histogram.

from farkle_simulation.pipeline import train_simulate

train_simulate()

To play a game against the simple RL agent.

from farkle_simulation.pipeline import play_game

play_game()

Input 1 to play as first player and 2 to play as second player.

Player 1 (1) or Player (2)? 1

Input action to chose for dice combination.

Dice Combination - (1, 1, 1, 1, 1, 4)
Legal Moves - (1) Keep 1 - max 100 remaining dice 5, (2) Keep 2 - max 200 remaining dice 4, (3) Keep 3 - max 1000 remaining dice 3, (4) Keep 4 - max 1100 remaining dice 2, (5) Keep 5 - max 1200 remaining dice 1, (7) No Roll - max 1200 remaining dice 0
Action - 7

The output of action is.

Current Turn Score - 1200
Current Score - 1200
Turn Score - 1200
Player 1 Score - 1200 Player 2 Score - 0

The output of a "farkle" is.

Dice Combination - (2, 6)
Farkled
Turn Score - 0

References

Farkle Official Rules. (2025). Retrieved from https://farkle.games/official-rules/

Le Cam, L. (1986). The Central Limit Theorem Around 1935. Statistical Science, 1(1), 78–91. Retrieved from http://www.jstor.org/stable/2245503

Mnih, V., Kavukcuoglu, K., Silver, D. et al. (2015). Human-level control through deep reinforcement learning. Nature 518, 529–533. Retrieved from https://doi.org/10.1038/nature14236

About

Simulating two-player Farkle with different strategies.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages