This project is a web-based Tic-Tac-Toe game with an option to play against a Q-Learning AI agent. The AI agent uses the Q-Learning algorithm to learn and improve its strategy as it plays the game.
- Play Tic-Tac-Toe against another player or against a Q-Learning AI agent
- Train the AI agent by setting it to play against itself
- Save and load the AI agent's learning progress
- Open the
index.html
file in your web browser - Select the game mode by clicking on the desired mode button (Player vs Player[PvP], Player vs AI[PvC], or AI vs AI[CvC])
- Click on the desired cell on the game board to make a move
- The game will automatically update the board and switch turns
- The game will declare a winner or a tie if the game has ended
- Set the game mode to AI vs AI[CvC]
- The AI will play against itself for the specified number of runs
- An alert will be posted once the AI training is done
- The AI's learning progress can be saved and loaded using the "Save State" and "Load State" buttons
- HTML/CSS/JavaScript for web development
- Q-Learning algorithm for AI training
- The AI agent's learning progress is stored in the
QLearn_StateStore
variable in theQLearning/script_QLearn.js
file. This variable can be modified and saved using the provided "Save State" and "Load State" buttons. - The AI agent's training can be customized by modifying the constants in the
TTT_processr/TTT_main.js
file (e.g. number of runs per training session, feedback scores for different outcomes).
The QLearn class is a JavaScript implementation of the Q-Learning algorithm. It is used to train the AI agent in this Tic-Tac-Toe game project. The QLearn class has the following methods:
chooseState(SName, StateAction, addtoShiftChain=true)
: This method chooses an action from theStateAction
list, given a state represented bySName
. IfaddtoShiftChain
is set to true, the chosen action will also be added to the QLearn object's state chain. The chosen action is selected based on the probabilities stored in theQLearn_StateStore
global variable, which is a matrix of state-action pairs and their corresponding probabilities.feedback(StateChain_Score)
: This method updates the probabilities in theQLearn_StateStore
based on the feedback score provided inStateChain_Score
. The feedback score is used to adjust the probabilities of the actions in the QLearn object's state chain in proportion to their relative distance from the end of the chain. The probabilities are also normalized after being updated.addtoStateChain(stateName, ActList, ActionSelected)
: This method adds an entry to the QLearn object's state chain, consisting of the state represented bystateName
and the action represented byActionSelected
. If the state represented bystateName
does not exist in theQLearn_StateStore
, it is added to theQLearn_StateStore
with equal probabilities for all actions inActList
.clearStateChain()
: This method clears the QLearn object's state chain.