Skip to content

In this machine learning experiment, I have analyzed the effect of using floating point numbers instead of integers in AI actions.

Notifications You must be signed in to change notification settings

limeraiin/unity-machine-learning-test-comparison

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

Continuous Vs. Discrete Actions in AI Behaviour

In this machine learning experiment, I have analyzed the effect of using floating point numbers instead of integers in AI actions.

About the Experiment

I have created a very basic machine learning scenario. The AI has to prevent colliding with obstacles and the ceiling. It gets a little reward for each successful dodging and a big punishment for every colliding.

Used tools: Pytorch, ML Agents (1.1.0-preview 3), Tensorboard, Unity3D.

Test field:

drawing

8 different playgrounds are used to speed up the learning process. Children of each agent has a vertically attached "Ray Perception Sensor" component. There have been no observations other than that.

There are rewarding walls that agents have to collide with behind every obstacle.
(Obsticles are randomly spawned)

Results

Both tests ran for one million steps. Blue and orange lines represent the mean reward for float and integer actions of the AI respectively.
(Each reward has a value of 0.1f, punishments have -1f)

drawing

It was pretty hard to control the jumps even while the agent was in human control. As can be seen above, using float data type for AI actions shortened the learning period dramatically. After around 800k steps, test runs started to complete with little to no mistakes but they didn't contribute to the mean reward graph.

The mean loss of the value update correlates to how well the model is able to predict the value of each state. This should increase while the agent is learning, and then decrease once the reward stabilizes.

drawing

Working Example

Running a test using the brain of the AI with float type actions after 1 million learning step:

test3

test4

About

In this machine learning experiment, I have analyzed the effect of using floating point numbers instead of integers in AI actions.

Topics

Resources

Stars

Watchers

Forks

Languages