Continuous Vs. Discrete Actions in AI Behaviour

In this machine learning experiment, I have analyzed the effect of using floating point numbers instead of integers in AI actions.

About the Experiment

I have created a very basic machine learning scenario. The AI has to prevent colliding with obstacles and the ceiling. It gets a little reward for each successful dodging and a big punishment for every colliding.

Used tools: Pytorch, ML Agents (1.1.0-preview 3), Tensorboard, Unity3D.

Test field:

8 different playgrounds are used to speed up the learning process. Children of each agent has a vertically attached "Ray Perception Sensor" component. There have been no observations other than that.

There are rewarding walls that agents have to collide with behind every obstacle.
(Obsticles are randomly spawned)

Results

Both tests ran for one million steps. Blue and orange lines represent the mean reward for float and integer actions of the AI respectively.
(Each reward has a value of 0.1f, punishments have -1f)

It was pretty hard to control the jumps even while the agent was in human control. As can be seen above, using float data type for AI actions shortened the learning period dramatically. After around 800k steps, test runs started to complete with little to no mistakes but they didn't contribute to the mean reward graph.

The mean loss of the value update correlates to how well the model is able to predict the value of each state. This should increase while the agent is learning, and then decrease once the reward stabilizes.

Working Example

Running a test using the brain of the AI with float type actions after 1 million learning step:

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Assets/Project		Assets/Project
ProjectSettings		ProjectSettings
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Continuous Vs. Discrete Actions in AI Behaviour

About the Experiment

Test field:

Results

Working Example

About

Languages

limeraiin/unity-machine-learning-test-comparison

Folders and files

Latest commit

History

Repository files navigation

Continuous Vs. Discrete Actions in AI Behaviour

About the Experiment

Test field:

Results

Working Example

About

Topics

Resources

Stars

Watchers

Forks

Languages