#Behavioral Cloning The idea of this project is to clone the human behavior to learn how to steer the car in a simulated track. The only information given to the model is a front view of the vehicle and the output expected is the steering angle of the wheel.
In order to meet this goal, it is necessary to train a Convolutional Neural Network, so it learns what to do in each scenario given a dataset with a bunch of images and a steering angles related to each one.
The figure bellow presents an image of both tracks of the simulator:
##Resources There are a few files needed to run the Behavioral Cloning project.
The simulator contains two tracks. Sample driving data for the first track is included bellow, which can optionally be used to help train the network. It is also possible to collect the data using the record button on the simulator.
###Simulator Download
###Beta Simulators
The dataset, provided by Udacity, found in this link, contains the following data:
- Folder with 8.036 simulation images, showing the center, left and right camera view of the road, tantalizing 24.108 images
- File driving_log.csv containing a list describing all the images with the following information
- Center image path
- Left image path
- Right image path
- Steering angle
- Throttle
- Brake
- Speed
Bellow is an example of the images used to train the CNN, it is also shown how the steering angle is adjusted based on the image.
The image bellow presents the histogram of the given dataset, here is possible to notice that the number of images with steering angle equal to zero is much more representative.
In order to have a more balanced dataset, it is necessary to eliminate good part of the zero angle steering examples. It was decided to consider only 15% of the total number, and the result is presented bellow.
In order to improve the learning task and make it more robust, it is necessary to augment the dataset, so more data is artificially generated based only on the given ones.
The following augmentation is used in this project:
- Flip
- Change image brightness
- Rotate
- Translate
- Shadow
- Shear
- Crop
Examples of each transformation will be presented bellow.
###Flip
In order to have a balanced dataset, it is useful to flip each image randomly, also inverting the sign of the steering angle.
###Change image brightness
It is useful to change the image brightness in order to make the model learn how to generalize from a day to a rainy day or at night, for example. This can be achieved changing the V value of the converted image to HSV.
###Rotate
It is also possible to generate sloping angles, so the model learns how to generalize to these cases.
Translating the image randomly makes it possible to generate even more data in different positions of the road, adding a proportional factor of this translation to the steering angle.
Shading randomly an image makes it more robust to shadows on the track, such as a tree, wires or poles.
Shearing the image is also usefull, once it is possible to generate more data with the ones we already have, change the borders that the vehicle does not need to learn.
In order to minimize the number of parameters of our CNN, it is possible to crop some unnecessary parts of the image, including the bottom, top and some few pixels on the sides.
The image bellow shows an example of a composed treatment of an image.
This project was tested using two different architectures, CommaAI and NVIDIA. Botch were trained using the same configuration (learning rate, optimizer, number of epochs, samples per epoch and augmentation) the only thing that was really changed was the model.
- Learning rate: 1e-3
- Optimizer: Adam
- Number of epochs: 20
- Samples per epoch: 20000
- Batch size: 50
- Validation split: 30%
Number total of trainable parameters: 2,116,983
Number total of trainable parameters: 592,497
Bellow is presented a video result running on the same track where the CNN was trained (Track 1). It was also tested on a track never seen before (Track 2) in order to prove that the model learns how to generalize to different tracks and conditions.
The task of adjusting the parameters, in order to get a satisfactory result is really difficult. Besides defining the architecture parameters, various other factors influence on the result, such as augmentation and dataset balance.
For this task it is important to have a good computer in order to train the model faster. On my computer, with a NVIDIA GeForce GT 730M it takes about 20 minutes to train, what, is a little bit frustrating.