Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 36 additions & 40 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,50 +9,46 @@ Our work process included collecting data from the [ALOV300++](http://www.alov30

At this point, we've gotten some encouraging results. Below are 10 randomly selected pairs of starting and ending frames (e.g. one frame after another). The starting frames on the left have the originally given bounding box (green), and the ending frames on the right have the ground truth bounding box (green) as well as the bounding box predicted by our net (red).

<img width="50"\>
<img src="./readme_imgs/start_1.jpg" width="175"\>
<img src="./readme_imgs/end_1.jpg" width="175"\>
<img width="50"\>
<img src="./readme_imgs/start_21.jpg" width="175"\>
<img src="./readme_imgs/end_21.jpg" width="175"\>

<img width="50"\>
<img src="./readme_imgs/start_41.jpg" width="175"\>
<img src="./readme_imgs/end_41.jpg" width="175"\>
<img width="50"\>
<img src="./readme_imgs/start_61.jpg" width="175"\>
<img src="./readme_imgs/end_61.jpg" width="175"\>

<img width="50"\>
<img src="./readme_imgs/start_81.jpg" width="175"\>
<img src="./readme_imgs/end_81.jpg" width="175"\>
<img width="50"\>
<img src="./readme_imgs/start_108.jpg" width="175"\>
<img src="./readme_imgs/end_108.jpg" width="175"\>

<img width="50"\>
<img src="./readme_imgs/start_121.jpg" width="175"\>
<img src="./readme_imgs/end_121.jpg" width="175"\>
<img width="50"\>
<img src="./readme_imgs/start_141.jpg" width="175"\>
<img src="./readme_imgs/end_141.jpg" width="175"\>

<img width="50"\>
<img src="./readme_imgs/start_161.jpg" width="175"\>
<img src="./readme_imgs/end_161.jpg" width="175"\>
<img width="50"\>
<img src="./readme_imgs/start_181.jpg" width="175"\>
<img src="./readme_imgs/end_181.jpg" width="175"\>
![Vid_1 Start](./readme_imgs/start_1.jpg)
![Vid_1 End](./readme_imgs/end_1.jpg)

![Vid 2 Start](./readme_imgs/start_21.jpg)
![Vid 2 End](./readme_imgs/end_21.jpg)

![Vid 3 Start](./readme_imgs/start_41.jpg)
![Vid 3 End](./readme_imgs/end_41.jpg)

![Vid 4 Start](./readme_imgs/start_61.jpg)
![Vid 4 End](./readme_imgs/end_61.jpg)

![Vid 5 Start](./readme_imgs/start_81.jpg)
![Vid 5 End](./readme_imgs/end_81.jpg)

![Vid 6 Start](./readme_imgs/start_108.jpg)
![Vid 6 End](./readme_imgs/end_108.jpg)

![Vid 7 Start](./readme_imgs/start_121.jpg)
![Vid 7 End](./readme_imgs/end_121.jpg)

![Vid 8 Start](./readme_imgs/start_141.jpg)
![Vid 8 End](./readme_imgs/end_141.jpg)

![Vid 9 Start](./readme_imgs/start_161.jpg)
![Vid 9 End](./readme_imgs/end_161.jpg)

![Vid 10 Start](./readme_imgs/start_181.jpg)
![Vid 10 End](./readme_imgs/end_181.jpg)


## Error Analysis

In plotting the actual versus predicted coordinates below for a random sample of 500 images, we can get a sense of how our network is learning. At the top-left we have x0, top right y0, bottom-left x1, and bottom-right x1. These correspond to the upper left corner (x0, y0) and bottom right corner (x1, y1) of the bounding. The kernal density estimates below show that we are on average predicting fairly well (as seems to also be indicated by the images above), but still have some variabilty in how well those predictions are lining up to the ground truth.

<img width="50"\>
<img src="./readme_imgs/x0.png" width="400"\>
<img src="./readme_imgs/y0.png" width="400"\>
<img width="50"\>
<img src="./readme_imgs/x1.png" width="400"\>
<img src="./readme_imgs/y1.png" width="400"\>

![Error X Dimension](./readme_imgs/x0.png)
![Error Y Dimension](./readme_imgs/y0.png)

![Error X Dimension](./readme_imgs/x1.png)
![Error Y Dimension](./readme_imgs/y1.png)

Moving forward, we hope to continue improving the object tracker through alternative architectures and larger networks.