Skip to content

Commit

Permalink
add tutorial for Img-Seq type input
Browse files Browse the repository at this point in the history
  • Loading branch information
yamy-cheng committed Apr 26, 2023
1 parent 847ff9a commit 34e51f3
Show file tree
Hide file tree
Showing 9 changed files with 39 additions and 7 deletions.
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,10 @@
**Segment and Track Anything** is an open-source project that focuses on the segmentation and tracking of any objects in videos, utilizing both automatic and interactive methods. The primary algorithms utilized include the [**SAM** (Segment Anything Models)](https://github.com/facebookresearch/segment-anything) for automatic/interactive key-frame segmentation and the [**DeAOT** (Decoupling features in Associating Objects with Transformers)](https://github.com/yoxu515/aot-benchmark) (NeurIPS2022) for efficient multi-object tracking and propagation. The SAM-Track pipeline enables dynamic and automatic detection and segmentation of new objects by SAM, while DeAOT is responsible for tracking all identified objects.

## :loudspeaker:New Features
- [2023/4/26] **Image-Sequence input**: The WebUI now has a new feature that allows for input of image sequences, which can be used to test video segmentation datasets. Get started with the [tutorial](./tutorial/tutorial%20for%20Image-Sequence%20input.md) for Image-Sequence input.
- [2023/4/25] **Online Demo:** You can easily use SAMTrack in [Colab](https://colab.research.google.com/drive/1R10N70AJaslzADFqb-a5OihYkllWEVxB?usp=sharing) for visual tracking tasks.

- [2023/4/23] **Interactive WebUI:** We have introduced a new WebUI that allows interactive user segmentation through strokes and clicks. Feel free to explore and have fun with the [tutorial](./tutorial/1.0-Version.md)!
- [2023/4/23] **Interactive WebUI:** We have introduced a new WebUI that allows interactive user segmentation through strokes and clicks. Feel free to explore and have fun with the [tutorial](./tutorial/tutorial%20for%20WebUI-1.0-Version.md)!
- [2023/4/24] **Tutorial V1.0:** Check out our new video tutorials!
- YouTube-Link: [Tutorial for Interactively modify single-object mask for first frame of video](https://www.youtube.com/watch?v=DF0iFSsX8KY)[Tutorial for Interactively add object by click](https://www.youtube.com/watch?v=UJvKPng9_DA)[Tutorial for Interactively add object by stroke](https://www.youtube.com/watch?v=m1oFavjIaCM).
- Bilibili Video Link:[Tutorial for Interactively modify single-object mask for first frame of video](https://www.bilibili.com/video/BV1tM4115791/?spm_id_from=333.999.0.0)[Tutorial for Interactively add object by click](https://www.bilibili.com/video/BV1Qs4y1A7d1/)[Tutorial for Interactively add object by stroke](https://www.bilibili.com/video/BV1Lm4y117J4/?spm_id_from=333.999.0.0).
Expand Down Expand Up @@ -87,15 +88,15 @@ python app.py
```
Users can upload the video directly on the UI and use SegTracker to automatically/interactively track objects within that video. We use a video of a man playing basketball as an example.

![Interactive WebUI](./assets/interactive_weiui.jpg)
![Interactive WebUI](./assets/interactive_webui.jpg)

SegTracker-Parameters:
- **aot_model**: used to select which version of DeAOT/AOT to use for tracking and propagation.
- **sam_gap**: used to control how often SAM is used to add newly appearing objects at specified frame intervals. Increase to decrease the frequency of discovering new targets, but significantly improve speed of inference.
- **points_per_side**: used to control the number of points per side used for generating masks by sampling a grid over the image. Increasing the size enhances the ability to detect small objects, but larger targets may be segmented into finer granularity.
- **max_obj_num**: used to limit the maximum number of objects that SAM-Track can detect and track. A larger number of objects necessitates a greater utilization of memory, with approximately 16GB of memory capable of processing a maximum of 255 objects.

Usage: To see the details, please refer to the [tutorial for 1.0-Version WebUI](./tutorial/1.0-Version.md).
Usage: To see the details, please refer to the [tutorial for 1.0-Version WebUI](./tutorial/tutorial%20for%20WebUI-1.0-Version.md).

### :full_moon_with_face:Credits
Licenses for borrowed code can be found in `licenses.md` file.
Expand Down
6 changes: 3 additions & 3 deletions app.py
Original file line number Diff line number Diff line change
Expand Up @@ -272,7 +272,7 @@ def seg_track_app():
with gr.Row():
input_img_seq = gr.File(label='Input Image-Seq').style(height=550)
with gr.Column(scale=0.25):
unzip_button = gr.Button(value="unzip")
extract_button = gr.Button(value="extract")
fps = gr.Slider(label='fps', minimum=5, maximum=50, value=30, step=1)

input_first_frame = gr.Image(label='Segment result of first frame',interactive=True).style(height=550)
Expand Down Expand Up @@ -330,7 +330,7 @@ def seg_track_app():
)

with gr.Row():
with gr.Tab(label="SegTracker Args", scale=0.5):
with gr.Tab(label="SegTracker Args"):
with gr.Row():
# args for tracking in video do segment-everthing
with gr.Column(scale=0.5):
Expand Down Expand Up @@ -414,7 +414,7 @@ def seg_track_app():
]
)

unzip_button.click(
extract_button.click(
fn=get_meta_from_img_seq,
inputs=[
input_img_seq
Expand Down
File renamed without changes
Binary file added tutorial/img/select_fps.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added tutorial/img/switch2ImgSeq.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added tutorial/img/upload_Image_seq.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added tutorial/img/use_exa4ImgSeq.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
31 changes: 31 additions & 0 deletions tutorial/tutorial for Image-Sequence input.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Tutorial for Image-Sequence input

## Zip the Image-Sequence as input for the WebUI.
**The structure of test-data-seq.zip must be like this. Please confirm that the image names are in ascending order.**
```
- test-data-seq
- 0.png
- 1.png
- 2.png
- 3.png
....
- x.png
```

## Use WebUI get test Image-Sequence data
### 1. Switch to the `Image-Seq type input` tab.

<p align="center"><img src="./img/switch2ImgSeq.jpg" width = "600" height = "300" alt="switch2ImgSeq"/> </p>

### 2. Upload the test dataset or use the provided examples directly.
- Once the test dataset has finished uploading, the WebUI will automatically extract the first frame and display it in the `Segment result of first frame` component.
- If you use the provided examples, you may need to manually extract the results by clicking the `extract` button.
- Below are examples of how to upload an Image-sequence data.

<p align="center"><img src="./img/upload_Image_seq.jpg" width = "600" height = "300"> <img src="./img/use_exa4ImgSeq.jpg" width = "600"></p>

### 3. Select fps for the output video

<p align="center"><img src="./img/select_fps.jpg" width = "600" height = "300"> </p>

### 4. You can follow the [tutorial for WebUI-1.0-Version](./tutorial%20for%20WebUI-1.0-Version.md) to obtain your result.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# 1.0-Version WebUI tutorial
# Tutorial for WebUI 1.0 Version

## Note:
- We recommend reinitializing SegTracker by clicking the `Reset button` after processing each video to avoid encountering bugs.
Expand Down

0 comments on commit 34e51f3

Please sign in to comment.