Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
logicwong committed Jul 25, 2022
1 parent 05975e6 commit 879fb1b
Show file tree
Hide file tree
Showing 2 changed files with 65 additions and 43 deletions.
106 changes: 64 additions & 42 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,19 +39,14 @@ OFA is a unified multimodal pretrained model that unifies modalities (i.e., cros
(e.g., image generation, visual grounding, image captioning, image classification, text generation, etc.)
to a simple sequence-to-sequence learning framework. For more information, please refer to our paper: [OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework](http://arxiv.org/abs/2202.03052).

We welcome contributions to our project. Feel free to contact us or send us issues/PRs!
<br></br>


# Online Demos
We provide online demo via Hugging Face Spaces for you to interact with our pretrained and finetuned models. Below are the links to the demos:
* [Generic Interface](https://huggingface.co/spaces/OFA-Sys/OFA-Generic_Interface)
* [Image Captioning](https://huggingface.co/spaces/OFA-Sys/OFA-Image_Caption)
* [Text-to-Image Generation](https://huggingface.co/spaces/OFA-Sys/OFA-Text2Image_Generation)
* [Visual Grounding](https://huggingface.co/spaces/OFA-Sys/OFA-Visual_Grounding)
* [Visual Question Answering](https://huggingface.co/spaces/OFA-Sys/OFA-Visual_Question_Answering)

Also we provide Colab notebooks for you to better perceive the procedures. Click [here](colab.md) to check them out!
In the following, we provide:
* News about our recent updates;
* Online Demos with links to Huggingface spaces and Colab notebooks;
* Model card (including official release of pretrained checkpoints (more can be found at [checkpoints.md](checkpoints.md)), and we also provide checkpoints for Huggingface Transformers on [https://huggingface.co/OFA-Sys](https://huggingface.co/OFA-Sys)) and experimental results of OFA models of different sizes;
* Step-by-step instructions of pretraining and finetuning (including almost all tasks presented in the paper);
* Case demonstration of OFA.

We sincerely welcome contributions to our project. Feel free to contact us or send us issues / PRs!
<br></br>


Expand Down Expand Up @@ -82,6 +77,18 @@ Also we provide Colab notebooks for you to better perceive the procedures. Click
<br></br>


# Online Demos
We provide online demo via Hugging Face Spaces for you to interact with our pretrained and finetuned models. Below are the links to the demos:
* [Generic Interface](https://huggingface.co/spaces/OFA-Sys/OFA-Generic_Interface)
* [Image Captioning](https://huggingface.co/spaces/OFA-Sys/OFA-Image_Caption)
* [Text-to-Image Generation](https://huggingface.co/spaces/OFA-Sys/OFA-Text2Image_Generation)
* [Visual Grounding](https://huggingface.co/spaces/OFA-Sys/OFA-Visual_Grounding)
* [Visual Question Answering](https://huggingface.co/spaces/OFA-Sys/OFA-Visual_Question_Answering)

Also we provide Colab notebooks for you to better perceive the procedures. Click [here](colab.md) to check them out!
<br></br>


# Model Card
We list the parameters and pretrained checkpoints of OFAs below. For finetuned checkpoints, please refer to [checkpoints.md](checkpoints.md).

Expand Down Expand Up @@ -159,7 +166,50 @@ pip install -r requirements.txt
See [datasets.md](datasets.md) and [checkpoints.md](checkpoints.md).
<br></br>

# Pretraining
# Training & Inference
Below we provide methods for training and inference on different tasks. We provide both pretrained OFA-Large and OFA-Base in [checkpoints.md](checkpoints.md). The scripts mentioned in this section are prepared for OFA-Large. For reproducing the downstreaming results of OFA-Base, we have also provided the corresponding finetuning and inference scripts for OFA-Base in the `run_scripts/` folder.

We recommend that your workspace directory should be organized like this:
```
OFA/
├── checkpoints/
│   ├── ofa_base.pt
│   ├── ofa_large.pt
│   ├── caption_large_best_clean.pt
│   └── ...
├── criterions/
├── data/
├── dataset/
│   ├── caption_data/
│   ├── gigaword_data/
│   └── ...
├── fairseq/
├── models/
├── run_scripts/
├── tasks/
├── train.py
├── trainer.py
└── utils/
```


## Image Processing
To ensure the efficiency of processing data, we did not store images with small files, but instead we encode them to base64 strings.
Transforming image files to base64 strings is simple. Run the following code:
```python
from PIL import Image
from io import BytesIO
import base64

img = Image.open(file_name) # path to file
img_buffer = BytesIO()
img.save(img_buffer, format=img.format)
byte_data = img_buffer.getvalue()
base64_str = base64.b64encode(byte_data) # bytes
base64_str = base64_str.decode("utf-8") # str
```

## Pretraining
Below we provide methods for pretraining OFA.

<details>
Expand Down Expand Up @@ -194,34 +244,6 @@ INFO: Loaded checkpoint ../../checkpoints/ofa_large.pt
</pre>
</details>

<br></br>

# Finetuning & Inference
Below we provide methods for finetuning and inference on different downstream tasks. We provide both pretrained OFA-Large and OFA-Base in [checkpoints.md](checkpoints.md). The scripts mentioned in this section are prepared for OFA-Large. For reproducing the downstreaming results of OFA-Base, we have also provided the corresponding finetuning and inference scripts for OFA-Base in the `run_scripts/` folder.

We recommend that your workspace directory should be organized like this:
```
OFA/
├── checkpoints/
│   ├── ofa_base.pt
│   ├── ofa_large.pt
│   ├── caption_large_best_clean.pt
│   └── ...
├── criterions/
├── data/
├── dataset/
│   ├── caption_data/
│   ├── gigaword_data/
│   └── ...
├── fairseq/
├── models/
├── run_scripts/
├── tasks/
├── train.py
├── trainer.py
└── utils/
```

## Image Captioning
We provide procedures to reproduce our results of image captioning on our paper below.
<details>
Expand Down
2 changes: 1 addition & 1 deletion checkpoints.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Checkpoints

We provide links for you to download our checkpoints. We will release all the checkpoints including pretrained and finetuned models on different tasks.
We provide links for you to download our checkpoints, including pretrained and finetuned models on different tasks. If you would like to use OFA with Transformers, please download checkpoints at [https://huggingface.co/OFA-Sys](https://huggingface.co/OFA-Sys), and check the code in the branch `feature/add_transformers`.

## Pretraining
* <a href="https://ofa-beijing.oss-cn-beijing.aliyuncs.com/checkpoints/ofa_huge.pt"> Pre-trained checkpoint (OFA-Huge) </a> (~930M parameters)
Expand Down

0 comments on commit 879fb1b

Please sign in to comment.