update readme

OFA-Sys · Jul 25, 2022 · 879fb1b · 879fb1b
1 parent 05975e6
commit 879fb1b
Show file tree

Hide file tree

Showing 2 changed files with 65 additions and 43 deletions.
diff --git a/README.md b/README.md
@@ -39,19 +39,14 @@ OFA is a unified multimodal pretrained model that unifies modalities (i.e., cros
 (e.g., image generation, visual grounding, image captioning, image classification, text generation, etc.) 
 to a simple sequence-to-sequence learning framework. For more information, please refer to our paper: [OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework](http://arxiv.org/abs/2202.03052).
 
-We welcome contributions to our project. Feel free to contact us or send us issues/PRs!
-<br></br>
-
-
-# Online Demos
-We provide online demo via Hugging Face Spaces for you to interact with our pretrained and finetuned models. Below are the links to the demos:
-* [Generic Interface](https://huggingface.co/spaces/OFA-Sys/OFA-Generic_Interface)
-* [Image Captioning](https://huggingface.co/spaces/OFA-Sys/OFA-Image_Caption)
-* [Text-to-Image Generation](https://huggingface.co/spaces/OFA-Sys/OFA-Text2Image_Generation)
-* [Visual Grounding](https://huggingface.co/spaces/OFA-Sys/OFA-Visual_Grounding)
-* [Visual Question Answering](https://huggingface.co/spaces/OFA-Sys/OFA-Visual_Question_Answering)
-
-Also we provide Colab notebooks for you to better perceive the procedures. Click [here](colab.md) to check them out!
+In the following, we provide:
+* News about our recent updates;
+* Online Demos with links to Huggingface spaces and Colab notebooks;
+* Model card (including official release of pretrained checkpoints (more can be found at [checkpoints.md](checkpoints.md)), and we also provide checkpoints for Huggingface Transformers on [https://huggingface.co/OFA-Sys](https://huggingface.co/OFA-Sys)) and experimental results of OFA models of different sizes;
+* Step-by-step instructions of pretraining and finetuning (including almost all tasks presented in the paper);
+* Case demonstration of OFA.
+
+We sincerely welcome contributions to our project. Feel free to contact us or send us issues / PRs!
 <br></br>
 
 
@@ -82,6 +77,18 @@ Also we provide Colab notebooks for you to better perceive the procedures. Click
 <br></br>
 
 
+# Online Demos
+We provide online demo via Hugging Face Spaces for you to interact with our pretrained and finetuned models. Below are the links to the demos:
+* [Generic Interface](https://huggingface.co/spaces/OFA-Sys/OFA-Generic_Interface)
+* [Image Captioning](https://huggingface.co/spaces/OFA-Sys/OFA-Image_Caption)
+* [Text-to-Image Generation](https://huggingface.co/spaces/OFA-Sys/OFA-Text2Image_Generation)
+* [Visual Grounding](https://huggingface.co/spaces/OFA-Sys/OFA-Visual_Grounding)
+* [Visual Question Answering](https://huggingface.co/spaces/OFA-Sys/OFA-Visual_Question_Answering)
+
+Also we provide Colab notebooks for you to better perceive the procedures. Click [here](colab.md) to check them out!
+<br></br>
+
+
 # Model Card
 We list the parameters and pretrained checkpoints of OFAs below. For finetuned checkpoints, please refer to [checkpoints.md](checkpoints.md). 
 
@@ -159,7 +166,50 @@ pip install -r requirements.txt
 See [datasets.md](datasets.md) and [checkpoints.md](checkpoints.md).
 <br></br>
 
-# Pretraining
+# Training & Inference
+Below we provide methods for training and inference on different tasks. We provide both pretrained OFA-Large and OFA-Base in [checkpoints.md](checkpoints.md). The scripts mentioned in this section are prepared for OFA-Large. For reproducing the downstreaming results of OFA-Base, we have also provided the corresponding finetuning and inference scripts for OFA-Base in the `run_scripts/` folder.
+
+We recommend that your workspace directory should be organized like this: 
+```
+OFA/
+├── checkpoints/
+│   ├── ofa_base.pt
+│   ├── ofa_large.pt
+│   ├── caption_large_best_clean.pt
+│   └── ...
+├── criterions/
+├── data/
+├── dataset/
+│   ├── caption_data/
+│   ├── gigaword_data/
+│   └── ...
+├── fairseq/
+├── models/
+├── run_scripts/
+├── tasks/
+├── train.py
+├── trainer.py
+└── utils/
+```
+
+
+## Image Processing
+To ensure the efficiency of processing data, we did not store images with small files, but instead we encode them to base64 strings.
+Transforming image files to base64 strings is simple. Run the following code:
+```python
+from PIL import Image
+from io import BytesIO
+import base64
+
+img = Image.open(file_name) # path to file
+img_buffer = BytesIO()
+img.save(img_buffer, format=img.format)
+byte_data = img_buffer.getvalue()
+base64_str = base64.b64encode(byte_data) # bytes
+base64_str = base64_str.decode("utf-8") # str
+```
+
+## Pretraining
 Below we provide methods for pretraining OFA.
 
 <details>
@@ -194,34 +244,6 @@ INFO: Loaded checkpoint ../../checkpoints/ofa_large.pt
 </pre>
 </details>
 
-<br></br>
-
-# Finetuning & Inference
-Below we provide methods for finetuning and inference on different downstream tasks. We provide both pretrained OFA-Large and OFA-Base in [checkpoints.md](checkpoints.md). The scripts mentioned in this section are prepared for OFA-Large. For reproducing the downstreaming results of OFA-Base, we have also provided the corresponding finetuning and inference scripts for OFA-Base in the `run_scripts/` folder.
-
-We recommend that your workspace directory should be organized like this: 
-```
-OFA/
-├── checkpoints/
-│   ├── ofa_base.pt
-│   ├── ofa_large.pt
-│   ├── caption_large_best_clean.pt
-│   └── ...
-├── criterions/
-├── data/
-├── dataset/
-│   ├── caption_data/
-│   ├── gigaword_data/
-│   └── ...
-├── fairseq/
-├── models/
-├── run_scripts/
-├── tasks/
-├── train.py
-├── trainer.py
-└── utils/
-```
-
 ## Image Captioning
 We provide procedures to reproduce our results of image captioning on our paper below.
 <details>

diff --git a/checkpoints.md b/checkpoints.md
@@ -1,6 +1,6 @@
 # Checkpoints
 
-We provide links for you to download our checkpoints. We will release all the checkpoints including pretrained and finetuned models on different tasks. 
+We provide links for you to download our checkpoints, including pretrained and finetuned models on different tasks. If you would like to use OFA with Transformers, please download checkpoints at [https://huggingface.co/OFA-Sys](https://huggingface.co/OFA-Sys), and check the code in the branch `feature/add_transformers`. 
 
 ## Pretraining
 * <a href="https://ofa-beijing.oss-cn-beijing.aliyuncs.com/checkpoints/ofa_huge.pt"> Pre-trained checkpoint (OFA-Huge) </a> (~930M parameters)