Merge pull request #20 from OpenGVLab/release

wjn922 · web-flow · commit 2bd1a6641d61 · 2025-02-26T13:42:06.000+08:00
Release
diff --git a/VisionLLMv2/docs/data_det.md b/VisionLLMv2/docs/data_det.md
@@ -155,7 +155,7 @@ data/reasonseg
 
 ### COCO
 
-Follow the instructions below to prepare the data:
+Follow the instructions below to prepare the data (We follow the evaluation from [PSALM](https://github.com/zamling/PSALM)):
 
 ```
 # Step 1: Create the data directory
@@ -165,7 +165,7 @@ mkdir -p data/coco && cd data/coco
 wget http://images.cocodataset.org/zips/train2017.zip && unzip train2017.zip
 wget http://images.cocodataset.org/zips/val2017.zip && unzip val2017.zip
 
-# Step 3: Download and place the annotation files
+# Step 3: Download and place the annotation files from PSALM
 # Download the annotation files from official website https://drive.google.com/file/d/1EcC1tl1OQRgIqqy7KFG7JZz2KHujAQB3/view
 
 cd ../..
diff --git a/VisionLLMv2/docs/eval_region-vqa.md b/VisionLLMv2/docs/eval_region-vqa.md
@@ -44,7 +44,7 @@ huggingface-cli download --resume-download --local-dir-use-symlinks False senten
 cd ..
 ```
 
-Specify the subset or full set (`lvis`, `paco`) you would like to evaluate in [visionllmv2/eval/eval_region_classification.py](https://github.com/OpenGVLab/VisionLLM/blob/7befe44a38f874fba6835445dbd0177f0b6b46d9/VisionLLMv2/visionllmv2/eval/eval_region_classification.py#L381).
+Specify the datasets (`lvis`, `paco`) you would like to evaluate in [visionllmv2/eval/eval_region_classification.py](https://github.com/OpenGVLab/VisionLLM/blob/7befe44a38f874fba6835445dbd0177f0b6b46d9/VisionLLMv2/visionllmv2/eval/eval_region_classification.py#L381).
 
 ```
 GPUS=8 bash scripts/vllmv2_7b/eval/dist_eval_region_classification.sh work_dirs/VisionLLMv2
diff --git a/VisionLLMv2/docs/install.md b/VisionLLMv2/docs/install.md
@@ -37,12 +37,13 @@ conda activate vllmv2
 pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
 ```
 
-Then, please refer to [install.sh](https://github.com/OpenGVLab/VisionLLM/blob/release/VisionLLMv2/docs/install.md) to install the necessary packages step by step.
+Then, please refer to [install.sh](https://github.com/OpenGVLab/VisionLLM/blob/release/VisionLLMv2/install.sh) to install the necessary packages step by step.
 
-- Additional:
+- Additionally:
 
-`pycocoevalcap` is used to evaluate the metrics for image/region captioning. You can install it by yourself. 
-For your convenience, you can directly download it and unzip the file.
+`pycocoevalcap` is used to evaluate the metrics for image/region captioning. You can install it by yourself.  
+
+For your convenience, we provide the full folder of pycocoevalcap. You can directly download and use it with the following commands.
 ```
 wget https://huggingface.co/OpenGVLab/VisionLLMv2/resolve/main/data/pycocoevalcap.zip
 unzip -qq pycocoevalcap.zip