Rename Supported-models-datasets.md to s-m-a-d

modelscope · Jintao-Huang · Dec 7, 2024 · Dec 6, 2024 · Dec 6, 2024 · Dec 6, 2024
commit a17ebf1f33fd678090023acc4ec30b8f02a32db3
diff --git a/README.md b/README.md
@@ -155,7 +155,7 @@ You can contact us and communicate with us by adding our group:
 - 🔥2024.04.11: Support Model Evaluation with MMLU/ARC/CEval datasets(also user custom eval datasets) with one command! Check [this documentation](docs/source_en/Instruction/LLM-eval.md) for details. Meanwhile, we support a trick way to do multiple ablation experiments, check [this documentation](docs/source_en/Instruction/LLM-exp.md) to use.
 - 🔥2024.04.11: Support **c4ai-command-r** series: c4ai-command-r-plus, c4ai-command-r-v01, use [this script](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/c4ai_command_r_plus/lora_mp/sft.sh) to train.
 - 2024.04.10: Use SWIFT to fine-tune the qwen-7b-chat model to enhance its function call capabilities, and combine it with [Modelscope-Agent](https://github.com/modelscope/modelscope-agent) for best practices, which can be found [here](https://github.com/modelscope/swift/tree/main/docs/source_en/LLM/Agent-best-practice.md#Usage-with-Modelscope_Agent).
-- 🔥2024.04.09: Support ruozhiba dataset. Search `ruozhiba` in [this documentation](docs/source_en/Instruction/Supported-models-datasets.md) to begin training!
+- 🔥2024.04.09: Support ruozhiba dataset. Search `ruozhiba` in [this documentation](docs/source_en/Instruction/Supported-models-and-datasets) to begin training!
 - 2024.04.08: Support the fine-tuning and inference of XVERSE-MoE-A4.2B model, use [this script](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/xverse_moe_a4_2b/lora/sft.sh) to start training!
 - 2024.04.04: Support **QLoRA+FSDP** to train a 70B model with two 24G memory GPUs, use [this script](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/llama2_70b_chat/qlora_fsdp/sft.sh) to train.
 - 🔥2024.04.03: Support **Qwen1.5-32B** series: Qwen1.5-32B, Qwen1.5-32B-Chat, Qwen1.5-32B-Chat-GPTQ-Int4.use [this script](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/qwen1half_32b_chat/lora_mp/sft.sh) to start training!
@@ -586,7 +586,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
 ```
 
 ### Supported Models
-The complete list of supported models and datasets can be found at [Supported Models and Datasets List](docs/source_en/Instruction/Supported-models-datasets.md).
+The complete list of supported models and datasets can be found at [Supported Models and Datasets List](docs/source_en/Instruction/Supported-models-and-datasets.md).
 
 #### LLMs
 

diff --git a/docs/source_en/Instruction/Commend-line-parameters.md b/docs/source_en/Instruction/Commend-line-parameters.md
@@ -42,7 +42,7 @@ The introduction to command line parameters will cover basic parameters, atomic
 
 ### Template Parameters
 
-- 🔥template: Template type, default uses the corresponding template type of the model. If it is a custom model, please refer to [Supported Models and Datasets](./Supported-models-datasets.md) and manually input this field.
+- 🔥template: Template type, default uses the corresponding template type of the model. If it is a custom model, please refer to [Supported Models and Datasets](./Supported-models-and-datasets) and manually input this field.
 - 🔥system: Custom system field, default is None, uses the default system of the template.
 - 🔥max_length: Maximum length of tokens for a single sample, default is None (no limit).
 - truncation_strategy: How to handle overly long tokens, supports `delete` and `left`, representing deletion and left trimming, default is left.

diff --git a/.../Instruction/Supported-models-datasets.md → ...truction/Supported-models-and-datasets.md b/.../Instruction/Supported-models-datasets.md → ...truction/Supported-models-and-datasets.md
@@ -1,4 +1,3 @@
-<<<<<<< HEAD:docs/source_en/Instruction/Supported-models-datasets.md
 # Supported models and datasets
 
 ## Models
@@ -1155,7 +1154,6 @@ The table below provides information about the datasets integrated with Swift:
 - Dataset Size: Size of the subdata set
 - Statistic: Statistics of the dataset. We use the number of tokens for statistics, which helps adjust the `max_length` hyperparameter. We concatenate the training and validation sets and then perform statistics. We use Qwen's tokenizer for tokenization. Different tokenizers produce different statistics. If you need token statistics for other models' tokenizers, you can obtain them yourself through [the script](https://github.com/modelscope/swift/tree/main/scripts/utils/run_dataset_info.py).
 - Tags: Tags of the dataset
->>>>>>> e0e1aaa0 (Translation):docs/source_en/Instruction/Supported Models and Datasets.md
 
 
 | Dataset ID | Subset name | Dataset Size | Statistic (token) | Tags | HF Dataset ID |

diff --git a/scripts/utils/run_dataset_info.py b/scripts/utils/run_dataset_info.py
@@ -75,7 +75,7 @@ def run_dataset(key, template, cache_mapping):
 
 
 def write_dataset_info() -> None:
-    fpaths = ['docs/source/Instruction/支持的模型和数据集.md', 'docs/source_en/Instruction/Supported-models-datasets.md']
+    fpaths = ['docs/source/Instruction/支持的模型和数据集.md', 'docs/source_en/Instruction/Supported-models-and-datasets.md']
     cache_mapping = get_cache_mapping(fpaths[0])
     res_text_list = []
     res_text_list.append('| Dataset ID | Subset name | Dataset Size | Statistic (token) | Tags | HF Dataset ID |')

diff --git a/scripts/utils/run_model_info.py b/scripts/utils/run_model_info.py
@@ -4,7 +4,7 @@
 
 
 def get_model_info_table():
-    fpaths = ['docs/source/Instruction/支持的模型和数据集.md', 'docs/source_en/Instruction/Supported-models-datasets.md']
+    fpaths = ['docs/source/Instruction/支持的模型和数据集.md', 'docs/source_en/Instruction/Supported-models-and-datasets.md']
     end_words = [['### 多模态大模型', '## 数据集'], ['### MLLM', '## Datasets']]
     result = [
         '| Model ID | Model Type | Default Template | '