Skip to content

Commit 01f49fa

Browse files
committed
v1
1 parent 88de445 commit 01f49fa

File tree

32 files changed

+154350
-19
lines changed

32 files changed

+154350
-19
lines changed

saves/v1/README.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
---
2+
license: other
3+
library_name: peft
4+
tags:
5+
- llama-factory
6+
- lora
7+
- generated_from_trainer
8+
base_model: /root/autodl-tmp/qwen/Qwen-7B-Chat
9+
model-index:
10+
- name: v1
11+
results: []
12+
---
13+
14+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
15+
should probably proofread and complete it, then remove this comment. -->
16+
17+
# v1
18+
19+
This model is a fine-tuned version of [/root/autodl-tmp/qwen/Qwen-7B-Chat](https://huggingface.co//root/autodl-tmp/qwen/Qwen-7B-Chat) on the ft_data_train and the ft_data_test datasets.
20+
It achieves the following results on the evaluation set:
21+
- Loss: 0.6572
22+
23+
## Model description
24+
25+
More information needed
26+
27+
## Intended uses & limitations
28+
29+
More information needed
30+
31+
## Training and evaluation data
32+
33+
More information needed
34+
35+
## Training procedure
36+
37+
### Training hyperparameters
38+
39+
The following hyperparameters were used during training:
40+
- learning_rate: 3e-05
41+
- train_batch_size: 1
42+
- eval_batch_size: 1
43+
- seed: 42
44+
- distributed_type: multi-GPU
45+
- gradient_accumulation_steps: 4
46+
- total_train_batch_size: 4
47+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48+
- lr_scheduler_type: cosine
49+
- lr_scheduler_warmup_ratio: 0.05
50+
- num_epochs: 2.0
51+
- mixed_precision_training: Native AMP
52+
53+
### Training results
54+
55+
| Training Loss | Epoch | Step | Validation Loss |
56+
|:-------------:|:------:|:----:|:---------------:|
57+
| 0.7559 | 0.2949 | 200 | 0.7451 |
58+
| 0.7838 | 0.5898 | 400 | 0.7202 |
59+
| 0.9792 | 0.8846 | 600 | 0.7065 |
60+
| 0.5938 | 1.1795 | 800 | 0.6733 |
61+
| 0.602 | 1.4744 | 1000 | 0.6753 |
62+
| 0.5666 | 1.7693 | 1200 | 0.6572 |
63+
64+
65+
### Framework versions
66+
67+
- PEFT 0.10.0
68+
- Transformers 4.40.1
69+
- Pytorch 2.1.2+cu121
70+
- Datasets 2.18.0
71+
- Tokenizers 0.19.1

saves/v1/adapter_config.json

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
{
2+
"alpha_pattern": {},
3+
"auto_mapping": null,
4+
"base_model_name_or_path": "/root/autodl-tmp/qwen/Qwen-7B-Chat",
5+
"bias": "none",
6+
"fan_in_fan_out": false,
7+
"inference_mode": true,
8+
"init_lora_weights": true,
9+
"layer_replication": null,
10+
"layers_pattern": null,
11+
"layers_to_transform": null,
12+
"loftq_config": {},
13+
"lora_alpha": 16,
14+
"lora_dropout": 0.0,
15+
"megatron_config": null,
16+
"megatron_core": "megatron.core",
17+
"modules_to_save": null,
18+
"peft_type": "LORA",
19+
"r": 8,
20+
"rank_pattern": {},
21+
"revision": null,
22+
"target_modules": [
23+
"c_proj",
24+
"c_attn",
25+
"w2",
26+
"w1"
27+
],
28+
"task_type": "CAUSAL_LM",
29+
"use_dora": false,
30+
"use_rslora": false
31+
}

saves/v1/adapter_model.safetensors

34.2 MB
Binary file not shown.

saves/v1/all_results.json

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
{
2+
"epoch": 1.9992628086988573,
3+
"eval_loss": 0.6572265625,
4+
"eval_runtime": 200.1862,
5+
"eval_samples_per_second": 1.509,
6+
"eval_steps_per_second": 1.509,
7+
"total_flos": 931691237146624.0,
8+
"train_loss": 0.8344805894699772,
9+
"train_runtime": 10334.0354,
10+
"train_samples_per_second": 0.525,
11+
"train_steps_per_second": 0.131
12+
}

saves/v1/eval_results.json

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
{
2+
"epoch": 1.9992628086988573,
3+
"eval_loss": 0.6572265625,
4+
"eval_runtime": 200.1862,
5+
"eval_samples_per_second": 1.509,
6+
"eval_steps_per_second": 1.509
7+
}

0 commit comments

Comments
 (0)