You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Software Environment / 软件环境 (Mandatory / 必填):
-- MindSpore version (e.g., 1.7.0.Bxxx) :
-- Python version (e.g., Python 3.7.5) :
-- OS platform and distribution (e.g., Linux Ubuntu 16.04):
-- GCC/Compiler version (if compiled from source):
MindSpore:2.3.1
mindnlp:0.4.1
from mindnlp.engine import Trainer, TrainingArguments
training_args = TrainingArguments(
output_dir="./vit-base-food101",
per_device_train_batch_size=16,
evaluation_strategy="steps",
num_train_epochs=4,
fp16=True,
save_steps=100,
eval_steps=100,
logging_steps=10,
learning_rate=2e-4,
save_total_limit=2,
remove_unused_columns=True,
load_best_model_at_end=True,
)
import numpy as np
import evaluate
metric = evaluate.load("accuracy")
# the compute_metrics function takes a Named Tuple as input:
# predictions, which are the logits of the model as Numpy arrays,
# and label_ids, which are the ground-truth labels as Numpy arrays.
def compute_metrics(eval_pred):
"""Computes accuracy on a batch of predictions"""
predictions = np.argmax(eval_pred.predictions, axis=1)
return metric.compute(predictions=predictions, references=eval_pred.label_ids)
trainer = Trainer(
model=lora_model,
args=training_args,
compute_metrics=compute_metrics,
train_dataset=train_ds,
eval_dataset=val_ds,
tokenizer=image_processor,
)
Describe the bug/ 问题描述 (Mandatory / 必填)
用trainer.train()的时候报错:KeyError: 'eval_loss',但pytorch代码没有报错。
Hardware Environment(
Ascend
/GPU
/CPU
) / 硬件环境:Ascend
Software Environment / 软件环境 (Mandatory / 必填):
-- MindSpore version (e.g., 1.7.0.Bxxx) :
-- Python version (e.g., Python 3.7.5) :
-- OS platform and distribution (e.g., Linux Ubuntu 16.04):
-- GCC/Compiler version (if compiled from source):
MindSpore:2.3.1
mindnlp:0.4.1
Excute Mode / 执行模式 (Mandatory / 必填)(
PyNative
/Graph
):graph
To Reproduce / 重现步骤 (Mandatory / 必填)
然后运行train_results = trainer.train()时报错。
Expected behavior / 预期结果 (Mandatory / 必填)
训练结束,但只训练到epoch0.36
Screenshots/ 日志 / 截图 (Mandatory / 必填)
Additional context / 备注 (Optional / 选填)
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: