-
Notifications
You must be signed in to change notification settings - Fork 584
Description
Bug summary
In the PyTorch backend, setting batch_size to list throws errors as shown below.
DeePMD-kit Version
v3.0.0a0-28-ged831c88
TensorFlow Version
PT v2.2.0+cu121-g8ac9b20d4b0
How did you download the software?
Built from source
Input Files, Running Commands, Error Log, etc.
Traceback (most recent call last):
File "/home/jz748/anaconda3/bin/dp", line 8, in <module>
sys.exit(main())
File "/home/jz748/codes/deepmd-kit/deepmd/main.py", line 807, in main
deepmd_main(args)
File "/home/jz748/anaconda3/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 347, in wrapper
return f(*args, **kwargs)
File "/home/jz748/codes/deepmd-kit/deepmd/pt/entrypoints/main.py", line 306, in main
train(FLAGS)
File "/home/jz748/codes/deepmd-kit/deepmd/pt/entrypoints/main.py", line 270, in train
trainer = get_trainer(
File "/home/jz748/codes/deepmd-kit/deepmd/pt/entrypoints/main.py", line 166, in get_trainer
) = prepare_trainer_input_single(
File "/home/jz748/codes/deepmd-kit/deepmd/pt/entrypoints/main.py", line 149, in prepare_trainer_input_single
train_data_single = DpLoaderSet(
File "/home/jz748/codes/deepmd-kit/deepmd/pt/utils/dataloader.py", line 129, in __init__
system_dataloader = DataLoader(
File "/home/jz748/anaconda3/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 356, in __init__
batch_sampler = BatchSampler(sampler, batch_size, drop_last)
File "/home/jz748/anaconda3/lib/python3.10/site-packages/torch/utils/data/sampler.py", line 267, in __init__
raise ValueError(f"batch_size should be a positive integer value, but got batch_size={batch_size}")
ValueError: batch_size should be a positive integer value, but got batch_size=[1, 1, 1]
Steps to Reproduce
cd examples/water/se_attenDo the following modifications:
diff --git a/examples/water/se_atten/input_torch.json b/examples/water/se_atten/input_torch.json
index 7e9cf06f..0188228e 100644
--- a/examples/water/se_atten/input_torch.json
+++ b/examples/water/se_atten/input_torch.json
@@ -68,7 +68,7 @@
"../data/data_1",
"../data/data_2"
],
- "batch_size": 1,
+ "batch_size": [1, 1, 1],
"_comment": "that's all"
},
"validation_data": {Then run
dp --pt train input_torch.jsonFurther Information, Files, and Links
Need to update documentation if it cannot be resolved before the stable release.
https://docs.deepmodeling.com/projects/deepmd/en/latest/train/train-input.html#argument:training/training_data/batch_size
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Done