-
Notifications
You must be signed in to change notification settings - Fork 207
Description
I encountered this error:
[2025-02-11 20:20:56,083][lcm.datasets.configs][INFO] - Parsing pretraining_data : card_name=pretraining_data, split=, weight=None
0%| | 0/1 [00:00<?, ?it/s]s3
libmkl_rt.so: cannot open shared object file: No such file or directory
/usr/lib/python3.10/ctypes/init.py(374)init()
-> self._handle = _dlopen(self._name, mode)
when running the code:
CUDA_VISIBLE_DEVICES=0 torchrun --standalone --nnodes=1 --nproc-per-node=1
-m lcm.train launcher=standalone
+finetune=two_tower
++trainer.output_dir="checkpoints/finetune_two_tower_lcm"
++trainer.data_loading_config.max_tokens=1000
+trainer.use_submitit=false
++trainer.model_config_or_name=my_pretrained_two_tower
Would you please help how to solve it?