Replies: 2 comments
-
Try deleting the kohya_ss folder and start the installation from scratch. Something does not appear to be installed properly. |
Beta Was this translation helpful? Give feedback.
0 replies
-
i have tried |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi everyone,
When i tried to training with K-SS, i had got this message.
what is my mistake ?
[Dataset 0] loading image sizes. 100%|██████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<?, ?it/s] make buckets min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む) bucket 0: resolution (512, 512), count: 760 mean ar error (without repeats): 0.0 prepare accelerator [W ..\torch\csrc\distributed\c10d\socket.cpp:558] [c10d] The client socket has failed to connect to [juju_s_pc]:29500 (system error: 10049 - LÆadresse demandÚe nÆest pas valide dans son contexte.). [W ..\torch\csrc\distributed\c10d\socket.cpp:558] [c10d] The client socket has failed to connect to [juju_s_pc]:29500 (system error: 10049 - LÆadresse demandÚe nÆest pas valide dans son contexte.). Traceback (most recent call last): File "C:\Windows\System32\kohya_ss\train_network.py", line 659, in train(args) File "C:\Windows\System32\kohya_ss\train_network.py", line 108, in train accelerator, unwrap_model = train_util.prepare_accelerator(args) File "C:\Windows\System32\kohya_ss\library\train_util.py", line 1984, in prepare_accelerator accelerator = Accelerator(gradient_accumulation_steps=args.gradient_accumulation_steps, mixed_precision=args.mixed_precision, File "C:\WINDOWS\system32\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 308, in init self.state = AcceleratorState( File "C:\WINDOWS\system32\kohya_ss\venv\lib\site-packages\accelerate\state.py", line 150, in init torch.distributed.init_process_group(backend="nccl", **kwargs) File "C:\WINDOWS\system32\kohya_ss\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 602, in init_process_group default_pg = _new_process_group_helper( File "C:\WINDOWS\system32\kohya_ss\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 727, in _new_process_group_helper raise RuntimeError("Distributed package doesn't have NCCL " "built in") RuntimeError: Distributed package doesn't have NCCL built in ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 400) of binary: C:\WINDOWS\system32\kohya_ss\venv\Scripts\python.exe
Beta Was this translation helpful? Give feedback.
All reactions