-
Notifications
You must be signed in to change notification settings - Fork 56
Open
Description
When I try to run the following command:
torchrun --nproc-per-node=1 projects/inference_seedvr2_3b.py --video_path /home/vinicius/Vídeos/teste_input/teste_in.mkv --output_dir /home/vinicius/Vídeos/teste_output --seed 1 --res_h 1080 --res_w 1440 --sp_size 4
I got the followind output:
Traceback (most recent call last):
File "/home/vinicius/SeedVR/projects/inference_seedvr2_3b.py", line 26, in <module>
from data.image.transforms.divisible_crop import DivisibleCrop
ModuleNotFoundError: No module named 'data'
E0104 07:15:47.745000 140434805777664 torch/distributed/elastic/multiprocessing/api.py:826] failed (exitcode: 1) local_rank: 0 (pid: 375784) of binary: /home/vinicius/.conda/envs/seedvr/bin/python3.10
Traceback (most recent call last):
File "/home/vinicius/.conda/envs/seedvr/bin/torchrun", line 7, in <module>
sys.exit(main())
File "/home/vinicius/.conda/envs/seedvr/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 347, in wrapper
return f(*args, **kwargs)
File "/home/vinicius/.conda/envs/seedvr/lib/python3.10/site-packages/torch/distributed/run.py", line 879, in main
run(args)
File "/home/vinicius/.conda/envs/seedvr/lib/python3.10/site-packages/torch/distributed/run.py", line 870, in run
elastic_launch(
File "/home/vinicius/.conda/envs/seedvr/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 132, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/vinicius/.conda/envs/seedvr/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 263, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
projects/inference_seedvr2_3b.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2026-01-04_07:15:47
host : AlienArchLinux
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 375784)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
I followed all the instruction in the page but I can't imagine how to fix. Any help is appreciated.
Metadata
Metadata
Assignees
Labels
No labels