-
Notifications
You must be signed in to change notification settings - Fork 174
Description
Hi
I am trying to train flowtron from scratch using custom data with 107 speakers (english) and I am getting this error. Slogged on it for 2 days but couldnt find a solution. Any help would be highly appreciated.
Number of speakers : 107
Number of speakers : 107
Setting up Tensorboard log in outdir/logs
Epoch: 0
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [96,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [97,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [98,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [99,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [100,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [101,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [102,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [103,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [104,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [105,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [106,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [107,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [108,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [109,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [110,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [111,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [112,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [113,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [114,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [115,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [116,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [117,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [118,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [119,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [120,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [121,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [122,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [0,0,0], thread: [123,0,0] Assertion srcIndex < srcSelectDimSize failed.
Traceback (most recent call last):
File "./train.py", line 443, in
train(n_gpus, rank, **train_config)
File "./train.py", line 321, in train
attn_logprob, mean, log_var, prob) = model(
File "/home/ubuntu/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ubuntu/flowtron/flowtron.py", line 875, in forward
text = self.encoder(text, in_lens)
File "/home/ubuntu/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ubuntu/flowtron/flowtron.py", line 498, in forward
mask = get_mask_from_lengths(in_lens).unsqueeze(1) if x.size(0) > 1 else None
File "/home/ubuntu/flowtron/flowtron.py", line 47, in get_mask_from_lengths
max_len = torch.max(lengths).item()
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
When I take a subset of single speaker from the same training data, its working normally. What am I missing?