Hello,
I am using ImageNet 64x64 and run the code with the following command :
python BigGAN-PyTorch/train.py --dataset I64_hdf5 --parallel --shuffle --num_workers 8 --batch_size 128 --num_G_accumulations 1 --num_D_accumulations 1 --num_D_steps 1--G_lr 1e-4 --D_lr 4e-4 --D_B2 0.999 --G_B2 0.999 --G_attn 32 --D_attn 32 --G_nl relu --D_nl relu --SN_eps 1e-8 --BN_eps 1e-5 --adam_eps 1e-8 --G_ortho 0.0 --G_init xavier --D_init xavier --G_eval_mode --G_ch 32 --D_ch 32 --ema --use_ema --ema_start 2000 --test_every 5000 --save_every 1000 --num_best_copies 5 --num_save_copies 2 --seed 0 --which_best FID --num_iters 200000 --num_epochs 1000 --embedding inceptionv3 --density_measure gaussian --retention_ratio 50
and getting this error:
File "train.py", line 229, in
main()
File "train.py", line 226, in main
run(config)
File "train.py", line 184, in run
metrics = train(x, y)
File "/BigGAN-PyTorch/train_fns.py", line 42, in train
split_D=config['split_D'])
File "/miniconda3/envs/biggan2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/miniconda3/envs/biggan2-env/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 140, in forward
return self.module(*inputs, **kwargs)
File "/miniconda3/envs/biggan2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/BigGAN-PyTorch/BigGAN.py", line 443, in forward
D_out = self.D(D_input, D_class)
File "/miniconda3/envs/biggan2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/BigGAN-PyTorch/BigGAN.py", line 403, in forward
out = out + torch.sum(self.embed(y) * h, 1, keepdim=True)
RuntimeError: CUDA error: device-side assert triggered
The interesting thing is when I create a "mini dataset" by randomly selecting 500 images per label from the original ImageNet dataset, code runs fine. What could be the problem? How can I solve this issue?
Hello,
I am using ImageNet 64x64 and run the code with the following command :
python BigGAN-PyTorch/train.py --dataset I64_hdf5 --parallel --shuffle --num_workers 8 --batch_size 128 --num_G_accumulations 1 --num_D_accumulations 1 --num_D_steps 1--G_lr 1e-4 --D_lr 4e-4 --D_B2 0.999 --G_B2 0.999 --G_attn 32 --D_attn 32 --G_nl relu --D_nl relu --SN_eps 1e-8 --BN_eps 1e-5 --adam_eps 1e-8 --G_ortho 0.0 --G_init xavier --D_init xavier --G_eval_mode --G_ch 32 --D_ch 32 --ema --use_ema --ema_start 2000 --test_every 5000 --save_every 1000 --num_best_copies 5 --num_save_copies 2 --seed 0 --which_best FID --num_iters 200000 --num_epochs 1000 --embedding inceptionv3 --density_measure gaussian --retention_ratio 50
and getting this error:
File "train.py", line 229, in
main()
File "train.py", line 226, in main
run(config)
File "train.py", line 184, in run
metrics = train(x, y)
File "/BigGAN-PyTorch/train_fns.py", line 42, in train
split_D=config['split_D'])
File "/miniconda3/envs/biggan2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/miniconda3/envs/biggan2-env/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 140, in forward
return self.module(*inputs, **kwargs)
File "/miniconda3/envs/biggan2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/BigGAN-PyTorch/BigGAN.py", line 443, in forward
D_out = self.D(D_input, D_class)
File "/miniconda3/envs/biggan2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/BigGAN-PyTorch/BigGAN.py", line 403, in forward
out = out + torch.sum(self.embed(y) * h, 1, keepdim=True)
RuntimeError: CUDA error: device-side assert triggered
The interesting thing is when I create a "mini dataset" by randomly selecting 500 images per label from the original ImageNet dataset, code runs fine. What could be the problem? How can I solve this issue?