Skip to content

RuntimeError: Magnitude of gradient is bad: inf #28

@robvanderg

Description

@robvanderg

train.conllu.txt
tune.conllu.txt

While tuning some of the parameters across multiple treebanks (1296 total runs), I got this error for about 60 of the runs:

  File "uuparser/parser.py", line 325, in main
    run(experiment,options)
  File "uuparser/parser.py", line 51, in run
    parser.Train(traindata,options)
  File "/data/rob/datasplits/uuparser/uuparser/arc_hybrid.py", line 448, in Train
    self.trainer.update()
  File "_dynet.pyx", line 6198, in _dynet.Trainer.update
  File "_dynet.pyx", line 6203, in _dynet.Trainer.update
RuntimeError: Magnitude of gradient is bad: inf

After running the exact same commands one more time, I still have the error for 24 of these 60 runs. I could not find a clear trend on why/when this happens, and it happened on two different machines.

One example where this occurs is the following command:
python3 uuparser/parser.py --trainfile ../ --devfile ../newsplits-v2.7/UD_Arabic-PADT/tune.conllu --learning-rate 0.01 --word-emb-size 100 --char-emb-size 500 --no-bilstms 1 --outdir models/UD_Arabic-PADT.0.0.01.100.500.1

I have attached the used files, I get the error at epoch 11.

PS. This can be solved by continuing training after the crash using --continue (which I have now done for my own experiments)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions