@Aetf
I created the relevant environment and run embedding.py on my own computer according to your documentation. The program hung after it run and printed 1-25 pieces of information (the position of the stall was different each time the program was run), but it did not exit.
2018-04-01 06:01:12.024821: myglobal 1 epoch 1 step 1 loss = 21.25 (0.9 samples/sec; 1.175 sec/batch)
2018-04-01 06:01:12.354372: myglobal 2 epoch 1 step 2 loss = 17.27 (3.2 samples/sec; 0.312 sec/batch)
2018-04-01 06:01:12.787619: myglobal 3 epoch 1 step 3 loss = 10.45 (2.9 samples/sec; 0.346 sec/batch)
2018-04-01 06:01:13.477380: myglobal 4 epoch 1 step 4 loss = 17.19 (1.5 samples/sec; 0.678 sec/batch)
2018-04-01 06:01:14.020272: myglobal 5 epoch 1 step 5 loss = 17.10 (1.9 samples/sec; 0.518 sec/batch)
2018-04-01 06:01:14.258575: myglobal 6 epoch 1 step 6 loss = 10.39 (4.4 samples/sec; 0.228 sec/batch)
2018-04-01 06:01:14.698754: myglobal 7 epoch 1 step 7 loss = 26.52 (2.5 samples/sec; 0.407 sec/batch)
2018-04-01 06:01:14.965694: myglobal 8 epoch 1 step 8 loss = 15.85 (4.1 samples/sec; 0.246 sec/batch)
2018-04-01 06:01:15.259785: myglobal 9 epoch 1 step 9 loss = 17.02 (3.6 samples/sec; 0.274 sec/batch)
<------it hangs and do nothing forever and different position in next rerunning
Ctrl+c does not work, and ctrl+z can exit.
I used the "top" command to see that the host's CPU and memory were idle and not busy running any more.
my system is Ubuntu16.04 LTS, tensorflow=1.0.0, tensorflow_fold_fold=0.0.1 python=3.5, CPU only
Linux ubuntu 4.13.0-37-generic #42~16.04.1-Ubuntu SMP Wed Mar 7 16:03:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
How do i solve this problem?
Thanks very much!
@Aetf
I created the relevant environment and run embedding.py on my own computer according to your documentation. The program hung after it run and printed 1-25 pieces of information (the position of the stall was different each time the program was run), but it did not exit.
2018-04-01 06:01:12.024821: myglobal 1 epoch 1 step 1 loss = 21.25 (0.9 samples/sec; 1.175 sec/batch)
2018-04-01 06:01:12.354372: myglobal 2 epoch 1 step 2 loss = 17.27 (3.2 samples/sec; 0.312 sec/batch)
2018-04-01 06:01:12.787619: myglobal 3 epoch 1 step 3 loss = 10.45 (2.9 samples/sec; 0.346 sec/batch)
2018-04-01 06:01:13.477380: myglobal 4 epoch 1 step 4 loss = 17.19 (1.5 samples/sec; 0.678 sec/batch)
2018-04-01 06:01:14.020272: myglobal 5 epoch 1 step 5 loss = 17.10 (1.9 samples/sec; 0.518 sec/batch)
2018-04-01 06:01:14.258575: myglobal 6 epoch 1 step 6 loss = 10.39 (4.4 samples/sec; 0.228 sec/batch)
2018-04-01 06:01:14.698754: myglobal 7 epoch 1 step 7 loss = 26.52 (2.5 samples/sec; 0.407 sec/batch)
2018-04-01 06:01:14.965694: myglobal 8 epoch 1 step 8 loss = 15.85 (4.1 samples/sec; 0.246 sec/batch)
2018-04-01 06:01:15.259785: myglobal 9 epoch 1 step 9 loss = 17.02 (3.6 samples/sec; 0.274 sec/batch)
<------it hangs and do nothing forever and different position in next rerunning
Ctrl+c does not work, and ctrl+z can exit.
I used the "top" command to see that the host's CPU and memory were idle and not busy running any more.
my system is Ubuntu16.04 LTS, tensorflow=1.0.0, tensorflow_fold_fold=0.0.1 python=3.5, CPU only
Linux ubuntu 4.13.0-37-generic #42~16.04.1-Ubuntu SMP Wed Mar 7 16:03:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
How do i solve this problem?
Thanks very much!