-
Notifications
You must be signed in to change notification settings - Fork 9
Description
yunitate.sh seems to produce rttm files with segments that go beyond (or even are completely outside) the duration of the source wave file.
the audio I'm using is this:
vagrant ssh -c "sox --i '/vagrant/data/0513.wav'"
Input File : '/vagrant/data/0513.wav'
Channels : 1
Sample Rate : 44100
Precision : 16-bit
Duration : 00:10:04.12 = 26641575 samples = 45308.8 CDDA sectors
File Size : 53.3M
Bit Rate : 706k
Sample Encoding: 16-bit Signed Integer PCM
So that amounts to 604.12 seconds duration.
After running vagrant ssh -c "yunitate.sh data/", I get the following rttm (only last few lines shown):
SPEAKER 0513.rttm 1 601.4 0.1 CHI
SPEAKER 0513.rttm 1 601.5 1.2 FEM
SPEAKER 0513.rttm 1 602.7 2.1 CHI
where the last segment starts inside the source wave file's duration, but goes beyond the end (602.7 + 2.1 = 604.9).
When running vagrant ssh -c "yunitate.sh data/ english" things become even stranger:
SPEAKER 0513.rttm 1 601.6 0.6 FEM
SPEAKER 0513.rttm 1 603.3 0.1 CHI
SPEAKER 0513.rttm 1 603.6 0.1 CHI
SPEAKER 0513.rttm 1 603.9 0.3 CHI
SPEAKER 0513.rttm 1 604.2 0.1 FEM
Here the last segment starts after the end of the original source.
This becomes problematic when using the latter file for vagrant ssh -c "~/launcher/WCE_from_SAD_outputs.sh /vagrant/data/ yunitator_english". Here, the tool finishes without error message, but doesn't produce the word count output. The wav_tmp folder is still present and contains this empty (corrupt?) wav file:
Input File : '/vagrant/data/wav_tmp/yunitator_english_0513_00604200-00000100.wav'
Channels : 1
Sample Rate : 44100
Precision : 16-bit
Sample Encoding: 16-bit Signed Integer PCM
And finally, if I use this file in the analyze.sh pipeline, I get the following message:
(MSG) [2] in SMILExtract : openSMILE starting!
(MSG) [2] in SMILExtract : config file is: MED_2s_100ms_htk.conf
(MSG) [2] in cComponentManager : successfully registered 96 component types.
(MSG) [2] in cComponentManager : successfully finished createInstances
(19 component instances were finalised, 1 data memories were finalised)
(MSG) [2] in cComponentManager : starting single thread processing loop
(MSG) [2] in cComponentManager : Processing finished! System ran for 60436 ticks.
sox WARN trim: End position is after expected end of audio.
sox WARN trim: Last 1 position(s) not reached.
/home/vagrant/utils/analyze.sh: line 40: /vagrant/data//detailed_outputs/WCE_yunitator_english_0513.rttm: No such file or directory
paste: /vagrant/data//wce.temp: No such file or directory
vcm_0513.rttm and yunitator_english_0513.rttm are present in detailed_output, but the corresponding wce_0513.rttm is missing.
One hackish solution might be to append a second or two of silence to the end of the source wave file, I suppose. I haven't tried that yet.