Hi,
I am using two NAND modules with the Cosmos+ board to implement an OCSSD. I have successfully created a pblk device.
However, when I executed the fio benchmark, some failure occurs.
My env:
OS: Ubuntu 14.04.5 LTS
Kernel: 4.16.0 / 4.17.7
First time, everything is ok.
root@osd-62:/home/osd# fio -filename=/dev/nvme_pblk -direct=1 -iodepth 128 -thread -rw=randwrite -ioengine=libaio -bs=4k -size=100% -numjobs=1 -group_reporting -name=mytest
mytest: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128
fio-3.3
Starting 1 thread
Jobs: 1 (f=1): [w(1)][100.0%][r=0KiB/s,w=694MiB/s][r=0,w=178k IOPS][eta 00m:00s]
mytest: (groupid=0, jobs=1): err= 0: pid=1861: Thu Feb 28 16:28:48 2019
write: IOPS=181k, BW=706MiB/s (741MB/s)(909GiB/1317868msec)
slat (nsec): min=1615, max=5591.7k, avg=4283.85, stdev=35077.17
clat (nsec): min=1763, max=9203.9k, avg=702784.25, stdev=416338.61
lat (usec): min=8, max=9206, avg=707.19, stdev=417.64
clat percentiles (usec):
| 1.00th=[ 449], 5.00th=[ 453], 10.00th=[ 453], 20.00th=[ 461],
| 30.00th=[ 502], 40.00th=[ 578], 50.00th=[ 652], 60.00th=[ 701],
| 70.00th=[ 750], 80.00th=[ 824], 90.00th=[ 955], 95.00th=[ 1270],
| 99.00th=[ 1631], 99.50th=[ 1696], 99.90th=[ 8586], 99.95th=[ 8717],
| 99.99th=[ 8848]
bw ( KiB/s): min=624512, max=781056, per=100.00%, avg=723367.19, stdev=19851.74, samples=2635
iops : min=156128, max=195264, avg=180841.82, stdev=4962.94, samples=2635
lat (usec) : 2=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01%
lat (usec) : 250=0.01%, 500=29.93%, 750=39.87%, 1000=21.24%
lat (msec) : 2=8.77%, 4=0.01%, 10=0.19%
cpu : usr=23.34%, sys=76.67%, ctx=630, majf=0, minf=1
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwt: total=0,238310400,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=128
Run status group 0 (all jobs):
WRITE: bw=706MiB/s (741MB/s), 706MiB/s-706MiB/s (741MB/s-741MB/s), io=909GiB (976GB), run=1317868-1317868msec
Disk stats (read/write):
nvme_pblk: ios=0/238305484, merge=0/0, ticks=0/599772, in_queue=0, util=0.00%
But the second time there will be an error, every time it is 8.1% error.
root@osd-62:/home/osd# fio -filename=/dev/nvme_pblk -direct=1 -iodepth 128 -thread -rw=randwrite -ioengine=libaio -bs=4k -size=100% -numjobs=1 -group_reporting -name=mytest
mytest: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128
fio-3.3
Starting 1 thread
bs: 1 (f=1): [w(1)][8.1%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 01h:48m:46s]
System log.
Feb 24 10:56:25 osd-62 kernel: [ 2302.751909] pblk: taking lun semaphore timed out: err 62
Feb 24 10:56:25 osd-62 kernel: [ 2302.863906] nvme nvme0: I/O 96 QID 1 timeout, aborting
Feb 24 10:56:25 osd-62 kernel: [ 2302.863912] nvme nvme0: I/O 97 QID 1 timeout, aborting
Feb 24 10:56:25 osd-62 kernel: [ 2302.863915] nvme nvme0: I/O 98 QID 1 timeout, aborting
Feb 24 10:56:25 osd-62 kernel: [ 2302.863917] nvme nvme0: I/O 99 QID 1 timeout, aborting
Feb 24 10:56:55 osd-62 kernel: [ 2332.767908] pblk: taking lun semaphore timed out: err 62
Feb 24 10:56:55 osd-62 kernel: [ 2332.911904] nvme nvme0: I/O 96 QID 1 timeout, reset controller
Feb 24 10:57:25 osd-62 kernel: [ 2362.783907] pblk: taking lun semaphore timed out: err 62
Feb 24 10:57:26 osd-62 kernel: [ 2363.867903] nvme nvme0: I/O 9 QID 0 timeout, reset controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.579901] nvme nvme0: Device not ready; aborting reset
Feb 24 10:58:12 osd-62 kernel: [ 2409.595996] nvme nvme0: I/O 97 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.619947] nvme nvme0: I/O 98 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.643905] nvme nvme0: Abort status: 0x7
Feb 24 10:58:12 osd-62 kernel: [ 2409.643907] nvme nvme0: I/O 10 QID 0 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.667901] nvme nvme0: Abort status: 0x7
Feb 24 10:58:12 osd-62 kernel: [ 2409.667904] nvme nvme0: I/O 11 QID 0 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.691903] nvme nvme0: I/O 99 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.715903] nvme nvme0: I/O 100 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.739901] nvme nvme0: Abort status: 0x7
Feb 24 10:58:12 osd-62 kernel: [ 2409.739904] nvme nvme0: I/O 12 QID 0 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.763901] nvme nvme0: Abort status: 0x7
Feb 24 10:58:12 osd-62 kernel: [ 2409.787902] nvme nvme0: I/O 101 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.811902] nvme nvme0: I/O 102 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.835903] nvme nvme0: I/O 103 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.859901] nvme nvme0: I/O 104 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.883902] nvme nvme0: I/O 105 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.907902] nvme nvme0: I/O 106 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.931902] nvme nvme0: I/O 107 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.955901] nvme nvme0: I/O 108 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2409.979902] nvme nvme0: I/O 109 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2410.003901] nvme nvme0: I/O 110 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2410.027903] nvme nvme0: I/O 111 QID 1 timeout, disable controller
Feb 24 10:58:12 osd-62 kernel: [ 2410.051902] nvme nvme0: I/O 112 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.075901] nvme nvme0: I/O 113 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.099903] nvme nvme0: I/O 114 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.123902] nvme nvme0: I/O 115 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.147901] nvme nvme0: I/O 116 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.171901] nvme nvme0: I/O 117 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.195901] nvme nvme0: I/O 118 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.219902] nvme nvme0: I/O 119 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.243903] nvme nvme0: I/O 120 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.267901] nvme nvme0: I/O 121 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.291901] nvme nvme0: I/O 122 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.315901] nvme nvme0: I/O 123 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.339902] nvme nvme0: I/O 124 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.363902] nvme nvme0: I/O 125 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.387901] nvme nvme0: I/O 126 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.411901] nvme nvme0: I/O 127 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.435901] nvme nvme0: I/O 128 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.459903] nvme nvme0: I/O 129 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.483901] nvme nvme0: I/O 130 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.499901] nvme nvme0: I/O 131 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.515901] nvme nvme0: I/O 132 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.531901] nvme nvme0: I/O 133 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.547901] nvme nvme0: I/O 134 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.563901] nvme nvme0: I/O 135 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.579901] nvme nvme0: I/O 136 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.595902] nvme nvme0: I/O 137 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.611901] nvme nvme0: I/O 138 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.627901] nvme nvme0: I/O 139 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.643901] nvme nvme0: I/O 140 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.659907] nvme nvme0: I/O 141 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.675901] nvme nvme0: I/O 142 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.699902] nvme nvme0: I/O 143 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.715901] nvme nvme0: I/O 144 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.731901] nvme nvme0: I/O 145 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.747901] nvme nvme0: I/O 146 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.763901] nvme nvme0: I/O 147 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.779901] nvme nvme0: I/O 148 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.803901] nvme nvme0: I/O 149 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.819901] nvme nvme0: I/O 150 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.835902] nvme nvme0: I/O 151 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.851901] nvme nvme0: I/O 152 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.867901] nvme nvme0: I/O 153 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.883901] nvme nvme0: I/O 154 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.899900] nvme nvme0: I/O 155 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.915900] nvme nvme0: I/O 156 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.931901] nvme nvme0: I/O 157 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.947901] nvme nvme0: I/O 158 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.963900] nvme nvme0: I/O 159 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.979900] nvme nvme0: I/O 160 QID 1 timeout, disable controller
Feb 24 10:58:13 osd-62 kernel: [ 2410.995900] nvme nvme0: I/O 161 QID 1 timeout, disable controller
Feb 24 10:58:29 osd-62 kernel: [ 2426.119905] nvme nvme0: Device not ready; aborting reset
Feb 24 10:58:29 osd-62 kernel: [ 2426.120320] nvme nvme0: Removing after probe failure status: -19
Feb 24 10:58:29 osd-62 kernel: [ 2426.135908] nvme0n1: detected capacity change from 1099511627776 to 0
Feb 24 10:58:29 osd-62 kernel: [ 2426.135918] pblk: data I/O submission failed: -19
Feb 24 10:58:29 osd-62 kernel: [ 2426.135968] pblk: I/O submission failed: -19
Feb 24 10:58:29 osd-62 kernel: [ 2426.136472] pblk: data I/O submission failed: -19
Feb 24 10:58:29 osd-62 kernel: [ 2426.136989] pblk: I/O submission failed: -19
Feb 24 10:58:29 osd-62 kernel: [ 2426.137515] pblk: data I/O submission failed: -19
Feb 24 10:58:29 osd-62 kernel: [ 2426.137548] pblk: I/O submission failed: -19
Feb 24 10:58:29 osd-62 kernel: [ 2426.137550] pblk: I/O submission failed: -19
Feb 24 10:58:29 osd-62 kernel: [ 2426.137551] pblk: I/O submission failed: -19
Feb 24 10:58:29 osd-62 kernel: [ 2426.137558] pblk: I/O submission failed: -19
......
Feb 24 10:58:29 osd-62 kernel: [ 2426.137620] pblk: I/O submission failed: -19
Feb 24 10:58:29 osd-62 kernel: [ 2426.137622] pblk: I/O submission failed: -19
Feb 24 10:58:29 osd-62 kernel: [ 2426.137623] pblk: I/O submission failed: -19
Feb 24 10:58:29 osd-62 kernel: [ 2426.137626] pblk: I/O submission failed: -19
Feb 24 10:58:30 osd-62 kernel: [ 2427.167909] pblk: data I/O submission failed: -19
Feb 24 10:58:31 osd-62 kernel: [ 2428.191904] pblk: data I/O submission failed: -19
Feb 24 10:58:34 osd-62 kernel: [ 2431.263903] pblk: data I/O submission failed: -19
Feb 24 10:58:35 osd-62 kernel: [ 2432.287904] pblk: data I/O submission failed: -19
......
Feb 24 10:59:29 osd-62 kernel: [ 2486.559901] pblk: data I/O submission failed: -19
Feb 24 10:59:30 osd-62 kernel: [ 2487.583901] pblk: data I/O submission failed: -19
Feb 24 10:59:31 osd-62 kernel: [ 2488.607901] pblk: data I/O submission failed: -19
Feb 24 10:59:32 osd-62 kernel: [ 2489.631901] pblk: data I/O submission failed: -19
Hi,
I am using two NAND modules with the Cosmos+ board to implement an OCSSD. I have successfully created a pblk device.
However, when I executed the fio benchmark, some failure occurs.
My env:
OS: Ubuntu 14.04.5 LTS
Kernel: 4.16.0 / 4.17.7
First time, everything is ok.
But the second time there will be an error, every time it is 8.1% error.
System log.