Skip to content

Inference time slow #1

@siddagra

Description

@siddagra

I just added a timer using time library and the inference speed is much worse than the one in the paper.

timer code:

t1 = time()
        for i in tqdm.tqdm(range(1, num_samples+1)):
            image_key = b'image-%09d' % i
            label_key = b'label-%09d' % i

            imgbuf = txn.get(image_key)
            buf = six.BytesIO()
            buf.write(imgbuf)
            buf.seek(0)

            img_pil = Image.open(buf).convert('RGB')
            img = np.array(img_pil)
            label = txn.get(label_key).decode()
            labels.append(label)
            img, la, width, img_ = data_preprocess(img, label, char2id, args)

            pred_value, attention_weights_value = sess.run([enc_pred, enc_attention_weights], feed_dict={input_images: [img], input_labels: [la]})

            pred_value_str = idx2label(pred_value, id2char, char2id)[0]

            predicts.append(pred_value_str)
            if args.vis_dir != None and args.vis_dir != "":
                os.makedirs(args.vis_dir, exist_ok=True)
                os.makedirs(os.path.join(args.vis_dir, "errors"), exist_ok=True)
                _ = heatmap_visualize(img_, attention_weights_value, pred_value_str, args.vis_dir, "{}.jpg".format(i))
                if pred_value_str.lower() != label.lower():
                    _ = heatmap_visualize(img_, attention_weights_value, pred_value_str,
                                       os.path.join(args.vis_dir, "errors"), "{}.jpg".format(i))
        metrics_value = calc_metrics(predicts, labels, args.metrics_type)
        print("Done, {}: {}".format(args.metrics_type, metrics_value))
        t2 = time()
        print("time taken:", t2-t1, "inference time (ms):", (t2-t1) / num_samples * 1000, "FPS:", 1/ ((t2-t1) / num_samples))

logs:

2022-07-14 01:25:41.839559: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8401
2022-07-14 01:25:41.840187: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
100%|█████████████████████████████████████████| 79/79 [00:07<00:00, 10.18it/s]
Done, accuracy: 0.013
time taken: 7.767025709152222 inference time (ms): 98.31678112850913 FPS: 10.17120361876888

Is this suppose to happen? Any way to optimise inference time to be better? Let me know if there are any ways, mentioned in the paper or otherwise, for improving inference time of this model. Because as it stands, it seems to be very unideal in terms of accuracy/inference time tradeoff.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions