Drop in mAP after TensorRT optimization #315

philipp-schmidt · 2021-01-02T16:42:55Z

@jkjung-avt
Hi, could we work together on the problem of the reduced accuracy? I believe I have similar issues in my implementation and I do not use any onnx conversion whatsoever. I would like to get this fixed and could use additional examples where it goes wrong to determine what's the cause.

We could start to work on the postprocessing method. I started with existing code for the yolo layer plugin similar to yours and had to fix a few errors already. Please let me know if my code increases your precision:

https://github.com/isarsoft/yolov4-triton-tensorrt/blob/master/clients/python/processing.py

philipp-schmidt · 2021-01-02T17:17:52Z

Here are all fixes I made so far:
https://github.com/isarsoft/yolov4-triton-tensorrt/commits/master/clients/python/processing.py

jkjung-avt · 2021-01-03T01:53:53Z

Hi, could we work together on the problem of the reduced accuracy?

That sounds good.

Here are all fixes I made so far:
https://github.com/isarsoft/yolov4-triton-tensorrt/commits/master/clients/python/processing.py

I have read through your commit history. I think my current code does not have those issues you've fixed in your own code...

I did reference the original AlexyAB/darknet code to develop my implementation. For example, "scale_x_y", which is used in yolov4/yolov4-tiny models, would affect how center x/y coordinates of bboxes are calculated. And I implemented that calculation in the "yolo_layer" plugin.

tensorrt_demos/plugins/yolo_layer.cu

Lines 238 to 239 in 793d7ae

    
           det->bbox[0] = (col + scale_sigmoidGPU(*(cur_input + 0 * total_grids), scale_x_y)) / yolo_width;    // [0, 1] 
        
           det->bbox[1] = (row + scale_sigmoidGPU(*(cur_input + 1 * total_grids), scale_x_y)) / yolo_height;   // [0, 1]

jkjung-avt · 2021-01-04T04:06:24Z

Related issues:

philipp-schmidt · 2021-01-06T15:36:48Z

I will prob. have time this weekend to crosscheck implementations. I will get back at you when I have more info.

jkjung-avt · 2021-01-06T16:13:20Z

@philipp-schmidt Look forward to your updates. Meanwhile, I'm inclined to think the problem lies more likely in darknet -> onnx -> TensorRT conversion. I will also review the code when I have time.

philipp-schmidt · 2021-01-26T03:06:25Z

Hi, a main source of wrong results and bad accuracy has been fixed for me in triton inference server. It was a server side race condition... I was hunting ghosts for many weeks... triton-inference-server/server#2339

Now I can focus on mAP, I'll keep you posted.

jkjung-avt · 2021-03-24T02:11:36Z

NVIDIA has this Polygraphy tool which could be used to compare "layer-wise" outputs between the ONNX model and the TensorRT engine. I think that would be an effective way to debug this mAP dropping problem.

Here is an example Polygraphy debugging output: NVIDIA/TensorRT#1087 (comment)

I'm not sure when I'll have time to look into this, though.

philipp-schmidt · 2021-03-24T10:32:43Z

I couldn't yet make the time to fully tackle this as well unfortunately.
This Polygraph tool seems to be very helpful regardless, thanks for the pointer.

jkjung-avt · 2021-04-10T14:13:18Z

NVIDIA's Polygraphy tool turns out to be very easy to use. I just follow the installation instructions and use the following command to debug the models.

$ polygraphy run yolov3-tiny-416.onnx --trt --fp16 --onnxrt
......
[I] Accuracy Comparison | trt-runner-N0-04/10/21-21:37:09 vs. onnxrt-runner-N0-04/10/21-21:37:09
[I]     Comparing Output: '016_convolutional' (dtype=float32, shape=(1, 255, 13, 13)) with '016_convolutional' (dtype=float32, shape=(1, 255, 13, 13))
[I]         Required tolerances: [atol=0.089517] OR [rtol=1e-05, atol=0.089425] OR [rtol=5.9166, atol=1e-05] | Mean Error: Absolute=0.010562, Relative=0.0033428
            Runner: trt-runner-N0-04/10/21-21:37:09          | Stats: mean=-6.5803, min=-15.992 at (0, 174, 0, 0), max=2.1582 at (0, 90, 12, 2)
            Runner: onnxrt-runner-N0-04/10/21-21:37:09       | Stats: mean=-6.5821, min=-16.004 at (0, 174, 0, 12), max=2.1647 at (0, 90, 12, 2)
[E]         FAILED | Difference exceeds tolerance (rtol=1e-05, atol=1e-05)
[I]     Comparing Output: '023_convolutional' (dtype=float32, shape=(1, 255, 26, 26)) with '023_convolutional' (dtype=float32, shape=(1, 255, 26, 26))
[I]         Required tolerances: [atol=0.095589] OR [rtol=1e-05, atol=0.095568] OR [rtol=268.68, atol=1e-05] | Mean Error: Absolute=0.012998, Relative=0.0078038
            Runner: trt-runner-N0-04/10/21-21:37:09          | Stats: mean=-7.1557, min=-18.188 at (0, 174, 15, 25), max=3.3008 at (0, 249, 15, 21)
            Runner: onnxrt-runner-N0-04/10/21-21:37:09       | Stats: mean=-7.1579, min=-18.159 at (0, 174, 15, 25), max=3.3272 at (0, 249, 15, 21)
[E]         FAILED | Difference exceeds tolerance (rtol=1e-05, atol=1e-05)
[E]     FAILED | Mismatched outputs: ['016_convolutional', '023_convolutional']

I summarize the results below. All comparisons are done between TensorRT FP16 and ONNX Runtime.

yolov3-tiny-416
- '016_convolutional' Mean Error: Absolute=0.010562, Relative=0.0033428
- '023_convolutional' Mean Error: Absolute=0.012998, Relative=0.0078038
yolov3-608
- '082_convolutional' Mean Error: Absolute=0.018218, Relative=0.0046612
- '094_convolutional' Mean Error: Absolute=0.018218, Relative=0.0046612
- '106_convolutional' Mean Error: Absolute=0.020347, Relative=0.0078671
yolov4-tiny-416
- '030_convolutional' Mean Error: Absolute=0.01394, Relative=0.0032779
- '037_convolutional' Mean Error: Absolute=0.013386, Relative=0.0069264
yolov4-608
- '139_convolutional' Mean Error: Absolute=0.0051023, Relative=0.0026887
- '150_convolutional' Mean Error: Absolute=0.0070509, Relative=0.0040541
- '161_convolutional' Mean Error: Absolute=0.0074914, Relative=0.001748

ROBYER1 · 2021-04-10T14:27:07Z

NVIDIA's Polygraphy tool turns out to be very easy to use. I just follow the installation instructions and use the following command to debug the models.

$ polygraphy run yolov3-tiny-416.onnx --trt --fp16 --onnxrt
......
[I] Accuracy Comparison | trt-runner-N0-04/10/21-21:37:09 vs. onnxrt-runner-N0-04/10/21-21:37:09
[I]     Comparing Output: '016_convolutional' (dtype=float32, shape=(1, 255, 13, 13)) with '016_convolutional' (dtype=float32, shape=(1, 255, 13, 13))
[I]         Required tolerances: [atol=0.089517] OR [rtol=1e-05, atol=0.089425] OR [rtol=5.9166, atol=1e-05] | Mean Error: Absolute=0.010562, Relative=0.0033428
            Runner: trt-runner-N0-04/10/21-21:37:09          | Stats: mean=-6.5803, min=-15.992 at (0, 174, 0, 0), max=2.1582 at (0, 90, 12, 2)
            Runner: onnxrt-runner-N0-04/10/21-21:37:09       | Stats: mean=-6.5821, min=-16.004 at (0, 174, 0, 12), max=2.1647 at (0, 90, 12, 2)
[E]         FAILED | Difference exceeds tolerance (rtol=1e-05, atol=1e-05)
[I]     Comparing Output: '023_convolutional' (dtype=float32, shape=(1, 255, 26, 26)) with '023_convolutional' (dtype=float32, shape=(1, 255, 26, 26))
[I]         Required tolerances: [atol=0.095589] OR [rtol=1e-05, atol=0.095568] OR [rtol=268.68, atol=1e-05] | Mean Error: Absolute=0.012998, Relative=0.0078038
            Runner: trt-runner-N0-04/10/21-21:37:09          | Stats: mean=-7.1557, min=-18.188 at (0, 174, 15, 25), max=3.3008 at (0, 249, 15, 21)
            Runner: onnxrt-runner-N0-04/10/21-21:37:09       | Stats: mean=-7.1579, min=-18.159 at (0, 174, 15, 25), max=3.3272 at (0, 249, 15, 21)
[E]         FAILED | Difference exceeds tolerance (rtol=1e-05, atol=1e-05)
[E]     FAILED | Mismatched outputs: ['016_convolutional', '023_convolutional']

I am guessing this is where there is a loss of accuracy? Will there be a fix?

philipp-schmidt · 2021-04-10T14:45:34Z

Interesting results, thanks for checking it out jkjung. I'm curious if there are any guarantees from TensorRT regarding precision. And taking into account that TensorRT selects from a range of different implementations for each layer the next question is: will this accuracy drop be reproducible and consistent among different hardwares?

jkjung-avt · 2021-04-11T05:17:12Z

I re-ran Polygraphy by specifying the correct input data range for the yolo models ("--float-min 0.0 --float-max 1.0"), e.g.

$ polygraphy run yolov3-tiny-416.onnx --trt --fp16 --onnxrt --float-min 0.0 --float-max 1.0
......
[I] Accuracy Comparison | trt-runner-N0-04/11/21-12:47:58 vs. onnxrt-runner-N0-04/11/21-12:47:58
[I]     Comparing Output: '016_convolutional' (dtype=float32, shape=(1, 255, 13, 13)) with '016_convolutional' (dtype=float32, shape=(1, 255, 13, 13))
[I]         Required tolerances: [atol=0.049671] OR [rtol=1e-05, atol=0.049614] OR [rtol=18.328, atol=1e-05] | Mean Error: Absolute=0.008115, Relative=0.0037584
            Runner: trt-runner-N0-04/11/21-12:47:58          | Stats: mean=-5.2187, min=-18.516 at (0, 174, 11, 3), max=1.5859 at (0, 111, 4, 4)
            Runner: onnxrt-runner-N0-04/11/21-12:47:58       | Stats: mean=-5.2171, min=-18.497 at (0, 174, 11, 11), max=1.5708 at (0, 0, 11, 0)
[E]         FAILED | Difference exceeds tolerance (rtol=1e-05, atol=1e-05)
[I]     Comparing Output: '023_convolutional' (dtype=float32, shape=(1, 255, 26, 26)) with '023_convolutional' (dtype=float32, shape=(1, 255, 26, 26))
[I]         Required tolerances: [atol=0.069397] OR [rtol=1e-05, atol=0.069256] OR [rtol=9084.6, atol=1e-05] | Mean Error: Absolute=0.010339, Relative=0.058467
            Runner: trt-runner-N0-04/11/21-12:47:58          | Stats: mean=-5.6, min=-18.625 at (0, 174, 7, 25), max=2.4707 at (0, 19, 12, 23)
            Runner: onnxrt-runner-N0-04/11/21-12:47:58       | Stats: mean=-5.5999, min=-18.625 at (0, 174, 7, 25), max=2.4672 at (0, 19, 12, 23)
[E]         FAILED | Difference exceeds tolerance (rtol=1e-05, atol=1e-05)
[E]     FAILED | Mismatched outputs: ['016_convolutional', '023_convolutional']
[E] FAILED | Command: /home/jkjung/project/MODNet/venv/bin/polygraphy run yolov3-tiny-416.onnx --trt --fp16 --onnxrt --float-min 0.0 --float-max 1.0

Here are the results: (FP16)

yolov3-tiny-416
- '016_convolutional' Mean Error: Absolute=0.008115, Relative=0.0037584
- '023_convolutional' Mean Error: Absolute=0.010339, Relative=0.058467
yolov3-608
- '082_convolutional' Mean Error: Absolute=0.01309, Relative=0.0043352
- '094_convolutional' Mean Error: Absolute=0.016002, Relative=0.0091567
- '106_convolutional' Mean Error: Absolute=0.016827, Relative=0.007058
yolov4-tiny-416
- '030_convolutional' Mean Error: Absolute=0.0065569, Relative=0.0021531
- '037_convolutional' Mean Error: Absolute=0.0080654, Relative=0.0048672
yolov4-608
- '139_convolutional' Mean Error: Absolute=0.01843, Relative=0.010256
- '150_convolutional' Mean Error: Absolute=0.014698, Relative=0.0067943
- '161_convolutional' Mean Error: Absolute=0.010814, Relative=0.0046399

The TensorRT "yolov3-tiny" FP16 engine is the only one which generates an output with >5% mean relative error from onnxruntime (all others are <1%). I think this indeed explains why the TensorRT "yolov3-tiny" engine evaluates to a much worse mAP than its DarkNet counterpart, comparing to the other models ("yolov3-608", "yolov4-tiny-416" and "yolov4-608")...

Duarte-Nunes · 2021-04-13T21:18:37Z

Hello, sorry for not adding anything to the discussion but i wanted to check, i'm currently trying to implement this repository on a Jetson Nano.

Does the yolov4-tiny model also present the mAP drop that has been discussed mainly for yolov3?

Anyways if this is unclear i will conduct my own tests on a custom dataset and can report the results back to you.

jkjung-avt · 2021-04-14T02:12:45Z

Does the yolov4-tiny model also present the mAP drop that has been discussed mainly for yolov3?

Based on my mAP evaluation results, "yolov3-tiny" suffers from this problem quite a bit. The other models ("yolov3", "yolov4-tiny" and "yolov4") are probably OK.

I would focus on solving the problem for "yolov3-tiny" if I have time.

akashAD98 · 2021-11-15T10:16:56Z

@jkjung-avt same problem for yolov4-mish ,yolov4-csp-swish model also, im getting lots of False positive & results are not same as darknet, May i know what are the reasons behind it? & how can we solve the FP problem?

jkjung-avt · 2021-11-15T10:25:18Z

@akashAD98 This is a known issue. I've done my best to make sure the code is correct for both TensorRT engine building and inferencing. But TensorRT engine optimization does result in mAP drop for various YOLO models.

I have also tried to analyze this problem with polygraphy as shown above, but failed to find the root cause and a solution. I don't have a good answer now. That's why I kept this issue open...

akashAD98 · 2021-11-15T11:04:17Z

@jkjung-avt thanks for your kind reply. we all appreciate your great work. Hope you will get a solution in the future.

akashAD98 · 2021-11-19T05:50:10Z

@jkjung-avt can we do inference & check the FPS & False prediction of onnx model? what you think about accuracy (False prediction ) its the same as tensorrt?
Do you have any script for doing inference on onnx model? same like tensorrt ?so we will get idea,whether problem with onnx conversion or onnx to tensorrt

jkjung-avt · 2021-11-19T07:39:33Z

Do you have any script for doing inference on onnx model?

I have done that for MODNet, but not for YOLO models. Some of the code could be reused though: https://github.com/jkjung-avt/tensorrt_demos/blob/master/modnet/test_onnx.py

In order to check mAP and false detection with the ONNX YOLO models, you'll also have to implement "yolo" layers in the post-processing code (this part is handled by the "yolo_layer" plugin in TensorRT cases). I don't think I have time to do that in the near future...

akashAD98 · 2021-11-20T06:55:06Z

hi @jkjung-avt do you have any idea ? how should i solve this issue onnx/tutorials#253 (comment)

this is script inference_onnx_yolov4-mish.ipynb.txt

akashAD98 · 2021-11-24T11:17:14Z

@jkjung-avt please have look

jkjung-avt · 2021-11-24T12:20:18Z

@akashAD98 I already commented: onnx/tutorials#253 (comment)

You need to modify the postprocessing code by yourself.

akashAD98 · 2021-11-26T07:06:15Z

Preprocessed image orignal shape: (1, 416, 416, 3) is so

i converted channel fist to channel last

& got

so its saying we need (1,3,416,416)

akashAD98 · 2021-11-29T08:23:06Z

@jkjung-avt is there any model which has almost similar results like darknet? yolov4-csp,yolo-mish has issues of false prediction ? so im looking for a good model of tensorrt. yolov4 is best??

jkjung-avt · 2021-11-29T08:26:49Z

Please refer to the "mAP and FPS" table in Demo #5: YOLOv4.

akashAD98 · 2022-03-03T07:10:15Z

@jkjung-avt
I converted yolov models into tensorrt & im getting to many false predictions as I said already in this issue,

one of the observations from my experiments-
I trained home model having 50 classes - which has very low False predictions
I trained music category model having only 10 classes- im getting too many False predictions.

i have done the same experiments with few more category classes & after that experiments, i come to know that its giving less FP if you have more classes & high FP if classes are less.

this is just my experimental observation-if you think this can help us to solve this issue, please let us know. Thanks

jkjung-avt · 2022-03-05T01:31:25Z

@akashAD98 Thanks for sharing the info. I tried to think about possible causes of such results but could not come up with any. I will keep this in mind and share my experience/thoughts when I have new findings.

ThomasGoud · 2022-04-27T10:17:56Z

Hello @jkjung-avt,

For my use case, I am trying to detect with yolov3 only one type of object (only one class).

After comparison with the code of yolov3 (https://github.com/experiencor/keras-yolo3), I observe that there is a major difference in the output classes probabilities processing.

In the original code (https://github.com/experiencor/keras-yolo3/blob/master/utils/utils.py at line 179), they apply softmax to all class probabilities:
netout[..., 5:] = netout[..., 4][..., np.newaxis] * _softmax(netout[..., 5:])

In your code, (https://github.com/jkjung-avt/tensorrt_demos/blob/master/plugins/yolo_layer.cu at line 167), you post-process the class probabilities with a sigmoid:
float max_cls_prob = sigmoidGPU(max_cls_logit);

In my case with only one classe:

with the original code: the softmax is always equal to one and the score associated with the bounding box is therefore equal to pobj*pclass0 = pobj.
with your implementation: the score of the bounding box is equal to pobj*pclass0 which is smaller

I think that this problem can explain why with more classes, the mAP is better (because the softmax is more similar to the sigmoid).

Thank you for your contribution with tensorrt_demos,
Thomas

jkjung-avt · 2022-04-27T15:51:20Z

@ThomasGoud Thanks for sharing your thoughts. But according to the original DarkNet implementation, the objectness and class scores are calculated by taking LOGISTIC (i.e. sigmoid) activation on the outputs of the previous convolutional layers.

You could refer to the source code as pointed below.

https://github.com/AlexeyAB/darknet/blob/8a0bf84c19e38214219dbd3345f04ce778426c57/src/yolo_layer.c#L680

https://github.com/AlexeyAB/darknet/blob/8a0bf84c19e38214219dbd3345f04ce778426c57/src/yolo_layer.c#L1190

ROBYER1 mentioned this issue Jan 2, 2021

How to get this working with Yolo v3-tiny? derenlei/Unity_Detection2AR#9

Closed

philipp-schmidt mentioned this issue Jan 6, 2021

Compare results with original yolov4 (mAP) isarsoft/yolov4-triton-tensorrt#7

Open

philipp-schmidt changed the title ~~Post Processing~~ Drop in mAP after conversion Jan 26, 2021

philipp-schmidt changed the title ~~Drop in mAP after conversion~~ Drop in mAP after TensorRT optimization Jan 26, 2021

jkjung-avt mentioned this issue Jan 30, 2021

No result when running single-class yolov3 #336

Closed

This was referenced Mar 24, 2021

Detection results darknet vs. trt engine #237

Closed

(Be sure to let me know if you found the problem.) #255

Closed

jkjung-avt mentioned this issue Mar 26, 2021

mAP is much lower in YOLOv3 INT8 model #381

Closed

jkjung-avt mentioned this issue Apr 5, 2021

Diference with original yolo #387

Closed

jkjung-avt mentioned this issue Apr 14, 2021

onnx configuration and int8 calib #388

Closed

jkjung-avt mentioned this issue Aug 28, 2021

darknet yolov4xmish vs converted tensorrt results- Too many false positive #473

Closed

philipp-schmidt mentioned this issue Oct 1, 2021

yolov4-tiny model accuracy not right isarsoft/yolov4-triton-tensorrt#49

Closed

akashAD98 mentioned this issue Nov 16, 2021

Drop in performance after converting models into .weight to Tensorrt wang-xinyu/tensorrtx#820

Closed

akashAD98 mentioned this issue Nov 16, 2021

Difference in False Detections when compared with yolov4 and yolov4+Tensor RT CaoWGG/TensorRT-YOLOv4#51

Open

jkjung-avt mentioned this issue Nov 17, 2021

fp32 vs FP16 vs int8 , both fp32 & fp16 having same results & fps? #504

Closed

jkjung-avt mentioned this issue May 31, 2022

custom model to trt-int8 problem #557

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop in mAP after TensorRT optimization #315

Drop in mAP after TensorRT optimization #315

philipp-schmidt commented Jan 2, 2021

philipp-schmidt commented Jan 2, 2021

jkjung-avt commented Jan 3, 2021

jkjung-avt commented Jan 4, 2021

philipp-schmidt commented Jan 6, 2021

jkjung-avt commented Jan 6, 2021

philipp-schmidt commented Jan 26, 2021 •

edited

Loading

jkjung-avt commented Mar 24, 2021

philipp-schmidt commented Mar 24, 2021

jkjung-avt commented Apr 10, 2021

ROBYER1 commented Apr 10, 2021

philipp-schmidt commented Apr 10, 2021 via email •

edited

Loading

jkjung-avt commented Apr 11, 2021

Duarte-Nunes commented Apr 13, 2021

jkjung-avt commented Apr 14, 2021

akashAD98 commented Nov 15, 2021 •

edited

Loading

jkjung-avt commented Nov 15, 2021

akashAD98 commented Nov 15, 2021

akashAD98 commented Nov 19, 2021

jkjung-avt commented Nov 19, 2021

akashAD98 commented Nov 20, 2021 •

edited

Loading

akashAD98 commented Nov 24, 2021

jkjung-avt commented Nov 24, 2021

akashAD98 commented Nov 26, 2021 •

edited

Loading

akashAD98 commented Nov 29, 2021 •

edited

Loading

jkjung-avt commented Nov 29, 2021

akashAD98 commented Mar 3, 2022 •

edited

Loading

jkjung-avt commented Mar 5, 2022

ThomasGoud commented Apr 27, 2022 •

edited

Loading

jkjung-avt commented Apr 27, 2022

Drop in mAP after TensorRT optimization #315

Drop in mAP after TensorRT optimization #315

Comments

philipp-schmidt commented Jan 2, 2021

philipp-schmidt commented Jan 2, 2021

jkjung-avt commented Jan 3, 2021

jkjung-avt commented Jan 4, 2021

philipp-schmidt commented Jan 6, 2021

jkjung-avt commented Jan 6, 2021

philipp-schmidt commented Jan 26, 2021 • edited Loading

jkjung-avt commented Mar 24, 2021

philipp-schmidt commented Mar 24, 2021

jkjung-avt commented Apr 10, 2021

ROBYER1 commented Apr 10, 2021

philipp-schmidt commented Apr 10, 2021 via email • edited Loading

jkjung-avt commented Apr 11, 2021

Duarte-Nunes commented Apr 13, 2021

jkjung-avt commented Apr 14, 2021

akashAD98 commented Nov 15, 2021 • edited Loading

jkjung-avt commented Nov 15, 2021

akashAD98 commented Nov 15, 2021

akashAD98 commented Nov 19, 2021

jkjung-avt commented Nov 19, 2021

akashAD98 commented Nov 20, 2021 • edited Loading

akashAD98 commented Nov 24, 2021

jkjung-avt commented Nov 24, 2021

akashAD98 commented Nov 26, 2021 • edited Loading

akashAD98 commented Nov 29, 2021 • edited Loading

jkjung-avt commented Nov 29, 2021

akashAD98 commented Mar 3, 2022 • edited Loading

jkjung-avt commented Mar 5, 2022

ThomasGoud commented Apr 27, 2022 • edited Loading

jkjung-avt commented Apr 27, 2022

philipp-schmidt commented Jan 26, 2021 •

edited

Loading

philipp-schmidt commented Apr 10, 2021 via email •

edited

Loading

akashAD98 commented Nov 15, 2021 •

edited

Loading

akashAD98 commented Nov 20, 2021 •

edited

Loading

akashAD98 commented Nov 26, 2021 •

edited

Loading

akashAD98 commented Nov 29, 2021 •

edited

Loading

akashAD98 commented Mar 3, 2022 •

edited

Loading

ThomasGoud commented Apr 27, 2022 •

edited

Loading