OpenVINO backend for INT8 models #23987

dkurt · 2023-07-13T11:05:12Z

Pull Request Readiness Checklist

TODO:

DetectionOutput layer (DetectionOutput layer on OpenVINO without limitations #24069)
Less FP32 fallbacks (i.e. Sigmoid, eltwise sum)
Accuracy, performance tests (TFLite models on different backends (tests and improvements) #24039)
Single layer tests (convolution)
~~Fixes for OpenVINO 2022.1 (https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/100334)~~

Performace results for object detection model coco_efficientdet_lite0_v1_1.0_quant_2021_09_06.tflite:

backend	performance (median time)
OpenCV	77.42ms
OpenVINO 2023.0	10.90ms

CPU: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz

Serialized model per-layer stats (note that Convolution should use *_I8 primitives if they are quantized correctly): https://gist.github.com/dkurt/7772bbf1907035441bb5454f19f0feef

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

force_builders=Custom

Xbuild_image:Custom=ubuntu-openvino-2021.4.2:20.04
build_image:Custom=ubuntu-openvino-2022.1.0:20.04

test_modules:Custom=dnn,python2,python3,java,gapi,video

buildworker:Custom=linux-1
# disabled due high memory usage: test_opencl:Custom=ON
test_bigdata:Custom=1
test_filter:Custom=*

DetectionOutput layer on OpenVINO without limitations #24069 ### Pull Request Readiness Checklist required for #23987 See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

asmorkalov · 2023-08-02T12:29:11Z

@dkurt related patch for detection layer has been merged.

modules/dnn/src/net_impl_backend.cpp

asmorkalov · 2023-09-14T14:29:13Z

@dkurt @fengyuentau Friendly reminder.

dkurt · 2023-09-19T07:32:42Z

@asmorkalov, can you please re-run failed opencv/ci-gha-workflow#109? Due Buildbot machines have only OpenVINO 2022 but this PR enables INT8 with OpenVINO 2023 and higher.

asmorkalov · 2023-09-20T12:28:19Z

@dkurt I tested the PR manually with the mentioned Docker image. List of failed tests:

[  PASSED  ] 4193 tests.
[  FAILED  ] 36 tests, listed below:
[  FAILED  ] DNNTestNetwork.FastNeuralStyle_eccv16/0, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Convolution2D/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Convolution3D/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Flatten/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Padding/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.AvePooling/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.MaxPooling/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Reduce/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Softmax_slim_TF/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Concat/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Scale/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.InnerProduct/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Reshape/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Eltwise/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.GoogLeNet/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.DenseNet121/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.SqueezeNet_v1_1/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.MobileNet_v1_SSD/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.MobileNet_v1_SSD_PPN/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.Inception_v2_SSD/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.opencv_face_detector/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.EfficientDet/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.FasterRCNN_resnet50/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.FasterRCNN_inceptionv2/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.FasterRCNN_vgg16/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.FasterRCNN_zf/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.RFCN/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.TinyYoloVoc/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.YOLOv3/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.YOLOv4/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.YOLOv4_tiny/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_ONNX_layers.Quantized_Convolution/0, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_ONNX_layers.Quantized_Resize/0, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_ONNX_layers.Quantized_Concat/0, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_ONNX_layers.QLinearSoftmax/0, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_TFLite.EfficientDet_int8/0, where GetParam() = NGRAPH/CPU

Some logs:

[ RUN      ] Test_Int8_layers.Convolution3D/1, where GetParam() = NGRAPH/CPU
/opencv/modules/dnn/test/test_common.impl.hpp:76: Failure
Expected: (normL1) <= (l1), actual: 0.0311098 vs 0.00734
conv3d  |ref| = 2.9877567291259766
/opencv/modules/dnn/test/test_common.impl.hpp:79: Failure
Expected: (normInf) <= (lInf), actual: 0.469178 vs 0.02434
conv3d  |ref| = 2.9877567291259766
/opencv/modules/dnn/test/test_common.impl.hpp:76: Failure
Expected: (normL1) <= (l1), actual: 0.0131492 vs 0.00353
conv3d  |ref| = 1.308632493019104
/opencv/modules/dnn/test/test_common.impl.hpp:79: Failure
Expected: (normInf) <= (lInf), actual: 0.151915 vs 0.00941
conv3d  |ref| = 1.308632493019104
/opencv/modules/dnn/test/test_common.impl.hpp:76: Failure
Expected: (normL1) <= (l1), actual: 0.132021 vs 0.00129
conv3d_bias  |ref| = 0.55038714408874512
/opencv/modules/dnn/test/test_common.impl.hpp:79: Failure
Expected: (normInf) <= (lInf), actual: 0.307804 vs 0.00249
conv3d_bias  |ref| = 0.55038714408874512
[  FAILED  ] Test_Int8_layers.Convolution3D/1, where GetParam() = NGRAPH/CPU (50 ms)

[ RUN      ] Test_Int8_layers.Padding/1, where GetParam() = NGRAPH/CPU
/opencv/modules/dnn/test/test_common.impl.hpp:76: Failure
Expected: (normL1) <= (l1), actual: 0.0141513 vs 0.0026
padding_valid  |ref| = 0.98040145635604858
/opencv/modules/dnn/test/test_common.impl.hpp:79: Failure
Expected: (normInf) <= (lInf), actual: 0.106906 vs 0.0064
padding_valid  |ref| = 0.98040145635604858
/opencv/modules/dnn/test/test_common.impl.hpp:76: Failure
Expected: (normL1) <= (l1), actual: 0.0167119 vs 0.0081
padding_same  |ref| = 2.7980570793151855
/opencv/modules/dnn/test/test_common.impl.hpp:79: Failure
Expected: (normInf) <= (lInf), actual: 0.455697 vs 0.032
padding_same  |ref| = 2.7980570793151855
/opencv/modules/dnn/test/test_common.impl.hpp:76: Failure
Expected: (normL1) <= (l1), actual: 0.0744841 vs 0.0078
spatial_padding  |ref| = 2.5320920944213867
/opencv/modules/dnn/test/test_common.impl.hpp:79: Failure
Expected: (normInf) <= (lInf), actual: 0.54478 vs 0.028
spatial_padding  |ref| = 2.5320920944213867
[  FAILED  ] Test_Int8_layers.Padding/1, where GetParam() = NGRAPH/CPU (105 ms)

[ RUN      ] Test_Int8_layers.AvePooling/1, where GetParam() = NGRAPH/CPU
unknown file: Failure
C++ exception with description "OpenCV(4.8.0-dev) /opencv/modules/dnn/src/ie_ngraph.cpp:771: error: (-2:Unspecified error) in function 'initPlugin'
> Failed to initialize Inference Engine backend (device = CPU): Check 'false' failed at src/inference/src/core.cpp:114:
> Supported primitive descriptors list is empty for node: AvgPool_94231
> " thrown in the test body.
[  FAILED  ] Test_Int8_layers.AvePooling/1, where GetParam() = NGRAPH/CPU (30 ms)

asmorkalov · 2023-09-20T12:28:54Z

The test was executed on old PC without AVX2. I'll check another host and publish logs.

asmorkalov · 2023-09-27T13:43:02Z

@dkurt The pipeline with OpenCV 2023 has been merged. Please rebase your PR on top of current 4.x to get new CI status.

asmorkalov · 2023-09-28T10:28:38Z

modules/dnn/src/int8layers/pooling_layer.cpp

@@ -124,6 +125,10 @@ class PoolingLayerInt8Impl CV_FINAL : public PoolingLayerInt8
                return type == MAX || type == AVE;
            return false;
        }
+        else if (backendId == DNN_BACKEND_INFERENCE_ENGINE_NGRAPH)
+        {
+            return true;


OpenVINO is optional dependency. Need some availability check here.

The checks are mixed across all layers - somewhere HAVE_INF_ENGINE checked but somewhere not.

modules/dnn/test/test_int8_layers.cpp

modules/dnn/test/test_tflite_importer.cpp

asmorkalov · 2023-09-28T10:46:20Z

The PR looks great! Thanks a lot for the contribution! Could you add int8 related information to wiki or documentation. It should promote your work among OpenCV users.
Candidates:

fengyuentau

👍

asmorkalov

👍

dkurt · 2023-09-28T14:03:01Z

@asmorkalov, I wanted to refer to quantization tutorial/wiki, but turned out that there is no such articles in OpenCV docs.

asmorkalov · 2023-09-28T14:23:28Z

You are welcome to create yet another chapter!

OpenVINO backend for INT8 models opencv#23987 ### Pull Request Readiness Checklist TODO: - [x] DetectionOutput layer (opencv#24069) - [x] Less FP32 fallbacks (i.e. Sigmoid, eltwise sum) - [x] Accuracy, performance tests (opencv#24039) - [x] Single layer tests (convolution) - [x] ~~Fixes for OpenVINO 2022.1 (https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/100334)~~ Performace results for object detection model `coco_efficientdet_lite0_v1_1.0_quant_2021_09_06.tflite`: | backend | performance (median time) | |---|---| | OpenCV | 77.42ms | | OpenVINO 2023.0 | 10.90ms | CPU: `11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz` Serialized model per-layer stats (note that Convolution should use `*_I8` primitives if they are quantized correctly): https://gist.github.com/dkurt/7772bbf1907035441bb5454f19f0feef --- See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

DetectionOutput layer on OpenVINO without limitations opencv#24069 ### Pull Request Readiness Checklist required for opencv#23987 See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

OpenVINO backend for INT8 models opencv#23987 ### Pull Request Readiness Checklist TODO: - [x] DetectionOutput layer (opencv#24069) - [x] Less FP32 fallbacks (i.e. Sigmoid, eltwise sum) - [x] Accuracy, performance tests (opencv#24039) - [x] Single layer tests (convolution) - [x] ~~Fixes for OpenVINO 2022.1 (https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/100334)~~ Performace results for object detection model `coco_efficientdet_lite0_v1_1.0_quant_2021_09_06.tflite`: | backend | performance (median time) | |---|---| | OpenCV | 77.42ms | | OpenVINO 2023.0 | 10.90ms | CPU: `11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz` Serialized model per-layer stats (note that Convolution should use `*_I8` primitives if they are quantized correctly): https://gist.github.com/dkurt/7772bbf1907035441bb5454f19f0feef --- See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

DetectionOutput layer on OpenVINO without limitations opencv#24069 ### Pull Request Readiness Checklist required for opencv#23987 See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

OpenVINO backend for INT8 models opencv#23987 ### Pull Request Readiness Checklist TODO: - [x] DetectionOutput layer (opencv#24069) - [x] Less FP32 fallbacks (i.e. Sigmoid, eltwise sum) - [x] Accuracy, performance tests (opencv#24039) - [x] Single layer tests (convolution) - [x] ~~Fixes for OpenVINO 2022.1 (https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/100334)~~ Performace results for object detection model `coco_efficientdet_lite0_v1_1.0_quant_2021_09_06.tflite`: | backend | performance (median time) | |---|---| | OpenCV | 77.42ms | | OpenVINO 2023.0 | 10.90ms | CPU: `11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz` Serialized model per-layer stats (note that Convolution should use `*_I8` primitives if they are quantized correctly): https://gist.github.com/dkurt/7772bbf1907035441bb5454f19f0feef --- See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

dkurt changed the title ~~Openvino int8 backend~~ OpenVINO backend for INT8 models Jul 13, 2023

asmorkalov added optimization category: dnn labels Jul 13, 2023

asmorkalov requested review from vpisarev and fengyuentau July 13, 2023 11:46

dkurt mentioned this pull request Jul 27, 2023

DetectionOutput layer on OpenVINO without limitations #24069

Merged

6 tasks

dkurt force-pushed the openvino_int8_backend branch from 58225d3 to 4d1f854 Compare August 10, 2023 12:47

dkurt marked this pull request as ready for review August 18, 2023 09:59

asmorkalov added this to the 4.9.0 milestone Sep 4, 2023

dkurt force-pushed the openvino_int8_backend branch 2 times, most recently from 57bef5d to ef72700 Compare September 11, 2023 11:01

dkurt commented Sep 12, 2023

View reviewed changes

modules/dnn/src/net_impl_backend.cpp Show resolved Hide resolved

dkurt added 11 commits September 27, 2023 18:56

Run first Quantize layer

cac994d

Conv with bias

9f289e0

Conv with output_zp

4a41a2c

Conv with outputMultiplier

00c8b04

Use FakeQuantize instead

931793b

input_zp at INT8 Convolution

c24adc3

Workaround RELU6

e219b7c

Fix Conv, Relu6. Add Eltwise sum

e44d4b7

Max pooling

bf86754

full inference

8ab5365

Use align corners at NN resize

f332d3b

dkurt added 13 commits September 27, 2023 18:56

Add FakeQuantize to Eltwise

8baf150

FakeQuantize for Pooling

a27b35f

FullyConnected INT8 with OpenVINO

6bb71af

Pooling with SUM

da0af4b

Reduced number of FP32 ops from 103 to 8

b0b0245

Better utilization of FakeQuantize in fully connected layer

c87db1a

Use multivalue FakeQuantize at Convolution and FullyConnected

e047716

Fix tests

b748de0

Skip two failed Ave pooling tests. Improve Dequantize.

ff1b2e8

Use ngraph::Output

224bfb8

Hide quantization to separate methods

72b736f

Require OpenVINO 2023.0

a76b63e

Correct fallback to OpenCV for OpenVINO < 2023

7cd2213

dkurt force-pushed the openvino_int8_backend branch from f009bd8 to 7cd2213 Compare September 27, 2023 15:57

Skip failed tests after rebase

aa94875

asmorkalov reviewed Sep 28, 2023

View reviewed changes

fengyuentau approved these changes Sep 28, 2023

View reviewed changes

Add skip comments

49ac114

asmorkalov approved these changes Sep 28, 2023

View reviewed changes

asmorkalov assigned fengyuentau Sep 28, 2023

asmorkalov merged commit c7ec0d5 into opencv:4.x Sep 28, 2023
24 checks passed

asmorkalov mentioned this pull request Oct 17, 2023

(5.x) Merge 4.x #24416

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenVINO backend for INT8 models #23987

OpenVINO backend for INT8 models #23987

dkurt commented Jul 13, 2023 •

edited

Loading

asmorkalov commented Aug 2, 2023

asmorkalov commented Sep 14, 2023

dkurt commented Sep 19, 2023

asmorkalov commented Sep 20, 2023

asmorkalov commented Sep 20, 2023

asmorkalov commented Sep 27, 2023

asmorkalov Sep 28, 2023

dkurt Sep 28, 2023

asmorkalov commented Sep 28, 2023

fengyuentau left a comment

asmorkalov left a comment

dkurt commented Sep 28, 2023

asmorkalov commented Sep 28, 2023

OpenVINO backend for INT8 models #23987

OpenVINO backend for INT8 models #23987

Conversation

dkurt commented Jul 13, 2023 • edited Loading

Pull Request Readiness Checklist

asmorkalov commented Aug 2, 2023

asmorkalov commented Sep 14, 2023

dkurt commented Sep 19, 2023

asmorkalov commented Sep 20, 2023

asmorkalov commented Sep 20, 2023

asmorkalov commented Sep 27, 2023

asmorkalov Sep 28, 2023

Choose a reason for hiding this comment

dkurt Sep 28, 2023

Choose a reason for hiding this comment

asmorkalov commented Sep 28, 2023

fengyuentau left a comment

Choose a reason for hiding this comment

asmorkalov left a comment

Choose a reason for hiding this comment

dkurt commented Sep 28, 2023

asmorkalov commented Sep 28, 2023

dkurt commented Jul 13, 2023 •

edited

Loading