Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenVINO backend for INT8 models #23987

Merged
merged 36 commits into from
Sep 28, 2023
Merged

Conversation

dkurt
Copy link
Member

@dkurt dkurt commented Jul 13, 2023

Pull Request Readiness Checklist

TODO:

Performace results for object detection model coco_efficientdet_lite0_v1_1.0_quant_2021_09_06.tflite:

backend performance (median time)
OpenCV 77.42ms
OpenVINO 2023.0 10.90ms

CPU: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz

Serialized model per-layer stats (note that Convolution should use *_I8 primitives if they are quantized correctly): https://gist.github.com/dkurt/7772bbf1907035441bb5454f19f0feef


See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
force_builders=Custom

Xbuild_image:Custom=ubuntu-openvino-2021.4.2:20.04
build_image:Custom=ubuntu-openvino-2022.1.0:20.04

test_modules:Custom=dnn,python2,python3,java,gapi,video

buildworker:Custom=linux-1
# disabled due high memory usage: test_opencl:Custom=ON
test_bigdata:Custom=1
test_filter:Custom=*

@dkurt dkurt changed the title Openvino int8 backend OpenVINO backend for INT8 models Jul 13, 2023
asmorkalov pushed a commit that referenced this pull request Aug 2, 2023
DetectionOutput layer on OpenVINO without limitations #24069

### Pull Request Readiness Checklist

required for #23987

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
@asmorkalov
Copy link
Contributor

@dkurt related patch for detection layer has been merged.

@dkurt dkurt marked this pull request as ready for review August 18, 2023 09:59
@asmorkalov asmorkalov added this to the 4.9.0 milestone Sep 4, 2023
@dkurt dkurt force-pushed the openvino_int8_backend branch 2 times, most recently from 57bef5d to ef72700 Compare September 11, 2023 11:01
@asmorkalov
Copy link
Contributor

@dkurt @fengyuentau Friendly reminder.

@dkurt
Copy link
Member Author

dkurt commented Sep 19, 2023

@asmorkalov, can you please re-run failed opencv/ci-gha-workflow#109? Due Buildbot machines have only OpenVINO 2022 but this PR enables INT8 with OpenVINO 2023 and higher.

@asmorkalov
Copy link
Contributor

@dkurt I tested the PR manually with the mentioned Docker image. List of failed tests:

[  PASSED  ] 4193 tests.
[  FAILED  ] 36 tests, listed below:
[  FAILED  ] DNNTestNetwork.FastNeuralStyle_eccv16/0, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Convolution2D/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Convolution3D/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Flatten/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Padding/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.AvePooling/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.MaxPooling/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Reduce/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Softmax_slim_TF/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Concat/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Scale/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.InnerProduct/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Reshape/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_layers.Eltwise/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.GoogLeNet/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.DenseNet121/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.SqueezeNet_v1_1/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.MobileNet_v1_SSD/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.MobileNet_v1_SSD_PPN/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.Inception_v2_SSD/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.opencv_face_detector/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.EfficientDet/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.FasterRCNN_resnet50/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.FasterRCNN_inceptionv2/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.FasterRCNN_vgg16/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.FasterRCNN_zf/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.RFCN/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.TinyYoloVoc/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.YOLOv3/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.YOLOv4/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_Int8_nets.YOLOv4_tiny/1, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_ONNX_layers.Quantized_Convolution/0, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_ONNX_layers.Quantized_Resize/0, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_ONNX_layers.Quantized_Concat/0, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_ONNX_layers.QLinearSoftmax/0, where GetParam() = NGRAPH/CPU
[  FAILED  ] Test_TFLite.EfficientDet_int8/0, where GetParam() = NGRAPH/CPU

Some logs:

[ RUN      ] Test_Int8_layers.Convolution3D/1, where GetParam() = NGRAPH/CPU
/opencv/modules/dnn/test/test_common.impl.hpp:76: Failure
Expected: (normL1) <= (l1), actual: 0.0311098 vs 0.00734
conv3d  |ref| = 2.9877567291259766
/opencv/modules/dnn/test/test_common.impl.hpp:79: Failure
Expected: (normInf) <= (lInf), actual: 0.469178 vs 0.02434
conv3d  |ref| = 2.9877567291259766
/opencv/modules/dnn/test/test_common.impl.hpp:76: Failure
Expected: (normL1) <= (l1), actual: 0.0131492 vs 0.00353
conv3d  |ref| = 1.308632493019104
/opencv/modules/dnn/test/test_common.impl.hpp:79: Failure
Expected: (normInf) <= (lInf), actual: 0.151915 vs 0.00941
conv3d  |ref| = 1.308632493019104
/opencv/modules/dnn/test/test_common.impl.hpp:76: Failure
Expected: (normL1) <= (l1), actual: 0.132021 vs 0.00129
conv3d_bias  |ref| = 0.55038714408874512
/opencv/modules/dnn/test/test_common.impl.hpp:79: Failure
Expected: (normInf) <= (lInf), actual: 0.307804 vs 0.00249
conv3d_bias  |ref| = 0.55038714408874512
[  FAILED  ] Test_Int8_layers.Convolution3D/1, where GetParam() = NGRAPH/CPU (50 ms)
[ RUN      ] Test_Int8_layers.Padding/1, where GetParam() = NGRAPH/CPU
/opencv/modules/dnn/test/test_common.impl.hpp:76: Failure
Expected: (normL1) <= (l1), actual: 0.0141513 vs 0.0026
padding_valid  |ref| = 0.98040145635604858
/opencv/modules/dnn/test/test_common.impl.hpp:79: Failure
Expected: (normInf) <= (lInf), actual: 0.106906 vs 0.0064
padding_valid  |ref| = 0.98040145635604858
/opencv/modules/dnn/test/test_common.impl.hpp:76: Failure
Expected: (normL1) <= (l1), actual: 0.0167119 vs 0.0081
padding_same  |ref| = 2.7980570793151855
/opencv/modules/dnn/test/test_common.impl.hpp:79: Failure
Expected: (normInf) <= (lInf), actual: 0.455697 vs 0.032
padding_same  |ref| = 2.7980570793151855
/opencv/modules/dnn/test/test_common.impl.hpp:76: Failure
Expected: (normL1) <= (l1), actual: 0.0744841 vs 0.0078
spatial_padding  |ref| = 2.5320920944213867
/opencv/modules/dnn/test/test_common.impl.hpp:79: Failure
Expected: (normInf) <= (lInf), actual: 0.54478 vs 0.028
spatial_padding  |ref| = 2.5320920944213867
[  FAILED  ] Test_Int8_layers.Padding/1, where GetParam() = NGRAPH/CPU (105 ms)
[ RUN      ] Test_Int8_layers.AvePooling/1, where GetParam() = NGRAPH/CPU
unknown file: Failure
C++ exception with description "OpenCV(4.8.0-dev) /opencv/modules/dnn/src/ie_ngraph.cpp:771: error: (-2:Unspecified error) in function 'initPlugin'
> Failed to initialize Inference Engine backend (device = CPU): Check 'false' failed at src/inference/src/core.cpp:114:
> Supported primitive descriptors list is empty for node: AvgPool_94231
> " thrown in the test body.
[  FAILED  ] Test_Int8_layers.AvePooling/1, where GetParam() = NGRAPH/CPU (30 ms)

@asmorkalov
Copy link
Contributor

The test was executed on old PC without AVX2. I'll check another host and publish logs.

@asmorkalov
Copy link
Contributor

@dkurt The pipeline with OpenCV 2023 has been merged. Please rebase your PR on top of current 4.x to get new CI status.

@@ -124,6 +125,10 @@ class PoolingLayerInt8Impl CV_FINAL : public PoolingLayerInt8
return type == MAX || type == AVE;
return false;
}
else if (backendId == DNN_BACKEND_INFERENCE_ENGINE_NGRAPH)
{
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenVINO is optional dependency. Need some availability check here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checks are mixed across all layers - somewhere HAVE_INF_ENGINE checked but somewhere not.

modules/dnn/test/test_int8_layers.cpp Outdated Show resolved Hide resolved
modules/dnn/test/test_tflite_importer.cpp Outdated Show resolved Hide resolved
@asmorkalov
Copy link
Contributor

The PR looks great! Thanks a lot for the contribution! Could you add int8 related information to wiki or documentation. It should promote your work among OpenCV users.
Candidates:

Copy link
Member

@fengyuentau fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@asmorkalov asmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@asmorkalov asmorkalov merged commit c7ec0d5 into opencv:4.x Sep 28, 2023
24 checks passed
@dkurt
Copy link
Member Author

dkurt commented Sep 28, 2023

@asmorkalov, I wanted to refer to quantization tutorial/wiki, but turned out that there is no such articles in OpenCV docs.

@asmorkalov
Copy link
Contributor

You are welcome to create yet another chapter!

hanliutong pushed a commit to hanliutong/opencv that referenced this pull request Oct 7, 2023
OpenVINO backend for INT8 models opencv#23987

### Pull Request Readiness Checklist

TODO:
- [x] DetectionOutput layer (opencv#24069)
- [x] Less FP32 fallbacks (i.e. Sigmoid, eltwise sum)
- [x] Accuracy, performance tests (opencv#24039)
- [x] Single layer tests (convolution)
- [x] ~~Fixes for OpenVINO 2022.1 (https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/100334)~~


Performace results for object detection model `coco_efficientdet_lite0_v1_1.0_quant_2021_09_06.tflite`:
| backend | performance (median time) |
|---|---|
| OpenCV | 77.42ms |
| OpenVINO 2023.0 | 10.90ms |

CPU: `11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz`

Serialized model per-layer stats (note that Convolution should use `*_I8` primitives if they are quantized correctly): https://gist.github.com/dkurt/7772bbf1907035441bb5454f19f0feef

---

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
@asmorkalov asmorkalov mentioned this pull request Oct 17, 2023
thewoz pushed a commit to thewoz/opencv that referenced this pull request Jan 4, 2024
DetectionOutput layer on OpenVINO without limitations opencv#24069

### Pull Request Readiness Checklist

required for opencv#23987

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request Jan 4, 2024
OpenVINO backend for INT8 models opencv#23987

### Pull Request Readiness Checklist

TODO:
- [x] DetectionOutput layer (opencv#24069)
- [x] Less FP32 fallbacks (i.e. Sigmoid, eltwise sum)
- [x] Accuracy, performance tests (opencv#24039)
- [x] Single layer tests (convolution)
- [x] ~~Fixes for OpenVINO 2022.1 (https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/100334)~~


Performace results for object detection model `coco_efficientdet_lite0_v1_1.0_quant_2021_09_06.tflite`:
| backend | performance (median time) |
|---|---|
| OpenCV | 77.42ms |
| OpenVINO 2023.0 | 10.90ms |

CPU: `11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz`

Serialized model per-layer stats (note that Convolution should use `*_I8` primitives if they are quantized correctly): https://gist.github.com/dkurt/7772bbf1907035441bb5454f19f0feef

---

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request May 29, 2024
DetectionOutput layer on OpenVINO without limitations opencv#24069

### Pull Request Readiness Checklist

required for opencv#23987

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request May 29, 2024
OpenVINO backend for INT8 models opencv#23987

### Pull Request Readiness Checklist

TODO:
- [x] DetectionOutput layer (opencv#24069)
- [x] Less FP32 fallbacks (i.e. Sigmoid, eltwise sum)
- [x] Accuracy, performance tests (opencv#24039)
- [x] Single layer tests (convolution)
- [x] ~~Fixes for OpenVINO 2022.1 (https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/100334)~~


Performace results for object detection model `coco_efficientdet_lite0_v1_1.0_quant_2021_09_06.tflite`:
| backend | performance (median time) |
|---|---|
| OpenCV | 77.42ms |
| OpenVINO 2023.0 | 10.90ms |

CPU: `11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz`

Serialized model per-layer stats (note that Convolution should use `*_I8` primitives if they are quantized correctly): https://gist.github.com/dkurt/7772bbf1907035441bb5454f19f0feef

---

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants