-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Align NPU to CPU #2560
Align NPU to CPU #2560
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #2560 +/- ##
============================================
- Coverage 62.07% 29.94% -32.13%
============================================
Files 494 494
Lines 45775 45791 +16
============================================
- Hits 28413 13714 -14699
- Misses 17362 32077 +14715
... and 262 files with indirect coverage changes
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Linters fails due to #2561 |
9d4ec97
to
4e6bbf9
Compare
Should NPU config also contain this?
|
Maybe @alexsu52 and @AlexKoff88 can answer this question. |
Good question. Scale unification should be applied only for operation from CPU hardware config to align behavior between CPU and NPU in INT8 quantization. What affects other precision, we should not change their behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have results of the conformance test for NPU config?
Please think about what test needs to be added to check the equality of the cpu, gpu and npu configs for int8.
We have no tests for NPU in the conformance as well. I haven't run these tests. |
"15 /nncf_model_output_1" [id=15, label="nncf_model_output_#15", style=filled, type=nncf_model_output]; | ||
"16 /nncf_model_output_2" [id=16, label="nncf_model_output_#16", style=filled, type=nncf_model_output]; | ||
"17 /nncf_model_output_3" [id=17, label="nncf_model_output_#17", style=filled, type=nncf_model_output]; | ||
"6 MultiBranchesModel/MaxPool2d[max_pool_b]/SymmetricQuantizer/symmetric_quantize_0" [color=green, id=6, label="AFQ_[B:8 M:S SGN:U PC:N]_#6_G0", style=filled, type=symmetric_quantize]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same for all requant
tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other references just changed assymetric to symmetric in 8bit activations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification.
But why do we have an asymmetric quantizer after the relu layer?
Just for the record, the main motivation for keeping the config is QAT for NPU which has some custom features such as W4A4 support. |
b741e0b
to
6dd4caa
Compare
1e959af
to
54c317e
Compare
54c317e
to
e5cfe06
Compare
@@ -206,6 +206,7 @@ def __init__( | |||
else: | |||
self._preset = QuantizationPreset.PERFORMANCE | |||
|
|||
self._override_device() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexsu52, I've added overriding. Review, please.
@alexsu52, please, review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Changes
Reason for changes
Related tickets
Tests