Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: attr: quantization refactor (part 1) #2270

Open
wants to merge 5 commits into
base: dzarukin/refactor_quant_prereq
Choose a base branch
from

Conversation

dzarukin
Copy link
Contributor

Every year around Christmas time something happens to quantization:

  • 2022 - a move to runtime happened, lots of obsolete code left behind
  • 2023 - advanced quantization with groups appeared, even more code that could use some love left behind.
  • 2023.5 - extension of zero-points for SRC argument happened, zero-points has become a warehouse of variable to access directly.
  • 2024 - time for refactor now!

The whole point of this refactor is to move quantization attributes to C++ way of doing things - provide clear and simple interfaces to operate with objects and close members.
This part 1 covers scales which were not that bad in terms of interfaces but could be better with argument members which this part covers.

Interface for both scales and its new underlying object, quant_entry_t, provide getters for a mask, data_type and groups (no need to worry about ndims any longer!), as well as default values checks.
Initialization is still done through set, reset was replaced with set,

Any operations with masks should happen only after verifying that specific arg scale is not default! It's forced now through an invalid mask value which can't be used as is.

Among legacy use-cases here are the main changes across sources:
common:

  • some primitive attribute checkers got more checks added.
    cpu:
  • many places updated mask != 0 which was the default value for common and non-initialized scales to mask > 0 as now the default mask is negative.
    • A check for equality is still valid while it's highly recommended to avoid unequal comparison (unless you really know what you are doing).
      gpu:
  • found some bugs in gemm_with_post_ops implementations.
  • changed logic in several generic/cudnn kernels to match the new behavior.
    tests:
  • updated gtests to comply with updated primitive checks.

Part 2 will cover zero-points.

Disclaimer: the change is somewhat fundamental, the bug leaks are highly possible even if all tests are passing. Feel free to report any if I missed something.

@dzarukin dzarukin requested review from a team as code owners December 13, 2024 21:37
@dzarukin
Copy link
Contributor Author

make test

@github-actions github-actions bot added platform:cpu-x64 Intel64/AMD64 processors. Codeowner: @oneapi-src/onednn-cpu-x64 platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64 platform:gpu-nvidia Codeowner: @oneapi-src/onednn-gpu-nvidia platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel platform:gpu-generic Codeowner: @oneapi-src/onednn-gpu-generic labels Dec 13, 2024
@dzarukin dzarukin force-pushed the dzarukin/refactor_quant branch from ad9b2d3 to 9b4efc6 Compare December 13, 2024 23:33
@dzarukin
Copy link
Contributor Author

make test

@dzarukin dzarukin force-pushed the dzarukin/refactor_quant branch from 9b4efc6 to 58b979e Compare December 16, 2024 18:10
@dzarukin
Copy link
Contributor Author

make test

@github-actions github-actions bot added the component:tests Codeowner: @oneapi-src/onednn-arch label Dec 16, 2024
@dzarukin dzarukin force-pushed the dzarukin/refactor_quant branch from 58b979e to b196033 Compare January 7, 2025 00:13
Copy link
Contributor

@mgouicem mgouicem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, thanks for the refactor.
Just one thing about the mask: moving from a default of 0 to -1 seems slippery.

Now the dev will have to always check if the value is non default before doing a comparison to 0, or doing some binary operation (e.g. or/xor).
I think it might be worth to normalize the mask upon get_mask so that default value becomes an implementation detail.

src/common/matmul_pd.hpp Outdated Show resolved Hide resolved
src/common/primitive_attr_quant.hpp Show resolved Hide resolved
src/cpu/ref_convolution_int8.cpp Show resolved Hide resolved
src/cpu/scale_utils.cpp Show resolved Hide resolved
@dzarukin dzarukin force-pushed the dzarukin/refactor_quant branch from b196033 to a3f8667 Compare January 10, 2025 23:51
@dzarukin dzarukin force-pushed the dzarukin/refactor_quant branch 5 times, most recently from 3a6d823 to 004ca8b Compare January 17, 2025 04:54
@dzarukin
Copy link
Contributor Author

make test

@dzarukin dzarukin force-pushed the dzarukin/refactor_quant branch from 004ca8b to 4bdcad2 Compare January 22, 2025 23:40
@dzarukin dzarukin force-pushed the dzarukin/refactor_quant branch from 4bdcad2 to 0f09c05 Compare January 22, 2025 23:43
@dzarukin
Copy link
Contributor Author

make test

@dzarukin dzarukin force-pushed the dzarukin/refactor_quant branch from 0f09c05 to e61f596 Compare January 24, 2025 23:56
@dzarukin dzarukin changed the base branch from main to dzarukin/refactor_quant_prereq January 24, 2025 23:56
@dzarukin
Copy link
Contributor Author

make test

@dzarukin dzarukin force-pushed the dzarukin/refactor_quant branch from e61f596 to 1c44dbb Compare January 25, 2025 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:tests Codeowner: @oneapi-src/onednn-arch platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64 platform:cpu-x64 Intel64/AMD64 processors. Codeowner: @oneapi-src/onednn-cpu-x64 platform:gpu-generic Codeowner: @oneapi-src/onednn-gpu-generic platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel platform:gpu-nvidia Codeowner: @oneapi-src/onednn-gpu-nvidia
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants