src: attr: quantization refactor (part 1) #2270

dzarukin · 2024-12-13T21:37:19Z

Every year around Christmas time something happens to quantization:

2022 - a move to runtime happened, lots of obsolete code left behind
2023 - advanced quantization with groups appeared, even more code that could use some love left behind.
2023.5 - extension of zero-points for SRC argument happened, zero-points has become a warehouse of variable to access directly.
2024 - time for refactor now!

The whole point of this refactor is to move quantization attributes to C++ way of doing things - provide clear and simple interfaces to operate with objects and close members.
This part 1 covers scales which were not that bad in terms of interfaces but could be better with argument members which this part covers.

Interface for both scales and its new underlying object, quant_entry_t, provide getters for a mask, data_type and groups (no need to worry about ndims any longer!), as well as default values checks.
Initialization is still done through set, reset was replaced with set,

Any operations with masks should happen only after verifying that specific arg scale is not default! It's forced now through an invalid mask value which can't be used as is.

Among legacy use-cases here are the main changes across sources:
common:

some primitive attribute checkers got more checks added.
cpu:
many places updated mask != 0 which was the default value for common and non-initialized scales to mask > 0 as now the default mask is negative.
- A check for equality is still valid while it's highly recommended to avoid unequal comparison (unless you really know what you are doing).
  gpu:
found some bugs in gemm_with_post_ops implementations.
changed logic in several generic/cudnn kernels to match the new behavior.
tests:
updated gtests to comply with updated primitive checks.

Part 2 will cover zero-points.

Disclaimer: the change is somewhat fundamental, the bug leaks are highly possible even if all tests are passing. Feel free to report any if I missed something.

dzarukin · 2024-12-13T21:37:31Z

make test

dzarukin · 2024-12-13T23:34:01Z

make test

dzarukin · 2024-12-16T18:10:21Z

make test

mgouicem

Looks good overall, thanks for the refactor.
Just one thing about the mask: moving from a default of 0 to -1 seems slippery.

Now the dev will have to always check if the value is non default before doing a comparison to 0, or doing some binary operation (e.g. or/xor).
I think it might be worth to normalize the mask upon get_mask so that default value becomes an implementation detail.

src/common/matmul_pd.hpp

src/common/primitive_attr_quant.hpp

src/cpu/ref_convolution_int8.cpp

src/cpu/scale_utils.cpp

src/gpu/generic/sycl/ref_layer_normalizations.cpp

src/gpu/nvidia/cudnn_inner_product.hpp

src/gpu/nvidia/cudnn_matmul_lt.hpp

dzarukin · 2025-01-17T04:54:08Z

make test

dzarukin · 2025-01-22T23:44:17Z

make test

dzarukin · 2025-01-24T23:57:09Z

make test

dzarukin requested review from a team as code owners December 13, 2024 21:37

dzarukin force-pushed the dzarukin/refactor_quant branch from ad9b2d3 to 9b4efc6 Compare December 13, 2024 23:33

dzarukin force-pushed the dzarukin/refactor_quant branch from 9b4efc6 to 58b979e Compare December 16, 2024 18:10

github-actions bot added the component:tests Codeowner: @oneapi-src/onednn-arch label Dec 16, 2024

dzarukin mentioned this pull request Jan 6, 2025

build: cmake: unsupported compilers cleanup #2296

Merged

dzarukin force-pushed the dzarukin/refactor_quant branch from 58b979e to b196033 Compare January 7, 2025 00:13

mgouicem approved these changes Jan 9, 2025

View reviewed changes

src/common/matmul_pd.hpp Outdated Show resolved Hide resolved

src/common/primitive_attr_quant.hpp Show resolved Hide resolved

src/cpu/ref_convolution_int8.cpp Show resolved Hide resolved

src/cpu/scale_utils.cpp Show resolved Hide resolved

dzarukin force-pushed the dzarukin/refactor_quant branch from b196033 to a3f8667 Compare January 10, 2025 23:51

sgeor255 approved these changes Jan 13, 2025

View reviewed changes

src/gpu/generic/sycl/ref_layer_normalizations.cpp Show resolved Hide resolved

src/gpu/nvidia/cudnn_inner_product.hpp Show resolved Hide resolved

src/gpu/nvidia/cudnn_matmul_lt.hpp Show resolved Hide resolved

sgeor255 approved these changes Jan 13, 2025

View reviewed changes

dzarukin force-pushed the dzarukin/refactor_quant branch 5 times, most recently from 3a6d823 to 004ca8b Compare January 17, 2025 04:54

dzarukin force-pushed the dzarukin/refactor_quant branch from 004ca8b to 4bdcad2 Compare January 22, 2025 23:40

dzarukin force-pushed the dzarukin/refactor_quant branch from 4bdcad2 to 0f09c05 Compare January 22, 2025 23:43

dzarukin added 4 commits January 24, 2025 15:55

cpu: matmul: limit mask over K dim to specific use cases

8fba636

gtests: update allow_attr struct with scale arg

da82db6

gtests: update attr_quant with relevant arguments and conditions

d3af801

common: move default_attr back

8854f3c

dzarukin force-pushed the dzarukin/refactor_quant branch from 0f09c05 to e61f596 Compare January 24, 2025 23:56

dzarukin changed the base branch from main to dzarukin/refactor_quant_prereq January 24, 2025 23:56

src: introduce quant_entry_t and refactor arg_scales_t to rely on it

1c44dbb

dzarukin force-pushed the dzarukin/refactor_quant branch from e61f596 to 1c44dbb Compare January 25, 2025 02:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src: attr: quantization refactor (part 1) #2270

src: attr: quantization refactor (part 1) #2270

dzarukin commented Dec 13, 2024

dzarukin commented Dec 13, 2024

dzarukin commented Dec 13, 2024

dzarukin commented Dec 16, 2024

mgouicem left a comment

dzarukin commented Jan 17, 2025

dzarukin commented Jan 22, 2025

dzarukin commented Jan 24, 2025

src: attr: quantization refactor (part 1) #2270

Are you sure you want to change the base?

src: attr: quantization refactor (part 1) #2270

Conversation

dzarukin commented Dec 13, 2024

dzarukin commented Dec 13, 2024

dzarukin commented Dec 13, 2024

dzarukin commented Dec 16, 2024

mgouicem left a comment

Choose a reason for hiding this comment

dzarukin commented Jan 17, 2025

dzarukin commented Jan 22, 2025

dzarukin commented Jan 24, 2025