[GPU] Revert jit::gemm handling for f32 src, f16 fpmath #2467

kealan-barbieri · 2025-01-21T21:45:31Z

Description

Revert to sending cases with f32 activations, fp16 fpmath setting to ref, as these cases cause additional register pressure from f32->f16 reorder that isnt supported by f16 gemm strategies:

onednn_verbose,v1,info,oneDNN v3.7.0 (commit 80b61e36646c9d1ab64d439a5bf1ea0966c6f0d9)
onednn_verbose,v1,info,cpu,runtime:OpenMP,nthr:224
onednn_verbose,v1,info,cpu,isa:Intel AVX-512 with float16, Intel DL Boost and bfloat16 support 
onednn_verbose,v1,info,gpu,runtime:OpenCL
onednn_verbose,v1,info,gpu,engine,opencl device count:1 
onednn_verbose,v1,info,gpu,engine,0,name:Intel(R) Data Center GPU Max 1100,driver_version:24.39.31294,binary_kernels:enabled
onednn_verbose,v1,info,experimental features are enabled
onednn_verbose,v1,info,use batch_normalization stats one pass is enabled
onednn_verbose,v1,info,GPU convolution v2 is disabled
onednn_verbose,v1,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,v1,primitive,error,gpu,jit::gemm,Insufficient registers in requested bundle,src/gpu/intel/jit/gemm/gen_gemm_kernel.cpp:959
Error: Function 'create_primitive' at (/nfs/pdx/home/skazakov/hal9000_skazakov/DNNL_JIRA/oneDNN/tests/benchdnn/dnnl_common.hpp:422) returned 'runtime_error'
Error: Function 'init_prim' at (/nfs/pdx/home/skazakov/hal9000_skazakov/DNNL_JIRA/oneDNN/tests/benchdnn/dnnl_common.hpp:475) returned '1'
Error: Function 'createit' at (/nfs/pdx/home/skazakov/hal9000_skazakov/DNNL_JIRA/oneDNN/tests/benchdnn/matmul/matmul.cpp:884) returned '1'
Error: Function 'create' at (/nfs/pdx/home/skazakov/hal9000_skazakov/DNNL_JIRA/oneDNN/tests/benchdnn/utils/task.hpp:49) returned '1'
0:UNTESTED_FAILED __REPRO: --matmul --engine=gpu --allow-enum-tags-only=false --dt=f32:u8:f32 --stag=ab --wtag=ba --dtag=ab --bia_dt=f32 --attr-scales=wei:per_oc --attr-zero-points=wei:per_oc:u8 --attr-scratchpad=user --attr-fpmath=f16:true 16384x512:512x512

Similar cases should use --fpmath=strict:true to leverage f32 jit:gemm strategies.

Fixes # MFDNN-13045

Checklist

General

Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
Have you formatted the code using clang-format?

Bug fixes

Have you included information on how to reproduce the issue (either in a github issue or in this PR)?
Have you added relevant regression tests?

src/gpu/intel/jit/gemm/gen_gemm_kernel.cpp

kealan-barbieri · 2025-01-23T01:56:06Z

make test
disable device_cpu
disable benchdnn_all
enable benchdnn_matmul
enable benchdnn_ip

kealan-barbieri · 2025-01-23T01:56:41Z

make test perf-gpu
set primitive=matmul

rjoursler · 2025-01-24T01:27:47Z

src/gpu/intel/jit/gemm/gen_gemm_kernel.cpp

+        });
+        add_mode_matches(fpmath_f16, [](Type dt) -> const char * {
+            if (dt.isF8()) return "H";
+            return nullptr;


You should be able to just add these if statements into the corresponding add_mode_matches at lines 560 and 566. Is there a reason that doesn't work?

kealan-barbieri requested a review from a team as a code owner January 21, 2025 21:45

github-actions bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Jan 21, 2025

petercad approved these changes Jan 22, 2025

View reviewed changes

rjoursler reviewed Jan 22, 2025

View reviewed changes

src/gpu/intel/jit/gemm/gen_gemm_kernel.cpp Outdated Show resolved Hide resolved

kealan-barbieri force-pushed the kealanba/disable_f16_f32_fpmath branch from 146ea11 to 504505a Compare January 23, 2025 01:38

kealan-barbieri added 2 commits January 23, 2025 15:23

xe: jit: gemm: fix handling for f32 src, f16 fpmath

83dbf1b

xe: jit: gemm: cover f32 decomp with f16 fpmath gap

bad7c75

kealan-barbieri force-pushed the kealanba/disable_f16_f32_fpmath branch from 504505a to bad7c75 Compare January 23, 2025 23:23

kealan-barbieri mentioned this pull request Jan 23, 2025

[GPU][rls-v3.7] Revert jit::gemm handling for f32 src, f16 fpmath #2496

Open

rjoursler reviewed Jan 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPU] Revert jit::gemm handling for f32 src, f16 fpmath #2467

[GPU] Revert jit::gemm handling for f32 src, f16 fpmath #2467

kealan-barbieri commented Jan 21, 2025 •

edited

Loading

kealan-barbieri commented Jan 23, 2025

kealan-barbieri commented Jan 23, 2025

rjoursler Jan 24, 2025

[GPU] Revert jit::gemm handling for f32 src, f16 fpmath #2467

Are you sure you want to change the base?

[GPU] Revert jit::gemm handling for f32 src, f16 fpmath #2467

Conversation

kealan-barbieri commented Jan 21, 2025 • edited Loading

Description

Checklist

General

Bug fixes

kealan-barbieri commented Jan 23, 2025

kealan-barbieri commented Jan 23, 2025

rjoursler Jan 24, 2025

Choose a reason for hiding this comment

kealan-barbieri commented Jan 21, 2025 •

edited

Loading