-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x64: utils: jit_io_helper: fix xf16 store from Xmm (fixes MFDNN-12635) #2509
base: main
Are you sure you want to change the base?
Conversation
make test |
host_->vcvtneps2bf16(cvt_lower_vmm, vmm, Xbyak::VexEncoding); | ||
host_->vcvtneps2bf16(cvt_lower_vmm, vmm, | ||
mayiuse(avx512_core) ? Xbyak::EvexEncoding | ||
: Xbyak::VexEncoding); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come an instruction that supports only Evex encoding has an option for Vex encoding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, but according to xbyak it has parameter for encoding:
https://github.com/oneapi-src/oneDNN/blob/main/third_party/xbyak/xbyak_mnemonic.h#L1260
And I see it's not the only place where Vex encoding is used, for example pooling:
https://github.com/oneapi-src/oneDNN/blob/main/src/cpu/x64/jit_uni_pool_kernel.cpp#L931
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should limit that instruction to Evex only, regardless if xbyak provides an opportunity for encoding argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do separate PR for this, as it occurs in many files.
|
||
# Test bf16 with aBcde4b format | ||
--reset | ||
--skip-impl=ref,simple # ! test jit version only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--skip-impl=ref,simple # ! test jit version only | |
--skip-impl=simple #skip non-jit version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
c1c72ef
to
3d10aca
Compare
make test |
This is a short fix for MFDNN-12635.
Function
jit_io_helper_t<Vmm>::store_bf16
doesn't supportXmm
, that's why it crashed when usingbf16
with blocked format with channels blocked by 4.I changed a way it works to use
jit_io_helper_t<Vmm>::store_byte_by_byte
function instead.Though, I have other idea that maybe using
store_bf16
could work if we introduced anotherOpmask
to filter out the lower half ofXmm
. It would require some changes tojit_io_helper_t
class, so I prefer to not submit it now as a bugfix, but after some time (as it requires some analysis), that's why I left aTODO
comment.Also I'm adding reorder regression input to benchdnn, as it was likely not covered on CPU side.