Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC for aligning StableHLO and TOSA arithmetic #1149

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

eric-k256
Copy link

Initial version of an RFC to discuss aligning StableHLO and TOSA arithmetic operation.

Signed-off-by: Eric Kunze eric.kunze@arm.com

@burmako burmako self-assigned this Feb 10, 2023
@burmako burmako added the RFC label Feb 10, 2023
@burmako burmako mentioned this pull request Feb 10, 2023
Initial version of an RFC to discuss aligning StableHLO and TOSA
arithmetic operation.

Signed-off-by: Eric Kunze <eric.kunze@arm.com>
Copy link
Contributor

@burmako burmako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi folks! Apologies for the late reply - there have been out of band conversations about this RFC, and now I'd like to summarize them here on GitHub.

As the RFC mentions, StableHLO dialect currently has support for quantization via:

  1. Supporting quant.uniform element types.
  2. Having dedicated ops like uniform_quantize / uniform_dequantize.
  3. Allowing regular ops like add / convolution to take quantized tensors.

This support was inherited from MHLO when StableHLO was bootstrapped, and MHLO's support was motivated by mobile use cases and inherited from TFLite. TFLite quantization has a specification, but StableHLO quantization does not.

One key aspect of TFLite quantization spec is that it uses a floating-point scale and an integer zero point as quantization parameters. In comparison, the quantization parameters proposed in this RFC involve an integer multiplier and shift. Harmonizing this or coming up with some kind of a compromise solution looks like the main open question at the moment.

Towards that end, I would like to propose for us to collaborate on pull requests to StableHLO specification. @sdasgup3 has created an initial PR #1352 that drafts a specification for QuantizedType and for semantics of quantized add, modelled after TFLite quantization semantics. Let's get together as a community and discuss the details, with the plan to progress to more involved ops like convolution in the future pull requests.

burmako pushed a commit that referenced this pull request Apr 14, 2023
StableHLO dialect currently supports quantization via:
  1) Supporting `quant.uniform` element types.
  2) Having dedicated ops like `uniform_quantize` / `uniform_dequantize`.
  3) Allowing regular ops like `add` / `convolution` to take quantized
tensors.

This support was inherited from MHLO when StableHLO was bootstrapped,
and MHLO support was motivated by mobile use cases and inherited from
TFLite.

As pointed out in #1149, StableHLO specification doesn't support
quantization at the moment, and this is an important gap that we would 
like to fix before StableHLO v1.0 (see #588).

To continue the discussion started in #1149 and to make progress towards
v1.0, this pull request:
  A) Adds QuantizedType to the StableHLO specification, modelled after
[TFLite quantization
spec](https://www.tensorflow.org/lite/performance/quantization_spec).
  B) To start a conversation about the applications of QuantizedType and
the semantics of quantized ops, proposes semantics for quantized `add`.

TFLite quantization spec doesn't cover everything. It specs constraints
on types (which we captured accordingly in this pull request), but it
doesn't go into describing semantics of quantized ops.

As a result, the proposed semantics for quantized `add` is intentionally
naive, as compared with the much more involved implementations in the
TensorFlow repository, e.g.:
  *
[tfl.add](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/kernels/add.cc).
  *
[tf.UniformQuantizedAdd](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/uniform_quant_ops/uniform_quantized_add_op.cc).

upd: After community discussion, we removed the spec for quantized
`add` leaving that for future work, since further alignment is required.

---------

Co-authored-by: Eugene Burmako <burmako@google.com>
@burmako burmako assigned GleasonK and unassigned burmako Nov 6, 2023
@burmako burmako requested a review from GleasonK November 9, 2023 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

3 participants