Looking for selective post training quantization for 8 bit weights and 16 bit activations #395

gaikwadrahul8 · 2024-11-28T07:44:31Z

System information

TensorFlow version (you are using): TF 2.13.0
Are you willing to contribute it (Yes/No): No
Describe the feature and the current behavior/state.

Dear TF developers, I'm currently experimenting with PTQ using 8 bit weights and 16 bit activations (W8A16), and I've gotten great results. However, after some experimentation I have identified that only a certain part of my network requires the 16 bit activations. In other word, using 16 bit activations for the entire model is sub-optimal for my use-case.

Hence, I'm looking for a way to selectively quantize a part of my model to 8 bit weights and activations (W8A8), and the other part to W8A16.

In the current state, would this be possible somehow ?

Who will benefit with this feature?
Platforms that support mixed-precision execution of activations.

Any Other info.

gaikwadrahul8 · 2024-11-28T13:44:58Z

This issue originally reported by @Hrayo712 has been moved to this dedicated repository for ai-edge-torch to enhance issue tracking and prioritization. To ensure continuity, we have created this new issue on your behalf.

We appreciate your understanding and look forward to your continued involvement.

pkgoogle · 2024-12-02T18:53:51Z

Original Issue: tensorflow/tensorflow#61720

github-actions bot assigned gaikwadrahul8 Nov 28, 2024

gaikwadrahul8 mentioned this issue Nov 28, 2024

Looking for selective post training quantization for 8 bit weights and 16 bit activations tensorflow/tensorflow#61720

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Looking for selective post training quantization for 8 bit weights and 16 bit activations #395

Looking for selective post training quantization for 8 bit weights and 16 bit activations #395

gaikwadrahul8 commented Nov 28, 2024

gaikwadrahul8 commented Nov 28, 2024

pkgoogle commented Dec 2, 2024

Looking for selective post training quantization for 8 bit weights and 16 bit activations #395

Looking for selective post training quantization for 8 bit weights and 16 bit activations #395

Comments

gaikwadrahul8 commented Nov 28, 2024

gaikwadrahul8 commented Nov 28, 2024

pkgoogle commented Dec 2, 2024