-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: yuwenzho <[email protected]>
- Loading branch information
Showing
20 changed files
with
761 additions
and
273 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,20 +30,20 @@ pip install -r requirements.txt | |
pip install . | ||
``` | ||
|
||
> **Note**: | ||
> **Note**: | ||
> Further installation methods can be found under [Installation Guide](./docs/installation_guide.md). | ||
## Getting Started | ||
|
||
Setting up the environment: | ||
Setting up the environment: | ||
```bash | ||
pip install onnx-neural-compressor "onnxruntime>=1.17.0" onnx | ||
``` | ||
After successfully installing these packages, try your first quantization program. | ||
> Notes: please install from source before the formal pypi release. | ||
> Notes: please install from source before the formal pypi release. | ||
### Weight-Only Quantization (LLMs) | ||
Following example code demonstrates Weight-Only Quantization on LLMs, device will be selected for efficiency automatically when multiple devices are available. | ||
Following example code demonstrates Weight-Only Quantization on LLMs, device will be selected for efficiency automatically when multiple devices are available. | ||
|
||
Run the example: | ||
```python | ||
|
@@ -59,17 +59,16 @@ quant = matmul_nbits_quantizer.MatMulNBitsQuantizer( | |
) | ||
quant.process() | ||
best_model = quant.model | ||
``` | ||
``` | ||
|
||
### Static Quantization | ||
|
||
```python | ||
from onnx_neural_compressor import config | ||
from onnx_neural_compressor.quantization import quantize | ||
from onnx_neural_compressor.quantization import calibrate | ||
from onnx_neural_compressor.quantization import quantize, config | ||
from onnx_neural_compressor import data_reader | ||
|
||
|
||
class DataReader(calibrate.CalibrationDataReader): | ||
class DataReader(data_reader.CalibrationDataReader): | ||
def __init__(self): | ||
self.encoded_list = [] | ||
# append data into self.encoded_list | ||
|
@@ -127,6 +126,6 @@ quantize(model, output_model_path, qconfig) | |
* [Contribution Guidelines](./docs/source/CONTRIBUTING.md) | ||
* [Security Policy](SECURITY.md) | ||
|
||
## Communication | ||
## Communication | ||
- [GitHub Issues](https://github.com/onnx/neural-compressor/issues): mainly for bug reports, new feature requests, question asking, etc. | ||
- [Email](mailto:[email protected]): welcome to raise any interesting research ideas on model compression techniques by email for collaborations. | ||
- [Email](mailto:[email protected]): welcome to raise any interesting research ideas on model compression techniques by email for collaborations. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.