We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialization for GGML:
bentoml.ggml.save_model bentoml.ggml.load_model
It is worth noting that bentoml.ggml also provides an entrypoint for converting the model weights from PyTorch, Tensorflow or HF directly to GGML:
bentoml.ggml
bentoml.ggml.convert_weights_to_ggml("/path/to/weight", format: t.Literal['pt', 'tf', 'hf'] = ...)
GGML runner will be available with CoreML, CPU, and CUDA support:
bentoml.ggml.get().to_runner() -> GGMLRunner
The development for this feature will live under bentoml/OpenLLM, and I will port back to BentoML once the API is more mature.
bentoml/OpenLLM
No response
The text was updated successfully, but these errors were encountered:
aarnphm
No branches or pull requests
Feature request
Serialization for GGML:
It is worth noting that
bentoml.ggml
also provides an entrypoint for converting the model weights from PyTorch, Tensorflow or HF directly to GGML:GGML runner will be available with CoreML, CPU, and CUDA support:
The development for this feature will live under
bentoml/OpenLLM
, and I will port back to BentoML once the API is more mature.Motivation
No response
Other
No response
The text was updated successfully, but these errors were encountered: