v0.1.0-rc.3
Pre-release
Pre-release
·
210 commits
to main
since this release
Automatically generated release for tag v0.1.0-rc.3.
What's Changed
- Add model adapter and multi-node inference docs by @Jeffwan in #222
- add gateway docs by @varungup90 in #232
- [Misc] add Runtime dependency for hf_transfer by @brosoul in #240
- Add validation for username and rpm/tpm negative value by @varungup90 in #241
- [CI] Merge python wheel publish process to release build pipeline by @brosoul in #247
- [CI] Push images to Github container registry by @Jeffwan in #246
- [CI] Fix post-submit container push failure by @Jeffwan in #249
- [Misc] Infer model name from model_uri and check AWS credential by @brosoul in #250
- [Misc ]Add runtime api metrics by @brosoul in #251
- [doc] Update release/contribution/quickstart docs by @Jeffwan in #242
- [batch] job FIFO scheduler as baseline by @xinchen384 in #231
- [Misc] Improve the installation component sequence by @Jeffwan in #252
- Fix concurrency issue with gateway RPM plugin by @varungup90 in #244
- Improve model adapter reliability and stability by @Jeffwan in #257
- Remove underscore from dir names and remove account word in rate limiter by @varungup90 in #271
- [Misc] Use klog as the logr implementation by @Jeffwan in #264
- [CI] Unify Dockerfile names and simplify the build scripts by @Jeffwan in #263
- Improve model adapter reconcile workflow stability by @Jeffwan in #260
- Add container override for images by @varungup90 in #273
- Add AIBrix Custom Autoscaling Algorithm APA by @kr11 in #223
- Use vllm metrics for routing by @varungup90 in #274
- Update random routing section and add support for anonymous user by @varungup90 in #276
- Add image build details and examples for multi-host inference by @Jeffwan in #278
- Cut v0.1.0-rc.3 release by @Jeffwan in #280
Full Changelog: v0.1.0-rc.2...v0.1.0-rc.3