Release v1.0.0 official release of infctx trainer · RWKV/RWKV-infctx-trainer

This is a major release from the original infctx trainer with a huge list of features, from the original infctx trainer

HF First dataset configuration (see: https://github.com/RWKV/RWKV-infctx-trainer/tree/main/notebook/dataset-config)
Deepspeed 3 support
Support for world tokenizer
Script included to initialize new models, to train models from scratch
RWKV v5 support (to finetune upcoming models)
BPTT support (default), for training arbitary data context length

Thanks for all those who helped test the trainer for bugs and issues, even when it was in a very rough early stages. While there are still some features that need to be added, or performance and docs that needs improving. For the vast majority of use cases, you should be able to get started with this new trainer for your finetuning (non LoRA) needs.

Special thanks to @Blealtan @BlinkDL @Yuzaboto @Bananaman

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.0.0 official release of infctx trainer

Contributors