-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow Inference using python #147
Comments
you dont need a a100 .. as this doesnt make really use of batching .. any consumer gpu will do just fine - and you will need a h100 to get to the 2x realitme generation speed |
He never list his t/s. Should get at least 115+ on A100 |
what is the best tts voice cloning zeroshot in a project i can use a100 40 gb vram |
It's this, GPTsovits and fish audio. I also like vokan. Weaker clone can be fixed with RVC. All sound like reading more than zonos. This one is more like bark. |
thank you it helps me a lot i can use it in my project |
Yeah zonos is probably the best tts right now since it generates very high quality clear audio with great voice cloning in 30langs with a very permissive apache license. Fish-speech, gpt-sovits, xttsv2, llasa, cosyvoice should be considerably faster but a bit worse and many don't allow for commercial usage. |
when running the model using python iam trying to automate in my project but it is too slow the model loading and inference currently using a100 gpu
The text was updated successfully, but these errors were encountered: