Add scrolling lyrics for Karaoke videos #8

arsaboo · 2023-08-18T14:25:00Z

I'm looking for a way to create karaoke videos with scrolling lyrics. This tool works well to remove the vocals, but it would be nice to have some way to transcribe the audio and create a video with scrolling lyrics. Any examples that use other tools like Whisper would be appreciated.

beveradb · 2023-08-20T16:32:55Z

Hey @arsaboo - thanks for reaching out! Always nice to meet a fellow home automation and karaoke nerd ;)

I'm actually working on the same thing over in lyrics-transcriber and karaoke-generator 😄

The idea is to make a free and open source CLI tool which can take any local audio file (or a YouTube URL), separate the audio, transcribe the lyrics with word-level timestamps, and generate a karaoke video which is good enough for most people. It's still a work in progress though, the biggest unsolved challenge is the lyrics transcription still unfortunately, that part needs a ton more work.

That project is just a wrapper around a couple of other projects:

yt-dlp to fetch videos from YouTube for convenience
audio-separator to separate the audio for an instrumental backing track
lyrics-transcriber which runs whisper-timestamped and attempts to sync up the lyrics correctly with known lyrics fetched from genius or spotify. Note: this part is very much still a WIP, the transcription and lyrics fetch are both complete but the actual hard part (matching these up and attempting to correct the transcribed lyrics based on the real lyrics) is hard and I haven't started working on it yet.

Once I have the lyrics transcriber working and generating reasonably accurate lyrics files in LRC and ASS formats, the generation of the actual video output will likely be a third library, e.g. python-scrolling-lyrics-renderer or something, with the intention of that part taking a lyrics file, audio file, optional background image/video, and outputting various formats, e.g. a video in MP4 but also a CDG+MP3 for traditional karaoke systems.
Then, the top-level karaoke-generator tool becomes a simple CLI wrapper tying it all together, leaving the individual projects usable for other people with use cases.

If you'd be interested in collaborating on the bits which aren't yet complete, that's always welcome! I'd even happily hop on a call and talk you through how things work or get you set up etc. Feel free to message me directly if you're keen.

As for examples of other tools, there are two things I know of which are probably of interest to you:

The Tuul (the-tuul.com) is an open-source web based tool for making karaoke videos, and the author is lovely :) It does audio separation (though, currently using the outdated spleeter model) and video generation using ffmpeg, but the lyrics syncing is expected to be handled by the user through the web UI.
Youka (youka.io) does everything fully automatically, and is worth installing and trying so you can see what the limitations are - it does a pretty impressive job all things considered! It's the main evidence I have which shows this is even viable at all. Sadly it's no longer open-source, the author has a pretty reasonable pricing model for a monthly subscription now to cover his server costs and dev time etc. However, you can still take some inspiration from a 3 year old version of the code from before he pulled it, here: https://github.com/beveradb/youka-desktop/tree/master/src/lib

arsaboo · 2023-08-21T01:08:26Z

@beveradb This is phenomenal and exactly what I am after. I will check your repositories out. Hopefully, I can contribute and plug some of the holes. We can get on a call once I have had a chance to explore.

beveradb · 2023-08-21T21:40:34Z

Awesome, glad to hear :)

Hope you don't mind @arsaboo but as this isn't exactly audio-separation related, I'm going to close this issue and move discussion / progress over to nomadkaraoke/python-lyrics-transcriber#1

I've also written up a bunch more detail in that issue, e.g. a rough outline of my proposed approach to improving the quality of synced lyrics, which in my opinion is the main unsolved challenge with fully automatic karaoke generation!

(the scrolling lyrics rendering is actually already pretty much solved using ASS and ffmpeg, as seen in the_tuul by @incidentist, so will be easy to integrate into https://github.com/karaokenerds/karaoke-generator once the lyrics transcription is good enough)

beveradb mentioned this issue Aug 21, 2023

Correct the synced lyrics heuristically nomadkaraoke/python-lyrics-transcriber#1

Open

beveradb closed this as completed Aug 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add scrolling lyrics for Karaoke videos #8

Add scrolling lyrics for Karaoke videos #8

arsaboo commented Aug 18, 2023

beveradb commented Aug 20, 2023 •

edited

Loading

arsaboo commented Aug 21, 2023

beveradb commented Aug 21, 2023 •

edited

Loading

Add scrolling lyrics for Karaoke videos #8

Add scrolling lyrics for Karaoke videos #8

Comments

arsaboo commented Aug 18, 2023

beveradb commented Aug 20, 2023 • edited Loading

arsaboo commented Aug 21, 2023

beveradb commented Aug 21, 2023 • edited Loading

beveradb commented Aug 20, 2023 •

edited

Loading

beveradb commented Aug 21, 2023 •

edited

Loading