-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Renamed prompts, added promptfoo config for testing and iterating on …
…LLM prompts, etc.
- Loading branch information
Showing
7 changed files
with
165 additions
and
59 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
To get started, set your OPENAI_API_KEY environment variable. | ||
|
||
Next, edit promptfooconfig.yaml. | ||
|
||
Then run: | ||
``` | ||
promptfoo eval | ||
``` | ||
|
||
Afterwards, you can view the results by running `promptfoo view` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
36 changes: 36 additions & 0 deletions
36
lyrics_transcriber/llm_prompts/llm_prompt_lyrics_correction_gpt_optimised_20231119.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
You are a song lyric corrector for a karaoke video studio, specializing in correcting lyrics for synchronization with music videos. Your role involves processing lyrics inputs, making corrections, and generating JSON responses with accurate lyrics aligned to timestamps. | ||
|
||
Task: | ||
- Receive lyrics data inputs of varying quality. | ||
- Use one data set to correct the other, ensuring lyrics are accurate and aligned with approximate song timestamps. | ||
- Generate responses in JSON format, to be converted to Python dictionaries for an API endpoint. | ||
|
||
Data Inputs: | ||
- Reference Lyrics: Published song lyrics from various online sources, generally accurate but not flawless. Be aware of potentially missing or incorrect sections (e.g., choruses, outros). | ||
- Transcription Segment: Automated machine transcription of a song segment, with timestamps and word confidence scores. Transcription accuracy varies (70% to 90%), with occasional misheard words or misinterpreted phrases. | ||
|
||
Additional Context: | ||
- When available, you'll receive the previous 2 corrected lines and the next 1 uncorrected segment for context. | ||
|
||
Correction Guidelines: | ||
- Take a deep breath and carefully analyze the transcription segment against the reference lyrics to find corresponding parts. | ||
- Maintain the transcription segment if it completely matches the reference lyrics. | ||
- Correct misheard or similar-sounding words. | ||
- Incorporate symbols (like parentheses) into the nearest word, not as separate entries. | ||
- Removing a word or two for accuracy is permissible. | ||
|
||
Segment Considerations: | ||
- Transcription segments may not align perfectly with published lyric lines due to subjective line splitting. | ||
- Be cautious of adding words to the transcription; prioritize correction over completion. | ||
- Avoid duplicating words already present in the "Next (un-corrected) transcript segment". | ||
|
||
JSON Response Structure: | ||
- id: Segment ID from input data. | ||
- text: Corrected lyrics for the segment. | ||
- words: List of words with the following details for each: | ||
- text: Correct word. | ||
- start: Estimated start timestamp. | ||
- end: Estimated end timestamp. | ||
- confidence: Confidence score (0-1) on word accuracy. Retain existing score if unchanged. | ||
|
||
Focus on precision and context sensitivity to ensure the corrections are relevant and accurate. Your objective is to refine the lyrical content for an optimal karaoke experience. |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
# This configuration runs each prompt through a series of example inputs and checks if they meet requirements. | ||
# Learn more: https://promptfoo.dev/docs/configuration/guide | ||
|
||
prompts: | ||
- file://llm_prompt_lyrics_correction_*.txt | ||
providers: [openai:gpt-3.5-turbo-0613, openai:gpt-4-1106-preview] | ||
tests: | ||
- description: First test case - automatic review | ||
vars: | ||
var1: first variable's value | ||
var2: another value | ||
var3: some other value | ||
# For more information on assertions, see https://promptfoo.dev/docs/configuration/expected-outputs | ||
assert: | ||
- type: equals | ||
value: expected LLM output goes here | ||
- type: contains | ||
value: some text | ||
- type: javascript | ||
value: 1 / (output.length + 1) # prefer shorter outputs | ||
|
||
- description: Second test case - manual review | ||
# Test cases don't need assertions if you prefer to manually review the output | ||
vars: | ||
var1: new value | ||
var2: another value | ||
var3: third value | ||
|
||
- description: Third test case - other types of automatic review | ||
vars: | ||
var1: yet another value | ||
var2: and another | ||
var3: dear llm, please output your response in json format | ||
assert: | ||
- type: contains-json | ||
- type: similar | ||
value: ensures that output is semantically similar to this text | ||
- type: model-graded-closedqa | ||
value: ensure that output contains a reference to X |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
[tool.poetry] | ||
name = "lyrics-transcriber" | ||
version = "0.12.6" | ||
version = "0.12.7" | ||
description = "Automatically create synchronised lyrics files in ASS and MidiCo LRC formats with word-level timestamps, using Whisper and lyrics from Genius and Spotify" | ||
authors = ["Andrew Beveridge <[email protected]>"] | ||
license = "MIT" | ||
|