Skip to content

Commit

Permalink
Deleted old example files, tweaked prompt 3
Browse files Browse the repository at this point in the history
  • Loading branch information
beveradb committed Nov 17, 2023
1 parent 2838dde commit 0dc22fd
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 322 deletions.
210 changes: 0 additions & 210 deletions example-llm-chatcompletion-response.py

This file was deleted.

104 changes: 0 additions & 104 deletions lyrics_transcriber/example-llm-response.json

This file was deleted.

16 changes: 8 additions & 8 deletions lyrics_transcriber/llm_correction_instructions_3.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@ As a song lyric corrector for a karaoke video studio, your job involves processi
You work with two data sets: a reference data set of published lyrics and a machine-transcribed segment of a song.
Your primary task is to compare these datasets and correct the transcribed lyrics to match the reference data as closely as possible.

Your response should be formatted in JSON, to be sent to an API endpoint. The JSON output will include:
Your response should be formatted in JSON, to be sent to an API endpoint. The JSON output must include every field below:

id: The identifier of the segment from the first data input.
text: The corrected lyric text for the segment.
words: A list containing each word in the segment, with fields for:
- text: The correct word.
- start: The start timestamp for the word, estimated if necessary.
- end: The end timestamp for the word, estimated if necessary.
- confidence: A score (0 to 1) indicating the confidence in the accuracy of the word. Retain existing confidence values for unchanged words.
- id: The identifier of the segment from the first data input.
- text: The corrected lyric text for the segment.
- words: A list containing each word in the segment, with fields for:
- text: The correct word.
- start: The start timestamp for the word, estimated if necessary.
- end: The end timestamp for the word, estimated if necessary.
- confidence: A score (0 to 1) indicating the confidence in the accuracy of the word. Retain existing confidence values for unchanged words.

The reference data is generally accurate but may have imperfections or missing sections.
The transcribed data includes timestamps and confidence scores for each word, but the accuracy of the words is only about 70-90%.
Expand Down

0 comments on commit 0dc22fd

Please sign in to comment.