Real-Time Translation

Overview

This project aims to provide real-time translation services using a combination of speech recognition, machine translation, and text-to-speech technologies. It integrates several models and tools to achieve seamless communication across Chinese and English languages

Repository Structure

melo/
openvoice/
seamless_communication/
finetune_seamless_m4t_medium.ipynb
seamless_translate.py
tri-model_translation.py

Core Functionality

melo/

Orignally from MyshelAI's MELOTTS project. Customized for this task.

Contains utilities and APIs for text normalization and text-to-speech (TTS) services.

openvoice/

Orignally from MyshelAI's OPENVOICE project. Customized for this task.

Customization Features:

Removed Watermark Generation to provide a more faster interference time
Removed Japanese, Spanish, French, and Korea to improve the initialization time (since this project is only Chinese to English)

Includes components for voice processing and manipulation, such as speaker extraction and tone color conversion.

seamless_communication/

Orignally from facebook's SEAMLESS COMMUNICATION project. Customized for this task.

Customization Features:

Customized Training Data
Customized Training Data, Val Data dateset class

Focuses on integrating different modules for seamless communication, including managing audio input/output and coordinating the translation pipeline.

finetune_seamless_m4t_medium.ipynb

A Jupyter Notebook for fine-tuning the Seamless M4T model, providing an environment for customizing the model to improve performance on specific datasets.

seamless_translate.py

Main script to perform translation tasks. It initializes and manages the translation pipeline, which includes speech recognition, translation, and text-to-speech conversion.

tri-model_translation.py

Script that integrates multiple models for enhanced translation accuracy. It includes functionalities for real-time speech recognition, translation, and TTS using various pre-trained models.

Getting Started

Prerequisites

Python 3.8+
PyTorch
Transformers library by Hugging Face
Additional dependencies listed in requirements.txt

Installation

Clone the repository:

git clone https://github.com/ivanhe123/real_time_translation.git
cd real_time_translation

Install the dependencies:
```
pip install -r requirements.txt
```

Running the Project

Fine-tuning the Model: Open finetune_seamless_m4t_medium.ipynb in Jupyter Notebook and follow the instructions to fine-tune the model on your dataset.
Real-Time Translation: Run seamless_translate.py to start the translation pipeline:
```
python seamless_translate.py
```
Multi-Model Translation: Run tri-model_translation.py to use the integrated multi-model approach:
```
python tri-model_translation.py
```

Detailed Workflow

Speech Recognition: Utilizes transformers pipeline with a pre-trained Whisper model for converting speech to text.
Translation: Employs a translation model to convert the recognized text from the source language to the target language.
Text-to-Speech: Uses a TTS model to convert the translated text back into speech, facilitating real-time communication.

Contributing

Contributions are welcome! Please create a pull request or open an issue to discuss any changes or improvements.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real-Time Translation

Overview

Repository Structure

Core Functionality

melo/

openvoice/

seamless_communication/

finetune_seamless_m4t_medium.ipynb

seamless_translate.py

tri-model_translation.py

Getting Started

Prerequisites

Installation

Running the Project

Detailed Workflow

Contributing

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
melo		melo
openvoice		openvoice
seamless_communication		seamless_communication
README.md		README.md
finetune_seamless_m4t_medium.ipynb		finetune_seamless_m4t_medium.ipynb
seamless_translate.py		seamless_translate.py
tri-model_translation.py		tri-model_translation.py

ivanhe123/real_time_translation

Folders and files

Latest commit

History

Repository files navigation

Real-Time Translation

Overview

Repository Structure

Core Functionality

melo/

openvoice/

seamless_communication/

finetune_seamless_m4t_medium.ipynb

seamless_translate.py

tri-model_translation.py

Getting Started

Prerequisites

Installation

Running the Project

Detailed Workflow

Contributing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages