Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I use the trained model to do speech enhancement? #2

Open
tuliang1996 opened this issue Mar 7, 2019 · 6 comments
Open

How can I use the trained model to do speech enhancement? #2

tuliang1996 opened this issue Mar 7, 2019 · 6 comments

Comments

@tuliang1996
Copy link

Just like using a noisy speech as input, such as a wav file, outputting enhanced speech。
But I have not found any function about enhanced。

@lifelongeek
Copy link
Owner

You can add '--mode test --load_path PATH_TO_PRETRAINED_MODEL' to the training script.

For example,
python main.py --mode test --trainer AAS --DB_name chime --rnn_size 500 --rnn_layers 4 --ASR_path ../AM_training/models/librispeech_final.pth.tar --load_path /data/kenkim/AAS_enhancement/model.pth.tar

If you request, I will upload pre-trained model as well.

@tuliang1996
Copy link
Author

Thank you for your reply.
I found the following code in main.py,
If (config.mode == 'train'):
         Trainer.train() # VAE
     Elif(config.mode == 'test'):
         Trainer.test()
     Elif(config.mode == 'visualize'):
         Trainer.visualize()
So, I went to the trainer_AAS.py file to find these functions, I found the function train() but I didn't find the function test(). This makes me confused.
I will train a model myself first. If I have some problems, I will Come back for your help.

@tuliang1996
Copy link
Author

sorry to disturb you.
Can I add '--mode test --load_path PATH_TO_PRETRAINED_MODEL' to other models, such as FSEGAN?
and use the default 300 epochs and 20 batch size for the CHiME-4 dataset on 1080ti devices,how long will i spent?
Or can you tell me the time details about your training?

@lifelongeek
Copy link
Owner

lifelongeek commented Mar 12, 2019

  1. For train/test FSEGAN
    You can use '--mode test --trainer FSEGAN --load_path PATH_TO_PRETRAINED_MODEL' for test FSEGAN model. I found that main.py does not link to trainer_FSEGAN.py so i just added now.

  2. Training time
    For the results in the paper, I train the model with maximum epoch = 100. In my case, it takes roughly 3 days on Titan machine per experiment.
    Although maximum epoch may be depends on model, learning algorithm and problem complexity, maximum epoch 100 might be too large for current setting. You can observe loss curve, and if there seems 'clear overfitting' on validation data, you can stop training.

@tuliang1996
Copy link
Author

Thank you
I think I need your pre-training model.
And use Chinese speech data for transfer learning.

@lifelongeek
Copy link
Owner

Sorry for late upload. Check main page :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants