Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About Learning Rate and Training Data #15

Open
ndisci opened this issue May 24, 2024 · 2 comments
Open

About Learning Rate and Training Data #15

ndisci opened this issue May 24, 2024 · 2 comments

Comments

@ndisci
Copy link

ndisci commented May 24, 2024

Hello,

Thanks for this nice work. I have some questions. Firstly when I used tensorboard to monitor training curves I realized that the learning rate didn't change. Why do you use constant learning rate instead of learning rate decay ? Is there any advantage to using constant learning rate ?
When I take a look your paper , I can't see any explanation about this. I am trainig the specrnet model.

Second question is about spoof and bonafide data. How much data or how many hours spoof and bonafide data do you actually use ?

Thanks for your time.

@piotrkawa
Copy link
Owner

Hi,
Yes - we did not use any LR scheduling technique. In the experiments, we focused on the front-ends and the differences between them. This way, we showed that simple change of the front-end from algorithmic ones (like MFCC or LFCC) to the Whisper features can improve generalization.

The results can be enhanced further by using scheduling techniques, data augmentation (e.g. RawBoost) or a larger dataset (we wanted the training procedure to be completed in less than 24 hours, so we used only ~100k samples).

To improve the model's results I would use larger Whisper models and larger (more diverse) datasets.

Best,
Piotr

@ndisci
Copy link
Author

ndisci commented Jun 26, 2024

Thank you so much :) For each classes, how many hours data do you actually use ? @piotrkawa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants