Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transliteration of the sentences from devanagari script to ILSL12 convention #1

Open
bharat-patidar opened this issue Sep 29, 2019 · 9 comments

Comments

@bharat-patidar
Copy link

bharat-patidar commented Sep 29, 2019

Hi Kunal,
Your work is wonderful.
I just wanted to know how can I get transliteration of sentences from devanagari script to ILSL12 conventions. It would be great if you can guide me.

Thanks,
Bharat

@bharat-patidar bharat-patidar changed the title Hi Kunal, Transliteration of the sentences from devanagari script to ILSL12 convention Sep 29, 2019
@kkokdari
Copy link

@KunalDhawan Hi Kunal, I'm curious too! I found many standards to transfer devanagari script to Roman script, but not sure which standard you used.

@bharat-patidar
Copy link
Author

Hi @kkokdari ,
You can use this parser.
https://www.iitm.ac.in/donlab/tts/unified.php

@kkokdari
Copy link

@bharat-patidar appreciation! Thx for replying me!!! Do you have the devanagari script of the 150 sentences * 7 speakers mentioned in this repo? Or have you tested this unified parser tool can get the unified result which is the same as the input of this repo?

Thx Bharat Patidar!

@bharat-patidar
Copy link
Author

Yes, I have used this parser and results are same.
Kunal has also used the same parser.

@kkokdari
Copy link

@bharat-patidar I've read the paper of this parser, and have some corresponding questions for it.
(sorry to bother you! I'm interested in Hindi speech synthesis and recognition, but after reading some papers I still cannot figure out answers of the following questions)
1、If we only build hindi speech recognition/synthesis system, why we still need to transfer original text to Roman script?
2、The input text of this repo is just like 'aadivaasii', transliterated by the Common Phone Set? But the alphabet 'v' doesn't exist in the Common Phone Set
1587384233652
3、in the lexicon.txt, for example, 'aashiirvaada aa sh ii r w aa d', the word 'aashiirvaada' is transliterated by the Common Phone Set without rules, while the phones 'aa sh ii r w aa d' is transliterated by the Common Phone Set with rules described in the paper
A Common Attribute based Unified HTS framework for Speech Synthesis in Indian Languages.pdf
?

Appreciation for your reply!

@kkokdari
Copy link

kkokdari commented May 7, 2020

Yes, I have used this parser and results are same.
Kunal has also used the same parser.

Sorry to bother you again! @bharat-patidar I build the unified parser but failed cause of segmental fault when using it like ./unified-parser 'अंगार' 1 0 0 0.

Do you know how to use it successfully?

@bharat-patidar
Copy link
Author

Yes, I have used this parser and results are same.
Kunal has also used the same parser.

Sorry to bother you again! @bharat-patidar I build the unified parser but failed cause of segmental fault when using it like ./unified-parser 'अंगार' 1 0 0 0.

Do you know how to use it successfully?

Yes, I had faced similar error as well.

My issue was because of flex and bison dependencies. You have to install this libraries as mentioned in one of the file of parser.

Also, I didn't face this segmentation fault issue when I ran it on amazon ec2 instance.

Hope this helps!

@bharat-patidar
Copy link
Author

@bharat-patidar I've read the paper of this parser, and have some corresponding questions for it.
(sorry to bother you! I'm interested in Hindi speech synthesis and recognition, but after reading some papers I still cannot figure out answers of the following questions)
1、If we only build hindi speech recognition/synthesis system, why we still need to transfer original text to Roman script?
2、The input text of this repo is just like 'aadivaasii', transliterated by the Common Phone Set? But the alphabet 'v' doesn't exist in the Common Phone Set
1587384233652
3、in the lexicon.txt, for example, 'aashiirvaada aa sh ii r w aa d', the word 'aashiirvaada' is transliterated by the Common Phone Set without rules, while the phones 'aa sh ii r w aa d' is transliterated by the Common Phone Set with rules described in the paper
A Common Attribute based Unified HTS framework for Speech Synthesis in Indian Languages.pdf
?

Appreciation for your reply!

Didn't understand your question.
If you still need help on this, we can discuss this offline.

Thanks!

@Chaitanya-Jadhav
Copy link

Chaitanya-Jadhav commented May 22, 2020

Yes, I have used this parser and results are same.
Kunal has also used the same parser.

Sorry to bother you again! @bharat-patidar I build the unified parser but failed cause of segmental fault when using it like ./unified-parser 'अंगार' 1 0 0 0.

Do you know how to use it successfully?

Hell, Have you solved this issue? I am facing the same issue!
I have also raised the same on the IITM TTS group

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants