Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Italian accented words #76

Open
a-meno opened this issue Nov 23, 2024 · 9 comments
Open

Italian accented words #76

a-meno opened this issue Nov 23, 2024 · 9 comments

Comments

@a-meno
Copy link

a-meno commented Nov 23, 2024

Hi @met4citizen,

I absolutely love what you're doing here! :)

I'm currently experimenting with Google TTS for Italian text and voice. Although there's no official lip-sync module for Italian yet, I've found that the English module still delivers decent results.

This is my setup right now:

head.showAvatar({
          url: '....',
          body: 'M',
          avatarMood: 'neutral',
          ttsLang: "it-IT",
          ttsVoice: "it-IT-Standard-C",
          modelFPS: 60,
          lipsyncLang: 'en'
})

However, I've encountered an issue that I'd like to discuss with you. When using the Google TTS API (https://eu-texttospeech.googleapis.com/v1beta1/text:synthesize), the generated voice doesn't correctly capture the intonation for Italian words with accented letters like à, è, é, ì, ò, and ù. This affects words such as "sanità," "aiuterà," "perché," and "più", etc...

Is there anything that can be done to address this issue?

@a-meno
Copy link
Author

a-meno commented Nov 23, 2024

I believe I have identified the issue within the English lip-sync module, specifically with the following line of code:

.normalize('NFD').replace(/[\u0300-\u036f]/g, '').normalize('NFC') // Removes non-English diacritics

Removing this line appears to resolve the problem, but it might be more appropriate to develop a separate lip-sync module tailored for the Italian language. Would that be a more effective solution?

@met4citizen
Copy link
Owner

There is already an open PR for Italian lip-sync. Currently, it doesn't seem to handle diacritics, but perhaps @lupettohf can provide more insight on this.

I don't speak Italian, but I would assume that diacritics marking stress could be ignored, but those that affect vowel openness/closedness should probably be handled to get the best result.

Out of curiosity: I know Italian and Finnish are both phonetically orthographic languages. Have you tried using Finnish lip-sync module "fi"? While there are, of course, differences between the two languages, I would expect Finnish and Italian to share more in common than Italian and English.

@lupettohf
Copy link

It's a Google TTS issue... Not happening when using Microsoft voices or Elevenlabs, I still need to find a solution.

@a-meno
Copy link
Author

a-meno commented Nov 23, 2024

The Finnish lipsync module could potentially be better in my case, and I plan to test it further. However, it currently has the same issue with handling accents and diacritical marks due to the following line of code:

.normalize('NFD').replace(/[\u0300-\u0307\u0309\u030b-\u036f]/g, '').normalize('NFC') // Remove non-Finnish diacritics

This line removes any letters with accents or marks, which poses a problem for languages like Italian, where such diacritics are crucial. In Italian, accents can completely alter the meaning of a word.
For example, "però" (/peˈrɔ/) means "however," while "pero" (/ˈpero/) means "pear tree."
These words are spelled with the same letters but have different meanings due to the presence of an accent.
There are many similar cases in Italian, highlighting the importance of preserving these diacritical marks.

@a-meno
Copy link
Author

a-meno commented Nov 23, 2024

It's a Google TTS issue... Not happening when using Microsoft voices or Elevenlabs, I still need to find a solution.

Hi @lupettohf !
I just tried to change this line:

.normalize('NFD').replace(/[\u0300-\u0307\u0309\u030b-\u036f]/g, '').normalize('NFC')
with this one
.normalize('NFD').normalize('NFC')

and seems to work fine even with Google TTS!

Ps. I am also using locally your italian lip sync module, thank you!

@lupettohf
Copy link

Thanks for the heads up, I will check it out today.

@met4citizen
Copy link
Owner

met4citizen commented Nov 23, 2024

Yes, you are right. In this case you should not remove all diacritics in the preProcessText method, as the preprocessed results are sent to Google TTS, and removing them would affect pronunciation in Italian. Good catch!

However, note that if you keep the diacritics, you must handle them in the wordToVisemes method. This can be done either by removing the diacritics before actual processing (so that É is actually handled as E) or, preferably, by adding separate rulesets/rules for letters with diacritics (that is, if you keep É, you must add a ruleset specifically for "É" etc). Otherwise the viseme sequence will be incorrect.

@a-meno
Copy link
Author

a-meno commented Nov 23, 2024

Ok, then I will just add this line to the code:

  wordsToVisemes(w) {
  
    let wprocessed = w.replace(/[\u0300-\u036f]/g, '')
    
    let o = { words: wprocessed.toUpperCase(), visemes: [], times: [], durations: [], i: 0 };
    
    ...

is that right?

@met4citizen
Copy link
Owner

You need to wrap the replace with normalize('NFD') and normalize('NFC') as it was done in the original preProcessText method. Without this canonical decomposition/composition, it doesn't filter out the diacritics. Otherwise your change seems syntactically correct.

Unfortunately, I don't know enough about Italian phonology to say whether it will produce the best result. As you probably know, there are far more phonemes than visemes, so it is common for different pronunciations to map to the same lip shape (this is essentially what makes lip-reading difficult). Whether this is the case here, with diacritics in Italian, I don't know. My guess was/is that if a diacritic mark indicates stress, it probably doesn't affect the viseme. However, for some vowels, also the lip shape might change. If that is the case, it would be better not to filter out diacritics but instead add letters with diacritics to the conversion rules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants