Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the language to JSON payload of ASR and NLU topics #11

Closed
koenvervloesem opened this issue May 24, 2020 · 7 comments
Closed

Add the language to JSON payload of ASR and NLU topics #11

koenvervloesem opened this issue May 24, 2020 · 7 comments
Labels
enhancement New feature or request

Comments

@koenvervloesem
Copy link
Member

koenvervloesem commented May 24, 2020

To create multilingual Rhasspy apps that are able to react to intents in the right language, they should know what language the original spoken command was in. I propose to add a lang attribute to the JSON payload of the following MQTT topics:

  • hermes/asr/textCaptured
  • hermes/nlu/query
  • hermes/nlu/intentParsed
  • hermes/intent/GetTime

See rhasspy/rhasspy-hermes-app#1 for the motivation for this change.

The ASR and NLU components should then fill in this lang attribute with the right language of the user's profile.

@Romkabouter
Copy link

Good idea, I hope to have more spare time to be more actively involved :)

@koenvervloesem
Copy link
Member Author

@synesthesiam I'm testing my Dutch Rhasspy setup and I see "lang": null in the intent messages, is this an implementation error or are the language attributes not yet set by the dialogue manager?

@synesthesiam
Copy link
Contributor

Right now, the lang attribute is copied by the dialogue manager (and all other services). I wasn't sure where we should initially set it, so the None is just being propagated everywhere.

Whose job do you think it is to set the language? Should it be hinted at by the wake word? Or is the ASR responsible?

@koenvervloesem
Copy link
Member Author

The ASR seems to me the most sensible choice. Wake words could be language-independent, and a wake word isn't necessary (if you start a session with a button). But it should also work if you enter text in the web interface for intent recognition.

@koenvervloesem
Copy link
Member Author

koenvervloesem commented Jan 21, 2021

I have been thinking about this. The logic could be: every service in the chain of messages looks at the currently set lang attribute. If it's already set to a non-null value, just propagate it. If it's null and the service wants to set it, it can change it and the value will be propagated everywhere. If no service changes the value, you get a null value at the end, which means the language is the default language, English. The hotword service could also set the language if the user wants to have different hotwords to activate sessions in different languages, but if the service lets the language unspecified, later services can still set the language. Something like this?

An alternative way could be that services are able to override non-null language attributes with another non-null language attribute (e.g. the hotword service sets the language to Dutch but the ASR overrides it to English), but I don't think this makes much sense, and it could complicate matters.

@synesthesiam
Copy link
Contributor

Yes, this is what I was thinking too (propagate if null, optionally set otherwise)!

If we let the wake word service set lang in this scenario, we could have the ASR services only listen to audio when their language is to follow. Alternatively, future ASR systems could implement language detection and set lang themselves for downstream NLU.

@synesthesiam
Copy link
Contributor

This will be in 2.5.10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants