Phonetic Entity Matcher #29

samhavens · 2019-06-02T20:09:21Z

Users misspell things. Having spell-check and synonyms helps a lot, but doesn't catch everything.

One solution would be to use the python metaphone package's implementation of the Double Metaphone algorithm.

At component train time, it could look at the normal entity lists, find the DM representation of all the synonyms, and store them. At runtime, it would send the user input through DM, and look for matches. This is substantially easier than, for example, the composite entity matcher we've built, so I'm tagging this as "good first issue."

The most benefit here would be for bots that

have lots of users who are non-native speakers of the language
are powering an Alexa skill

(2) is not an original idea. The 2018 Alexa prize winner used a similar trick - see their paper. Basically, you try to parse everything, and if it doesn't work, you try again, but using phonetics. This works well because ASR is imperfect. They call this "ASR Correction." Here is what they have to say for the lazy:

ASR error has a huge impact on NLU quality. ASK provides an overall ASR confidence score by incorporating both the confidence score for each word and the score generated by a language model. The overall score indicates how likely the whole utterance is recognized correctly. However, there are two types of false positives that may trigger error handling signaling ASR errors when the confidence score is low. The first one is when the word mentioned is not frequently seen in the training data, so the word receives a low weight. Another instance of a false positive is with homophone words that the ASR cannot capture even if the user repeats their request.

We used the double metaphone algorithm [18] to compare the noun phrases (ignoring the most frequent stopwords) mentioned by the user and a knowledge base. The knowledge base includes both the context and the domain (e.g. sports genres, movies titles and game names). We stored the primary and the secondary code of the double metaphone for each word as a key with the word as the value. We also added a tertiary code to words with certain patterns based on observations (e.g.. the word “jalapeno” will have a tertiary code “HLPN” in addition to “JLPN” and “ALPN”). If the overall confidence from ASR is below a threshold (set to 0.4), we propose a candidate by matching the metaphone code of the noun phrases to that of the knowledge base. For example, we propose the topic “obscure holidays” and in the next turn we receive the ASR input from the user requests, “let’s talk about secure holiday”. The primary code for “secure holiday” is “SKRLT” and the primary code for “obscure holidays” is “APSKRLTS”. Since it is very likely that the ASR does not detect the beginning or the ending of a phrase, we know that this might be a map with relatively high confidence. Another example is “let’s talk about the sport high ally” received from ASR. Because we know the context of sports, we map the code of “high ally” to the code of “jai alai” in the knowledge base (sports list).

samhavens added enhancement New feature or request good first issue Good for newcomers labels Jun 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phonetic Entity Matcher #29

Phonetic Entity Matcher #29

samhavens commented Jun 2, 2019

Phonetic Entity Matcher #29

Phonetic Entity Matcher #29

Comments

samhavens commented Jun 2, 2019