Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

people() seems to get confused by commas #1111

Closed
sandro-pasquali opened this issue Jun 3, 2024 · 3 comments
Closed

people() seems to get confused by commas #1111

sandro-pasquali opened this issue Jun 3, 2024 · 3 comments

Comments

@sandro-pasquali
Copy link

Great library!

Loving the various entity extraction utilities. They work great. One I use is people(). However, it seems to be unable to separate a list of comma-separated names into individual names, at least in this case.

This is what I'm seeing [ NodeJs 22, OSX, "compromise": "^14.13.0" ]:

import Nlp from 'compromise';

const text = `The NAACP’s founding members included white progressives Mary White Ovington, Henry Moskowitz, William English Walling and Oswald Garrison Villard, along with such African Americans as W.E.B. Du Bois, Ida B. Wells, Archibald Grimke and Mary Church Terrell.`;

const processed = Nlp(text);
console.log(processed.people().out('array'));

// [
//     'Mary White Ovington, Henry Moskowitz, William English Walling',
//     'Oswald Garrison Villard,',
//     'Ida B. Wells, Archibald Grimke',
//     'Mary Church Terrell.'
// ]

As a side note, you can also see it isn't catching W.E.B Du Bois but that seems a complex pattern, and prob best here would be to add to the custom lexicon I'm guessing.

Thanks again for compromise!

@spencermountain
Copy link
Owner

hey Sandro - good catch!
Happy to fix this for the next release.
cheers

@spencermountain
Copy link
Owner

got it fixed in 14.14.0!
cheers

@sandro-pasquali
Copy link
Author

I'd like to tell and celebrate the value of the work you do to build and maintain this excellent library. So I'll share a concrete example of the positive impact your diligence makes.

This is the test output I was seeing which prompted my original question:

  people: [
    'Mary White Ovington Henry Moskowitz William English Walling',
    'Oswald Garrison Villard',
    'Ida B Wells Archibald Grimke',
    'Mary Church Terrell'
  ],

Then you released 14.14.0, and I updated to that version. I did nothing else.

This is now the test output:

  people: [
    'Mary White Ovington',
    'Henry Moskowitz',
    'William English Walling',
    'Oswald Garrison Villard',
    'Ida B Wells',
    'Archibald Grimke',
    'Mary Church Terrell'
  ],

Happy start to the day. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants