Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(plugins/dates): validate days and months in mm/dd format #1129

Merged
merged 8 commits into from
Jul 25, 2024

Conversation

Fdawgs
Copy link
Contributor

@Fdawgs Fdawgs commented Jul 24, 2024

At present, compromise-dates tags anything 2-digit/2-digit as a Date:

'use strict'

const nlp = require('compromise')
const nlpDates = require('compromise-dates')
nlp.plugin(nlpDates)

const text = 'I saw him 20/24'

const doc = nlp(text)
console.log(doc.out('tags'))
/*
  Outputs:
  [
    {
      i: [ 'Noun', 'Pronoun' ],
      saw: [ 'Verb', 'PastTense' ],
      him: [ 'Noun', 'Pronoun' ],
      '20/24': [ 'Date' ]
    }
  ]
*/

This PR updates the compromise-dates plugin to perform some minor validation of days and months when provided in the mm/dd format, by ensuring the months are between 01-12 and the days are between 01-31.

@Fdawgs
Copy link
Contributor Author

Fdawgs commented Jul 24, 2024

Tests are failing because mac-os-latest is now using ARM.
See fastify/.github#37

@hchiam
Copy link
Contributor

hchiam commented Jul 24, 2024

i like the idea of tightening up the date detection, but i wonder if some english documents ever have the day before the month but specifically without the year, as in dd/mm? i can anecdotally say i've encountered mm/dd and dd/mm/yy etc on physical displays and digital documents, but i'm having no luck finding exact sources online of counter-examples using dd/mm

@hchiam
Copy link
Contributor

hchiam commented Jul 24, 2024

i like the idea of tightening up the date detection, but i wonder if some english documents ever have the day before the month but specifically without the year, as in dd/mm? i can anecdotally say i've encountered mm/dd and dd/mm/yy etc on physical displays and digital documents, but i'm having no luck finding exact sources online of counter-examples using dd/mm

just found an explicit description: https://www.grammarly.com/blog/how-to-write-dates/#:~:text=writing%20dates%20as%20numerals

So remember, if you are American and you write to your British friend inviting him to celebrate Independence Day on 7/4 with you, you can expect your guest to arrive on April 7 (which he will express as 7 April). Likewise, if he invites you to his Guy Fawkes Day party on 5/11, you will need to mark your calendar for November 5 rather than May 11.

@Fdawgs
Copy link
Contributor Author

Fdawgs commented Jul 25, 2024

@hchiam British dates aren't fully supported as stated in the readme.

Example:

const doc = nlp('I have to go to the hellhole that is London on 12/08, by which I mean August').dates().get();

console.log(doc);
/*
  [{ start: '2024-12-08', end: '2024-12-08' }]
*/

I assume the dates function just needs to be updated with a config param to pass the dmy: true option to the underlying spacetime module. But then that no doubt needs additional unit tests etc.

@spencermountain spencermountain changed the base branch from master to dev July 25, 2024 13:58
@spencermountain spencermountain merged commit 59a93ae into spencermountain:dev Jul 25, 2024
0 of 6 checks passed
@Fdawgs Fdawgs deleted the patch-1 branch July 25, 2024 13:59
@spencermountain
Copy link
Owner

hey thanks - yeah, let's tighten it up. ya, definitely the best outcome would be passing a config to compromise, on how best to handle ambiguous dates.

@hchiam
Copy link
Contributor

hchiam commented Jul 25, 2024 via email

@spencermountain
Copy link
Owner

good point Howard - ya, the blurriness in the docs really demonstrates the lack of clarity in the project. As Frasier mentioned, a lot of the date-parsing is done with spacetime, which does an ok job at british formats - and this compromise-dates plugin could simply pass these british-dates through - it's very doable, and would be a welcome PR.

If I remember:

  • 03/26 - march 26
  • 26/03 - march 26
  • 03/04 - march 4
  • 03/04 + {dmy: true} -april 3rd

open to either of you, if you've got some spare cycles. I can put it on my list for the future, otherwise.
cheers

@hchiam
Copy link
Contributor

hchiam commented Jul 25, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants