Skip to content
This repository has been archived by the owner on Aug 13, 2020. It is now read-only.

tryParseEnglish "twentytwenty" returns 40 #28

Open
ploeh opened this issue Feb 3, 2016 · 5 comments
Open

tryParseEnglish "twentytwenty" returns 40 #28

ploeh opened this issue Feb 3, 2016 · 5 comments

Comments

@ploeh
Copy link
Owner

ploeh commented Feb 3, 2016

tryParseEnglish "twentytwenty" returns Some 40, which is surprising to say the least. It was never an explicit test case, though, but is a fairly standard idiom in the language, particularly when referring to years:

  • nineteen eighty-four (1984)
  • twenty-sixteen (2016)
  • fourteen fifty-three (1453)

There are two potential ways to address such numbers:

  1. If they are unambiguous, a better result from tryParseEnglish "twentytwenty" would be Some 2020. While I suspect that they are unambiguous, anyone can, and is welcome to, prove me wrong with only a single counter-example.
  2. If such numerals are ambiguous, the correct return value would be None.
@ploeh
Copy link
Owner Author

ploeh commented Feb 3, 2016

Similar to #29

@ncave
Copy link
Contributor

ncave commented Feb 7, 2016

Is that really a bug, we're not parsing years, we're parsing numbers. There are a lot of other shorthands for particular years, say the "sixties", or the Chinese years, that we're not parsing either, my point is years are not pure numbers.

@ploeh
Copy link
Owner Author

ploeh commented Feb 10, 2016

we're parsing numbers.

Agreed. These numbers may be years, but they may also be other types of numbers.

According to that argument, tryParseEnglish "twentytwenty" should not return Some 2020. On the other hand, neither should it return Some 40, so I'm still inclined to consider the current implementation defective.

@ncave
Copy link
Contributor

ncave commented Feb 10, 2016

What should it return then, a parsing error or a list of numbers (if we can detect a logical separation)? It's unusual to have a sequence of numbers without some separator, unless it's structure is known beforehand (i.e. treating a number as a list of numbers of certain fixed size, e.g. 1 or 2).

@ploeh
Copy link
Owner Author

ploeh commented Feb 10, 2016

If such numerals are ambiguous, the correct return value would be None.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants