Skip to content

Commit

Permalink
📅 Fix date representation (#1470)
Browse files Browse the repository at this point in the history
Co-authored-by: Rowan Cockett <rowanc1@gmail.com>
  • Loading branch information
agoose77 and rowanc1 authored Aug 22, 2024
1 parent 4e880f3 commit da224b7
Show file tree
Hide file tree
Showing 15 changed files with 186 additions and 88 deletions.
7 changes: 7 additions & 0 deletions .changeset/old-rats-help.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
"simple-validators": minor
"myst-frontmatter": patch
"mystmd": patch
---

Reduce scope of date parsing, and validate to ISO8601
18 changes: 7 additions & 11 deletions docs/frontmatter.md
Original file line number Diff line number Diff line change
Expand Up @@ -535,19 +535,15 @@ affiliations:

## Date

The date field is a string and should conform to a valid Javascript data format. Examples of acceptable date formats are:

- `2021-12-14T10:43:51.777Z` - [an ISO 8601 calendar date extended format](https://262.ecma-international.org/11.0/#sec-date-time-string-format), or
- `14 Dec 2021`
- `14 December 2021`
- `2021, December 14`
- `2021 December 14`
- `12/14/2021` - `MM/DD/YYYY`
- `12-14-2021` - `MM-DD-YYYY`
- `2022/12/14` - `YYYY/MM/DD`
The date field is a string and should conform to a well-defined calendar date. Examples of acceptable date formats are:

- `2022-12-14` - `YYYY-MM-DD`
- `01 Jan 2000` - `DD? MON YYYY`
- `Sat, 1 Jan 2000` - `DAY, DD? MON YYYY`

Where the latter example in that list are valid [IETF timestamps](https://datatracker.ietf.org/doc/html/rfc2822#page-14)
These dates correspond to two main formats:
- A strict (full, extended) calendar date defined by [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) (see also [RFC 3339](https://datatracker.ietf.org/doc/html/rfc3339))
- A date-only variant of [RFC 2822](https://datatracker.ietf.org/doc/html/rfc2822), built using the RFC gammar rules.

(frontmatter:exports)=

Expand Down
4 changes: 2 additions & 2 deletions packages/myst-frontmatter/src/page/page.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ cases:
a: b
subtitle: sub
short_title: short
date: 14 Dec 2021
date: '14 Dec 2021'
kernelspec: {}
jupytext: {}
keywords:
Expand Down Expand Up @@ -84,7 +84,7 @@ cases:
macro: b
subtitle: sub
short_title: short
date: 14 Dec 2021
date: '2021-12-14'
kernelspec: {}
jupytext: {}
keywords:
Expand Down
4 changes: 2 additions & 2 deletions packages/myst-frontmatter/src/project/project.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ cases:
affiliations:
- id: univa
name: University A
date: 14 Dec 2021
date: '14 Dec 2021'
doi: 10.1000/abcd/efg012
arxiv: https://arxiv.org/example
open_access: true
Expand Down Expand Up @@ -76,7 +76,7 @@ cases:
affiliations:
- id: univa
name: University A
date: 14 Dec 2021
date: '2021-12-14'
doi: 10.1000/abcd/efg012
arxiv: https://arxiv.org/example
open_access: true
Expand Down
2 changes: 1 addition & 1 deletion packages/mystmd/tests/indices/outputs/first.json
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
"id": "contributors-myst-generated-uid-0"
}
],
"date": "20 Jul 2024",
"date": "2024-07-20",
"affiliations": [{ "id": "Curvenote", "name": "Curvenote" }],
"exports": [
{
Expand Down
2 changes: 1 addition & 1 deletion packages/mystmd/tests/indices/outputs/index.json
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
"id": "contributors-myst-generated-uid-0"
}
],
"date": "20 Jul 2024",
"date": "2024-07-20",
"affiliations": [{ "id": "Curvenote", "name": "Curvenote" }],
"exports": [
{
Expand Down
2 changes: 1 addition & 1 deletion packages/mystmd/tests/indices/outputs/joke.json
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
"id": "contributors-myst-generated-uid-0"
}
],
"date": "20 Jul 2024",
"date": "2024-07-20",
"affiliations": [{ "id": "Curvenote", "name": "Curvenote" }],
"exports": [{ "format": "md", "filename": "joke.md" }]
},
Expand Down
2 changes: 1 addition & 1 deletion packages/mystmd/tests/indices/outputs/recipes.json
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
"id": "contributors-myst-generated-uid-0"
}
],
"date": "20 Jul 2024",
"date": "2024-07-20",
"affiliations": [{ "id": "Curvenote", "name": "Curvenote" }],
"exports": [
{
Expand Down
2 changes: 1 addition & 1 deletion packages/mystmd/tests/math-macros/outputs/index.json
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
],
"github": "https://github.com/jupyter-book/mystmd",
"keywords": [],
"date": "1 Jan 2024"
"date": "2024-01-01"
},
"mdast": {
"type": "root",
Expand Down
2 changes: 1 addition & 1 deletion packages/mystmd/tests/raw/outputs/index.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"title": "Project with raw content",
"content_includes_title": false,
"authors": [{ "id": "Franklin Koch", "name": "Franklin Koch" }],
"date": "9 Aug 2024",
"date": "2024-08-09",
"exports": [
{
"format": "md",
Expand Down
1 change: 0 additions & 1 deletion packages/simple-validators/src/index.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
export type { ValidationOptions, KeyOptions } from './types.js';
export { getDate, formatDate } from './utils.js';
export {
defined,
locationSuffix,
Expand Down
11 changes: 0 additions & 11 deletions packages/simple-validators/src/utils.spec.ts

This file was deleted.

28 changes: 0 additions & 28 deletions packages/simple-validators/src/utils.ts

This file was deleted.

59 changes: 42 additions & 17 deletions packages/simple-validators/src/validators.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -243,25 +243,50 @@ describe('validateEnum', () => {

describe('validateDate', () => {
it.each([
'2021-12-14T10:43:51.777Z',
'14 Dec 2021',
'14 December 2021',
'2021, December 14',
'2021 December 14',
'12/14/2021',
'12-14-2021',
'2022/12/14',
'2022-12-14',
])('valid date: %p', async (date: any) => {
expect(validateDate(date, opts)).toEqual(date);
['2021-12-14T10:43:51.777Z', 1, 'time'],
['14 Dec 2021', 0],
['Sat, 14 Dec 2021', 0],
['14 December 2021', 1],
['2021, December 14', 1],
['2021 December 14', 1],
['12/14/2021', 1],
['12-14-2021', 1],
['2021/12/14', 1],
['2021-12-14', 0],
])('valid date: %s', async (date: string, warnings: number, message?: string) => {
expect(validateDate(date, opts)).toEqual('2021-12-14');
expect(opts.messages.warnings?.length ?? 0).toEqual(warnings);
if (warnings === 1 && message) {
expect(opts.messages.warnings?.[0].message).toContain(message);
}
});
it('date object is valid', () => {
const date = new Date();
expect(validateDate(date, opts)).toEqual(date.toISOString());
it.each([
['not a date', 1],
['https://example.com', 1],
['2023-02-32', 1],
['2023-02-31', 1],
['2023-02-29', 1], // Not a leap year!
['2021-14-12', 1], // YYYY-DD-MM
])('invalid date: %s', async (date: string, warnings: number) => {
expect(validateDate(date, opts)).toEqual(undefined);
expect(opts.messages.errors?.length ?? 0).toEqual(warnings);
});
it('invalid date errors', () => {
expect(validateDate('https://example.com', opts)).toEqual(undefined);
expect(opts.messages.errors?.length).toEqual(1);
it.each([
['2024', '2024-01-01', 1],
['2024-06', '2024-06-01', 1],
['2024 June', '2024-06-01', 1],
['June 2024', '2024-06-01', 1],
['2024 June 25', '2024-06-25', 1],
['Sat, 2024 June 25', '2024-06-25', 1],
['Fri, 2024 June 25', '2024-06-25', 1], // ??!?!
['2024/06', '2024-06-01', 1],
])('non-standard date: %s', async (date: string, result: string, warnings: number) => {
expect(validateDate(date, opts)).toEqual(result);
expect(opts.messages.warnings?.length ?? 0).toEqual(warnings);
});
it('date object is valid', () => {
const date = new Date('2024-08-22T01:03:52.011Z');
expect(validateDate(date, opts)).toEqual('2024-08-22');
});
});

Expand Down
130 changes: 120 additions & 10 deletions packages/simple-validators/src/validators.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
import type { KeyOptions, ValidationOptions } from './types.js';
import { formatDate } from './utils.js';

export function defined<T = any>(val: T | null | undefined): val is T {
return val != null;
Expand Down Expand Up @@ -215,22 +214,133 @@ export function validateEnum<T>(
return input;
}

// This pattern implements the date pattern from ISO8601
// Technically, it's also RFC3339 (a particular profile of ISO8601 i.e. YYYY-MM-DD
// There is also a trailing capture group for timestamps
const ISO8601_DATE_PATTERN = /^(\d\d\d\d)(?:-(\d\d))?(?:-(\d\d))?(T.*)?$/;
// This pattern implements the following ABNF from RFC2822: `[ day-of-week "," ] date`
// with a trailing capture group for time-like information
const RFC2822_DATE_PATTERN =
/^(?:(Mon|Tue|Wed|Thu|Fri|Sat|Sun),)?\s*(\d{1,2})\s+(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s+(\d\d\d\d)\s*([^\s].*)?$/;

const MONTH_TO_NUMBER = new Map(
['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'].map(
(elem, index) => [elem, index + 1],
),
);

/**
* Build an ISO8601-compliant date string
*/
function buildISO8601DateString(year: number, month: number, day: number): string {
const paddedMonth = `${month}`.padStart(2, '0');
const paddedDay = `${day}`.padStart(2, '0');
return `${year}-${paddedMonth}-${paddedDay}`;
}

function dateErrorString(input: string) {
return `invalid date "${input}" - must be a full date "YYYY-MM-DD" (ISO 8601) or calendar date "Sat, 1 Jan 2000" (RFC 2822)`;
}

function revalidateDate(
input: any,
result: string,
opts: ValidationOptions & { dateIsLocal?: boolean },
): string | undefined {
// We put this into the validation date function recursively to see if it comes back with the same date
// For example "2024-02-31" is invalid
const validated = validateDate(new Date(result), {
...opts,
suppressErrors: true,
suppressWarnings: true,
});
if (validated !== result) {
return validationError(dateErrorString(input), opts);
}
return result;
}

/**
* Validate date string or object
*
* Uses javascript Date() constructor; any input to the constructor that returns
* a valid date is valid input. This includes ISO 8601 formatted strings and
* IETF timestamps are valid.
* Parses strings as ISO 8601 dates, or a variant of RFC 2822 dates, falling back to the Date constructor otherwise.
* Parses Date objects as UTC or local dates according to the given options.
*/
export function validateDate(input: any, opts: ValidationOptions) {
const date = new Date(input);
if (!date.getDate()) {
return validationError(
`invalid date "${input}" - must be ISO 8601 format or IETF timestamp`,
export function validateDate(
input: any,
opts: ValidationOptions & { dateIsLocal?: boolean },
): string | undefined {
// String format dates
if (typeof input === 'string') {
// Try ISO 8601
let match = input.match(ISO8601_DATE_PATTERN);
if (match) {
const [year, month, day, tail] = match.slice(1, 5);
// Is a timestamp component present?
if (tail !== undefined) {
validationWarning(
`Date "${input}" should not include a time component ("${tail}"), which has been ignored`,
opts,
);
}
// Rebuild the string, dropping time
const result = [year, month ?? '01', day ?? '01'].join('-');
if (month === undefined || day === undefined) {
validationWarning(
`non-standard date "${input}": interpreting date as "${result}".\nPlease use a full date "YYYY-MM-DD" (ISO 8601).`,
opts,
);
}
return revalidateDate(input, result, opts);
}

// Try a variant of RFC2822
match = input.match(RFC2822_DATE_PATTERN);
if (match) {
const [day, month, year, tail] = match.slice(2, 6);

// Is a timestamp component present?
if (tail !== undefined) {
validationWarning(
`Date "${input}" should not include a time component ("${tail}"), which has been ignored`,
opts,
);
}

const numericYear = parseInt(year);
const numericMonth = MONTH_TO_NUMBER.get(month)!; // Convert Jan to 1 etc.
const numericDay = parseInt(day);

// Build an ISO8601 date string
const result = buildISO8601DateString(numericYear, numericMonth, numericDay);
return revalidateDate(input, result, opts);
}
// Try falling back on JS parsing and assume it's parsed in the local timezone
const parsed = Date.parse(input);
if (isNaN(parsed)) {
return validationError(dateErrorString(input), opts);
}
const localDate = new Date(parsed);
const result = buildISO8601DateString(
localDate.getFullYear(),
localDate.getMonth() + 1,
localDate.getDate(),
);
validationWarning(
`non-standard date "${input}": interpreting date as "${result}".\nPlease use a full date "YYYY-MM-DD" (ISO 8601).`,
opts,
);
return result;
}
// Handle pre-existing date objects
else if (input instanceof Date) {
// Is the given timestamp representative of a UTC calendar date
return opts.dateIsLocal // Default is UTC!
? buildISO8601DateString(input.getFullYear(), input.getMonth() + 1, input.getDate())
: buildISO8601DateString(input.getUTCFullYear(), input.getUTCMonth() + 1, input.getUTCDate());
} else {
return validationError(dateErrorString(input), opts);
}
return typeof input === 'string' ? input : formatDate(date);
}

/**
Expand Down

0 comments on commit da224b7

Please sign in to comment.