Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HED Validator integration into bids schema based validator. #1648

Merged
merged 46 commits into from
Jul 31, 2024
Merged
Show file tree
Hide file tree
Changes from 39 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
b606bdd
add checks that can occur after dataset walk to allow for hed validat…
rwblair Mar 27, 2023
0853771
fix call creating dataset object for hed validation
rwblair Mar 28, 2023
88f1f2d
Merge branch 'master' of https://github.com/bids-standard/bids-valida…
rwblair Sep 26, 2023
398e62e
update hed validator to latest version
rwblair Sep 26, 2023
9a6caf9
update hed-validator map file. Add lookup for old style hed issues
rwblair Sep 26, 2023
f637904
add hed validator issues to main issues object. Await ds level checks…
rwblair Sep 26, 2023
7b672a1
ignore deno deps in codespell
rwblair Sep 26, 2023
9868a5d
codespell wants comma seperated list for skip flag
rwblair Sep 26, 2023
893b2b6
import hed-validator via esm.sh and pin verison, remove old dist file…
rwblair Sep 27, 2023
aa092dd
typescript fixes and simple test
rwblair Sep 27, 2023
a42f3de
remove console logs, add test for validator not being triggered.
rwblair Sep 27, 2023
0fd4a25
pin node version for esm.sh? Build works locally for me.
rwblair Sep 27, 2023
63155b8
move hedargs from global into dscontext
rwblair Sep 28, 2023
2569d48
Merge branch 'master' of github.com:bids-standard/bids-validator into…
rwblair Jan 4, 2024
10ab783
stop casting hed-validator arguments to string. Bump hed validator to…
rwblair Jan 4, 2024
a89d682
fix detect hed to match columns in datastructure at time of call
rwblair Jan 4, 2024
1c4af61
bumped hed valdiator in schema validator. debugging prints, and actua…
rwblair Jan 23, 2024
d927cab
Merge branch 'master' into schema/hedIntegration
effigies Mar 7, 2024
506e3f9
Merge branch 'master' of github.com:bids-standard/bids-validator into…
rwblair Jun 11, 2024
2b86dbc
enable hed validator checks at end of all context validation. update…
rwblair Jun 12, 2024
4546fde
Merge branch 'schema/hedIntegration' of github.com:bids-standard/bids…
rwblair Jun 12, 2024
1e0634f
remove accumulator, run hed valdiation as context level check. Bails …
rwblair Jun 13, 2024
0ac20d3
call hed validate on json files that have hed entries.
rwblair Jun 13, 2024
c277c46
Merge branch 'master' of github.com:bids-standard/bids-validator into…
rwblair Jul 29, 2024
c9036c1
incomplete update using new hed api
rwblair Jul 29, 2024
1395241
Overhaul hed validation calls to be similar to recent legacy validato…
rwblair Jul 30, 2024
d0db309
overhaul integration tests for hed validate
rwblair Jul 30, 2024
30abfe0
remove hedArgs from dataset context since validation is done per file…
rwblair Jul 30, 2024
08ccde0
Forgot to commit some hedargs removals.
rwblair Jul 30, 2024
f2f3b7c
update dataset context instantiations to use new schema arg.
rwblair Jul 30, 2024
fdb5027
Merge branch 'master' of github.com:bids-standard/bids-validator into…
rwblair Jul 31, 2024
9577090
bump hed-validator in deno deps. Pass text of tsv file as a string to…
rwblair Jul 31, 2024
6793466
use node18 for esm.sh import of hed-validator
rwblair Jul 31, 2024
7378d3d
add Buffer polyfill for deno web validator.
rwblair Jul 31, 2024
2a9e8e2
Update bids-validator/src/validators/bids.ts
rwblair Jul 31, 2024
3e36450
Update bids-validator/src/validators/hed.ts
rwblair Jul 31, 2024
11fd91a
Update bids-validator/src/validators/hed.ts
rwblair Jul 31, 2024
7c65cf5
checkout LICENSE file from ffb2d63
rwblair Jul 31, 2024
418d01d
Merge branch 'schema/hedIntegration' of github.com:bids-standard/bids…
rwblair Jul 31, 2024
799f4b2
Merge branch 'master' into schema/hedIntegration
effigies Jul 31, 2024
e492d94
Update import of BIDSFile now that file.ts is removed.
rwblair Jul 31, 2024
52e2a84
Merge branch 'master' of github.com:bids-standard/bids-validator into…
rwblair Jul 31, 2024
899fbf1
mend
rwblair Jul 31, 2024
79b500c
bump hed-javascript, pass columns instead of text to buildHedTsvFile
rwblair Jul 31, 2024
5c94a5e
Merge branch 'master' of github.com:bids-standard/bids-validator into…
rwblair Jul 31, 2024
5beb84f
skip deno.lock from codespell
rwblair Jul 31, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/codespell.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,4 @@ jobs:
- uses: codespell-project/actions-codespell@master
with:
ignore_words_list: ro,anser,te,tage,afterall,nwe,nin,nd,falsy
skip: package-lock.json
skip: package-lock.json, deps*
1 change: 1 addition & 0 deletions bids-validator/src/deps/hed-validator.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
export { default } from "https://esm.sh/hed-validator@3.15.2?deps=node@18"
40 changes: 39 additions & 1 deletion bids-validator/src/issues/list.ts
Original file line number Diff line number Diff line change
Expand Up @@ -103,4 +103,42 @@ export const filenameIssues: IssueDefinitionRecord = {
},
}

export const nonSchemaIssues = { ...filenameIssues }
const hedIssues: IssueDefinitionRecord = {
HED_ERROR: {
severity: 'error',
reason: 'The validation on this HED string returned an error.',
},
HED_WARNING: {
severity: 'warning',
reason: 'The validation on this HED string returned a warning.',
},
HED_INTERNAL_ERROR: {
severity: 'error',
reason: 'An internal error occurred during HED validation.',
},
HED_INTERNAL_WARNING: {
severity: 'warning',
reason: 'An internal warning occurred during HED validation.',
},
HED_MISSING_VALUE_IN_SIDECAR: {
severity: 'warning',
reason:
'The json sidecar does not contain this column value as a possible key to a HED string.',
},
HED_VERSION_NOT_DEFINED: {
severity: 'warning',
reason:
"You should define 'HEDVersion' for this file. If you don't provide this information, the HED validation will use the latest version available.",
},
}

export const hedOldToNewLookup: Record<number, Partial<keyof IssueDefinitionRecord>> = {
104: 'HED_ERROR',
105: 'HED_WARNING',
106: 'HED_INTERNAL_ERROR',
107: 'HED_INTERNAL_WARNING',
108: 'HED_MISSING_VALUE_IN_SIDECAR',
109: 'HED_VERSION_NOT_DEFINED'
}

export const nonSchemaIssues = { ...filenameIssues, ...hedIssues }
63 changes: 63 additions & 0 deletions bids-validator/src/tests/local/hed-integration.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
import { formatAssertIssue, validatePath } from './common.ts'
import { assert, assertEquals } from '../../deps/asserts.ts'
import { BIDSFileDeno, readFileTree } from '../../files/deno.ts'
import { DatasetIssues } from '../../issues/datasetIssues.ts'
import { loadSchema } from '../../setup/loadSchema.ts'
import { BIDSContext, BIDSContextDataset } from '../../schema/context.ts'
import { BIDSFile } from '../../types/file.ts'
import { FileTree } from '../../types/filetree.ts'
import { GenericSchema } from '../../types/schema.ts'
import { hedValidate } from '../../validators/hed.ts'

function getFile(fileTree: FileTree, path: string) {
let [current, ...nextPath] = path.split('/')
if (nextPath.length === 0) {
const target = fileTree.files.find(x => x.name === current)
if (target) {
return target
}
const dirTarget = fileTree.directories.find(x => x.name === nextPath[0])
return dirTarget
} else {
const nextTree = fileTree.directories.find(x => x.name === current)
if (nextTree) {
return getFile(nextTree, nextPath.join('/'))
}
}
return undefined
}

Deno.test('hed-validator not triggered', async (t) => {
const PATH = 'tests/data/bids-examples/ds003'
const tree = await readFileTree(PATH)
const schema = await loadSchema()
const issues = new DatasetIssues()
const dsContext = new BIDSContextDataset(undefined, schema, {'HEDVersion': ['bad_version']})
await t.step('detect hed returns false', async () => {
const eventFile = getFile(tree, 'sub-01/func/sub-01_task-rhymejudgment_events.tsv')
assert(eventFile !== undefined)
assert(eventFile instanceof BIDSFileDeno)
const context = new BIDSContext(tree, eventFile, issues, dsContext)
await context.asyncLoads()
await hedValidate(schema as unknown as GenericSchema, context)
assert(issues.size === 0)
})
})

Deno.test('hed-validator fails with bad schema version', async (t) => {
const PATH = 'tests/data/bids-examples/eeg_ds003645s_hed_library'
const tree = await readFileTree(PATH)
const schema = await loadSchema()
const issues = new DatasetIssues()
const dsContext = new BIDSContextDataset(undefined, schema, {'HEDVersion': ['bad_version']})
await t.step('detect hed returns false', async () => {
const eventFile = getFile(tree, 'sub-002/eeg/sub-002_task-FacePerception_run-3_events.tsv')
assert(eventFile !== undefined)
assert(eventFile instanceof BIDSFileDeno)
const context = new BIDSContext(tree, eventFile, issues, dsContext)
await context.asyncLoads()
await hedValidate(schema as unknown as GenericSchema, context)
assert(issues.size === 1)
assert(issues.has('HED_ERROR'))
})
})
11 changes: 9 additions & 2 deletions bids-validator/src/types/check.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
import { GenericSchema } from './schema.ts'
import { BIDSContext } from '../schema/context.ts'
import { BIDSContext, BIDSContextDataset } from '../schema/context.ts'
import { DatasetIssues } from '../issues/datasetIssues.ts'

/** Function interface for writing a check */
export type CheckFunction = (
export type ContextCheckFunction = (
schema: GenericSchema,
context: BIDSContext,
) => Promise<void>
Expand All @@ -13,3 +14,9 @@ export type RuleCheckFunction = (
schema: GenericSchema,
context: BIDSContext,
) => void

export type DSCheckFunction = (
schema: GenericSchema,
dsContext: BIDSContextDataset,
issues: DatasetIssues,
) => Promise<void>
1 change: 1 addition & 0 deletions bids-validator/src/types/context.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ export interface ContextDatasetSubjects {
participant_id?: string[]
phenotype?: string[]
}

export interface ContextDataset {
dataset_description: Record<string, unknown>
files: any[]
Expand Down
13 changes: 10 additions & 3 deletions bids-validator/src/validators/bids.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { CheckFunction } from '../types/check.ts'
import { DSCheckFunction, ContextCheckFunction } from '../types/check.ts'
import { FileTree } from '../types/filetree.ts'
import { IssueFile } from '../types/issues.ts'
import { GenericSchema } from '../types/schema.ts'
Expand All @@ -15,17 +15,21 @@ import { emptyFile } from './internal/emptyFile.ts'
import { BIDSContext, BIDSContextDataset } from '../schema/context.ts'
import { BIDSFile } from '../types/file.ts'
import { parseOptions } from '../setup/options.ts'
import { hedValidate } from './hed.ts'

/**
* Ordering of checks to apply
*/
const CHECKS: CheckFunction[] = [
const perContextChecks: ContextCheckFunction[] = [
emptyFile,
filenameIdentify,
filenameValidate,
applyRules,
hedValidate,
]

const perDSChecks: DSCheckFunction[] = []

/**
* Full BIDS schema validation entrypoint
*/
Expand Down Expand Up @@ -85,11 +89,14 @@ export async function validate(
}
await context.asyncLoads()
// Run majority of checks
for (const check of CHECKS) {
for (const check of perContextChecks) {
await check(schema as unknown as GenericSchema, context)
}
await summary.update(context)
}
for (const check of perDSChecks) {
await check(schema as unknown as GenericSchema, dsContext, issues)
}

let derivativesSummary: Record<string, ValidationResult> = {}
await Promise.allSettled(
Expand Down
4 changes: 2 additions & 2 deletions bids-validator/src/validators/filenameValidate.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { CheckFunction, RuleCheckFunction } from '../types/check.ts'
import { ContextCheckFunction, RuleCheckFunction } from '../types/check.ts'
import { DatasetIssues } from '../issues/datasetIssues.ts'
import { BIDSContext } from '../schema/context.ts'
import { Entity, Format, GenericSchema, Schema } from '../types/schema.ts'
Expand All @@ -7,7 +7,7 @@ import { hasProp } from '../utils/objectPathHandler.ts'

const sidecarExtensions = ['.json', '.tsv', '.bvec', '.bval']

const CHECKS: CheckFunction[] = [
const CHECKS: ContextCheckFunction[] = [
missingLabel,
atRoot,
entityLabelCheck,
Expand Down
111 changes: 111 additions & 0 deletions bids-validator/src/validators/hed.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
import hedValidator from '../deps/hed-validator.ts'
import { hedOldToNewLookup } from '../issues/list.ts'
import { GenericSchema } from '../types/schema.ts'
import { IssueFile, IssueFileOutput } from '../types/issues.ts'
import { BIDSContext, BIDSContextDataset } from '../schema/context.ts'
import { DatasetIssues } from '../issues/datasetIssues.ts'
import { ColumnsMap } from '../types/columns.ts'

function sidecarHasHed(sidecarData: BIDSContext["sidecar"]) {
if (!sidecarData) {
return false
}
return Object.keys(sidecarData).some(x => sidecarValueHasHed(sidecarData[x]))
}

function sidecarValueHasHed(sidecarValue: unknown) {
return (
sidecarValue !== null &&
typeof sidecarValue === 'object' &&
'HED' in sidecarValue &&
sidecarValue.HED !== undefined
)
}

let hedSchemas: object | undefined | null = undefined

async function setHedSchemas(datasetDescriptionJson = {}) {
const datasetDescriptionData = new hedValidator.bids.BidsJsonFile(
'/dataset_description.json',
datasetDescriptionJson,
null,
)
try {
hedSchemas = await hedValidator.bids.buildBidsSchemas(
datasetDescriptionData,
null,
)
return [] as HedIssue[]
} catch (issueError) {
hedSchemas = null
return hedValidator.bids.BidsHedIssue.fromHedIssues(
issueError,
datasetDescriptionData.file,
)
}
}

export interface HedIssue {
code: number
file: IssueFile
evidence: string
}

export async function hedValidate(
schema: GenericSchema,
context: BIDSContext,
): Promise<void> {
let file
let hedValidationIssues = [] as HedIssue[]

try {
if (context.extension == '.tsv' && context.columns) {
if (!('HED' in context.columns) && !sidecarHasHed(context.sidecar)) {
return
}
hedValidationIssues = await setHedSchemas(context.dataset.dataset_description)

file = await buildHedTsvFile(context)
} else if (context.extension == '.json' && sidecarHasHed(context.json)) {
hedValidationIssues = hedValidationIssues = await setHedSchemas(context.dataset.dataset_description)
file = buildHedSidecarFile(context)
}

if (file !== undefined) {
hedValidationIssues.push(...file.validate(hedSchemas))
}
} catch (error) {
context.issues.addNonSchemaIssue('HED_ERROR', [{ ...context.file, evidence: error}])
}

hedValidationIssues.map((hedIssue) => {
const code = hedIssue.code
if (code in hedOldToNewLookup) {
context.issues.addNonSchemaIssue(
hedOldToNewLookup[code],
[{ ...hedIssue.file, evidence: hedIssue.evidence }],
)
}
})
}

async function buildHedTsvFile(context: BIDSContext) {
const tsvData = await context.file.text()
const eventFile = new hedValidator.bids.BidsTsvFile(
context.path,
tsvData,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would passing the ColumnMap (or a copy) work here, or would that expose too much internal data? The parameter type in hed-validator is Map<string, string[]>, which appears to be ColumnMap's parent type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the columns map here generates an error:
TypeError: Cannot read properties of undefined (reading 'forEach')

This is coming from convertParsedTSVData:
https://github.com/hed-standard/hed-javascript/blob/3be833f929cd3db5e259b315ba6cff6e1c677f59/bids/tsvParser.js#L45
via:
https://github.com/hed-standard/hed-javascript/blob/3be833f929cd3db5e259b315ba6cff6e1c677f59/bids/types/tsv.js#L53

I think this is because Map objects pass the check meant to catch the old style tsv data:

> let x = new Map()
undefined
> x === Object(x)
true

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@happy5214 Does it make sense to go ahead as-is and then submit a PR once hed-validator can handle a ColumnMap directly?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2041 should fix that issue. The type check was a bug that proved easy to fix.

context.file,
[],
context.sidecar,
)
return eventFile
}

function buildHedSidecarFile(context: BIDSContext) {
const sidecarFile = new hedValidator.bids.BidsSidecar(
context.path,
context.json,
context.file,
)
return sidecarFile
}
4 changes: 2 additions & 2 deletions bids-validator/src/validators/internal/emptyFile.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import { CheckFunction } from '../../types/check.ts'
import { ContextCheckFunction } from '../../types/check.ts'

// Non-schema EMPTY_FILE implementation
export const emptyFile: CheckFunction = (schema, context) => {
export const emptyFile: ContextCheckFunction = (schema, context) => {
if (context.file.size === 0) {
context.issues.addNonSchemaIssue('EMPTY_FILE', [context.file])
}
Expand Down
4 changes: 2 additions & 2 deletions bids-validator/src/validators/isBidsy.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@
*/
// @ts-nocheck
import { BIDSContext } from '../schema/context.ts'
import { CheckFunction } from '../../types/check.ts'
import { ContextCheckFunction } from '../../types/check.ts'
import { BIDSFile } from '../types/file.ts'
import { Schema } from '../types/schema.ts'

export const isBidsyFilename: CheckFunction = (schema, context) => {
export const isBidsyFilename: ContextCheckFunction = (schema, context) => {
// every '.', '-', '_' followed by an alnum
// only contains '.', '-', '_' and alnum
}
Loading
Loading