Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MNT] Deprecate Cognitive Atlas vocab namespace & add check for unsupported namespaces #410

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

alyssadai
Copy link
Contributor

@alyssadai alyssadai commented Dec 19, 2024

Changes proposed in this pull request:

  • add logic to error out if TermURLs in data dictionary use an unsupported vocab namespace prefix
    • include informative message in error if any vocabs were previously supported but are now deprecated
  • point to latest version of neurobagel_examples submodule so tests pass
  • replace cogatlas terms in test data files with snomed equivalents
  • example5 reworked to be an example with unsupported namespace prefixes in the data dictionary
    • previously, was not used anywhere (maybe leftover from early CLI development?) and conceptually was a duplicate of example9
  • add .sh script for regenerating the reference example JSONLDs in neurobagel_examples (script added in this repo b/c the files usually need to be updated right away after CLI changes)

Checklist

This section is for the PR reviewer

  • PR has an interpretable title with a prefix ([ENH], [FIX], [REF], [TST], [CI], [MNT], [INF], [MODEL], [DOC]) (see our Contributing Guidelines for more info)
  • PR has a label for the release changelog or skip-release (to be applied by maintainers only)
  • PR links to GitHub issue with mention Closes #XXXX
  • Tests pass
  • Checks pass

For new features:

  • Tests have been added

For bug fixes:

  • There is at least one test that would fail under the original bug conditions.

Summary by Sourcery

Deprecate the Cognitive Atlas vocabulary namespace and add a mechanism to check for unsupported namespaces in data dictionaries. Update example5 to illustrate unsupported namespace usage and add corresponding tests to validate the new functionality.

Enhancements:

  • Add logic to identify unsupported vocabulary namespace prefixes in data dictionaries and provide informative error messages for deprecated vocabs.

Tests:

  • Add tests to ensure unsupported and deprecated vocabulary namespaces are correctly identified in data dictionaries.

Chores:

  • Rework example5 to demonstrate unsupported namespace prefixes in the data dictionary.

Copy link

sourcery-ai bot commented Dec 19, 2024

Reviewer's Guide by Sourcery

This PR implements a validation mechanism for vocabulary namespace prefixes in data dictionaries, deprecating the Cognitive Atlas (cogatlas) namespace and adding checks to ensure only supported namespaces are used. The implementation includes error handling with informative messages for both unsupported and deprecated namespaces.

No diagrams generated as the changes look simple and do not need a visual representation.

File-Level Changes

Change Details Files
Added namespace validation functionality to ensure data dictionaries only use supported vocabulary prefixes
  • Created function to recursively search for TermURL values in data dictionaries
  • Implemented check for unsupported namespace prefixes
  • Added detection of deprecated namespaces
  • Integrated namespace validation into the data dictionary validation process
  • Added informative error messages for unsupported and deprecated namespaces
bagel/utilities/pheno_utils.py
bagel/mappings.py
Updated test suite to cover new namespace validation functionality
  • Added unit tests for unsupported namespace detection
  • Added unit tests for deprecated namespace identification
  • Updated existing tests to use SNOMED terms instead of Cognitive Atlas terms
  • Added integration test for invalid namespace handling
tests/unit/test_pheno_utils.py
tests/integration/test_cli_pheno.py
Refactored example data to demonstrate namespace validation
  • Reworked example5 to showcase unsupported namespace usage
  • Updated example data files to replace Cognitive Atlas terms with SNOMED equivalents
  • Added script for regenerating reference example JSONLDs
tests/data/example5.json
tests/data/example5.tsv
generate_neurobagel_example_jsonlds.sh

Assessment against linked issues

Issue Objective Addressed Explanation
#407 Error out when unrecognized vocabulary is encountered in data dictionary
#407 Error with informative message when data dictionary includes deprecated vocabulary (e.g. CogAtlas)
#407 Prevent deprecated terms from being added to generated jsonld file

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@alyssadai alyssadai added the pr-minor Non-breaking feature or enhancement, will increment minor version (0.+1.0) label Dec 19, 2024
Copy link

codecov bot commented Dec 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.54%. Comparing base (563d2d9) to head (7d8352b).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #410      +/-   ##
==========================================
+ Coverage   98.49%   98.54%   +0.04%     
==========================================
  Files          18       18              
  Lines        1066     1101      +35     
==========================================
+ Hits         1050     1085      +35     
  Misses         16       16              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alyssadai alyssadai added the release Create a release when this PR is merged label Dec 19, 2024
@alyssadai alyssadai marked this pull request as ready for review December 19, 2024 04:11
@surchs surchs self-requested a review December 19, 2024 16:31
Copy link
Contributor

@surchs surchs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alyssadai, this looks great.

I do think the generate_neurobagel_examples workflow would be better handled via a makefile to reduce potential for mistakes. Leave it up to you if you want to handle that here or in a follow-on PR.

After you decide: 🧑‍🍳

bagel/utilities/pheno_utils.py Show resolved Hide resolved
generate_neurobagel_example_jsonlds.sh Outdated Show resolved Hide resolved
generate_neurobagel_example_jsonlds.sh Outdated Show resolved Hide resolved
generate_neurobagel_example_jsonlds.sh Outdated Show resolved Hide resolved
generate_neurobagel_example_jsonlds.sh Show resolved Hide resolved
generate_neurobagel_example_jsonlds.sh Show resolved Hide resolved
bagel/utilities/pheno_utils.py Show resolved Hide resolved
Co-authored-by: Sebastian Urchs <surchs@users.noreply.github.com>
@alyssadai
Copy link
Contributor Author

Waiting to merge (to release) along with neurobagel/annotation_tool#630.

@surchs
Copy link
Contributor

surchs commented Dec 19, 2024

Hey @alyssadai I think we're ready to merge. Do you want me to start the annotation tool release?

@alyssadai
Copy link
Contributor Author

@surchs - do we want to wait for the related API PR from neurobagel/api#390?

@surchs
Copy link
Contributor

surchs commented Dec 19, 2024

Aha, good point, I missed that. Yes, let's wait for that one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-minor Non-breaking feature or enhancement, will increment minor version (0.+1.0) release Create a release when this PR is merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Warn when dictionary includes unrecognized vocabulary
2 participants