Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with note_type_concept_id column in notes table #74

Open
stevenbedrick opened this issue Jan 14, 2022 · 0 comments
Open

Issue with note_type_concept_id column in notes table #74

stevenbedrick opened this issue Jan 14, 2022 · 0 comments

Comments

@stevenbedrick
Copy link

Hello! I may have found an issue with the MIMIC OMOP CDM mapping for the note table's note_type_concept_id column. The existing ETL script uses a lookup table to populate this column, and at first glance the choices of OMOP concepts all seem to make sense- they are all from the "Note Type" vocabulary and seem semantically appropriate- using, e.g. OMOP4822279 for discharge summaries, etc.

However, the CDM documentation specifies a set of valid concepts for the note_type_concept_id column, and the ones that the MIMIC ETL process uses are not in that set. Upon closer inspection, I noticed that the "Note Type" concepts that the existing ETL process uses are all marked as being from the "source concepts" (i.e., "non-standard") subset of the Athena vocabulary, and if I'm reading the relevant section of the OHDSI standardized vocabulary documentation, that means that they aren't supposed to be used in fields like note_type_concept_id. So instead of using OMOP4822279 for discharge summaries, we maybe ought to be using OMOP4976897.

The situation is a bit confusing to me, since my understanding was that the idea behind "source concepts" was that they were for external vocabularies that needed to be mapped in to OMOP-land, but the "Note Type" vocabulary entries are all marked as being from an OMOP-authored vocabulary, even though they are (apparently) "non-standard", so it's not like they're from some other vocabulary that got pulled in at some point. The concepts don't seem to have any relationships or ancestors, though, and so are off by themselves off in vocabulary-space. Without knowing more, my best guess is that perhaps at some earlier point in time these particular concepts were valid, but maybe things got reorganized at some point in time?

Anyway, that's probably a question for the Athena folks, but bringing it back to MIMIC:

  • Is there a particular reason why the ETL process is using these particular concepts for note_type_concept_id instead of the ones the documentation is (or appears to be) suggesting?
  • If not, would a PR be welcome that updated the mapping table for this field? I'd be happy to take a stab at that, if so.

Apologies if this is ground that has already been covered somewhere else; I looked at issues and documentation and didn't see anything but might have missed it. Thanks for an amazing dataset and tool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant