Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use metadata from NA segment in joined metadata when HA segment isn't available #161

Open
huddlej opened this issue Apr 15, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@huddlej
Copy link
Contributor

huddlej commented Apr 15, 2024

Current Behavior

Our current approach to joining segment-level metadata records into isolate-level metadata records is an HA-centric one such that NA records without a matching HA do not get any metadata from the NA record in the isolate-level record.

Expected behavior

When HA records are missing, we still want to know as much as possible about the NA record including the isolate id, the collection date, etc. We will use this information in segment-level analyses such as the flu_frequencies workflow where we estimate NA-specific clade frequencies and want to use all available NA records.

Possible solution

One solution could be to update the join_metadata script to define all segment-specific columns (e.g., "passage_category" should be segment-specific) and then update the isolate-level metadata with the first set of remaining isolate-level columns that are presenting in a segment's record (e.g., date, region, country, etc.).

@huddlej huddlej added the bug Something isn't working label Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant