Phenotypic columns in parsed datatable do not have the appropriate `dtype` #93

alyssadai · 2023-10-07T02:24:02Z

Since the dashboard relies on the input tabular data to be in long format, an input column that contains values from multiple tasks with distinct value types per task gets read in by pandas as type object. This means that when individual tasks in the column are extracted to form separate columns of the final datatable shown in the dashboard, their dtypes all still remain object. 😞

e.g., if the expected assessment_score column for a phenotypic .csv contains scores from two different assessments, one whose participant scores are integers and one whose participant scores are true/false, when the .csv is read in everything in the column gets turned into a string. These string values are what get ultimately stored for other back-end data operations in the dashboard, which is problematic when we need to know the original type of the data for e.g., plotting.

This is mainly a problem for the more recently supported phenotypic bagels which have a more liberal column schema, since with imaging bagels all the values in a given input column are generally expected to have the same type.

Steps to fix

After processing the input, add a helper function to try and convert columns of the processed dataframe into more appropriate types from object
Test above function using a toy dataframe

Note: pandas functions convert_dtypes() and infer_objects() both aren't sufficient here

The text was updated successfully, but these errors were encountered:

alyssadai added bug:functional flag:blocker flag that issue is blocking at least one other issue from being completed. labels Oct 7, 2023

alyssadai added this to Neurobagel Oct 7, 2023

alyssadai moved this to Implement - Active in Neurobagel Oct 7, 2023

alyssadai self-assigned this Oct 7, 2023

This was referenced Oct 8, 2023

Implement plotting of histogram for a user-selected column #82

Closed

[FIX] Re-infer column dtypes for dataframes processed from input files #94

Merged

alyssadai moved this from Implement - Active to Implement - Done in Neurobagel Oct 8, 2023

alyssadai moved this from Implement - Done to Implement - Active in Neurobagel Oct 8, 2023

alyssadai moved this from Implement - Active to Implement - Done in Neurobagel Oct 10, 2023

rmanaem moved this from Implement - Done to Review - Active in Neurobagel Oct 10, 2023

alyssadai closed this as completed in #94 Oct 10, 2023

github-project-automation bot moved this from Review - Active to Review - Done in Neurobagel Oct 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phenotypic columns in parsed datatable do not have the appropriate `dtype` #93

Phenotypic columns in parsed datatable do not have the appropriate `dtype` #93

alyssadai commented Oct 7, 2023 •

edited

Loading

Phenotypic columns in parsed datatable do not have the appropriate dtype #93

Phenotypic columns in parsed datatable do not have the appropriate dtype #93

Comments

alyssadai commented Oct 7, 2023 • edited Loading

Steps to fix

Phenotypic columns in parsed datatable do not have the appropriate `dtype` #93

Phenotypic columns in parsed datatable do not have the appropriate `dtype` #93

alyssadai commented Oct 7, 2023 •

edited

Loading