Phenotypic columns in parsed datatable do not have the appropriate dtype
#93
Labels
flag:blocker
flag that issue is blocking at least one other issue from being completed.
Since the dashboard relies on the input tabular data to be in long format, an input column that contains values from multiple tasks with distinct value types per task gets read in by pandas as type
object
. This means that when individual tasks in the column are extracted to form separate columns of the final datatable shown in the dashboard, their dtypes all still remainobject
. 😞e.g., if the expected
assessment_score
column for a phenotypic .csv contains scores from two different assessments, one whose participant scores are integers and one whose participant scores aretrue
/false
, when the .csv is read in everything in the column gets turned into a string. These string values are what get ultimately stored for other back-end data operations in the dashboard, which is problematic when we need to know the original type of the data for e.g., plotting.This is mainly a problem for the more recently supported phenotypic bagels which have a more liberal column schema, since with imaging bagels all the values in a given input column are generally expected to have the same type.
Steps to fix
object
Note: pandas functions convert_dtypes() and infer_objects() both aren't sufficient here
The text was updated successfully, but these errors were encountered: