NIH Long COVID Computational Challenge -- Targeted Machine Learning Analysis Group
This is the formatted competition code for the L3C Challenge entry of the Targeted Machine Learning Analysis Group at UC Berkeley. See (TODO: maybe add writeup here for details of our analysis plan and results)
- obtain the synthetic data (contact @trberg for box access)
- extract the synthetic data:
tar -xzf synthetic_data.tar.gz
- add the additional data files to the synthetic data folder:
LL_concept_sets_fusion_everyone.csv LL_DO_NOT_DELETE_REQUIRED_concept_sets_all.csv
- build the docker container
utils/build.sh
- run
utils/do_analysis.sh
- fit models and predictions will be in the
output
folder
The python module format_code can process raw code exported from the enclave (as in the src_raw
folder) and generate runnable python code (as in the src
folder). R is not currently supported.