This repository provides a worked example of using wearables data available in UK Biobank for an analysis associating physical activity level with risk of incident cardiovascular disease. Analyses are based on this paper and this paper.
Analytic decisions should be made in the context of each research project. Choices in this repository reflect choices of the authors in the linked papers and the code authors, and should not be interpreted as definitive or widely generalisable. This repository is associated with the authors alone, and is not endorsed by UK Biobank.
This repository contains one Jupyter notebook for accessing data from the RAP, and three R markdowns for analysing this data with the RAP. These notebooks describe:
- (1) Accessing data using the RAP
- (2) Preparing data for analysis
- (3) Exploring the wearables component of the data
- (4) Conducting an analysis associating physical activity level with risk of incident cardiovascular disease
Visual inspection of the data is invaluable for understanding what the code is doing. The worked example does not contain much visual inspection, to avoid printing participant data on the internet. Add statements to get a feel for the data as you work through the tutorials (e.g. head()
, str()
statements in R). But don't commit or publish results of these!
Illustration of the processing pipeline in this repository.
If you are on the 2023 Oxford Health Data Science CDT course, follow the instructions here.
If you are in the OxWearables group, we have a video tutorial walking through how to access the RAP. Please reach out to Alaina for the link.
If you are extracting data from the RAP for your own analyses, you can find templates for extracting data within our RAP projects here.
If you have a question, feel free to add an issue on GitHub.
There are probably bugs. If you find them, please let us know! Again, add an issue on GitHub.
This worked example draws on several earlier examples and tutorials:
- https://dnanexus.gitbook.io/uk-biobank-rap/working-on-the-research-analysis-platform/using-spark-to-analyze-tabular-data
- https://github.com/dnanexus/OpenBio/blob/master/UKB_notebooks/ukb-rap-pheno-basic.ipynb
The original tutorial written by Rosemary Walmsley and Junayed Naushad, with contributions and advice from Ondrej Klempir, Aiden Doherty, and Ben Busby.
Changes to this repo have been made for the 2023 CDT cohort by Alaina Shreves and Aidan Acquah.
If you use this repo for a research paper, please cite:
- Ramakrishnan R, Doherty A, Smith-Byrne K, Rahimi K, Bennett D, Woodward M, Walmsley R, Dwyer T (2021) Accelerometer measured physical activity and the incidence of cardiovascular disease in the UK: Evidence from the UK Biobank cohort study. PLOS Medicine 18(1): e1003487
If you use this repo for a technical report, please cite:
- Walmsley R, Naushad J, Klempir O, Busby B and Doherty A. rap_wearables (2022), URL: https://github.com/OxWearables/rap_wearables.