23rd - 27th July 2018: Craik-Marshall Room, Downing Site, University of Cambridge
Functional genomics looks at the dynamic aspects of how the genome functions within cells, particularly in the form of gene expression (transcription) and gene regulation. This workshop surveys current methods for functional genomics using high-throughput technologies.
High-throughput technologies such as next generation sequencing (NGS) can routinely produce massive amounts of data. However, such datasets pose new challenges in the way the data have to be analyzed, annotated and interpreted which are not trivial and are daunting to the wet-lab biologist. This course covers state-of-the-art and best-practice tools for bulk RNA-seq and ChIP-seq data analysis, and will also introduce approaches for analysing data arising from single-cell RNA-seq studies.
Enthusiastic and motivated wet-lab biologists who want to gain more of an understanding of NGS data and eventually progress to analysing their own data
The course will include a great deal of hands-on work in R and at the command line. In order for you to make the most of the course we strongly recommend that you take an introductory course, or have sufficient experience in the following areas:
- R
- Unix
- Introductory statistics
More specific requirements and references can be found here
Data files for course are here. There is a zip-file for each course and a sizes.txt file with zip sizes (Warning: Single-cell one is BIG!)
- Mark Fernandes (CRUK CI)
- Shamith Samarajiwa (MRC CU)
- Dora Bihary (MRC CU)
- Ashley Sawle (CRUK CI)
- Abigail Edwards (CRUK CI)
- Alistair Martin (CRUK CI)
- Stephane Ballereau (CRUK CI)
- Michael Morgan (CRUK CI).
During this course you will learn about:-
- To provide an understanding of how aligned sequencing reads, genome sequences and genomic regions are represented in R.
- To encourage confidence in reading sequencing reads into R, performing quality assessment and executing standard pipelines for (bulk) RNA-Seq and ChIP-Seq analysis
- Analysis of transcription factor (TF) and epigenomic (histone mark) ChIP-seq data
- Recent advances in single-cell sequencing
After the course you should be able to:-
- Know what tools are available in Bioconductor for HTS analysis and understand the basic object-types that are utilised.
- Process and quality control short read sequencing data
- Given a set of gene identifiers, find out whereabouts in the genome they are located, and vice-versa
- Produce a list of differentially expressed genes from an RNA-Seq experiment.
- Import a set of ChIP-Seq peaks and investigate their biological context.
- Appreciate the differences between bulk and single-cell RNA-seq analyses, and why the same methodologies might not be applicable
**SOCIAL 18:00 - .. Informal get-together at The Grain and Hop Store (close to accommodation in Downing College) Join us for a drink and dinner (self-paying), and to meet your colleagues for the next few days http://www.grainandhopstore-cambridge.co.uk/
**Note that the Training Room in Craik-Marshall building (1st Floor) will be open from 9am. School etherpad (E-whiteboard) is here **
- 09:30 Course Introduction
- 09:30 - 10:30; Introduction to Functional Genomics
- 10:30 - 12:30; Introduction (Recap) of R and Bioconductor
- 12:30 - 13:30; LUNCH
- 13:30 - 14:30 Principles of Experimental Design
- 14:30 - 17:00;
- Data processing for Next Generation Sequencing
- Lecture 1: Introduction to next generation sequencing (2.30- 2.45pm)
- Lecture 2: Brief introduction to file formats (2.45- 3.00pm)
- Lecture 3: Quality control and artefact removal (3.00- 3.45pm)
- Practical 1: learn to use FastQC and Cutadapt (20 min) on a sample dataset
- Lecture 4: Short read alignment and Quality Control (3.45-5.00pm)
- Practical 2: Alignment of a ChIP-seq dataset to a reference genome using BWA OR Bowtie2 and a RNA-seq dataset to STAR (45 min)
Please note we use several Rstudio Notebook html files as material for the RNAseq course. To obtain the source code (.Rmd file) you can simply click on the code button in the top right-hand corner).
- 09:00 - 09:30;
- 09:30 - 11:00;
- 11:00 - 12:30 Linear models & differential expression
- 12:30 - 13:30; LUNCH
- 13:30 - 15:00; Linear models & differential expression
- 15:00 - 17:00
-
09:30 - 11:00; Annotation and Visualisation of Differential Expression
-
11:00 - 12:30; Gene set analysis and Gene Ontology testing
-
12:30 - 13:30; LUNCH
-
13:30 - 16:30; Single Cell RNASeq 'taster'
NB We do not have sufficient time to teach this entire course in half a day. However, some concepts are covered in the Bulk RNASeq course and we provide the link to the full materials. Please note, that on occasions where all of the material was used, it resulted in a five-day course(!). We will teach topics that should be of interest even to those not interested in single-cell work.
SOCIAL: Punting trip - leave from Mill Lane punting site at 18:00 (~10 min walk from Craik-Marshall) Google Map. Scudamore's Web-site & Mill lane map.
- ChIP-seq data analysis
- Lecture 5: Introduction to ChIP-seq (9.30-10.00pm)
- Lecture 6: Peak Calling (10.00-11.00pm)
- Practical 3: Peak calling using MACS2 (30 min)
- Lecture 7: Differential binding analysis (11.00-12.30pm)
- Practical 4: THOR (and Diffbind) (20 min)
- Lecture 8: Quality control methods for ChIP-seq (1 hr)
- Practical 5: ChIPQC package (30 min)
- Practical 6: Integrative Genome Viewer (30 min) LUNCH (12.30-1.30pm)
- Lecture 9: Downstream analysis of ChIP-seq (1.30-3.15pm)
- Practical 7: Downstream analysis of ChIP-seq (30 min)
- Practical 8: Identifying direct targets of transcription factors with Rcade (30 min)
- Lecture 10:: Useful software utilities for the analysis of genomic data (4.30-5.00pm)
**SOCIAL: Summer School evening meal & Reception at the nearby Downing College at 18:00 to 22:30.. If you do not wish to attend this meal (free to attendees) then please let us know ASAP. Smart casual dress. http://www.dow.cam.ac.uk Downing College, Regent Street, Cambridge, CB2 1DQ (Site map in link below)
Drinks Reception 18:00- 19:45 West Lodge
Dinner 19:45 -22:30 Grace Howard Room **
-
09:30 - 12:30; **A room in C-M building will be available for storing your baggage (There will be signage)
-
ATAC-seq and Epigenomics
- Practical 9: Useful software utilities for the analysis of genomic data (9.30-10.30am)
- Lecture 11 ATAC-seq data analysis (10.30-11.30am)
- Practical 10: ATAC-seq analysis (30 min)
- Lecture 12 Introduction to Epigenomics and Chromatin Interactions (11.30-12.30)
-
12:30 - 13:30; LUNCH
-
Safe Journey home (Thank you for participating in the Summer School)