Advancing Epigenetics Towards Systems Biology

Epigenesys activities

RNA Seq/CHIP Seq workshop - WP5

Tuesday 30 August 2011 - Wednesday 31 August 2011

Location  : IGH, Montpellier France


(By Thomas Sexton, IGH, Montpellier France)

In the few years since their development, massively parallel sequencing technologies have revolutionised epigenetic studies. As this technology becomes more and more affordable, genome-wide profiles of transcriptional output, histone modifications and binding of chromatin proteins are attainable to more research groups. However, analytical tools are required to make biological sense of these rich datasets. The digital nature of sequencing data is incompatible with existing microarray analytical tools and their potential technical limitations are only just beginning to be appreciated as the numbers of datasets (and replicates) increase. Various bioinformatic research groups have recently developed a number of tools to handle sequencing datasets, some tailored to the unique requirements of ChIP and transcriptome analysis, but this rapidly growing field can be bewildering to the majority of molecular biologists generating the data.

 The RNA-Seq/ChIP-Seq analysis course held at the EBI in Cambridge was attended by 10 EpiGeneSys members and provided the “philosophical” groundwork for analysis of high-throughput sequencing datasets, supported by practical workshops where worked examples were analysed. The emphasis of the workshop was for use of the Bioconductor series of packages, specifically developed for use within the R program. These useful packages automate a lot of the processes required for ChIP-Seq and RNA-Seq analyses without the need for individual researchers to write their own scripts from scratch, although the nuances of the different R objects that each package creates, along with the syntax for their manipulation, is initially daunting. This course gave an introduction to the Bioconductor packages, sometimes by the people who had developed them, an overview of their uses and limitations, and then provided worked examples to give attendants a stepping stone towards proficiency in their use when analysing their own data.

 More specifically, the course entailed:

  • An overview of the current high-throughput sequencing technologies, and the “next next generation” technologies just around the corner, by Harold Swerdlow, Head of Sequencing Technology at the Wellcome Trust Sanger Institute, Cambridge.
  • An overview of the core Bioconductor packages and their use in handling sequencing projects (assessing quality, alignment, assessing coverage, referring to annotations, demultiplexing, exporting files, etc.) by Nicolas Delhomme, a staff member of the Genome Biology Unit, EMBL Heidelberg.
  • A description of the potential applications, technical limitations and recommended normalisation strategies for analysing RNA-Seq datasets by John Marioni, a group leader at the EBI, Cambridge.
  • Appraisal of the Bioconductor packages available for assessing differential expression from RNA-Seq datasets by Thomas Hardcastle, developer of the baySeq Bioconductor package from the Department of Plant Sciences, Cambridge.
  • A description of the technical considerations and appraisal of the different peak calling strategies for identifying binding sites in ChIP-Seq datasets, by EpiGeneSys member Nicholas Luscombe, group leader at the EBI, Cambridge.
  • Training in use of the Bioconductor packages for downstream processing of the binding sites found in ChIP-Seq experiments (annotations, motif searches, coverage plots, etc.) by Andre Faure and Petra Schwalie, researchers at the EBI, Cambridge.
The EpiGeneSys attendants were predominantly PhD students and postdocs who benefited not just from the tutors, but also from each other as they discussed their own projects over a glass of wine in the hotel bar. The course was rounded off nicely on the evening of Mardi Gras with dinner (and more wine) at a Turkish restaurant in Cambridge.