DRP-HCB Training Courses

This page is intended to provide a set of resources for those interested in learning about bioinformatics and acquiring new skills. It is by no means a comprehensive list of online training materials and does not cover everything you need to know to become a skilled bioinformatician. However, it should be enough to get you started and we will happily add more subjects if they are of interest to members of the centre.

Our dedicated bioinformatics servers provide an expansive list of tools via the Linux command line, an RStudio server and access to programming languages and libraries such as Python and Perl. If you require an account, please contact Shaun Webb.

Linux command line

Linux operating systems are the workhorse for most large scale genomics applications and are well suited to managing and processing huge data files produced by high throughput sequencing technologies. Although command-line computing can be daunting to new users, grasping a few fundamental commands and learning to run software packages and pipelines from the terminal will open new doors

Beginner

Intermediate

Reproducible research
- git, Conda, Snakemake, R Markdown, Jupyter, Docker and Singularity
Package management with Conda (coming soon)
- Getting started with Conda
- Conda cheatsheet
Running automated pipelines (coming soon)

Programming in R

R is a programming language designed for data retrieval, manipulation and visualisation as well as statistical analysis. Learning R will allow you to move beyond spreadsheets and access the full analysis potential of large datasets. You can build multi-dimensional visualisations as well as interactive web applications and documents. It’s easy to get to grips with the basics of R and the RStudio interface provides an intuitive graphical environment in which to develop your code. The extensive Bioconductor libraries also provide pre-built functions to analyse and interpret all sorts of genomic datasets.

Beginner

WCB Introduction to R course
- RStudio, base R, Tidyverse packages
R Cheatsheets

Beginner - Intermediate

R for Data Science

Advanced topics

Programming in Python

Python is a general-purpose programming language that is becoming increasingly popular with data scientists. Python has a growing number of libraries for data manipulation, exploration and visualisation as well as machine learning and is ideal for deploying applications and reproducible code. Jupyter notebooks provide an interactive development environment for coding in Python and other languages.

Beginner - Intermediate

Advanced topics

Building pipelines with SnakeMake
- SnakeMake Tutorial
- SnakeMake slides

Statistics for biology

While there are countless resources for learning statistics and a wealth of engaging videos on Youtube, it is important for bioinformaticians to understand statistics in the context of biology. Knowledge of statistical analysis will allow you to correctly describe and make inferences from your data as well as designing meaningful and useful experiments. Here are some links to get started:

Introduction to Stats (slides)
Introduction to statistics with R
Points of Significance (Nature collection)
Statistics for biologists (Nature collection)
Statistical learning (comprehensive stats applied in R)
Roslin Institute Video Tutorials

Analysing HTS experiments

The tutorials below provide introductions to analysing specific types of high throughput sequencing experiments. They are designed to help you understand your data, complete generic processing steps and explore and interpret the output. Please note the pre-requisite skills in brackets.

In most cases, the pipelines provided are simple and generic and may not be suitable for your own analysis. The bioinformatics core facility is available to collaborate on your project and to offer advice on experimental design and data analysis strategies. We also provide an expansive set of software tools and analysis workflows accessible through our bioinformatics servers:

Further Bioinformatics Training Resources

Edinburgh Genomics

Training programme for Edinburgh Genomics workshops and online courses.

Carpentries

Community driven project delivering training in data science and coding skills. Software Carpentry and Data Carpentry websites include many free online courses. Edinburgh Carpentries host regular events at the university and include a Data Carpentry for Genomics course.

HarvardX

A comprehensive set of courses covering R, Python, and biostatistics among other things. Includes case studies in analysing RNA-seq, ChIP-seq and Methyl-seq data.

Harvard Chan Bioinformatics Core

Training courses in R and command line including many different HTS techniques.

Babraham Bioinformatics

Training courses offered by the Babraham Bioinformatics group in R, Python, statistics and sequencing applications.
Bitesize bioinformatics modules and videos

Rockefeller Bioinformatics

A collection of bioinformatics training courses

Biodatascience

A collection of R courses in data science, statistics and computational biology.

Datacamp

Interactive learning with videos, lessons, tests and a built-in programming interface. Datacamp is a subscription service with some free modules for learning R, Python and data science skills.

UoC bioinformatics core

A collection of bioinformatics training courses from the University of Cambridge bioinformatics core facility.

Linux command line

Beginner

Intermediate

Programming in R

Beginner

Beginner - Intermediate

Advanced topics

Programming in Python

Beginner - Intermediate

Advanced topics

Statistics for biology

Analysing HTS experiments

ChIP-seq

RNA-seq

Hi-C

CLIP/CRAC

CLASH

Methyl-seq

Further Bioinformatics Training Resources