This module provides an introduction to the statistical foundations and applied methodologies used to study the genetics of quantitative and complex traits in human populations. The emphasis is on developing a deep conceptual understanding of the main modelling frameworks, inference procedures, and computational tools used in modern statistical genetics. Students will engage with both theoretical material and hands-on computational exercises, with a particular focus on estimation, hypothesis testing, and prediction using genetic markers.
Basic proficiency in R and bash scripting.
Familiarity with introductory statistics (regression, probability distributions) is helpful but not strictly required.
By the end of the course, students will be able to:
An ability to read/execute code in R/python/C++. Most demonstrations will be kept simple and done in R. For each objective a practical will be given where students will be able to work with simple demonstration versions of existing approaches.
R, a Unix-based computing environment for running command-line tools, and docker, with containers will be provided to ensure reproducibility and ease of installation.
Matthew Robinson’s group focuses on medical genomics and large-scale modelling of human health data. His research aims to understand how genetic and lifestyle factors combine to influence the risk, onset, and progression of common complex diseases. Matt completed his PhD at the University of Edinburgh and held research positions in Australia and Switzerland before joining ISTA in 2020. His group develops statistical and computational methods for analysing biobank-scale datasets, characterising genetic architecture, and predicting health outcomes across the life course.
Duncan Palmer is a SMARTbiomed Senior Research Fellow at the Big Data Institute and the Department of Statistics. He co-leads the Biobank Rare-Variant Consortium (BRaVa), an international collaboration analysing large-scale sequencing data to uncover the genetic basis of complex traits. Duncan’s research focuses on developing statistical and computational methods that integrate genetic, phenotypic, and functional data to refine association signals and understand disease mechanisms.