Inference of transmission trees for epidemics using whole-genome sequence data

Cassidy, Rosanna (2019) Inference of transmission trees for epidemics using whole-genome sequence data. PhD thesis, University of Nottingham.

[thumbnail of Rosanna_Cassidy_Thesis.pdf]
Preview
PDF (Thesis - as examined) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (4MB) | Preview

Abstract

Recently, collection of sequence data has become increasingly rapid and cost-efficient, prompting much research into using this kind of data in the analysis of infectious diseases. There is currently substantial interest in developing epidemic model frameworks which can incorporate this new abundance of data. Whole-genome sequence (WGS) data reveal to us the unique construction– the 'fingerprint'– of the DNA of a sample pathogen. These high resolution data introduce the possibility that we may be able to discover who infected whom in an epidemic outbreak, allowing us to better understand transmission dynamics and therefore design improved preventative and intervention measures. WGS data may prove useful in understanding how levels of infectiousness and susceptibility vary between individuals in a population, or patients on a hospital ward. Genetic data are becoming increasingly widely available, and it is now possible to sequence isolates of some pathogens in real-time in the field with mobile sequencing technologies. Therefore, developing the models and methods to best exploit this is of considerable importance.

The first focus of the research presented here is on antibiotic-resistant nosocomial infections, or 'hospital superbugs', as these still pose a significant problem in hospitals, especially in developing countries. Antibiotic resistance is estimated to kill 700 thousand people globally every year. Current public focus on the threat of an 'antibiotic apocalypse' focuses on the need to reduce the overuse of antibiotics, but another important strategy is to better understand the transmission of such pathogens in order that better prevention and intervention strategies can be designed. Hospital wards present a unique environment, data from which require their own models and methods to analyse outbreaks of infectious disease. Initial research in this thesis has concentrated on outbreaks of methicillin-resistant Staphylococcus aureus (MRSA), as it is the most widespread and most common antibiotic-resistant nosocomial infection.

In this thesis, discrete-time stochastic epidemic models are developed which can be used to analyse both epidemiological and genetic data from an outbreak of MRSA on a hospital ward. These new models can be used to estimate routes of transmission through the hospital ward on the level of individual transmission events by harnessing the information available in the genetic distances between isolate sequences taken from colonised patients. The unobserved transmission dynamics in the models can be inferred using Bayesian inference in a data-augmented MCMC algorithm. Although techniques have been developed to assess the goodness-of-fit of epidemic models in Bayesian settings, they do not assess how well a model fits the genetic data. Methods for doing so are developed in this thesis. An outbreak of MRSA is analysed using the presented models, and the new goodness-of-fit techniques are used to suggest ways to improve the fit of models.

The ideas behind the models for genetic data from MRSA outbreaks are also applicable to other epidemic outbreaks for which genetic data are available. In this thesis we present continuous-time stochastic epidemic models for the spread of avian influenza. These models have a spatial aspect and can be used to estimate the transmission events between farms by analysing genetic and epidemiological data from each farm. Avian influenza is carried endemically by wild birds, so it is very difficult to prevent outbreaks entirely. Therefore, it is very useful to better understand the transmission dynamics of outbreaks and to be able to make predictions about the course of a future epidemic.

The combined analysis of both epidemiological and genetic data through novel models and methods allows transmission of pathogens in epidemic outbreaks to be investigated on the level of individuals in the population. This can have a great public health impact, as results about the routes of infection can inform prevention and control measures.

Item Type: Thesis (University of Nottingham only) (PhD)
Supervisors: O'Neill, Philip
Kypraios, Theodore
Keywords: Epidemics, whole-genome sequencing, statistics, modelling
Subjects: Q Science > QA Mathematics > QA276 Mathematical statistics
Faculties/Schools: UK Campuses > Faculty of Science > School of Mathematical Sciences
Item ID: 57708
Depositing User: Cassidy, Rosanna
Date Deposited: 20 Dec 2019 08:22
Last Modified: 06 May 2020 11:17
URI: https://eprints.nottingham.ac.uk/id/eprint/57708

Actions (Archive Staff Only)

Edit View Edit View