Nanopore adaptive sequencing of gigabase length genomes for mixed samples, whole exome capture, and targeted panels

Payne, Stuart Alexander (2022) Nanopore adaptive sequencing of gigabase length genomes for mixed samples, whole exome capture, and targeted panels. PhD thesis, University of Nottingham.

[thumbnail of thesis.pdf] PDF (Thesis - as examined) - Repository staff only - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Available under Licence Creative Commons Attribution.
Download (34MB)

Abstract

Single molecule sequencing technologies, such as nanopore sequencing, provide new ways to investigate genomes and genetics. They permit the detailed analysis of stretches of DNA orders of magnitude larger than previously possible. Studying genomes at this detail allows for a better understanding of genome organisation and structural variants that are typically difficult to resolve using short read sequencing.

Oxford Nanopore Technologies sequencers drive single molecules of DNA through membrane bound protein nanopores by applying a voltage across the membrane. This applied voltage draws ions and DNA through the nanopore, which is measured as a real-time data stream of ionic current. Inspecting the current data in real-time allows for specific molecules to be rejected by reversing the voltage across an individual nanopore. This process is called “Read Until”.

Previously, Read Until has been carried out by inspecting and comparing the current data produced during sequencing. This dissertation proposes a method for implementing Read Until using graphics cards to accelerate basecalling and optimised real-time alignment.

To build up to a full system for selective sequencing, the raw signal data that nanopore sequencers output must be assessed (Chapter 3). Specifically to better understand the characteristics of the continuous data stream. This is accomplished by inspecting bulk FAST5 files, first a visualisation application is built. This visualisation application is then used to assess both DNA and RNA samples, specifically looking at how unblocking behaviour is actioned and the impact it has on sequencing.

With a grasp of raw signal data an application, readfish, is developed aiming to enable real-time basecalling of read chunks for currently sequencing molecules (Chapter 4). This approach uses GPU accelerated basecalling and fast alignment to make decisions on selecting and rejecting individual molecules. In addition, a schema is designed to allow for arbitrary experiments to be devised allowing multiple experiments to take place simultaneously. Then, an optimised CPU basecaller and barcode demultiplexing are incorporated extending the platforms and types of samples that can be considered.

As a proof-of-concept readfish is used to selectively sequence target panels encompassing thousands of loci in the form of whole exome sequencing of the human cell line NA12878. This single experiment demonstrates great flexibility in the chosen target panel and the ability to use reference genomes at a gigabase scale. In further experiments using the ZymoBIOMICS mock community adaptive techniques are introduced as the experimental parameters are updated — dynamically — in response to the data generated by the same experiment.

Finally, exemplar problems and applications of selective sequencing are considered as well as other practical mechanisms for real-time feedback making the whole process adaptive (Chapter 5). These exemplar problems show how the methods developed in this thesis enable the time-efficient screening using panels of gene targets, decrease the time to identifying fusions in a leukaemic cell line, and reduce sequencing costs through standard library preparation methods.

Item Type: Thesis (University of Nottingham only) (PhD)
Supervisors: Loose, Matthew
Keywords: Nanopore adaptive sequencing, Gigabase length genomes, Whole exome capture, Targeted panels
Subjects: Q Science > QH Natural history. Biology > QH301 Biology (General)
Faculties/Schools: UK Campuses > Faculty of Medicine and Health Sciences > School of Life Sciences
Item ID: 69266
Depositing User: Payne, Alexander
Date Deposited: 31 Jul 2022 04:42
Last Modified: 31 Jul 2022 04:42
URI: https://eprints.nottingham.ac.uk/id/eprint/69266

Actions (Archive Staff Only)

Edit View Edit View