Nanopore sequencing and assembly of a human genome with ultra-long reads

Jain, M. and Koren, S. and Miga, K.H. and Quick, J. and Rand, A.C. and Sasani, T.A. and Tyson, J.R. and Beggs, A.D. and Dilthey, A.T. and Fiddes, I.T. and Malla, S. and Marriott, H. and Nieto, T. and O'Grady, J. and Olsen, H.E. and Pedersen, B.S. and Rhie, A. and Richardson, H. and Quinlan, A.R. and Snutch, T.P. and Tee, L. and Paten, B. and Phillippy, A.M. and Simpson, J.T. and Loman, N.J. and Loose, M. (2018) Nanopore sequencing and assembly of a human genome with ultra-long reads. Nature Biotechnology . ISSN 1546-1696

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Available under Licence Creative Commons Attribution.
Download (1MB) | Preview
[img] PDF - Repository staff only until 11 June 2018. - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (3MB)

Abstract

We report the sequencing and assembly of a reference genome for the human GM12878 Utah/Ceph cell line using the MinION (Oxford Nanopore Technologies) nanopore sequencer. 91.2 Gb of sequence data, representing ~30× theoretical coverage, were produced. Reference-based alignment enabled detection of large structural variants and epigenetic modifications. De novo assembly of nanopore reads alone yielded a contiguous assembly (NG50 ~3 Mb). Next, we developed a protocol to generate ultra-long reads (N50 > 100kb, up to 882 kb). Incorporating an additional 5×-coverage of these data more than doubled the assembly contiguity (NG50 ~6.4 Mb). The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled assembly and phasing of the 4 Mb major histocompatibility complex (MHC) locus in its entirety, measurement of telomere repeat length and closure of gaps in the reference human genome assembly GRCh38.

Item Type: Article
Schools/Departments: University of Nottingham, UK > Faculty of Medicine and Health Sciences > School of Life Sciences
Identification Number: 10.1038/nbt.4060
Depositing User: Eprints, Support
Date Deposited: 05 Feb 2018 14:07
Last Modified: 17 Apr 2018 14:53
URI: http://eprints.nottingham.ac.uk/id/eprint/48665

Actions (Archive Staff Only)

Edit View Edit View