Novel guidelines for the analysis of single nucleotide polymorphisms in disease association studies

Fiaschi, Linda (2011) Novel guidelines for the analysis of single nucleotide polymorphisms in disease association studies. PhD thesis, University of Nottingham.

[thumbnail of ethesis.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (3MB) | Preview

Abstract

How genetic mutations such as Single Nucleotide Polymorphisms (SNPs) affect the risk of contracting a specific disease is still an open question for numerous different medical conditions. Two problems related to SNPs analysis are (i) the selection of computational techniques to discover possible single and multiple SNP associations; and (ii) the size of the latest datasets, which may contain millions of SNPs.

In order to find associations between SNPs and diseases, two popular techniques are investigated and enhanced. Firstly, the ‘Transmission Disequilibrium Test’ for familybased analysis is considered. The fixed length of haplotypes provided by this approach represents a possible limit to the quality of the obtained results. For this reason, an adaptation is proposed to select the minimum number of SNPs that are responsible for disease predisposition. Secondly, decision tree algorithms for case-control analysis in situations of unrelated individuals are considered. The application of a single tool may lead to limited analysis of the genetic association to a specific condition. Thus, a novel consensus approach is proposed exploiting the strengths of three different algorithms, ADTree, C4.5 and Id3. Results obtained suggest the new approach achieves improved performance.

The recent explosive growth in size of current SNPs databases has highlighted limitations in current techniques. An example is ‘Linkage Disequilibrium’ which identifies redundancy in multiple SNPs. Despite the high accuracies obtained by this method, it exhibits poor scalability for large datasets, which severely impacts on its performance. Therefore, a new fast scalable tool based on ‘Linkage Disequilibrium’ is developed to reduce the size through the measurement and elimination of redundancy between SNPs included in the initial dataset. Experimental evidence validates the potentially improved performance of the new method.

Item Type: Thesis (University of Nottingham only) (PhD)
Supervisors: Garibaldi, J.M.
Natalio, K.
Subjects: Q Science > QA Mathematics > QA 75 Electronic computers. Computer science
Q Science > QH Natural history. Biology > QH426 Genetics
Faculties/Schools: UK Campuses > Faculty of Science > School of Computer Science
Item ID: 11808
Depositing User: EP, Services
Date Deposited: 27 Sep 2011 10:08
Last Modified: 08 May 2020 10:47
URI: https://eprints.nottingham.ac.uk/id/eprint/11808

Actions (Archive Staff Only)

Edit View Edit View