Application of autoantibody binding curve characteristics and machine learning methods for improving the diagnostic performance of an early detection test for lung cancer

Allen, Jared (2023) Application of autoantibody binding curve characteristics and machine learning methods for improving the diagnostic performance of an early detection test for lung cancer. PhD thesis, University of Nottingham.

[img]
Preview
PDF (Thesis - as examined) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Available under Licence Creative Commons Attribution.
Download (5MB) | Preview

Abstract

The EarlyCDT®-Lung test has been technically and clinically validated for the early detection of lung cancer with a sensitivity ~40% and a specificity of ~90% through measurement of a panel of seven serum autoantibodies.

The test generates curves of autoantibody binding to a titrated series of capture antigen concentrations thus providing patient-specific autoantibody profile titration curves. We postulated that the antibodies responsible for false positive results in healthy individuals exhibit different binding kinetics to specific autoantibodies present in cancer patients and that these differences may manifest themselves in the shape of the autoantibody-antigen titration curves.

The EarlyCDT®-Lung test result is currently a simple logic test combination of the results from the seven autoantibodies. The employment of machine learning models to combine the biomarker results, especially with the addition of a number of extra biomarker parameters, may allow improved clinical utility of the test through increased sensitivity and specificity.

A health economic analysis was undertaken to determine the current cost-effectiveness of the EarlyCDT®-Lung test for population screening for lung cancer compared to low-dose computed tomography, it showed that the current test performance was more cost-effective than LDCT screening at £37,679 per QALY, and quantified the performance needed to achieve cost-effectiveness at £30,000 per QALY was sensitivity of 39.8% at 99% specificity, 47.5% at 95% specificity, or 56.2% at 90% specificity respectively.

Serum autoantibodies from three case-control cohorts were measured on the EarlyCDT®-Lung test, as well as on an extended panel of autoantibodies. The titration binding curves returned by the test were analysed for signal magnitude, as well as curve characteristics including Slope, Intercept, Area Under Curve (AUC) and maximum slope obtained over the curve (SlopeMax). A range of unsupervised and supervised machine learning strategies for combining these biomarker results were explored, including principal components analysis, cluster analysis, logistic regression, decision tree analysis, naïve bayes, support vector machines, random forest, and extreme gradient boosting. The performance improvements of these optimised models was, however, modest and inconsistent across cohorts.

Finally, a simulated annealing based algorithm for multivariate panel optimisation was developed as an evolution of the Monte Carlo random search strategy previously used to establish panel cutoff thresholds. This algorithm was able to derive optimal panels that compared favourably to both the current commercial thresholds and to the best models derived by machine learning strategies.

Item Type: Thesis (University of Nottingham only) (PhD)
Supervisors: Grainge, Matthew
Chapman, Caroline
Keywords: Autoantibody, Biomarkers, Lung Cancer, Early Detection, Machine Learning
Subjects: W Medicine and related subjects (NLM Classification) > WF Respiratory system
Faculties/Schools: UK Campuses > Faculty of Medicine and Health Sciences > School of Medicine
Item ID: 76628
Depositing User: Allen, Jared
Date Deposited: 29 Oct 2024 12:03
Last Modified: 29 Oct 2024 12:03
URI: https://eprints.nottingham.ac.uk/id/eprint/76628

Actions (Archive Staff Only)

Edit View Edit View