Multivariate analysis of Raman spectroscopy data

Haydock, Richard (2015) Multivariate analysis of Raman spectroscopy data. PhD thesis, University of Nottingham.

PDF (Thesis - as examined) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (7MB) | Preview


This thesis is concerned with developing techniques for analysing Raman spectroscopic images. A Raman spectroscopic image differs from a standard image as in place of red, green and blue quantities for each pixel a Raman image contains a spectrum of light intensities at each pixel. These spectra are used to identify the chemical components from which the image subject, for example a tablet, is comprised. The study of these types of images is known as chemometrics, with the majority of chemometric methods based on multivariate statistical and image analysis techniques.

The work in this thesis has two main foci. The first of these is on the spectral decomposition of a Raman image, the purpose of which is to identify the component chemicals and their concentrations. The standard method for this is to fit a bilinear model to the image where both parts of the model, representing components and concentrations, must be estimated. As the standard bilinear model is nonidentifiable in its solutions we investigate the range of possible solutions in the solution space with a random walk. We also derive an improved model for spectral decomposition, combining cluster analysis techniques and the standard bilinear model. For this purpose we apply the expectation maximisation algorithm on a Gaussian mixture model with bilinear means, to represent our spectra and concentrations. This reduces noise in the estimated chemical components by separating the Raman image subject from the background.

The second focus of this thesis is on the analysis of our spectral decomposition results. For testing the chemical components for uniform mixing we derive test statistics for identifying patterns in the image based on Minkowski measures, grey level co-occurence matrices and neighbouring pixel correlations. However with a non-identifiable model any hypothesis tests performed on the solutions will be specific to only that solution. Therefore to obtain conclusions for a range of solutions we combined our test statistics with our random walk. We also investigate the analysis of a time series of Raman images as the subject dissolved. Using models comprised of Gaussian cumulative distribution functions we are able to estimate the changes in concentration levels of dissolving tablets between the scan times. The results of which allowed us to describe the dissolution process in terms of the quantities of component chemicals.

Item Type: Thesis (University of Nottingham only) (PhD)
Supervisors: Brignell, C.J.
Preston, S.P.
Subjects: Q Science > QA Mathematics > QA276 Mathematical statistics
Q Science > QC Physics > QC350 Optics. Light, including spectroscopy
Faculties/Schools: UK Campuses > Faculty of Science > School of Mathematical Sciences
Item ID: 30697
Depositing User: Haydock, Richard
Date Deposited: 18 Feb 2016 11:36
Last Modified: 16 Dec 2017 14:00

Actions (Archive Staff Only)

Edit View Edit View