The narrative of galaxy morphological classification told through machine learning

Cheng, Ting-Yun (2020) The narrative of galaxy morphological classification told through machine learning. PhD thesis, University of Nottingham.

[img] PDF (corrections) (Thesis - as examined) - Repository staff only - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (31MB)


In this thesis, we present a complete study of machine learning applications, in- cluding both supervised and unsupervised, for galaxy morphological classification using calibrated imaging data. Two main topics are approached: (1) classification - we discuss optimal machine learning technique in terms of accuracy, efficiency, and inclusiveness using imaging data for large-scale surveys; (2) exploration - we explore galaxy morphology without human bias and discuss a novel morphological classification scheme defined by machine learning.

In the classification task, we first carry out a thorough comparison in accuracy and efficiency between several common supervised methods using the Dark Energy Survey (DES) imaging data (Chapter 2). The morphology labels from the Galaxy Zoo 1 (GZ1) catalogue (Lintott et al., 2008, 2011) are used to train the supervised methods. We conclude that using a combination of linear and gradient images (with the Histogram of Oriented Gradient technique) to train our convolutional neural networks (CNN) shows the most optimal performance in terms of accuracy and efficiency amongst the supervised methods tested using imaging data. Due to the better resolution (0. 263 per pixel) and greater depth (i = 22.51) of DES data than the Sloan Digital Sky survey (SDSS) imag- ing data used in the GZ1 project, we reveal that ∼ 2.5% galaxies in our dataset are mislabeled by the GZ1. After correcting these galaxies’ labels based on the DES imaging data, we reach a final accuracy of over 0.99 for binary classification (ellipticals and spirals) with the CNN (Chapter 3). We then use the CNN to build one of the largest galaxy morphological classification catalogues which in- cludes over 20 million galaxies from the DES Year 3 data (Chapter 4). However, supervised machine learning techniques are biased towards the training set and the human-defined labels. Therefore, we test the possibility of a classification task using unsupervised machine learning techniques (Chapter 5 and Chapter 6). In Chapter 5, the combination of a convolutional autoencoder and a Bayesian Gaussian mixture model successfully distinguishes a variety of lensing features such as different Einstein ring sizes and arcs from galaxy-galaxy strong lensing systems (GGSL). This unsupervised method categorises simulated images from Metcalf et al. (2019a) into 24 classes without human involvement and picks up ∼ 63 percent of lensing images from all lenses in the training set. Additionally, with fewer human judgements involved to classify 24 machine classes, we reach an accuracy of 77.3 ± 0.5% in the binary classification of lensing and non-lensing systems.

On the other hand, unsupervised machine learning techniques are used to objectively explore galaxy morphology using the SDSS imaging data in Chapter 6. We improve the efficiency of the unsupervised method used in Chapter 5 by applying a vector quantisation process in the feature learning phase, and achieve a better ‘accuracy’ compared to the current knowledge towards galaxy morphology using an uneven iterative hierarchical clustering (Chapter 6). This unsupervised method can categorise the galaxies in the dataset, which includes 23% early-type galaxies (ETGs) and 77% late-type galaxies (LTGs), into two preliminary classes and reach an accuracy of ∼ 0.87 for binary classification of ETGs and LTGs. To explore galaxy morphology, our method provides 27 classes based on the galaxy shape and structure. We further confirm that regardless of the galaxy morphological mix that existed in the dataset, this unsupervised machine captures consistent features. The 27 machine-defined morphological classes show a solid division on stellar properties such as colour, absolute magnitude, stellar mass, and physical size of the galaxies. Each class has distinctive galaxy features which distinguish each class uniquely from other classes. Moreover, when comparing the machine classes with visual Hubble types, it is clear that a mix of different galaxy struc- tures can exist in one visual morphological Hubble type. This reveals that an intrinsic uncertainty exists in visual classification schemes such as the Hubble sequence in precisely classifying galaxies. With the investigation in Chapter 6, we propose to rethink the current visual morphological classification scheme, and consider the possibility of using a novel classification scheme defined by machine learning to re-approach studies of galaxy evolution and formation from a different perspective.

Item Type: Thesis (University of Nottingham only) (PhD)
Supervisors: Conselice, Christopher
Aragon-Salamanca, Alfonso
Keywords: Machine learning, Galaxy morphology, Gravitational lensing
Subjects: Q Science > QB Astronomy
Faculties/Schools: UK Campuses > Faculty of Science > School of Physics and Astronomy
Item ID: 63653
Depositing User: CHENG, Ting-Yun
Date Deposited: 31 Dec 2020 04:40
Last Modified: 31 Dec 2020 04:40

Actions (Archive Staff Only)

Edit View Edit View