Li, Ruizhe
(2023)
Semi-supervised Learning for Medical Image Segmentation.
PhD thesis, University of Nottingham.
Abstract
Medical image segmentation is a fundamental step in many computer aided clinical applications, such as tumour detection and quantification, organ measurement and feature learning, etc. However, manually delineating the target of interest on medical images (2D and 3D) is highly labour intensive and time-consuming, even for clinical experts. To address this problem, this thesis focuses on exploring and developing solutions of interactive and fully automated methods to achieve efficient and accurate medical image segmentation.
First of all, an interactive semi-automatic segmentation software is developed for the purpose of efficiently annotating any given medical image in 2D and 3D. By converting the segmentation task into a graph optimisation problem using Conditional Random Field, the software allows interactive image segmentation using scribbles. It can also suggest the best image slice to annotate for segmentation refinement in 3D images. Moreover, an “one size for all” parameter setting is experimentally determined using different image modalities, dimensionalities and resolutions, hence no parameter adjustment is required for different unseen medical images. This software can be used for the segmentation of individual medical images in clinical applications or can be used as an annotation tool to generate training examples for machine learning methods. The software can be downloaded from bit.ly/interactive-seg-tool.
The developed interactive image segmentation software is efficient, but annotating a large amount of images (hundreds or thousands) for fully supervised machine learning to achieve automatic segmentation is still time-consuming. Therefore, a semi-supervised image segmentation method is developed to achieve fully automatic segmentation by training on a small number of annotated images. An ensemble learning based method is proposed, which is an encoder-decoder based Deep Convolutional Neural Network (DCNN). It is initially trained using a few annotated training samples. This initially trained model is then duplicated as sub-models and improved iteratively using random subsets of unannotated data with pseudo masks generated from models trained in the previous iteration. The number of sub-models is gradually decreased to one in the final iteration. To the best of our knowledge, this is the first use of ensemble learning and DCNN to achieve semi-supervised learning. By evaluating it on a public skin lesion segmentation dataset, it outperforms both the fully supervised learning method using only annotated data and the state-of-the-art methods using similar pseudo labelling ideas.
In the context of medical image segmentation, many targets of interest have common geometric shapes across populations (e.g. brain, bone, kidney, liver, etc.). In this case, deformable image registration (alignment) technique can be applied to annotate an unseen image by deforming an annotated template image. Deep learning methods also advanced the field of image registration, but many existing methods can only successfully align images with small deformations. In this thesis, an encoder-decoder DCNN based image registration method is proposed to deal with large deformations. Specifically, a multi-resolution encoder is applied across different image scales for feature extraction. In the decoder, multi-resolution displacement fields are estimated in each scale and then successively combined to produce the final displacement field for transforming the source image to the target image space. The method outperforms many other methods on a local 2D dataset and a public 3D dataset with large deformations. More importantly, the method is further improved by using segmentation masks to guide the image registration to focus on specified local regions, which improves the performance of both segmentation and registration significantly.
Finally, to combine the advantages of both image segmentation and image registration. A unified framework that combines a DCNN based segmentation model and the above developed registration model is developed to achieve semi-supervised learning. Initially, the segmentation model is pre-trained using a small number of annotated images, and the registration model is pre-trained using unsupervised learning of all training images. Subsequently, soft pseudo masks of unannotated images are generated by the registration model and segmentation model. The soft Dice loss function is applied to iteratively improve both models using these pseudo labelled images. It is shown that the proposed framework allows both models to mutually improve each other. This approach produces excellent segmentation results only using a small number of annotated images for training, which is better than the segmentation results produced by each model separately. More importantly, once finished training, the framework is able to perform both image segmentation and image registration in high quality.
Actions (Archive Staff Only)
|
Edit View |