Capsule-based image translation and quality analysis

Yang, Fei (2021) Capsule-based image translation and quality analysis. PhD thesis, University of Nottingham.

[thumbnail of thesis_v11.pdf] PDF (Thesis - as examined) - Repository staff only - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (16MB)

Abstract

The research community has witnessed a great success of computer vision for past decades, benefiting from the rapid development of deep learning technologies. Among a great number of research topics, image translation plays an important role in computer vision, which aims to synthesize image B conditioning on image A. The translation of A to B is achieved by well-designed generative models, which takes image A as the input and generates image B. It is regarded as domain translation or style translation, such as horse-to-zebra, image rendering, noise removal and etc.

The development of deep learning boosts the research of image translation. Advanced network structures have been designed as the translator of models. Effective objective functions have been proposed to supervise the generation of images. Advanced training strategies have been explored to optimise the training procedure. Theses techniques bring improvements to translating models in generating realistic images.

However, the research of image translation still has a long way to go. The applicable scenarios of image translation are diverse with a great number of types. The source domain and the target domain can be defined arbitrarily, which means a horse can be translated into any another object as we want or a painting can be translated to any other style of image as we define. It is hard for one general model to process various translating characteristics. Besides this, existing translation models cannot completely avoid noisy marks that are introduced by the convolution kernels during the translation process.

My contributions are summarised as follow. 1) To push forward the research of image translation, algorithms are developed to enhance the model i performance. A capsule-based framework is built on the structure of image conditioned generative adversarial network, in which the capsule units are responsible for improving the ability of learning part-to-whole relationship and strengthening the feature learning ability in a global view. 2) Multiple applications are discussed in this report, among which image rendering is a new one and others such as de-raining, de-snowing and de-hazing are traditional noise removal topics. Considering the specific characteristics of each application, various techniques are proposed. A preservation loss is proposed for image rendering. A two-branch structure with a rain component loss is designed for deraining. A multi-scale structure is proposed for de-snowing. A depth encoding method is developed for de-hazing. 3) Image quality has much correlation with the model performance, especially when assessing the quality of translated images. The ways of estimating the quality levels are explored in this report. Based on the level estimation, a task-oriented image quality assessment method is developed to calculate quality scores.

Item Type: Thesis (University of Nottingham only) (PhD)
Supervisors: Zhang, Qian
Qiu, Guoping
Keywords: computer vision; image translation;
Subjects: T Technology > T Technology (General)
Faculties/Schools: UNNC Ningbo, China Campus > Faculty of Science and Engineering > School of Computer Science
Item ID: 66024
Depositing User: YANG, Fei
Date Deposited: 10 Aug 2021 00:56
Last Modified: 10 Aug 2021 00:56
URI: https://eprints.nottingham.ac.uk/id/eprint/66024

Actions (Archive Staff Only)

Edit View Edit View