Semantic image understanding: from pixel to word

Fu, Hao (2012) Semantic image understanding: from pixel to word. PhD thesis, University of Nottingham.

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (3MB) | Preview

Abstract

The aim of semantic image understanding is to reveal the semantic meaning behind the image pixel. This thesis investigates problems related to semantic image understanding, and have made the following contributions.

Our first contribution is to propose the usage of histogram matching in Multiple Kernel Learning. We treat the two-dimensional kernel matrix as an image and transfer the histogram matching algorithm in image processing to kernel matrix. Experiments on various computer vision and machine learning datasets have shown that our method can always boost the performance of state of the art MKL methods.

Our second contribution is to advocate the segment-then-recognize strategy in pixel-level semantic image understanding. We have developed a new framework which tries to integrate semantic segmentation with low-level segmentation for proposing object consistent regions. We have also developed a novel method trying to integrate semantic segmentation with interactive segmentation. We found this segment-then-recognize strategy also works well on medical image data, where we designed a novel polar space random field model for proposing gland-like regions.

In the realm of image-level semantic image understanding, our contribution is a novel way to utilize the random forest. Most of the previous works utilizing random forest store the posterior probabilities at each leaf node, and each random tree in the random forest is considered to be independent from each other. In contrast, we store the training samples instead of the posterior probabilities at each leaf node. We consider the random forest as a whole and propose the concept of semantic nearest neighbor and semantic similarity measure. Based on these two concepts, we devise novel methods for image annotation and image retrieval tasks.

Item Type: Thesis (University of Nottingham only) (PhD)
Supervisors: Qiu, G.
Bai, L.
Keywords: semantic, image, understanding, pixels, multiple kernel learning, mkl, rkdi, relative kernel distribution invariance, semantic nearest neighbours, snn, semantic similarity measure, ssm
Subjects: T Technology > TA Engineering (General). Civil engineering (General)
Faculties/Schools: UK Campuses > Faculty of Science > School of Computer Science
Item ID: 12847
Depositing User: EP, Services
Date Deposited: 08 Apr 2013 08:33
Last Modified: 15 Dec 2017 05:35
URI: https://eprints.nottingham.ac.uk/id/eprint/12847

Actions (Archive Staff Only)

Edit View Edit View