Digital system for bio-inspired visual attention processing fast and efficient information theoretic modelling of saliency

Ngo, Anh Cat Le (2015) Digital system for bio-inspired visual attention processing fast and efficient information theoretic modelling of saliency. PhD thesis, University of Nottingham.

[img] PDF (Thesis - as examined) - Repository staff only - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (24MB)

Abstract

Visual attention is a biological mechanism of human vision systems to cope with rich and fast-changing visual information in surrounding environments. Visual saliency is a strategy, which recommends attentive spots to be visited in descending orders of interest or information amounts. This thesis aims to utilize information theory in computational saliency models, assumed that more attention is drawn toward more informative locations.

As visual media, i.e. images and videos, are high-dimensional data, information estimation is often computationally infeasible due to enormous requirement of computation and data samples. This thesis proposes and analyses three different practical and innovative information-based saliency models.

The first model, called entropy-based saliency method (ENT), measures salient information with centre-surrounding operation by conditional entropy (ENT-CON) or Kullback-Leibler diver-gence (ENT-KLD). However, ENT only estimates information from local features offixed-size windows, it does not utilize multi-scale and global information of visual media, which are proven to be important in biological visual attention.

To utilise multi-scale information, Wavelet-based Scale-Saliency (WSS), the second model, estimates information from power distribution of data across wavelet sub-bands basis descriptors in multiple dyadic scales. Though WSS has benefited from local features at multiple scales, it has not integrated information of global context or statistical characteristics of natural images.

Multiscale Discriminant Saliency (MDIS), the third model, adopts Wavelet Hidden Markov Tree (WHMT) to unify both multiple-scale and global information for a comprehensive saliency method. All three models, ENT, WSS and MDIS are evaluated and compared against well-known saliency methods such as PSS, AIM, DIS, etc quantitatively by standard numerical tools (Normalized Scale Saliency (NSS), Linear Correlation Coefficient (LCC), Area Under Curver (AUC)) on N.Bruce’s, Kootstra’s and Judd’s databases with human eye-tracking ground-truth as well as qualitatively by visual examination of individual cases. Performances and comprehen-siveness of three models are reflected through numerical results of an experiment on Bruce’s database. As the latter model is designed in more comprehensive and computationally complex manner than the previous, all three quantitative evaluations (LCC,NSS,AUC) generally and computational time increase in that order.

ENT WSS MDIS

LCC 0.02263 -0.01731 0.02382

NSS -0.17533 0.31782 0.48019

AUC 0.78167 0.70292 0.88335

TIME(s/frame) 0.87040 1.26889 2.32734

Table 1: ENT,WSS,MDIS’s quantitative results on N.Bruce’s database

Item Type: Thesis (University of Nottingham only) (PhD)
Supervisors: Ang, Kenneth Li-Minn
Qiu, Guoping
Seng, Jasmine Kah-Phooi
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering
Faculties/Schools: University of Nottingham, Malaysia > Faculty of Science and Engineering — Engineering > Department of Electrical and Electronic Engineering
Item ID: 30984
Depositing User: Rozario, Margaret
Date Deposited: 14 Jan 2016 04:44
Last Modified: 15 Jul 2021 14:06
URI: https://eprints.nottingham.ac.uk/id/eprint/30984

Actions (Archive Staff Only)

Edit View Edit View