Visual tracking over multiple temporal scales

Khan, Muhammad Haris (2015) Visual tracking over multiple temporal scales. PhD thesis, University of Nottingham.

[img]
Preview
PDF (Thesis - as examined) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (7MB) | Preview

Abstract

Visual tracking is the task of repeatedly inferring the state (position, motion, etc.) of the desired target in an image sequence. It is an important scientific problem as humans can visually track targets in a broad range of settings. However, visual tracking algorithms struggle to robustly follow a target in unconstrained scenarios. Among the many challenges faced by visual trackers, two important ones are occlusions and abrupt motion variations. Occlusions take place when (an)other object(s) obscures the camera's view of the tracked target. A target may exhibit abrupt variations in apparent motion due to its own unexpected movement, camera movement, and low frame rate image acquisition. Each of these issues can cause a tracker to lose its target.

This thesis introduces the idea of learning and propagation of tracking information over multiple temporal scales to overcome occlusions and abrupt motion variations. A temporal scale is a specific sequence of moments in time Models (describing appearance and/or motion of the target) can be learned from the target tracking history over multiple temporal scales and applied over multiple temporal scales in the future. With the rise of multiple motion model tracking frameworks, there is a need for a broad range of search methods and ways of selecting between the available motion models.

The potential benefits of learning over multiple temporal scales are first assessed by studying both motion and appearance variations in the ground-truth data associated with several image sequences. A visual tracker operating over multiple temporal scales is then proposed that is capable of handling occlusions and abrupt motion variations.

Experiments are performed to compare the performance of the tracker with competing methods, and to analyze the impact on performance of various elements of the proposed approach. Results reveal a simple, yet general framework for dealing with occlusions and abrupt motion variations. In refining the proposed framework, a search method is generalized for multiple competing hypotheses in visual tracking, and a new motion model selection criterion is proposed.

Item Type: Thesis (University of Nottingham only) (PhD)
Supervisors: Pridmore, T.P.
Valstar, M.F.
Subjects: Q Science > QA Mathematics > QA 75 Electronic computers. Computer science
Q Science > QP Physiology > QP351 Neurophysiology and neuropsychology
Faculties/Schools: UK Campuses > Faculty of Science > School of Computer Science
Item ID: 33056
Depositing User: Khan, Muhammad
Date Deposited: 15 Aug 2016 11:09
Last Modified: 23 Sep 2016 01:56
URI: http://eprints.nottingham.ac.uk/id/eprint/33056

Actions (Archive Staff Only)

Edit View Edit View