Multiobjective selection hyper-heuristics using reinforcement learning

Li, Wenwen (2018) Multiobjective selection hyper-heuristics using reinforcement learning. PhD thesis, University of Nottingham.

[img] PDF (Thesis - as examined) - Repository staff only - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (10MB)


Considering the multiobjective nature of real-world optimisation problems requiring a search for optimal trade-off solutions, many multiobjective metaheuristics have been proposed in the scientific literature. As observed in previous studies, different approaches show strengths on different problems. A research question would be how to combine the strengths of those multiple multiobjective approaches to obtain improved performance across a range of problems. Hyper-heuristics, which emerged as general-purposed search methods with reusable components, are one of the design philosophies to achieve that goal. Hyper heuristics perform search over the space of (meta)heuristics by either selecting an appropriate (meta)heuristic or generating a new one from given components. Hierarchically, the (meta)heuristics are called low-level (meta)heuristics since they work underneath the selection or generation strategies of the hyper-heuristics. This feature leads to increased generality level for search methods and enables multiobjective hyper-heuristics to be applied to a wide range of problem domains than a metaheuristic tailored for a particular problem domain. A crucial component in such hyper-heuristics is learning and hence, various online and offline learning mechanisms have been adopted within hyper-heuristics.

In this thesis, the focus is online learning based selection hyper-heuristics for multiobjective optimisation. To gain insights of the behaviour and roles of online learning mechanisms played in selection hyper-heuristics, nine hyper-heuristics including online learning based, predefined sequence based and random choice based are applied to and analysed on an ‘unseen’ real-world problem, wind farm layout optimisation. The empirical results show that selection hyper-heuristics can indeed exploit the strengths of different MOEAs. Meanwhile, it also suggests two research directions: find a reasonable combination of low-level multiobjective evolutionary algorithms (MOEAs) for the selection hyper-heuristic framework to perform search on, and come up with a more effective online learning mechanism for the hyper-heuristic framework to exploit the strengths of different low-level MOEAs.

Therefore, a critical review of different types of MOEAs is carried out in order to develop a better understanding of their nature, advantages and disadvantages. This review would lead to a more informed decision on the choice of the low-level metaheuristic set that selection hyper-heuristics can operate on. In addition, based on the investigation of hypervolume guided EAs, an improved version of such algorithm is proposed in this thesis, which later is used as one of the low-level MOEAs in the proposed selection hyper-heuristics.

Following this, two learning automata based selection hyper-heuristics for multiobjective optimisation are proposed which select an appropriate metaheuristic to perform at a given time based on the information gathered during the search. Due to the complicated nature of multiobjective optimisation, the learning automata in the proposed hyper-heuristics is employed in a non-traditional way and novel components are also designed for making the best use of the learned information.

The proposed hyper-heuristics are compared with a range of multiobjective approaches including a state-of-the-art online learning based selection hyper-heuristic on four problem domains including two mathematical benchmark functions and two real-world problems. The experimental results demonstrate the superior performance and generality of the proposed approach. To further challenge the proposed hyper-heuristics, different numbers and types of metaheuristics are incorporated as the low-level metaheuristics and combined with different acceptance strategies. The proposed learning automata based hyper-heuristics are the best-performed ones based on the performance indicator, hypervolume µ-norm.

Item Type: Thesis (University of Nottingham only) (PhD)
Supervisors: Ozcan, Ender
John, Robert
Keywords: Hyper-heuristics, multiobjective optimisation, reinforcement learning, wind farm layout optimisation
Subjects: Q Science > QA Mathematics > QA 75 Electronic computers. Computer science
Faculties/Schools: UK Campuses > Faculty of Science > School of Computer Science
Item ID: 53475
Depositing User: Li, Wenwen
Date Deposited: 19 Dec 2018 12:44
Last Modified: 01 Jul 2019 04:30

Actions (Archive Staff Only)

Edit View Edit View