Zheng, Wen
(2023)
Modeling context and knowledge for dialogue generation.
PhD thesis, University of Nottingham.
Abstract
Dialogue generation reflects the cutting-edge AI technology application, with several AI assistants already developed by prominent Information and Communications Technology (ICT) companies like Google Assistant, Apple Siri, and Microsoft Cortana. However, AI support for open dialogue presents many challenges and research continues to address them, including the ultimate challenge of passing the Turing test.
One of the key issues in dialogue generation is the source of content that can be used to train generative models within dialogue systems. Over the past decade, the construction of dialogue-generation datasets (e.g., Wizard of Wikipedia, CMU-DoG, DailyDialog, etc.) and research on models for context-aware and knowledge-based dialogue generation have been actively pursued by researchers. A detailed review of the related literature has revealed that the field has rapidly evolved, leading to significant progress but also giving rise to new questions and challenges. In this thesis, this line of research is continued, and specific issues in context-aware and knowledge-based dialogue generation are addressed, including (1) Context Usage in Dialogue Generation, incorporating dialogue intrinsic properties related to the speakers and content characteristics; (2) Knowledge Injection in Dialogue Generation, enabling the incorporation of multiple sources of knowledge; (3) Knowledge Selection in Dialogue Generation, with flexibility to separate the injection of knowledge from dialogue generation; (4) Term-Level Knowledge De-noising in Dialogue Generation, simulating response representations that can be used for knowledge de-noising in the test phase; and (5) Differentiating Context Use for Knowledge Selection and Response Generation, supporting a distinct use of context and contextualized knowledge for selecting knowledge and generating responses.
This research resulted in new methods and models that represent key novel contributions of this thesis. (1) Starting from context-only dialogue generation, a context-aware dialogue generation model named GMATs was implemented, leveraging the dialogue's intrinsic characteristics such as speaker roles and part-of-speech indicators for the dialogue utterances. (2) Given the evidence that knowledge can help improve dialogue generation, a Transformer-based knowledge injection model, TED, was designed, featuring weights for different knowledge units. The conclusion that knowledge should not be injected indiscriminately but should be carefully selected can be obtained. (3) To address this issue, a knowledge selection mechanism, named TPPA, was explored, approximating a post-retrieved knowledge network with a response-retrieved knowledge network, enabling TPPA to emulate the ground-truth response and retrieve the relevant knowledge sentences. (4) Furthermore, an investigation into how to de-noise the injected knowledge at the term level was conducted, and a KTWM model was introduced to filter out noise during model training. (5) In the end, to construct a unified dialogue generation framework, the CKL model was proposed, built on the premise that context plays a role in the knowledge selection task that is different from its role in the dialogue generation task.
The effectiveness of various models for incorporating context and knowledge into dialogue generation was empirically investigated in this thesis. Specifically, the impact of context awareness, knowledge weighting, knowledge selection, term-level de-noising, and a unified model approach on the performance of the dialogue generation methods proposed in this thesis were evaluated. The experimental results demonstrate that the use of dialogue context is critical for improving the performance of response generation: A 12.8% improvement (over the Syntax-infused BART) in the BLEU-2 score was achieved by the GMATs model, considering dialogue intrinsic characteristics such as speaker role and part-of-speech. Additionally, compared to previous works, further performance improvements of 23.1% (TED vs. WSeq-Sum) and 4.4% (TPPA vs. TED) on the BLEU-2 and Meteor metrics, respectively, were achieved by assigning different weights to knowledge sentences (TED model) and selecting knowledge using the TPPA method. The proposed KTWM model for term-level de-noising also resulted in a 6.3% improvement in the BLEU-2 score on top of TED. Finally, the CKL model, which incorporates all of these approaches into a unified framework, outperformed the best previous study, DIALKI, by 15.2% on the BLEU-2 score and even surpassed TED by a large margin (86.5%). Overall, these findings suggest that differentiating context usage for the knowledge selection task and response generation task is critical when designing a dialogue generation model and incorporating knowledge into dialogue systems using sentence-level knowledge selection and term-level de-noising can significantly enhance their performance.
Item Type: |
Thesis (University of Nottingham only)
(PhD)
|
Supervisors: |
Ke, Zhou Natasa, Milic-Frayling |
Keywords: |
Dialogue generation, natural language processing, TED, KTWM, TPPA, GMATs, CKL, response generation, generative model, context-aware, knowledge grounded, knowledge selection |
Subjects: |
Q Science > QA Mathematics > QA 75 Electronic computers. Computer science |
Faculties/Schools: |
UK Campuses > Faculty of Science > School of Computer Science |
Item ID: |
76364 |
Depositing User: |
Zheng, Wen
|
Date Deposited: |
12 Dec 2023 04:40 |
Last Modified: |
12 Dec 2023 04:40 |
URI: |
https://eprints.nottingham.ac.uk/id/eprint/76364 |
Actions (Archive Staff Only)
|
Edit View |