1 Introduction ................................................. 1
1.1 Context-Sensitive Situated NLG .......................... 1
1.2 Two Architectures for NLG ............................... 3
1.3 Outline of the Thesis ................................... 8
2 Related Work ................................................ 11
2.1 Introduction ........................................... 11
2.2 Context-Sensitive Language Generation .................. 12
2.2.1 Rule-Based Approaches to Generation ............. 12
2.2.2 Generation As Planning .......................... 14
2.2.3 Trainable Generation ............................ 16
2.2.4 Other Approaches ................................ 19
2.3 Reinforcement Learning for NLG ......................... 20
2.4 Joint Treatment of Subtasks ............................ 24
2.5 Graphical Models for Natural Language Generation ....... 28
2.6 Conclusion ............................................. 31
3 Hierarchical Reinforcement Learning for NLG ................. 33
3.1 Introduction ........................................... 33
3.2 Reinforcement Learning ................................. 35
3.2.1 The Markov Decision Process ..................... 37
3.2.2 Policy Learning ................................. 40
3.2.3 An MDP for Referring Expression Generation ...... 44
3.3 Hierarchical Reinforcement Learning .................... 46
3.3.1 The Semi-Markov Decision Process ................ 47
3.3.2 Policy Learning ................................. 49
3.3.3 An SMDP for Referring Expression Generation ..... 52
3.4 A Joint Learning Agent for Situated Interaction ........ 53
3.4.1 The Domain: Generating Instructions in Virtual
Environments (GIVE) ............................. 55
3.4.2 The GIVE-2 Corpus ............................... 55
3.4.3 Corpus Annotation ............................... 56
3.4.4 Hierarchy of Learning Agents .................... 59
3.5 Experimental Setting ................................... 61
3.5.1 The Simulated Environment ....................... 61
3.5.2 A Data-Driven Reward Function ................... 63
3.5.3 Training Parameters ............................. 69
3.6 Experimental Results ................................... 69
3.7 Conclusion ............................................. 73
4 A Hierarchical Information State for Constrained Learning ... 77
4.1 Introduction ........................................... 77
4.2 Content Selection for Situated Interaction ............. 81
4.2.1 Decision Trees for Content Selection ............ 82
4.2.2 Consistency and Alignment in Human Data ......... 86
4.3 Information State ...................................... 87
4.3.1 Informational Components ........................ 87
4.3.2 Formal Representations .......................... 88
4.3.3 Generation Moves ................................ 88
4.3.4 Update Rules .................................... 88
4.3.5 Update Strategy ................................. 89
4.3.6 Example of an Information State for NLG ......... 89
4.4 Combining Hierarchical RL with a Hierarchical
Information State ...................................... 91
4.4.1 The Semi-Markov Decision Process Using an
Information State ............................... 91
4.4.2 The HSMQ-Learning Algorithm Using Constrained
Actions ......................................... 92
4.5 Experimental Setting ................................... 93
4.5.1 A Reward Function for Consistency ............... 93
4.5.2 Training Parameters ............................. 94
4.6 Experimental Results ................................... 94
4.6.1 Simulation-Based Results ........................ 95
4.6.2 Human Rating Study .............................. 96
4.7 Conclusion ............................................. 98
5 Graphical Models for Surface Realisation ................... 101
5.1 Introduction .......................................... 101
5.2 Variation and Alignment in Human Data ................. 102
5.2.1 Variation and Alignment in the GIVE Corpus ..... 104
5.2.2 A Constituent Alignment Score .................. 107
5.3 Representing Generation Spaces as Graphical Models .... 107
5.3.1 (Probabilistic) Context-Free Grammars .......... 108
5.3.2 Hidden Markov Models ........................... 110
5.3.3 Bayesian Networks .............................. 112
5.3.4 Comparison of Graphical Models ................. 117
5.4 Experimental Setting .................................. 119
5.4.1 Integrating Surface Realisation: A Three-
Dimen-sional Reward Function for Situated NLG .. 119
5.5 Experimental Results .................................. 123
5.5.1 Simulation-Based Results ....................... 123
5.5.2 Similarity with Human Authors .................. 127
5.6 Conclusion ............................................ 129
6 Evaluation ................................................. 133
6.1 Introduction .......................................... 133
6.2 Experimental Setting for Navigation in Virtual
Environments .......................................... 135
6.2.1 Experimental Methodology ....................... 135
6.2.2 Experimental Setup ............................. 137
6.2.3 Experimental Results ........................... 143
6.3 Experimental Setting for Navigation in Real
Environments .......................................... 162
6.3.1 Experimental Methodology ....................... 163
6.3.2 Experimental Setup ............................. 164
6.3.3 Experimental Results ........................... 167
6.4 Conclusion ............................................ 173
7 Conclusions and Future Research ............................ 175
7.1 Contributions and Findings ............................ 175
7.2 Future Directions ..................................... 180
|