WP3-en – CORTEX

work plan

WP3. Natural Language Generation and Use Cases

This last WP will contribute to fulfilling OB4 and OB5, and its purpose is to apply the proposed and developed knowledge-enhanced NLG approaches into diverse scenarios and use cases to validate and show their appropriateness in real contexts. Each scenario will integrate the findings and outcomes of WP1 and WP2, and they will be evaluated with the specific and standard metrics appropriate for the diverse settings. In particular, for this project the following scenarios are envisaged and next explained.

Task 3.1 Text Summarisation

Text summarisation aims to synthesise information keeping only what is relevant (Syed, Gaol and Matsuo, 2021). Although research into extractive approaches is the most predominant, they are limited to literally copying the information from the input and pasting it in the output summary. On the other hand, abstractive summarisation is more powerful, but at the same time more challenging. The goal of this task is to address abstractive summarisation, integrating a strong NLG component from the results of WP2. This will open several domains to experiment with (e.g., finance, journalism, health, or education, to name a few). The integration of a knowledge-enhanced NLG component during the abstractive summarisation process would contribute to producing more human-like summaries, as it will be possible to detect and infer relevant information, even when this is described through complex events in several non-consecutive sentences. It will also enable reliable and precise paraphrasing of the textual documents from which the abstract will be generated.

Milestone: Analysis and development of an abstractive text summariser that integrates the NLG approach of task 2.3 as one of its main components.

Task 3.2 Creative text generation (storytelling and poetry texts)

One of the most complex NLG scenarios is the production of creative or artistic texts, including storytelling (fictional narrative) or poetry (Barros et al., 2019; Bena and Kalita, 2019; Chakrabarty et 8 de 20 al., 2021; Lau et al., 2018; Papay y Padó, 2020; Vicente et al., 2018; Wang et al., 2021). In both cases, a NLG system must deal with specific linguistic phenomena such as the type and structure of the narrative events, temporal or causal relationships, the depiction of mental states, figurative language, or prosodic devices such as meter and rhythm, among others. Although they appear in other types of texts, it is in these texts where they are frequently used.

The objective of this scenario is to go deeper into the computational analysis of these textual phenomena (in the framework of Computational Literary Studies) and to analyse to what extent they affect NLG (Van Heerden & Bas 2021). Our aim is twofold. First, we will analyse and automatically extract literary events, their structures and the temporal or causal relations between them (Sims et al., 2019; Feder et al., 2021). For this task we will take advantage of the European Literary Text Collection, a multilingual corpus of European novels (Odrebrech et al., 2019). Second, we will explore formal analysis of metre and rhythm in a corpus of poetry (as the ADSO corpus (Navarro Colorado et al., 2016), a large corpus of Spanish poetry with metric information) to introduce more realistic prosody in NLG.

Milestone: a formal computational model for the analysis and generation of creative texts.

Task 3.3 Chatbots for emotional intelligence

Emotional Intelligence education (Goleman, 1995) is a pending issue for society that could potentially contribute to solving many current social problems. Some of these include bullying, suicide, gender violence, stress, anxiety, depression, anorexia, discrimination, and autism. We will demonstrate the benefits of chatbots for helping users to improve their emotional intelligence and to better manage and understand their emotions. Specifically, the chatbot will work on “tales with a message”. These folk stories or fables are appropriate in that they represent the millennia-long tradition of homo sapiens to skilfully transmit and understand knowledge. These stories are easily understood, often with simple moral or associated metaphors. The work of the following scholars supports the usefulness of these tales in this type of research: Färber & Färber, 2015; Odabasi et al., 2012; Kulikovskaya & Andrienko, 2016. We will apply the research conducted to the text generation techniques of the previous WPs to help users develop social cognition (the ability to identify and understand social situations: Uekerman et al., 2010), as well as to improve their reading comprehension level. For instance, a reading comprehension question generator will be able to help the user to better understand the tale.

Milestone: Develop a chatbot for tales with a message that integrates text generation techniques in order to improve emotional intelligence and reading comprehension.

Task 3.4 Making English-language metaphors more intelligible

The study of metaphors in specific domains in the English language is motivated by the desire to promote inclusivity and falls within the area known as English for specific purposes. The goal is to facilitate the human assimilation of abstract information when this happens in an unknown context as well as to ultimately provide an equivalent meaning in a simpler and more straightforward manner. This is beneficial to the new generation of digital citizens or “netizens”, who need to be operation across national boundaries, often using English as their vehicle. Indeed, according to Rai and Chakraverty (2020), there is a pressing need to process metaphors in a common language for all communities as they are often ambiguous and require up to date global knowledge to understand their meaning and purpose.

In the financial domain, as an example among many, daily communication very often occurs by using conceptual and often tricky-to-unravel metaphors, which can result in outsiders feeling excluded from financial trading communities. Phrases such as “bear market bounce” or “dead cat bounce”, among many others, have their own domain-based meaning which may be difficult to decode, especially for non-expert communities or other interested parties for whom English is a foreign language. Therefore, resources that facilitate knowledge access via the development of new technologies would have a positive impact on, for example, an individual’s management of their personal finances and investments.

In such a scenario, we will rely on the conceptual metaphor theory developed by the research of Lakoff and Johnson (1980) as a basis, complementing it to knowledge acquisition techniques from WP1 which will allow us to identify metaphors, as well as employing the methods developed in WP2 to make them more intelligible by using plain and accessible language, so that both experts and non-experts in the specific domain can understand them.

Milestone: development of resources to facilitate intelligibility of English-language metaphors in specific domains.