Discourse, information structure and prosody in corpora

séminaire LPL

Arndt Riester, Ina Rösiger & Uwe Reyle

IMS, University of Stuttgart, Germany

LPL, salle de conférences B011, 5 avenue Pasteur, Aix-en-Provence

entrée libre

14h-14h40 Arndt Riester, IMS, University of Stuttgart, Germany
Transforming written or spoken discourse into QUD trees*
The idea that topicality (or questions) should be seen as the general organizing principle of discourse can be traced back to at least Polanyi (1988), Stutterheim and Klein (1989), or Van Kuppevelt (1995). Notably, contemporary frameworks of discourse structure, like SDRT (Asher and Lascarides 2003) or RST (Taboada and Mann 2006), base their analyses on the identification of rhetorical relations rather than of questions. By contrast, I will show a new method (Riester et al., to appear) for the structural analysis of discourse ("QUD trees"), which is based on the reconstruction of implicit Questions under Discussion (QUDs, cf. Roberts 2012). This approach comes with the great additional benefit that it allows for a comprehensive information-structural analysis of the data under investigation. The method has been successfully applied, for instance, to French conversations, German radio interviews, and fieldwork data from the Austronesian language Sumbawa.

14h40-15h20 Ina Rösiger, IMS, University of Stuttgart, Germany
The DIRNDL and Stuttgart SFB 732 Silver Standard Collection corpus as a resource for information status and prosodic analysis
In the first part of the talk, I will present two corpus resources:
(i) DIRNDL, a corpus of German radio news annotated with prosody, information status, i.e. the degree of givenness of referring expressions, as well as coreference and bridging links, and
(ii) the Stuttgart SFB 732 Silver Standard Collection corpus, a corpus of German radio interviews, that is currently being annotated with information status.
I will describe the annotation process following guidelines on information status annotation by Riester and Baumann 2012/2017, as well as the corpus format and automatic annotations that are contained in the final corpora.
In the second part, I will show how computational studies can benefit from these corpora, concentrating on the classic NLP task coreference resolution.
I will report on two recent experiments on the relation between coreference and prosody, in which we found that manual and automatic prosodic labels can help to improve coreference resolution.

15h20-16h Uwe Reyle, IMS, University of Stuttgart, Germany
Joint information structure and discourse structure analysis in an Underspecified DRT framework
We present major aspects of a method for the analysis of natural language in terms of information structure and discourse structure using Questions under Discussion (QUDs). The main purpose is to describe a semantic implementation of the principles that determine the formulation of QUDs in Underspecified Discourse Representation Theory (UDRT).

