Statistical Identification of Pleonastic Pronouns

Författare

Marcus Stamborg
Pierre Nugues

Summary, in English

This paper describes an algorithm to identify pleonastic pronouns using statistical techniques. The training step uses a coreference

annotated corpus of English and focuses on a set of pronouns such as it. As far as we know, there is no corpus with a pleonastic

annotation. The main idea of the algorithm was then to recast the definition of pleonastic pronouns as pronouns that never occur

in a coreference chain. We integrated this algorithm in an existing coreference solver (Bj¨orkelund and Nugues, 2011) and we

measured the overall performance gains brought by the pleonastic it removal. We observed an improvement of 0.42 from 59.15

of the CoNLL score. The complete system (Stamborg et al., 2012) participated in the CoNLL 2012 shared task (Pradhan et al.,

2012), where it obtained the 4th rank.

Avdelning/ar

Publiceringsår

2012

Språk

Engelska

Sidor

67-68

Publikation/Tidskrift/Serie

SLTC 2012 : The Fourth Swedish Language Technology Conference

Fulltext

Available as PDF - 258 kB
Download statistics

Länkar

Publication in Lund University research portal

Dokumenttyp

Konferensbidrag

Förlag

SLTC

Ämne

Computer Science

Conference name

The Fourth Swedish Language Technology Conference

Conference date

2012-10-24 - 2012-10-26

Conference place

Lund, Sweden

Status

Published

Statistical Identification of Pleonastic Pronouns

Summary, in English

Kontaktinformation

Information om www.lu.se

Följ oss på sociala medier

Samarbeten och nätverk