Publikationer
Extended constituent-to-dependency conversion for English
Redaktör:
- Joakim Nivre
- Heiki-Jaan Kalep
- Kadri Muischnek
- Mare Koit
Avdelning/ar:
Publiceringsår: 2007
Språk: Engelska
Sidor: 105-112
Publikation/Tidskrift/Serie: NODALIDA 2007 Proceedings
Fulltext:
Dokumenttyp: Konferensbidrag
Förlag: University of Tartu
Sammanfattning
We describe a new method to convert English constituent trees using the Penn Treebank annotation style into dependency trees. The new format was inspired by annotation practices used in other dependency treebanks with the intention to produce a better interface to further semantic processing than existing methods. In particular, we used a richer set of edge labels and introduced links to handle long-distance phenomena such as wh-movement and topicalization.
The resulting trees generally have a more complex dependency structure. For example, 6% of the trees contain at least one nonprojective link, which is difficult for many parsing algorithms. As can be expected, the more complex structure and the enriched set of edge labels make the trees more difficult to predict, and we observed a decrease in parsing accuracy when applying two dependency parsers to the new corpus. However, the richer information contained in the new trees resulted in a 23% error reduction in a baseline FrameNet semantic role labeler that relied on dependency arc labels only.
The resulting trees generally have a more complex dependency structure. For example, 6% of the trees contain at least one nonprojective link, which is difficult for many parsing algorithms. As can be expected, the more complex structure and the enriched set of edge labels make the trees more difficult to predict, and we observed a decrease in parsing accuracy when applying two dependency parsers to the new corpus. However, the richer information contained in the new trees resulted in a 23% error reduction in a baseline FrameNet semantic role labeler that relied on dependency arc labels only.
Disputation
Nyckelord
- Technology and Engineering
- dependency syntax
- treebanks
- Natural language processing
Övrigt
16th Nordic Conference of Computational Linguistics
2007-05-25/2007-05-26
Tartu, Estonia
Published
Yes
- ISBN: 978-9985-4-0514-7

