Javascript verkar inte påslaget? - Vissa delar av Lunds universitets webbplats fungerar inte optimalt utan javascript, kontrollera din webbläsares inställningar.
Du är här

Extended constituent-to-dependency conversion for English

  • Joakim Nivre
  • Heiki-Jaan Kalep
  • Kadri Muischnek
  • Mare Koit
Publiceringsår: 2007
Språk: Engelska
Sidor: 105-112
Publikation/Tidskrift/Serie: NODALIDA 2007 Proceedings
Dokumenttyp: Konferensbidrag
Förlag: University of Tartu


We describe a new method to convert English constituent trees using the Penn Treebank annotation style into dependency trees. The new format was inspired by annotation practices used in other dependency treebanks with the intention to produce a better interface to further semantic processing than existing methods. In particular, we used a richer set of edge labels and introduced links to handle long-distance phenomena such as wh-movement and topicalization.

The resulting trees generally have a more complex dependency structure. For example, 6% of the trees contain at least one nonprojective link, which is difficult for many parsing algorithms. As can be expected, the more complex structure and the enriched set of edge labels make the trees more difficult to predict, and we observed a decrease in parsing accuracy when applying two dependency parsers to the new corpus. However, the richer information contained in the new trees resulted in a 23% error reduction in a baseline FrameNet semantic role labeler that relied on dependency arc labels only.


  • Computer Science
  • dependency syntax
  • treebanks
  • Natural language processing


16th Nordic Conference of Computational Linguistics
  • ISBN: 978-9985-4-0514-7

Box 117, 221 00 LUND
Telefon 046-222 00 00 (växel)
Telefax 046-222 47 20
lu [at] lu [dot] se

Fakturaadress: Box 188, 221 00 LUND
Organisationsnummer: 202100-3211
Om webbplatsen