Towards a historical treebank of Middle and Modern Welsh
Syntactic parsing
Keywords:
Middle Welsh language, Historical corpora, Historical syntax
Abstract
This article examines various issues involved in constructing a parsed Penn-style representative historical corpus of Middle and Modern Welsh. Specifically, it focuses on what structures to adopt for constituency-based structural descriptions in three case studies: (i) whether to adopt rel- atively more or less hierarchical structures at the phrasal level and above; (ii) how to deal with complex prepositional phrases, typically containing a grammaticalizing or grammaticalized noun as one of their elements; and (iii) how to deal with coordination of main clauses and omission of elements shared between clauses. In each case, we see how conventions need to be adopted that facilitate maximal ease of searching for potential users of the corpus; that are robust across many centuries of language change; and that permit efficient and consistent parsing by a team of annotators.
Published
2022-06-27
Section
Annotating Historical Corpora special issue
Copyright (c) 2022 Marieke Meelen, David Willis

This work is licensed under a Creative Commons Attribution 4.0 International License.
Articles appearing in Journal of Historical Syntax are published under a Creative Commons Attribution License. Authors retain copyright.