Het Gesproken Corpus van de zuidelijk-Nederlandse Dialecten


Anne Breitbarth

Melissa Farasyn

Anne-Sophie Ghyselen

Jacques Van Keymeulen


In this paper, we report on the construction of a linguistically annotated pilot corpus of the southern Dutch dialects, based on existing tape recordings from the 1960s and 1970s. The corpus provides audio aligned transcriptions in two layers, one closer to the dialect and one closer to Standard Dutch, the latter of which is part-of-speech and syntactically tagged. The corpus is intended to facilitate large-scale research into the syntactic pecularities of the southern Dutch dialects, which could not be researched systematically on a large scale in an easily reproducible way yet. Two short case studies concerning such peculiarities, i.e. V2 violations and the retention of the old preverbal negation particle, are presented in this paper to support the need for the corpus.