Linguistik online
 
     

Information:
Deutsch
English
Français
Español  
 
 

Impressum

 

Verteilte Korpusabfragesysteme

Tobias Roth (Basel)


 

Abstract

Distributed text corpora have not been very much in use so far. The Swiss Text Corpus (CHTK) and its partner projects set up a distributed corpus for German ("Korpus C4"), virtually merging parts of their corpus data and making them available through one common query platform. Based on experience made during this project, we propose a possible path towards a more standardised interface for distributed corpus queries. This should allow to integrate new as well as existing corpora more easily into distributed corpus systems. Special attention is paid to problems such as responsibility assignment, performance, user management, format unification and metadata synchronisation.


full text