| ladin | italiano | deutsch | ||
|---|---|---|
| The Information System for Legal Terminology bistro | |
bistro is an online Information System for the research, analysis and translation of terms and texts in the Italian, German and Ladin legal and administrative language. bistro was developed by the scientific collaborators of the Institute for Specialised Communication and Multilingualism at the European Academy of Bolzano. The name of the system recalls the atmosphere of small Parisian restaurants where food is served quickly. It is said that the bistros owe their name to Russian soldiers who reached Paris after Napoleon’s defeated army and used to shout Bystro! Bystro! (Quick! Quick!) to the French waiters in order to be served more speedily. Hence the name of the restaurants and of the bistro Information System that allows the user to perform searches and find complete and useful information very rapidly. bistro consists of:
The bistro Information System is freely available online. CATEx corpusThis corpus was built within the CATEx (Computer Assisted Terminology Extraction) project. It is a collection of Italian legal texts with their translations into German. The entire corpus is available online via the bistro Information System. The texts are part of the so-called Blaue Reihe, a series of translations of the main Italian legal codes into German. Also a collection of local bilingual laws is part of the corpus. Contents
The corpus contains approximately 5 million words in Italian and German. The two language versions of each text - in the two official languages of the province, Italian and German - are aligned at paragraph level. This means that the user can access the equivalent text excerpt in the other language for every single paragraph within the corpus. Purpose The CATEx is a tool for linguistic and terminological analyses in general. More specifically, it allows to compare the two languages rapidly and with great flexibility. A dedicated user interface allows to carry out detailed and specific researches within the corpus. The user can look up single or multiword expressions as well as entire collocations, such as delitto, colpa grave or dolo o colpa grave. Combined searches are also possibile. For example, all instances of colpa translated with Fahrlässigkeit and not with Schuld can be found thanks to this option. Further readings The CATEx project was coordinated by Johann Gamper. For further information read the following articles:
CLE corpusThe Ladin corpus of the EURAC (CLE) was built within the project on Ladin legal and administrative terminology TermLad II. It consists of a collection of administrative texts in electronic format, which were provided by the municipalities and public bodies of the Province of Bolzano/Bozen. Every single text (mostly with Italian as source language) was linked to the equivalent translations into Ladin and German, i.e. the texts were aligned at paragraph/sentence level. The corpus contains about 5000 documents (8.5 million words) in German, Italian and Ladin. Purpose The CLE corpus is mainly used for the elaboration of terminology entries. Terms, definitions and contexts contained in the corpus are extracted for this purpose. Different translations and their frequency can be analysed. The CLE corpus is freely accessible to the public via a dedicated user interface. The user can search the corpus for German, Italian and/or Ladin terms. Text collection All the texts contained in the CLE were provided by the Municipalities of the Ladin valleys Val Badia and Val Gardena, by the Office for Language Issues of the Autonomous Province of Bolzano/Bozen and by the Ladin Pedagogical Institute (IPL). Further readings Streiter, O./ Stuflesser, M./ Ties, I.: CLE, an Aligned, Tri-lingual Ladin-Italian-German Corpus. Corpus Design and Interface. LREC 2004, workshop on ‘First Steps for Language Documentation of Minority Languages: Computational Linguistic Tools for Morphology, Lexicon and Corpus Compilation’ Lisbon, 24th May 2004. Language toolsThe language tools available in bistro are the term extraction and term recognition tools. Both aim at speeding up and supporting terminology work. Users can copy/paste a text of their choice (accepted formats are *.doc., *.rtf, o *.txt) or an internet address (URL) into the interface of the term recognition tool. bistro will then process the text and highlight all terms that are already contained in the bistro terminology data base. For each highlighted term direct access to the full terminology entry is granted. This tool allows the user to
The term extraction tool can be used exactly in the same way. In this case bistro finds potential ‘term candidates’ in the texts indicated by the user and drafts a list (a glossary) that can be used for further terminology research. | ||