Corpus is a text bank but primarily it is a knowledge sharing tool for handling original and translated text in several languages. The tool makes it easy to be consistent and correct when working with languages, and it secures the individual company from resource demanding retranslations.
In Corpus, the user may search systematically for words and collocations in collections of original, translated and parallel texts in several (Western) languages. It is possible to create your own text collection and acquire new knowledge in the shared texts in the common core. The search is performed either in the company's own text collections or in the common core of texts to which all the users have access. Each company chooses whether the texts they upload should form part of the common core.
The concept behind Corpus is similar to the TermFlower: a common core of shared texts forms the centre of the tool, and round it the participating organisations have established their own, internal text collections. The internal text collection is only accessible to the users in the organisation. If you choose to share your text with other users, the text will appear in the common core provided that the company's super user has validated it. Texts are created in Corpus either by import or by copying from one's own collection of material.
The texts in Corpus are organised by text genre and divided into segments that may either be syntactically or semantically motivated. When a user of Corpus searches for a word or a collocation, the result of the search will show all the segments containing the word searched for as well as the parallel segment from the translated version of the source text, if such a version exists. One single mouse click provides access to see the chosen segment's source text in its entire length and the source's bibliographic data.
Having access to a joint material of texts from a particular subject area facilitates the production of correct and consistent texts for the translator, the secretary and the communication assistant. The sentence based search in a given document collection ensures a homogeneous terminology irrespective of industry, and as a tool for optimising the effectiveness of the company's text production and document handling Corpus has a number of advantages:
Search in Corpus
On the search page, you type in the word or collocation you wish to search for, and you choose the language to search in. The search finds all the text segments containing the search criterion, and the search is performed both in the internal texts and the common core. You may limit your search by using the various ways of filtering, e.g. only original sources within a certain genre, so that the search only returns the precise type of text needed. By clicking the icon at the far right you gain access to the entire text and its bibliographic data.
Create your own text collection
All the users from the same company are created in the same group which is automatically assigned a 'corner' of Corpus for their own texts. This text collection is only accessible to the company's own users who may search in and change the texts. It is the company's super user who decides whether a shared text should be validated for the common core or whether the text should only be accessible internally.
Create a new text
In order to add texts, you should choose Sources and then Create source. On this page, you type in a number of bibliographic data, e.g. year, language and genre. This is also where you indicate whether the text should be sent for validation in order to form part of the common core or if the text should only be accessible for the company's users.
If the text you wish to create is a comma separated file or a Trados WinAlign file, it may be imported automatically. Other texts should be created manually.
Share texts with other users
On the page Create source you should indicate whether you wish to share your text. If you choose Yes, the text is sent to the company's super user who decides if the text should appear in the common core for the benefit of other users. The super user can get an overview of the new texts sent to approval on the pageValidate.
Edit a text
You may edit text uploaded by yourself or other users in your company. If you choose Sources, you will see an overview of the company's texts. From this list you may change both the bibliographic data and the source text itself. You may also delete a text from the list and thus from Corpus as such.
About Medicinal Products Corpus
Medicinal Products Corpus is an example of Corpus used within medicine and medical science. Original and translated texts make up the core within genres as scientific articles, patient information leaflets and product summaries.
About EU Legal Acts Corpus
EU Legal Acts Corpus is another example of Corpus used within the field of law. The EU Legal Acts Corpus allows you to search and re-use translations from the EU's joint set of rules, Acquis Communautaire, i.e. laws, treaties, rulings, directives, declarations and international agreements etc.
If you are interested in creating your own Corpus, you are welcome to contact Lab manager Morten Pilegaard at KCL for more information.