Using ParaConc to extract bilingual terminology from parallel corpora: A case of English and Ndebele
Literator
Field | Value | |
Title | Using ParaConc to extract bilingual terminology from parallel corpora: A case of English and Ndebele Die gebruik van ParaConc om tweetalige terminologie van parallel korpusse te onttrek: ’n Geval van Engels en Ndebele | |
Creator | Ndhlovu, Ketiwe | |
Description | The development of African languages into languages of science and technology is dependent on action being taken to promote the use of these languages in specialised fields such as technology, commerce, administration, media, law, science and education among others. One possible way of developing African languages is the compilation of specialised dictionaries (Chabata 2013). This article explores how parallel corpora can be interrogated using a bilingual concordancer (ParaConc) to extract bilingual terminology that can be used to create specialised bilingual dictionaries. An English–Ndebele Parallel Corpus was used as a resource and through ParaConc, an alphabetic list was compiled from which headwords and possible translations were sought. These translations provided possible terms for entry in a bilingual dictionary. The frequency feature and ‘hot words’ tool in ParaConc were used to determine the suitability of terms for inclusion in the dictionary and for identifying possible synonyms, respectively. Since parallel corpora are aligned and data are presented in context (Key Word in Context), it was possible to draw examples showing how headwords are used. Using this approach produced results quickly and accurately, whilst minimising the process of translating terms manually. It was noted that the quality of the dictionary is dependent on the quality of the corpus, hence the need for creating a representative and clean corpus needs to be emphasised. Although technology has multiple benefits in dictionary making, the research underscores the importance of collaboration between lexicographers, translators, subject experts and target communities so that representative dictionaries are created. Die ontwikkeling van die Afrikatale as wetenskap- en tegnologietale hang af van wat gedoen word om die gebruik van hierdie tale in gespesialiseerde domeine soos die tegnologie, handel, administrasie, die media, regte, wetenskap en onderwys te bevorder. Een moontlike manier waarop die Afrikatale ontwikkel kan word, is deur vakwoordeboeke saam te stel (Chabata 2013). Hierdie artikel verken die manier waarop parallelle korpora met ’n tweetalige konkordanser (ParaConc) deursoek kan word vir tweetalige terme wat dan gebruik kan word om tweetalige vakwoordeboeke saam te stel. ’n Engels-Ndebele Parallelle Korpus het as bron gedien en ParaConc is gebruik om ’n alfabetiese lys saam te stel waarvoor vertalings verskaf is. Hierdie vertalings het moontlike terme verskaf wat in ’n tweetalige woordeboek opgeneem kan word. Die frekwensielys in ParaConc is tesame met sy ‘hot words’-instrument aangewend om gepaste terme te bepaal wat in die woordeboek opgeneem kan word en om ook moontlike sinonieme te identifiseer. Aangesien parallelle korpora belyn word, en alle data in konteks (Key Word in Context) verskyn, was dit moontlik om voorbeelde uit te lig om aan te toon hoe trefwoorde gebruik kan word. Met hierdie benadering was dit moontlik om resultate baie vinnig en akkuraat te bekom terwyl die vertaling van terme met die hand bykans uitgeskakel word. Daar word aangetoon dat die gehalte van die woordeboek duidelik afhang van die gehalte van die korpus en derhalwe word die behoefte aan ’n verteenwoordigende en skoon korpus beklemtoon. Alhoewel woordeboekmaak op veelvoudige maniere baat vind by moderne tegnologie, beklemtoon die navorsing ook die belangrikheid van samewerking tussen leksikograwe, vertalers, vakkenners en teikengebruikers sodat verteenwoordigende woordeboeke geskep kan word. | |
Publisher | AOSIS | |
Date | 2016-10-26 | |
Identifier | 10.4102/lit.v37i2.1278 | |
Source | Literator; Vol 37, No 2 (2016); 12 pages Literator; Vol 37, No 2 (2016); 12 pages 2219-8237 0258-2279 | |
Language | eng | |
Relation |
The following web links (URLs) may trigger a file download or direct you to an alternative webpage to gain access to a publication file format of the published article:
https://literator.org.za/index.php/literator/article/view/1278/2106
https://literator.org.za/index.php/literator/article/view/1278/2105
https://literator.org.za/index.php/literator/article/view/1278/2107
https://literator.org.za/index.php/literator/article/view/1278/2099
|
|
ADVERTISEMENT