Original Research

Using ParaConc to extract bilingual terminology from parallel corpora: A case of English and Ndebele

Ketiwe Ndhlovu
Literator | Vol 37, No 2 | a1278 | DOI: https://doi.org/10.4102/lit.v37i2.1278 | © 2016 Ketiwe Ndhlovu | This work is licensed under CC Attribution 4.0
Submitted: 28 January 2016 | Published: 26 October 2016

About the author(s)

Ketiwe Ndhlovu, Department of Linguistics and Modern Languages, University of South Africa, South Africa


The development of African languages into languages of science and technology is dependent on action being taken to promote the use of these languages in specialised fields such as technology, commerce, administration, media, law, science and education among others. One possible way of developing African languages is the compilation of specialised dictionaries (Chabata 2013). This article explores how parallel corpora can be interrogated using a bilingual concordancer (ParaConc) to extract bilingual terminology that can be used to create specialised bilingual dictionaries. An English–Ndebele Parallel Corpus was used as a resource and through ParaConc, an alphabetic list was compiled from which headwords and possible translations were sought. These translations provided possible terms for entry in a bilingual dictionary. The frequency feature and ‘hot words’ tool in ParaConc were used to determine the suitability of terms for inclusion in the dictionary and for identifying possible synonyms, respectively. Since parallel corpora are aligned and data are presented in context (Key Word in Context), it was possible to draw examples showing how headwords are used. Using this approach produced results quickly and accurately, whilst minimising the process of translating terms manually. It was noted that the quality of the dictionary is dependent on the quality of the corpus, hence the need for creating a representative and clean corpus needs to be emphasised. Although technology has multiple benefits in dictionary making, the research underscores the importance of collaboration between lexicographers, translators, subject experts and target communities so that representative dictionaries are created.




Total abstract views: 2875
Total article views: 7191


Crossref Citations

1. Multilingual parallel corpus: An institutional resource for terminology development at the University of South Africa (Unisa)
Koliswa Moropa, Bulelwa Nokele
Research in Corpus Linguistics  vol: 11  issue: 2  first page: 141  year: 2023  
doi: 10.32714/ricl.11.02.08