The use of technology to preserve indigenous languages of South Africa

the scope of Literator , which is to publish studies in linguistics and literature with a special focus on South African languages. This publication will bring solutions to how minority languages could be preserved in the context of South Africa.


Introduction
South Africa is a multilingual country with 12 official languages namely, English, Afrikaans, Sepedi, Setswana, Sesotho, isiZulu, isiXhosa, Tshivenḓa, Xitsonga, Siswati, isiNdebele and the South African Sign language.These languages are supposed to enjoy the same status of officiality.However, practically, they do not receive equal attention in development as some languages are more resourced than others.Despite that, the South African Constitution (1996) section 6(5) gave the Pan South African Language Board a mandate to 'promote and create conditions for the development of all official languages' and all other typically used non-official languages in South African communities.Indigenous languages face limited digital accessibility because of resource scarcity; they, however, must be technologically preserved so that they can be used in various disciplines without losing their identity.Mercuri (2012) posits that language is perceived as a fundamental component of one's cultural identity and its loss is equivalent to the collapse of an entire culture.UNESCO (2023) also affirms that: [W]hen an indigenous language is lost, not only does the knowledge accumulated by the community of its speakers fade away, but also the world's cultural and biological diversity is jeopardised.(n.p.) Despite the importance of language in society, many of the world's languages have become extinct, whereas others are endangered to varying degrees (Etim 2016).Therefore, the preservation of languages is an imperative undertaking, and all stakeholders must create useful strategies to make successful efforts.
In the context of South Africa, Tshivenḓa, Xitsonga, Siswati and isiNdebele are indigenous languages that have fewer language speakers and language resources (Luvhengo 2012).Statistics Indigenous languages in South Africa must be preserved to ensure that they do not lose their identity and become extinct.The four indigenous languages with the fewest speakers among South Africa's 12 official languages are: Xitsonga, Siswati, Tshivenḓa and isiNdebele.The preservation of these languages in South Africa has been a long-standing challenge because of various social and economic factors.With the advancement of technology, opportunities have arisen to preserve and promote the use of these languages.Therefore, this study explores various technological strategies that can be used to preserve the South African indigenous languages.These languages can be preserved by making them widely accessible to users through various strategies such as localisation of daily used technology, translation through crowdsourcing, digitisation and archiving.Digital learning tools such as machine translation (MT) and creating online dictionaries can also contribute to preserving these languages.Each of these strategies offers benefits on how technology could be employed effectively and facilitate the preservation of indigenous languages.This study demonstrates the significance of technology in preserving indigenous languages and promoting their use around the world.

The use of technology to preserve indigenous languages of South Africa
Read online:  (Maja 2007).This study focusses on four South African indigenous languages, namely, Tshivenḓa, Xitsonga, Siswati and isiNdebele.The aim is to explore various technological strategies that can be used to preserve these languages.This study is systematically divided into five sections.In section 2 related works are discussed.Section 3 describes the methodology and section 4 deals with the analysis and discussion of strategies.Lastly, a conclusion is drawn in section 5.

Related work
Osborn (2006) noted that there is minimal use of African languages in information and communication technologies (ICTs) such as computer applications and the Internet, even though these languages are vital in ways of generating and expressing useful knowledge.The author acceded that if African languages are utilised in other fields, their inclusion in technology becomes vitally important because the reliance on European languages such as English and French to transmit contemporary knowledge and information has been detrimental for societies and people who cannot understand these foreign languages.Osborn (2006) also observed that the implementation of African languages in ICTs is hampered by several factors such as insufficient resources to incorporate African languages in ICT, non-existence of standardised orthographies for indigenous languages with fewer speakers, and use of specialised alphabets in many languages that need specialised fonts.The lack of collaboration between ICT developers and African linguists has also contributed to the minimal existence of these languages in ICTs.Despite these challenges, the author argued that information must be made available in African languages through localisation of the ICTs such as computer software and websites.Riza (2008) studied indigenous languages in Indonesia and determined that there are languages that are facing extinction on a huge and quick scale.He predicted that languages and cultures of communities with very few speakers might not survive beyond the end of the current century.This problem is because of linguistic and cultural assimilation with the majority group, migration to cities and a lack of social and economic support from the government.Riza identified three important tasks that need to be done to preserve an endangered indigenous language of Indonesia.These tasks are computational linguistics techniques that allow a multidimensional view of language resources, increasing the orientation of regional research centres towards resource creation, and using statistical or empirical models of language, particularly if the language is near extinction.The author also observed that there is a growing understanding of the relevance of corpus resources in Indonesia, which has assisted in the development of resources that can preserve endangered languages.Galla (2009) described areas in which technology plays a significant role in language and culture revitalisation.The study discussed efforts made by indigenous communities to preserve, maintain and revitalise their languages using technology.An overview was provided about the significance of language preservation for indigenous communities and the challenges that languages face because of globalisation and cultural assimilation.The author explained that languages can be preserved using technology through creation of language learning applications, online language resources and use of social media and other digital platforms to connect language speakers.The preservation and revitalisation of indigenous languages can also be done by the wax cylinder recording of digital audio recordings, email to chat, video recording of interactive audio videoconferencing and surfing the Internet to play interactive computer games.Some of the challenges and limitations of using technology to preserve languages, such as the need for resources, expertise and the need to balance the use of technology with more traditional language learning methods were also highlighted.Osborn (2010) posited that the unavailability of ICTs in indigenous languages decreases the chances of creating and having content in local languages online.This affects the culture and information that could be made available to scholars interested in studying them.The author proposed that ICTs should be localised through translating and culturally adapting applications and software graphical user interfaces to indigenous languages.Localisation can also be achieved by developing online content in various languages and translating the content from a dominant language to other target languages.The author also elucidated that localisation will assist many African language speakers to easily access ICTs and information in their local languages.
The target group for localisation would be Africans based in Africa, those who live abroad as well as non-Africans.However, there are several challenges that delay the use of ICT in African languages such as expenses associated with translating materials into these languages as well as the fact that the African languages are very diverse.Negative attitude from educated African language users is another issue that contributes to the limited use of languages in ICT.Moreover, African languages are marginalised by educational policies in several countries, resulting in a negative effect on their use in ICTs.Olaifa (2014) examined the role of libraries in language preservation and development and noted that libraries can play a significant role in language preservation and development, especially for indigenous and endangered languages.The author discussed the importance of languages and the challenges they face in the modern world.The challenges are language loss, language shift or even language death because many languages are still not properly documented.To address these challenges, Olaifa explored the role of libraries in language preservation by highlighting the various ways in which libraries can support language documentation, revitalisation and promotion.These include the gathering and preservation of language materials, language learning resource provision and language event and activity hosting.The study also discussed some of the issues that libraries face in terms of language preservation, such as a lack of resources and expertise, as well as the need for collaboration with language communities.Olaifa contended that libraries may overcome these problems and effectively preserve languages by having a communitycentred approach and engaging closely with language speakers and other stakeholders.
Mirza and Sundaram (2016) investigated how harnessing and leveraging collective intelligence techniques are used to support the preservation and learning of endangered languages.Mirza and Sundaram (2016) defined collective intelligence as a group's ability to perceive and solve challenges that cannot be solved individually.The authors argued that using teamwork, cooperation, coordination and cognition within communities can assist in overcoming some of the issues faced by language revitalisation projects, such as a lack of resources and expertise to preserve endangered languages.A variety of ways in which collective intelligence techniques can be employed in language preservation initiatives were outlined.These included the creation of online platforms and crowdsourcing tools to help in language documentation and revitalisation.The utilisation of online communities and peer-to-peer learning networks to enhance language acquisition was also viewed as an advantage of collective intelligence techniques.Some of the challenges and limitations that might arise from collective intelligence approaches in language preservation were noted.These challenges include the requirement for participants' trust and cooperation as well as the significance of assuring that these approaches are respectful and appropriate for different cultures.
Martín-Mor (2017) explored the use of technology in preserving endangered languages.The author focussed on the minority languages of Sardinia, an Italian island with linguistic diversity.He contended that Sardinian languages are in a state of diglossia because official national language policies do not appear to be capable of reversing the severe situation with these languages, leaving their preservation mostly to individual devotion.Martín-Mor outlined three levels of technological measures that may be utilised to preserve Sardinian languages as endangered languages: translation, localisation and the establishment of language tools and resources.In this study, it was emphasised that it is vital to involve communities in the preservation of languages as each technological measure requires linguistic competencies and technical skills from language speakers.
The conclusion was that language preservation initiatives must be culturally sensitive and take into consideration each community's specific demands.
Moodley (2020) explored the significance of localising online computer software interfaces to include Setswana for increased accessibility and inclusivity for South African teachers.The study found that teachers' experiences with dual English-Setswana interface educational software significantly impact their attitudes towards using African languages in ICT.The author also noted that despite teachers' satisfaction with the Setswana software, they expressed concerns about the non-translated words and neglect of dialect accommodation.Teachers also felt that the software could be improved by including more Setswana cultural content and providing more support for teachers in using the software in the classroom.The study concluded that the localisation of online computer software into indigenous African languages is essential in ensuring that all South African teachers have access to the ICT they need to teach effectively.Sundani (2023)

Methodology
This is qualitative research that looks at how technology can be used to preserve indigenous languages of South Africa with fewer speakers.This approach seeks to offer in-depth insights and knowledge of real-world situations (Moser & Korstjens 2017).It requires researchers to employ a systematic and rigorous approach to gather, interpret and analyse nonnumerical data.This study utilised systematic selfobservation (SSO) as a valuable data collection method.This method involves individuals systematically recording their observations and experiences, often through written or digital means, about a specific phenomenon of interest (Lumma & Weger 2021;Rydberg 2023).It emerges from an intention to observe, describe and clarify the patterns, events and phenomena of everyday life (Mick, Spiller & & Baglioni 2012).This study employed SSO to determine technological strategies and describe how they can be implemented for promoting and preserving indigenous languages in South Africa with fewer speakers and digital resources, akin to their global application.

Discussion of strategies
Technologies can be used to preserve indigenous and endangered languages (Martín-Mor 2017).For South African indigenous languages, the technological strategies: localisation of daily used technology, translation through crowdsourcing, digitisation and archiving, digital learning tools, machine translation (MT) and creating online dictionaries that can preserve these languages are discussed in the next subsections.

Localisation of daily used technologies and/or application software
The term localisation can be explained as 'a communicative, technological, textual and cognitive process by which interactive digital texts are modified for use by audiences around the world than those originally targeted' (Jiménez-Crespo 2013:1).Sandrini (2008) categorised localisation into software application and websites.Both categories of localisation involve linguistic and cultural adaptation of a specific product for a particular group (Esselink 2000).Sandrini (2008) emphasised that the purpose of localisation is to ensure that individuals from specific areas and settings can easily and effectively use software and websites in their home language.
In the context of South Africa, popular web browsers, digital tools and application software interfaces such as Google, Mozilla, Facebook, Twitter, Slack and Skype can be localised into South African indigenous languages to promote the preservation and use of these languages.This can be done by linguists and speakers of the languages to ensure that the interfaces of these applications are in their indigenous languages.For instance, the localisation of the Facebook interface from English to Siswati has been done as shown in Figure 1a and Figure 1b.
From Figure 1a, we observe that the Facebook interface is in English, and in Figure 1b, the Facebook interface has been localised into Siswati.In Figure 1a, the English words 'Friends', 'Requests', 'Your friends' and 'People you may know' were localised in Figure 1b to Bangani, Ticelo, Bangani bakho and Bantfu lokungenteka uyabati, respectively.
A similar approach can be applied to Tshivenḓa, Xitsonga and isiNdebele, respectively.Motsa (2023) quoted Msibi who stated that: usefulness of the localisation of web applications is that it makes digitised documents accessible to many sociolinguistic groups worldwide (Jiménez-Crespo 2013).Martín-Mor (2017) also affirmed that the purpose of localisation is to ensure that certain products are available in the languages of minority groups.Hence, in the case of South Africa, localisation ensures that technologies and application software are available in indigenous languages and enhance their use and preservation.

Translation through crowdsourcing
South African indigenous languages can be preserved using translation through crowdsourcing.Brabham (2013:xix) defined crowdsourcing as 'an online, distributed problem-solving and production model that leverages the collective intelligence of online communities to serve specific organisational goals'.Kalinin and Savchenko (2014) highlighted that crowdsourcing is a quick approach to obtaining high-quality translations in which community members and native speakers of the languages participate willingly.In South Africa, speakers of Xitsonga, Tshivenḓa, Siswati and isiNdebele can voluntarily and collaboratively translate articles on the website, specifically on Wikipedia which permits the translation of English articles to a specific target language.Wikipedia data shows that there is a huge gap in the number of articles written in indigenous languages of South Africa with fewer speakers as shown in Table 1.
From Table 1, we observe that the South African indigenous languages with fewer speakers have the least number of articles found on Wikipedia: Tshivenḓa has 808, Xitsonga 741 and Siswati 656.Disappointingly, isiNdebele has no articles available.These statistics show that more work and effort must be geared towards the translation of articles into Tshivenḓa, Xitsonga, Siswati and isiNdebele, with the latter language in dire need.Therefore, translation through crowdsourcing will improve the number of articles and preserve these indigenous languages.
The online translation exercise will make information accessible in all disciplines and provide corpora that are useful in developing language resources and tools for underresourced indigenous languages.The translation must also be verified for quality.Translation quality assurance is a challenge in many translated materials (House 2013).The benefit of using crowdsourcing in translation is to have language speakers with a variety of expertise contribute to the translation process to improve the quality of the translated materials.Regardless of the challenge associated with translation quality, having web publications translated from English to indigenous languages is a beneficial step toward ensuring that knowledge is open, promoted and conserved, as well as used in a range of fields.

Digitisation and archiving
Digitisation and archiving can play an important role in the preservation of South African indigenous languages.Digitisation is a method of transforming numerous kinds of information into electronic forms, such as written material, sound, picture and speech (Khan, Khan & Aftab 2015).Technological tools such as scanners and voice recorders can be used to digitise language documents, voice and audio so that they can be stored easily in electronic formats.Optical Character Recognition software is suitable for the digitisation of documents as it converts hardcopy documents and images into digital formats (Hocking & Puttkammer 2016).Finlayson and Madiba (2002) noted that all South African indigenous languages have written materials in the form of dictionaries, literary works and terminology lists.These materials exist in hard copies that makes them not easily accessible to every user.Written materials that contain the multifaceted use of indigenous languages including those that are out of print can be digitised to preserve them and make information accessible to everyone.One of the major merits of digitisation is that it increases access to data, information, and resources because many people can reach them without difficulty and there is no need to go to places where printed materials are kept (Khan et al. 2015).
Archiving is also essential in preserving minority and endangered languages (Joshua 2014).Archiving can be done through language documentation in which documents in Xitsonga, Tshivenḓa, Siswati, and isiNdebele are deposited into digital archives or repositories for different purposes such as language preservation, conducting research, orthography, and spelling verification.The advantage of using digital archives is that documents are stored in a digital form which makes them immune to physical deterioration (Henke & Berez-Kroeker 2016).Joshua (2014) also noted that archiving in the digital format ensures that information is easy to retrieve as it is accessible to every user through the Internet.Therefore, language materials must be archived digitally to preserve them so that future language users can access them freely.

Digital learning tools
Traditional modes and techniques of learning are being replaced by digital learning, which is a new renaissance in education (Maria et al. 2019).Digital learning involves the use of technology-based resources or platforms to facilitate Digital learning tools such as MT tools, online dictionaries, audio, visual multimedia content and social media platforms can all facilitate faster and more comprehensive language acquisition.Language preservation may also be accomplished with the use of these tools.For this study, MT tools and online dictionaries will be examined as digital learning tools to show how they might be utilised to preserve indigenous languages in South Africa.

Machine translation tools
El-Banna and Naeem (2016) defined MT as a branch of artificial intelligence (AI) that entails the development of specialised computer systems that can translate text between different human languages.Machine translation goes beyond merely translating words from one language to another by applying sophisticated linguistic analysis to the text and selecting the most likely words and sentence structures from corpora that were previously translated texts (Korošec 2011).This simply means that MT is a computer software that automatically translates texts from one natural language into another without the involvement of a human (Baker & Saldanha 2009).Machine translation systems are categorised into two approaches: rule based and statistical.Rule-based MT is dependent on explicit linguistic data, including grammar, morphological dictionaries, bilingual dictionaries and structural transfer rules (Forcada et al. 2011).The statistical MT tools, on the other hand, rely on sizable parallel corpora of human-engineered translations that are used to automatically infer a statistical model of translation (Korošec 2011).
In the context of South Africa, there are two MT tools that support some of the official languages: Google Translate and Autshumato Machine Translation Web Service.From these tools, only Xitsonga as an indigenous language with fewer speakers in South Africa was added.Other indigenous languages with the least number of speakers such as Tshivenḓa, Siswati and isiNdebele were neglected.Machine translation tools are used to translate documents, articles and educational materials into various languages.
Having other indigenous languages in these tools will not only help for learning purposes but will also play a vital role in their preservation.Bird and Chiang (2012:126) also affirmed that 'when source texts are translated into a major world language, using machine translation, we guarantee that the language documentation will be interpretable even after the language has fallen out of use'.Furthermore, Mager et al. (2018) indicated that MT is ideally suited for indigenous languages as it might facilitate communication with other commonly spoken languages as part of a preservation effort.MT tools can be used to preserve the indigenous languages of South Africa by documenting and translating texts from these languages into languages that are widely used.This can help to increase awareness and understanding of the languages and potentially help in their preservation.Therefore, the development of MT tools for the indigenous languages of South Africa can ensure that they continue to be spoken and used in South Africa and around the world.

Online dictionaries
Alberts (2017) described dictionaries as: [I]nformation resources that reflect human knowledge of language and the world.They serve as authoritative reference works on spelling, pronunciations, meaning and usage as well as on the origin of words, and supply translation equivalents in other languages.(p.32) There are two categories of dictionaries: printed dictionaries and online dictionaries.Lew (2010) stated that printed dictionaries employ a variety of techniques to provide word meanings in a paper format.Online dictionaries, as described by Hilary and Warwick (2000:839), are any sources of information that are electronically stored and provide details on the pronunciation, definition or use of words.
When compared to a hard-copy dictionary, an online dictionary is far more innovative because of the retrieval technology rather than the information content (Hilary & Warwick 2000).Dictionaries are helpful instruments for language documentation and standardisation as they cover and record the vocabulary of a language (Klein 2009).
Few printed dictionaries that were produced by National Lexicographic Units (NLU) and private companies exist in the context of South African indigenous languages.According to an announcement made by the Pan South African Language Board at the beginning of 2005, bilingual dictionaries for Tshivenḓa, Xitsonga, Siswati and isiNdebele were created (Brand South Africa 2005).The printed dictionaries have partially preserved these languages like other printed materials.The same is true for online dictionary services in siSwati, Xitsonga and isiNdebele.These sites are not adequate to preserve these languages because of various limitations such as lack of specialised vocabulary.To preserve these indigenous languages concisely, language speakers, the NLU and private companies must develop online comprehensive dictionaries that can be easily accessed and updated.These dictionaries should address the current challenges that indigenous languages face, such as globalisation and cultural assimilation.One of the biggest benefits of using an online dictionary is that there are almost no space limitations, allowing it to have more entries and examples than a printed dictionary (Klein 2009).Online dictionaries offer an important avenue for the preservation of languages by providing accessible and comprehensive resources for language users.Such dictionaries can also help to document and preserve the South African indigenous languages, thus making them more accessible to a wider audience.

Conclusion
This study demonstrated how technology can be used to preserve the indigenous languages of South Africa with fewer speakers.The technological strategies: localisation of daily used technology and/or application software, translation through crowdsourcing, digitisation and archiving, digital learning tools such as MT and creating online dictionaries provide viable means to preserve the languages.The localisation of daily used technology and/or application software will ensure that language users access tools in their mother tongue.Translation through crowdsourcing makes information open and accessible in indigenous languages on the Internet and across various disciplines.Digitisation and archiving preserve language information and enhance accessibility.Machine translation tools and online dictionaries can also be used to preserve languages by documenting and translating texts from these languages into languages that are widely used.This will render them more accessible to a wider audience.If these technological strategies can be implemented adequately, the promotion of South African indigenous languages will be facilitated, and their preservation could be maintained.
examined the relationship between South African indigenous languages and digital technologies, highlighting the challenges and opportunities in access, promotion and preservation.The author posited that digital technologies can significantly aid in the promotion and preservation of South African indigenous languages through their multimedia capabilities, storage capacity and communication tools.However, limited access to digital technologies supporting these languages negatively impacts their promotion and preservation.The access to digital technologies for indigenous languages in South Africa is hindered by barriers such as lack of expertise, collaboration, equitable digital services and clear advocacy.The author recommended that the South African government, the Pan South African Language Board and other stakeholders, including language experts, researchers and ICT companies, should implement digital technology strategies to prevent these barriers to effective access, promotion and preservation of South African indigenous languages.The literature consulted demonstrates that the use of technology to preserve indigenous South African languages has not been widely studied to the best knowledge of the authors, especially Xitsonga, Tshivenḓa, Siswati and isiNdebele, which are languages with fewer speakers in South Africa.The study fills this gap as it explores the use of technology to preserve these indigenous languages.It describes various technological strategies, which could be employed practically to preserve these languages.

TABLE 1 :
Ahmadi (2018)icles in South African official languages in Wikipedia.Wikimedia, 2023, AllWikipedias ordered by number of articles, viewed 18 May 2023, from https://meta.wikimedia.org/wiki/List_of_Wikipediasteaching and learning.Ahmadi (2018)postulated that digital learning tools have abilities to maximise successful language learning outcomes and increase student engagement. Source: