Issues in language documentation of under-resourced languages with reference to Sarawak

Author: Norazuna Norahim, Suhaila Saee, Sarah Flora Samson Juan, Bibi Aminah Abdul Ghani (University Malaysia Sarawak)
Speaker: Norazuna Norahim, Suhaila Saee, Sarah Flora Samson Juan, Bibi Aminah Abdul Ghani
Topic: Language documentation
The (SCOPUS / ISI) SOAS GLOCAL CALA 2019 General Session


This paper reports on on-going language documentation project that attempts to develop a repository to archive systematically data on languages of Sarawak. This project complements the current research work under SALT (Sarawak Language Technology) initiated by Faculty of Computer Sciences and Information Technology, University Malaysia Sarawak. Initially, SILA will produce phonetic and morphological inventories of under-resourced languages of the Lower Baram sub-group of North Sarawak languages, namely Bakong, Miriek, Narom, Berawan, Dali’, Kiput, Tutong, Belait or Lemeting (Blust, 1972, p.13). Thus far, language-dialect dichotomy of language members in the group have yet to be ascertained. Hence, the research also examines linguistic affiliation between members of the group. The paper also discusses challenges faced in the documentation of under-described and under-resourced languages in Sarawak with reference to the Miriek language. The Miriek project is an attempt to develop a platform for a machine-readable dictionary. Dictionary resource development is a critical first step towards documentation and revival of the disappearing cultures, and languages of the world. Many Orang Asli and indigenous languages of Sabah and Sarawak, Malaysia are still under-resourced. Unlike dictionaries of written languages, which have billions of corpuses to form the basis of linguistic analyses, the dictionary resource development for indigenous languages would have to rely heavily on native speakers for the corpora. This is the greatest challenge of the research project, yet it has great impact on the future development of the indigenous languages. The discussion also includes establishing initial orthography of these languages.

Keywords: language documentation, Sarawak languages, language affiliation