TY - CONF AV - none N2 - There is huge growth of online text documents in the Internet today. We can easily find documents written in languages from all over part of the just from a single click. Increasing number of online text document in Internet makes the increased availability of information on the Internet. In fact that none in the world can understand all languages of the digital documents. Hence, there is a significant need to have a language identifier to assist user to understand the information. Up to now, the language identification is more focused in European languages and still limited for Asian languages. Whilst the research of language identification for similar languages from popular languages has attracted the attention of many researchers. In this research, a new language identification for language with similar topology, Malay and Indonesian language, is proposed. The algorithm is experimented on a set of Indonesian and Malay text documents to support the limited research of language identification for Asian language. An experiment done on 100 Indonesian and Malay text documents has produced a number of satisfactorily accurate results. © 2015 IEEE. N1 - cited By 2; Conference of 2015 International Symposium on Mathematical Sciences and Computing Research, iSMSC 2015 ; Conference Date: 19 May 2015 Through 20 May 2015; Conference Code:124374 TI - A language identifier for Indonesian and Malay text document SP - 127 ID - scholars6736 KW - Algorithms; Computer programming; Computer science KW - Asian languages; Digital Documents; European languages; Indonesian languages; Language identification; N-grams; NAtural language processing; nocv1; Text document KW - Natural language processing systems Y1 - 2016/// PB - Institute of Electrical and Electronics Engineers Inc. SN - 9781479978946 A1 - Indra, Z. A1 - Jaafar, J. A1 - Zamin, N. A1 - Bakar, Z.A. UR - https://www.scopus.com/inward/record.uri?eid=2-s2.0-84995701431&doi=10.1109%2fISMSC.2015.7594040&partnerID=40&md5=d9715785f362d63c5eefd4f58185acc8 EP - 131 ER -