eprintid: 6210 rev_number: 2 eprint_status: archive userid: 1 dir: disk0/00/00/62/10 datestamp: 2023-11-09 16:17:58 lastmod: 2023-11-09 16:17:58 status_changed: 2023-11-09 16:05:16 type: article metadata_visibility: show creators_name: Zamin, N. creators_name: Bakar, Z.A. title: Name entity recognition for malay texts using cross-lingual annotation projection approach ispublished: pub keywords: Natural language processing systems, Alignment methods; Comparative studies; Dice coefficient; Name entity recognition; NAtural language processing; Projection method; State of the art; String similarity, Computational linguistics note: cited By 3; Conference of 15th International Conference on Computational Science and Its Applications, ICCSA 2015 ; Conference Date: 22 June 2015 Through 25 June 2015; Conference Code:157939 abstract: Cross-lingual annotation projection methods can benefit from richresourced languages to improve the performance of Natural Language Processing (NLP) tasks in less-resourced languages. In this research, Malay is experimented as the less-resourced language and English is experimented as the rich-resourced language. The research is proposed to reduce the deadlock in Malay computational linguistic research due to the shortage of Malay tools and annotated corpus by exploiting state-of-the-art English tools. This paper proposes an alignment method known as MEWA (Malay-English Word Aligner) that integrates a Dice Coefficient and bigram string similarity measure with little supervision to automatically recognize three common named entities â�� person (PER), organization (ORG) and location (LOC). Firstly, the test collection of Malay journalistic articles describing on Indonesian terrorism is established in three volumes â�� 646, 5413 and 10002 words. Secondly, a comparative study between selected state-of-the-art tools is conducted to evaluate the performance of the tools against the test collection. Thirdly, MEWA is experimented to automatically induced annotations using the test collection and the identified English tool. A total of 93 accuracy rate is achieved in a series of NE annotation projection experiment. © Springer International Publishing Switzerland 2015. date: 2015 publisher: Springer Verlag official_url: https://www.scopus.com/inward/record.uri?eid=2-s2.0-84948970178&doi=10.1007%2f978-3-319-21404-7_18&partnerID=40&md5=f93e12612885f286f1f3b8b50c3400ca id_number: 10.1007/978-3-319-21404-7₁₈ full_text_status: none publication: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) volume: 9155 pagerange: 242-256 refereed: TRUE isbn: 9783319214030 issn: 03029743 citation: Zamin, N. and Bakar, Z.A. (2015) Name entity recognition for malay texts using cross-lingual annotation projection approach. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9155. pp. 242-256. ISSN 03029743