relation: https://khub.utp.edu.my/scholars/2476/ title: Characteristics of a Malay journalistic corpus creator: Zamin, N. creator: Oxley, A. creator: Bakar, Z.A. creator: Farhan, S.A. description: This paper presents in detail a linguistics study of a journalistic corpus of Malay describing Indonesian terrorism. The initial raw text was manually annotated for its parts-of-speech. It is the first corpus of its nature ever established in Malaysia. The objective of this research is to conduct an empirical analysis of the actual patterns of use in journalistic texts. This paper presents the characteristics of Malay terrorism corpus which include the properties, word classes, named entities and word occurrences. The results of this work are given purely in terms of the characteristics of a Malay terrorism corpus. The results are highly useful for solving larger tasks in the Natural Language Processing area, such as Information Retrieval and Information Extraction, in the area of terrorism. © 2012 IEEE. date: 2012 type: Conference or Workshop Item type: PeerReviewed identifier: Zamin, N. and Oxley, A. and Bakar, Z.A. and Farhan, S.A. (2012) Characteristics of a Malay journalistic corpus. In: UNSPECIFIED. relation: https://www.scopus.com/inward/record.uri?eid=2-s2.0-84875103783&doi=10.1109%2fCCSII.2012.6470503&partnerID=40&md5=6776ad8b5bae336d40307290eb6d1aff relation: 10.1109/CCSII.2012.6470503 identifier: 10.1109/CCSII.2012.6470503