relation: https://khub.utp.edu.my/scholars/3124/ title: A lazy manâ��s way to part-of-speech tagging creator: Zamin, N. creator: Oxley, A. creator: Bakar, Z.A. creator: Farhan, S.A. description: A statistical-based approach to word alignment involving automatically projecting part-of-speech (POS) tags is presented. The approach is referred to as the â��lazy manâ��s wayâ�� because it improves POS assignment for a resource-poor language by exploiting its similarity to a resource-rich one. This unsupervised learning method combines the N-gram and Dice Coefficient similarity functions in order to align English texts with Malay texts thus projecting the POS tags from English to Malay. It is a quick method that does not require the laborious effort needed to annotate the Malay dataset. A case study, an experiment done on 25 terrorism news articles written in Malay, has shown that leveraging pre-existing resources from a resource-rich language, i.e. English, to supplement a resource-poor language, i.e. Malay, is feasible and avoids building new text-processing tools from scratch. The system was tested on the Malay corpus, consisting of 5413 word tokens. The results reached values of 86.87 for precision, 72.56 for recall and 79.07 for F1-Score. This shows that the â��lazy manâ��s wayâ��, where a resource-poor language just exploits the rich linguistic information available in English, increases bitext projection accuracy significantly. © Springer-Verlag Berlin Heidelberg 2012. publisher: Springer Verlag date: 2012 type: Article type: PeerReviewed identifier: Zamin, N. and Oxley, A. and Bakar, Z.A. and Farhan, S.A. (2012) A lazy manâ��s way to part-of-speech tagging. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7457 L. pp. 106-117. ISSN 03029743 relation: https://www.scopus.com/inward/record.uri?eid=2-s2.0-84893005114&doi=10.1007%2f978-3-642-32541-0_9&partnerID=40&md5=6398989ce94cbfadf1367af131ac7120 relation: 10.1007/978-3-642-32541-0₉ identifier: 10.1007/978-3-642-32541-0₉