eprintid: 1135 rev_number: 2 eprint_status: archive userid: 1 dir: disk0/00/00/11/35 datestamp: 2023-11-09 15:49:17 lastmod: 2023-11-09 15:49:17 status_changed: 2023-11-09 15:39:04 type: conference_item metadata_visibility: show creators_name: Sari, Y. creators_name: Hassan, M.F. creators_name: Zamin, N. title: Rule-based pattern extractor and Named Entity Recognition: A hybrid approach ispublished: pub keywords: Extraction patterns; Hybrid approach; Information extraction; Link grammar; Name entity recognition; Named entities; Named entity recognition; Part-of-speech tagger; Rule based; Self-training; Semi-supervised; Stanford, Information technology, Character recognition note: cited By 15; Conference of 2010 International Symposium on Information Technology, ITSim'10 ; Conference Date: 15 June 2010 Through 17 June 2010; Conference Code:81915 abstract: Name Entity Recognition (NER) is one of the important tasks in Information Extraction (IE) research that has been flourishing for more than fifteen years ago. NER enables an IE system to recognize and classify information units in an unstructured text. This paper presents a Rule-based pattern extractor and a Semi-Supervised NER approach to automatically generate extraction pattern from a limited corpus and label the pre-defined entities in a collection of accident documents. Link Grammar parser and Stanford Part-of-Speech tagger are used in the pattern extractor to identify named entity and construct extraction pattern. The extraction pattern then feed to Semi Supervised NER to categorize the entities into some predefined categories. Performance is evaluated using Exact Match evaluation and tested on two different entities-DATE and LOCATION. Using only two features, the system shows promising result. © 2010 IEEE. date: 2010 official_url: https://www.scopus.com/inward/record.uri?eid=2-s2.0-78049354555&doi=10.1109%2fITSIM.2010.5561392&partnerID=40&md5=a3eabd49aeac0cd962c57316c99ca0cd id_number: 10.1109/ITSIM.2010.5561392 full_text_status: none publication: Proceedings 2010 International Symposium on Information Technology - Engineering Technology, ITSim'10 volume: 2 place_of_pub: Kuala Lumpur pagerange: 563-568 refereed: TRUE isbn: 9781424467181 citation: Sari, Y. and Hassan, M.F. and Zamin, N. (2010) Rule-based pattern extractor and Named Entity Recognition: A hybrid approach. In: UNSPECIFIED.