eprintid: 1244 rev_number: 2 eprint_status: archive userid: 1 dir: disk0/00/00/12/44 datestamp: 2023-11-09 15:49:24 lastmod: 2023-11-09 15:49:24 status_changed: 2023-11-09 15:39:18 type: conference_item metadata_visibility: show creators_name: Khan, A. creators_name: Baharudin, B. creators_name: Khan, K. title: Efficient feature selection and domain relevance term weighting method for document classification ispublished: pub keywords: Bag of words; Data sets; Document Classification; Efficient feature selections; Feature selection; Feature selection methods; Feature vector; Feature vectors; Inverse Document Frequency; Term dependency; Term weighting; Text classification; Text classifiers; Vector space models, Classification (of information); Information retrieval systems; Ontology; Text processing; Vector spaces; Vectors, Feature extraction note: cited By 11; Conference of 2nd International Conference on Computer Engineering and Applications, ICCEA 2010 ; Conference Date: 19 March 2010 Through 21 March 2010; Conference Code:80409 abstract: Feature selection is of paramount concern in document classification process which improves the efficiency and accuracy of text classifier. Vector Space Model is used to represent the "Bag of Word" BOW of the documents with term weighting phenomena. Documents representing through this model has some limitations that is, ignoring term dependencies, structure and ordering of the terms in documents. To overcome this problem semantic base feature vector is proposed. That is used to extracts the concept of term, co-occurring and associated terms using ontology. The proposed method is applied on small documents dataset, which shows that this method outperforms then term frequency/ inverse document frequency (TF-IDF) with BOW feature selection method for text classification. © 2010 Crown Copyright. date: 2010 official_url: https://www.scopus.com/inward/record.uri?eid=2-s2.0-77952774461&doi=10.1109%2fICCEA.2010.228&partnerID=40&md5=02a06ac08bdb055f2ac9cab1fffa4345 id_number: 10.1109/ICCEA.2010.228 full_text_status: none publication: 2010 2nd International Conference on Computer Engineering and Applications, ICCEA 2010 volume: 2 pagerange: 398-403 refereed: TRUE isbn: 9780769539829 citation: Khan, A. and Baharudin, B. and Khan, K. (2010) Efficient feature selection and domain relevance term weighting method for document classification. In: UNSPECIFIED.