eprintid: 189 rev_number: 2 eprint_status: archive userid: 1 dir: disk0/00/00/01/89 datestamp: 2023-11-09 15:15:49 lastmod: 2023-11-09 15:15:49 status_changed: 2023-11-09 15:13:30 type: conference_item metadata_visibility: show creators_name: Setiawan, N.A. creators_name: Venkatachalam, P.A. creators_name: Hani, A.F.M. title: Missing data estimation on heart disease using artificial neural network and rough set theory ispublished: pub keywords: Backpropagation; Cardiology; Classifiers; Feature extraction; Functions; Fuzzy sets; Knowledge based systems; Learning systems; Neural networks; Set theory, Artificial neural networks; Attribute values; Common problems; Comparative studies; Data sets; Decomposition trees; Estimation methods; Heart diseases; Input features; Missing data estimations; Missing datums; Missing value; Nearest neighbors; Reduced inputs; Reducts; University of california, Rough set theory note: cited By 13; Conference of 2007 International Conference on Intelligent and Advanced Systems, ICIAS 2007 ; Conference Date: 25 November 2007 Through 28 November 2007; Conference Code:74506 abstract: The objective of this research is to implement a method for estimating the real missing data in heart disease datasets and to show how it affects the resulting knowledge. Missing data is common problem in Knowledge Discovery from Database (KDD) processes that can lead significant error in extracted knowledge. We use hybridization of Artificial Neural Network and Rough Set Theory (ANNRST) to estimate the real missing data on heart disease from UCI (University of California, Irvine) datasets 1. ANN with reduced input features is used to estimate the missing data. RST is used to reduce the dimensionality of input features and to extract the knowledge as reducts and rules from heart disease datasets with estimated missing data. RST, decomposition tree, Local Transfer Function Classifier (LTF-C) and k-Nearest Neighbor (k-NN) classifier are used to calculate the accuracy. Comparative study with k-NN estimation, most common attribute value filling and deletion of missing data are made to evaluate the extracted knowledge. ANNRST can be considered as the appropriate estimation method when strong relationship between original complete datasets and estimated datasets is important (the estimated datasets really represent the nature of original complete datasets) as it gives the best accuracy and coverage for almost all the classifiers. ©2007 IEEE. date: 2007 official_url: https://www.scopus.com/inward/record.uri?eid=2-s2.0-57949088688&doi=10.1109%2fICIAS.2007.4658361&partnerID=40&md5=561e6e9acb8318805267c7ea2b9cc5ff id_number: 10.1109/ICIAS.2007.4658361 full_text_status: none publication: 2007 International Conference on Intelligent and Advanced Systems, ICIAS 2007 place_of_pub: Kuala Lumpur pagerange: 129-133 refereed: TRUE isbn: 1424413559; 9781424413553 citation: Setiawan, N.A. and Venkatachalam, P.A. and Hani, A.F.M. (2007) Missing data estimation on heart disease using artificial neural network and rough set theory. In: UNSPECIFIED.