eprintid: 15663 rev_number: 2 eprint_status: archive userid: 1 dir: disk0/00/01/56/63 datestamp: 2023-11-10 03:30:17 lastmod: 2023-11-10 03:30:17 status_changed: 2023-11-10 02:00:03 type: article metadata_visibility: show creators_name: Ibrahim, M.B. creators_name: Mustaffa, Z. creators_name: Balogun, A.-L. creators_name: Hamonangan Harahap, I.S. creators_name: Ali Khan, M. title: Advanced data mining techniques for landslide susceptibility mapping ispublished: pub keywords: Decision making; Landslides; Learning algorithms; Lithology; Mapping; Precipitation (meteorology); Predictive analytics; Sediment transport; Support vector machines, Comparative assessment; Geo-spatial database; Landslide susceptibility; Landslide susceptibility mapping; Mitigation strategy; Planning and development; Predictive capabilities; Traditional learning, Data mining note: cited By 7 abstract: This paper describes the development and validation of landslides susceptibility models for mountainous regions using advanced data mining techniques. The investigation was carried out to ascertain the effectiveness of Naïve Bayes Multinomial (NBM) and Random Trees (RT) in landslide susceptibility mapping. The NBM is an advancement of the frequently used Naïve Bayes classifiers, while the RT was built to overcome the limitations of the traditional forest classifiers. A geospatial database for this investigation comprises 148 landslide locations influenced by ten (10) landslide conditioning factors. The factors (Slope Angle, Slopes Elevation, Slope Aspect, Plan curvature, Profile Curvature, Lithology, Soil type, Stream power index (SPI), Sediment transport index (STI), and Rainfall precipitation) were drawn using a Multi Collinearity Decision Making (MCDM) technique. A Frequency Ratio (FR) analysis was used to obtain the relative significance of the factors in the slides. Predictive models were also developed by quantifying these models using data mining techniques. A section of the entire geospatial data (70) was used as training datasets, while the remaining part of the data (30) was used to validate the trained datasets. SVM, RT, and NBM algorithms were used to produce predicted datasets from the training datasets. These predicted datasets were used to develop the Landslides Susceptibility Models. A comparative assessment between the two classifiers against the famous traditional learning algorithm, the Support vector machines (SVM), was conducted. Model performance evaluators such as the AUROC, RSME, F-measure, MAE, and ACC were employed to check the predictive capabilities and accuracies of the models. The indices indicated that the SVM model performed better than the other two algorithms in both training and validation datasets. Further analysis and comparison of the models reveal that the new data mining techniques are reliable for landslide susceptibility. Simultaneously, the traditional algorithm is also useful and remains relevant, especially with similar site conditions. This study has provided insights on better planning and development and provision of mitigation strategies and further analysis on landslides in the study area, particularly in cases of limited data availability. © 2021 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group. date: 2021 publisher: Taylor and Francis Ltd. official_url: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85114772600&doi=10.1080%2f19475705.2021.1960433&partnerID=40&md5=a644c5d2ac87bc0d20106189bf1b37ac id_number: 10.1080/19475705.2021.1960433 full_text_status: none publication: Geomatics, Natural Hazards and Risk volume: 12 number: 1 pagerange: 2430-2461 refereed: TRUE issn: 19475705 citation: Ibrahim, M.B. and Mustaffa, Z. and Balogun, A.-L. and Hamonangan Harahap, I.S. and Ali Khan, M. (2021) Advanced data mining techniques for landslide susceptibility mapping. Geomatics, Natural Hazards and Risk, 12 (1). pp. 2430-2461. ISSN 19475705