eprintid: 17887 rev_number: 2 eprint_status: archive userid: 1 dir: disk0/00/01/78/87 datestamp: 2023-12-19 03:24:11 lastmod: 2023-12-19 03:24:11 status_changed: 2023-12-19 03:08:51 type: article metadata_visibility: show creators_name: Otchere, D.A. creators_name: Ganat, T.O.A. creators_name: Ojero, J.O. creators_name: Tackie-Otoo, B.N. creators_name: Taki, M.Y. title: Application of gradient boosting regression model for the evaluation of feature selection techniques in improving reservoir characterisation predictions ispublished: pub keywords: Computational efficiency; Efficiency; Feature extraction; Forecasting; Machine learning; Petroleum reservoir evaluation; Porosity; Regression analysis; Reservoirs (water), Decision-tree algorithm; Dimensionality reduction techniques; Ensemble machine learning; Feature selection technique; Features selection; Gradient boosting; Input features; Prediction errors; Reservoir characterization; Selection techniques, Decision trees, algorithm; machine learning; numerical model; pattern recognition; prediction; regression analysis; reservoir characterization note: cited By 43 abstract: Feature Selection, a critical data preprocessing step in machine learning, is an effective way in removing irrelevant variables, thus reducing the dimensionality of input features. Removing uninformative or, even worse, misinformative input columns helps train a machine learning model on a more generalised data with better performances on new and unseen data. In this paper, eight feature selection techniques paired with the gradient boosting regressor model were evaluated based on the statistical comparison of their prediction errors and computational efficiency in characterising a shallow marine reservoir. Analysis of the results shows that the best technique in selecting relevant logs for permeability, porosity and water saturation prediction was the Random Forest, SelectKBest and Lasso regularisation methods, respectively. These techniques did not only reduce the features of the high dimensional dataset but also achieved low prediction errors based on MAE and RMSE and improved computational efficiency. This indicates that the Random Forest, SelectKBest, and Lasso regularisation can identify the best input features for permeability, porosity and water saturation predictions, respectively. © 2021 Elsevier B.V. date: 2022 publisher: Elsevier B.V. official_url: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85113218886&doi=10.1016%2fj.petrol.2021.109244&partnerID=40&md5=82a135f80f4b611a342755fe48872095 id_number: 10.1016/j.petrol.2021.109244 full_text_status: none publication: Journal of Petroleum Science and Engineering volume: 208 refereed: TRUE issn: 09204105 citation: Otchere, D.A. and Ganat, T.O.A. and Ojero, J.O. and Tackie-Otoo, B.N. and Taki, M.Y. (2022) Application of gradient boosting regression model for the evaluation of feature selection techniques in improving reservoir characterisation predictions. Journal of Petroleum Science and Engineering, 208. ISSN 09204105