Evaluation of tree-based ensemble learning algorithms to estimate total organic carbon from wireline logs Academic Article uri icon

abstract

  • To evaluate the hydrocarbon generation potential, Total Organic Carbon (TOC) of source/reservoir rocks is of vital importance. TOC estimation from well logs is challenging and in laboratory from rock specimens is costly as well as time-consuming. TOC prediction from Passey method is low whereas AI techniques such as Artificial Neural Network (ANN), Support Vector Machine (SVM) get trapped in local optima, resulting in overfitting and are also considered ambiguous if the technique is not reasonable. In this paper, we proposed four efficient tree-based ensemble techniques that include Random Forest (RF), Extra Trees (ET), Gradient Boosting (GB), and eXtremely Gradient Boosting (XGB), capable of fitting highly non-linear data with minimum data pre-processing for TOC prediction. To evaluate the efficiency of these models, 205 data points and seven well logs from the Goldwyer Formation of the Canning Basin, Australia, were used for the training and testing purpose. Results validated that the accuracy of these tree-based ensemble techniques is at exemplary level for the TOC estimation, where the XGB model (for testing R2 94.39%, MAE 0.0447, MSE 0.0039) outperformed the other techniques, i.e., RF (for testing R2 90.59%, MAE 0.0549, MSE 0.0055), ET (for testing R2 90.63%, MAE 0.0583, MSE 0.0058) and GB (for testing R2 91.23%, MAE 0.0569, MSE 0.0053). These robust tree-based ensemble techniques have not only protected overfitting but also achieved better prediction results in dealing with the multidimensional data. © ICIC International 2021.

publication date

  • 2021

start page

  • 807

end page

  • 829

volume

  • 17

issue

  • 3