Application of machine learning models to predict cytotoxicity of ionic liquids using VolSurf principal properties

Tabaaza, G.A. and Tackie-Otoo, B.N. and Zaini, D.B. and Otchere, D.A. and Lal, B. (2023) Application of machine learning models to predict cytotoxicity of ionic liquids using VolSurf principal properties. Computational Toxicology, 26.

Full text not available from this repository.
Official URL: https://www.scopus.com/inward/record.uri?eid=2-s2....

Abstract

Ionic Liquids (ILs) are considered greener alternatives to traditional organic solvents due to their unique physical and chemical properties. Nevertheless, recent studies showed that ILs can induce toxic effects in ecosystem. Therefore, it is essential to determine the level of risk to the aquatic life to successfully use these ILs. Toxicity measurement of various ILs on a broad spectrum of conditions through experimental techniques is way demanding on time, resources, and is at times impractical. Various research works have been performed in Quantitative Property Relationship (QSAR/QSPR) for IL toxicity prediction expressed as EC50. In this study, five supervised machine learning models were trained and tested using nine Principal Properties (PPs) as descriptors to predict leukemia rat cell line (IPC-81) cytotoxicity. Then eight feature selection techniques were used to preprocess the data to improve the performance of the best machine learning model among the preliminary trained models. Analysis of the performance of the models on predicting the out-of-sample data set showed that the Extreme Gradient Boosting (XGBoost) supervised machine learning model is the best in predicting with the highest test score (R2 = 0.79). This model was the most parsimonious (minimum AIC of 46.50), consistent (minimum RMSE of 0.45), and precise (minimum MAE of 0.32) in predicting IPC-81 cytotoxicity. The feature importance attribute of XGBoost confirmed that the structural features of ILs� cation like cationic hydrophilicity and the side chain length have significant impact on the toxicity. Nevertheless, the anionic part of IL is also important to their toxicity and needs to be considered in toxicity prediction. Among the tested feature selection techniques, the random forest technique was the best in improving model performance (i.e., the least error matrices: AIC = 41.22, MAE = 0.31 and RMSE = 0.4259 respectively) but at longer execution time. However, the wrapper methods were the most robust in improving computational efficiency (i.e, improved the model performance at the shortest execution time). Therefore, this study improves QSPR studies on toxicity prediction of new ILs with the application of machine learning and feature selection techniques. © 2023

Item Type: Article
Additional Information: cited By 3
Uncontrolled Keywords: Adaptive boosting; Cell culture; Computational chemistry; Computational efficiency; Diseases; Feature Selection; Forecasting; Forestry; Ionic liquids; Learning systems; Statistical tests; Support vector machines, Cell lines; Features selection; Leukemia rat cell line (IPC-81); Machine learning models; Machine-learning; Property; QSPR/QSAR and principal property; Rat cells; Selection techniques; Toxicity predictions, Cytotoxicity, ionic liquid, animal cell; animal experiment; Article; controlled study; cytotoxicity; decision tree; EC50; hydrophilicity; LT12 cell line; machine learning; nonhuman; physical chemistry; prediction; quantitative structure activity relation; quantitative structure property relation; random forest; rat; support vector machine; toxicity; validation process
Depositing User: Mr Ahmad Suhairi UTP
Date Deposited: 04 Jun 2024 14:10
Last Modified: 04 Jun 2024 14:10
URI: https://khub.utp.edu.my/scholars/id/eprint/18597

Actions (login required)

View Item
View Item