Advanced Machine Learning and Gene Expression Programming Techniques for Predicting CO2-Induced Alterations in Coal Strength
Zijian Liu1, Yong Shi2, Chuanqi Li1, Xiliang Zhang3,*, Jian Zhou1, Manoj Khandelwal4,*
1 School of Resources and Safety Engineering, Central South University, Changsha, 410083, China
2 Changsha Institute of Mining Research Co., Ltd., Changsha, 410012, China
3 State Key Laboratory of Safety and Health for Metal Mines, Ma’anshan, 243000, China
4 Institute of Innovation, Science and Sustainability, Federation University Australia, Ballarat, VIC 3350, Australia
* Corresponding Author: Xiliang Zhang. Email:
; Manoj Khandelwal. Email:
Computer Modeling in Engineering & Sciences https://doi.org/10.32604/cmes.2025.062426
Received 18 December 2024; Accepted 18 February 2025; Published online 12 March 2025
Abstract
Given the growing concern over global warming and the critical role of carbon dioxide (CO
2) in this phenomenon, the study of CO
2-induced alterations in coal strength has garnered significant attention due to its implications for carbon sequestration. A large number of experiments have proved that CO
2 interaction time (T), saturation pressure (P) and other parameters have significant effects on coal strength. However, accurate evaluation of CO
2-induced alterations in coal strength is still a difficult problem, so it is particularly important to establish accurate and efficient prediction models. This study explored the application of advanced machine learning (ML) algorithms and Gene Expression Programming (GEP) techniques to predict CO
2-induced alterations in coal strength. Six models were developed, including three metaheuristic-optimized XGBoost models (GWO-XGBoost, SSA-XGBoost, PO-XGBoost) and three GEP models (GEP-1, GEP-2, GEP-3). Comprehensive evaluations using multiple metrics revealed that all models demonstrated high predictive accuracy, with the SSA-XGBoost model achieving the best performance (R
2—Coefficient of determination = 0.99396, RMSE—Root Mean Square Error = 0.62102, MAE—Mean Absolute Error = 0.36164, MAPE—Mean Absolute Percentage Error = 4.8101%, RPD—Residual Predictive Deviation = 13.4741). Model interpretability analyses using SHAP (Shapley Additive exPlanations), ICE (Individual Conditional Expectation), and PDP (Partial Dependence Plot) techniques highlighted the dominant role of fixed carbon content (FC) and significant interactions between FC and CO
2 saturation pressure (P). The results demonstrated that the proposed models effectively address the challenges of CO
2-induced strength prediction, providing valuable insights for geological storage safety and environmental applications.
Keywords
CO
2-induced coal strength; meta-heuristic optimization algorithms; XGBoost; gene expression programming; model interpretability