TY - CHAP
T1 - Predicting building-related carbon emissions
T2 - A test of machine learning models
AU - Boateng, Emmanuel B.
AU - Twumasi, Emmanuella A.
AU - Darko, Amos
AU - Tetteh, Mershack O.
AU - Chan, Albert P.C.
PY - 2021
Y1 - 2021
N2 - This chapter evaluates and compares the performance of six machine-learning (ML) algorithms in predicting China’s building-related carbon emissions. The models took into account five input parameters influencing building-related CO2 emissions: urbanisation, R&D, population size, GDP, and energy use. The study used quarterly data throughout 1971Q1–2014Q4 to develop, calibrate, and validate the models. Each model was developed using 140 observations and validated on 36 observations. In tuning each ML model for comparative purposes, 10-fold with cross-validation approach was used in selecting the optimal hyperparameters and their associated arguments. The results indicate that the random forest (RF) model attained the highest coefficient of determination (R2) of 99.88%, followed by the k-nearest neighbour (KNN) (99.87%), extreme gradient boosting (XGBoost) (99.77%), decision tree (DT) (99.63%), adaptive boosting (AdaBoost) (99.56%), and the support vector regression (SVR) model (97.67%). Overall, the RF algorithm is the best performing ML algorithm in accurately predicting building-related CO2 emissions, whereas the best algorithm in terms of time efficiency is the DT algorithm. The KNN model is highly recommended when practitioners want to have accurate predictions in a timely manner. RF, KNN, and DT models could be added to the toolkits of environmental policymakers to provide high-quality forecasts and patterns of building-related CO2 emissions in an accurate and real-time manner.
AB - This chapter evaluates and compares the performance of six machine-learning (ML) algorithms in predicting China’s building-related carbon emissions. The models took into account five input parameters influencing building-related CO2 emissions: urbanisation, R&D, population size, GDP, and energy use. The study used quarterly data throughout 1971Q1–2014Q4 to develop, calibrate, and validate the models. Each model was developed using 140 observations and validated on 36 observations. In tuning each ML model for comparative purposes, 10-fold with cross-validation approach was used in selecting the optimal hyperparameters and their associated arguments. The results indicate that the random forest (RF) model attained the highest coefficient of determination (R2) of 99.88%, followed by the k-nearest neighbour (KNN) (99.87%), extreme gradient boosting (XGBoost) (99.77%), decision tree (DT) (99.63%), adaptive boosting (AdaBoost) (99.56%), and the support vector regression (SVR) model (97.67%). Overall, the RF algorithm is the best performing ML algorithm in accurately predicting building-related CO2 emissions, whereas the best algorithm in terms of time efficiency is the DT algorithm. The KNN model is highly recommended when practitioners want to have accurate predictions in a timely manner. RF, KNN, and DT models could be added to the toolkits of environmental policymakers to provide high-quality forecasts and patterns of building-related CO2 emissions in an accurate and real-time manner.
KW - Adaptive boosting
KW - Building emissions
KW - Decision tree
KW - Extreme gradient boosting
KW - K-nearest neighbour
KW - Machine learning
KW - Predicting
KW - Random forest
KW - Support vector regression
UR - http://www.scopus.com/inward/record.url?scp=85091703326&partnerID=8YFLogxK
UR - https://link.springer.com/chapter/10.1007/978-3-030-52067-0_11
U2 - 10.1007/978-3-030-52067-0_11
DO - 10.1007/978-3-030-52067-0_11
M3 - Chapter
AN - SCOPUS:85091703326
SN - 978-3-030-52066-3
T3 - Studies in Computational Intelligence
SP - 247
EP - 266
BT - Studies in Computational Intelligence
A2 - Hassanien, A.E.
A2 - Taha, M.H.N.
A2 - Khalifa, N.E.M.
PB - Springer
ER -