Predicting building-related carbon emissions: A test of machine learning models

Emmanuel B. Boateng, Emmanuella A. Twumasi, Amos Darko*, Mershack O. Tetteh, Albert P.C. Chan

*Corresponding author for this work

Research output: Chapter in Book/Published conference outputChapter


This chapter evaluates and compares the performance of six machine-learning (ML) algorithms in predicting China’s building-related carbon emissions. The models took into account five input parameters influencing building-related CO2 emissions: urbanisation, R&D, population size, GDP, and energy use. The study used quarterly data throughout 1971Q1–2014Q4 to develop, calibrate, and validate the models. Each model was developed using 140 observations and validated on 36 observations. In tuning each ML model for comparative purposes, 10-fold with cross-validation approach was used in selecting the optimal hyperparameters and their associated arguments. The results indicate that the random forest (RF) model attained the highest coefficient of determination (R2) of 99.88%, followed by the k-nearest neighbour (KNN) (99.87%), extreme gradient boosting (XGBoost) (99.77%), decision tree (DT) (99.63%), adaptive boosting (AdaBoost) (99.56%), and the support vector regression (SVR) model (97.67%). Overall, the RF algorithm is the best performing ML algorithm in accurately predicting building-related CO2 emissions, whereas the best algorithm in terms of time efficiency is the DT algorithm. The KNN model is highly recommended when practitioners want to have accurate predictions in a timely manner. RF, KNN, and DT models could be added to the toolkits of environmental policymakers to provide high-quality forecasts and patterns of building-related CO2 emissions in an accurate and real-time manner.

Original languageEnglish
Title of host publicationStudies in Computational Intelligence
EditorsA.E. Hassanien, M.H.N. Taha, N.E.M. Khalifa
Number of pages20
ISBN (Electronic)978-3-030-52067-0
ISBN (Print)978-3-030-52066-3
Publication statusPublished - 2021

Publication series

NameStudies in Computational Intelligence
ISSN (Print)1860-949X
ISSN (Electronic)1860-9503


  • Adaptive boosting
  • Building emissions
  • Decision tree
  • Extreme gradient boosting
  • K-nearest neighbour
  • Machine learning
  • Predicting
  • Random forest
  • Support vector regression


Dive into the research topics of 'Predicting building-related carbon emissions: A test of machine learning models'. Together they form a unique fingerprint.

Cite this