spiral.imperial.ac.uk€¦ · web viewfig. 2 illustrates the binary partitioning process in the...
TRANSCRIPT
Waste-to-Resource Transformation: Gradient Boosting Modelling for Organic Fraction Municipal Solid Waste Projection
Eniola Adeogbaǂ, Peter Bartyǂ, Edward O’Dwyer, Miao Guo*
Department of Chemical Engineering, Imperial College London, London, SW7 2AZ U.K.ǂ Equivalent contribution
*corresponding author: [email protected]
Abstract
Food and garden waste are important components of organic fraction municipal solid waste (OFMSW), representing carbon
and nutrient rich resources composed of carbohydrates, lipid, protein, cellulose, hemicellulose and lignin. Despite progressive
diversion from landfill, over 50% of landfilled MSW is biodegradable, causing greenhouse gas emissions. In conventional
waste management value chains, OFMSW components have been regarded as by-products as opposed to promising resources
with energy and nutrient values. Full exploitation of waste resources calls for a value chain transformation towards proactive
resource recovery and waste commoditization. This requires robust projection of OFMSW composition and supply variability.
Gradient boosting models are developed here using historical socio-demographic, weather and waste data from UK local
authorities. These models are used to forecast garden and food OFMSW generation for each of the 327 UK local authorities.
The developed methods perform particularly well in forecasting garden waste due to a greater link to measurable
environmental variables. The research highlights the key influences in waste volume prediction and demonstrates the difficulty
in transferring models to local authorities without training data. The predictive performance and spatial granularity of model
projections offer a promising approach to inform decision-making on future waste recovery facilities and OFMSW
commoditization.
Key words: machine learning, organic municipal solid waste, Gradient Boosting model, waste recovery, value chain
Introduction
The projected 50% increase in global population in the 21st century1 combined with non-OECD economic growth is
expected to increase resource (e.g. food and energy) demands as well as lead to rising waste generation. Municipal solid waste
(MSW) defined as the municipal waste collected and treated by or for municipalities, including organic fraction MSW
(OFMSW, food waste, garden and park waste, paper and cardboard, wood) and inorganic MSW (textiles, rubber and leather,
plastics, metal, glass and others)2. Global MSW growth is projected to exceed 11 million tons per day (59%-68% organic
1
fraction) by 2100 under ‘business as usual’3. Increasing waste trends are particularly intense in less developed countries2. Such
waste trends not only increase the resource stress but also contribute to greenhouse gases (GHGs). Annual global food waste
(equivalent to one-third of food produced globally) is equivalent to the waste of 8.5% annual water withdrawn and 28% of
agricultural lands4; MSW merged as a major concern as post-consumer waste account for 5% of global GHGs. A
transformation to resource-circular systems and sustainable MSW management is necessary.
Despite the ongoing shift away from a landfill-dominated system, MSW chemical composition variability and conventional
waste value chains hinder the transformation of waste sector towards a resource-circular system. Growing environmental
pressure has resulted in regional/national targets to divert waste from landfills and increase recycling and recovery rate. As
part of a circular economy strategy, the EU has set the targets to recover MSW (recycling 65% MSW by 2035) 5 and restrict
OFMSW sent to landfill (35% of the 1995 baseline by 2020)6. UK OFMSW to landfill represents 22% of the 1995 baseline
value with over 7.7 million tonnes of biodegradable municipal waste ending up in landfill in 2016.6 This implies that over 50%
of landfilled MSW is biodegradable.6,7 The decomposition of organic MSW in landfill is the predominant contributor (92%) to
the GHGs associated with the MSW sector (4% of total GHG) in the UK.8 In conventional waste management value chain and
market, OFMSW along with other waste streams have been regarded as by-products (carrying zero or low-value) rather than
marketable commodities with well-defined grades (in contrast to oil products as energy carriers). In fact, OFMSW streams are
not only carbon-rich resources as energy carrier but also contain high nutrient value (e.g. protein, lipid and minerals). The
waste sector presents promising opportunities for resources to be converted to value-added products via thermochemical and
biochemical pathways such as anaerobic digestion. To exploit waste resource value requires a transformative waste value
chain and commoditization of waste resources, which requires quantitative projection of waste composition and supply.
However, the waste composition is highly complex and variable. Take food waste as an example. The UK nationwide analyses
showed significantly varying carbohydrates (30-250 g/kg), lipid (10-128 g/kg), protein (5-140g/kg), soluble sodium and
potassium (1.2-55 g/kg) contents.9 These are dependent on spatially-explicit factors (e.g. local diet and behavior) and
seasonally environmental variables (e.g. winter and summer). The analytical experiments to quantify such varying waste
composition can be labor intensive and cost ineffective; but it is essential to inform the technology design to maximize waste
recovery. Moreover, planning (e.g. sizing and logistics) and operation of waste recovery facilities requires continuous and
consistent waste feedstock supply; whereas it is difficult to precisely quantify the waste availability in particular OFMSW
volumes due to its low traceability and mixed waste collection system origin. Notably, household waste (27.3 million tons in
2016) dominates the local authority collected MSW in the UK (~90%)7 and shares over 70% of the food waste stream.10 These
2
lead to inhomogeneous OFMSW streams driven by household behaviors and consumption trends which are not only affected
by environmental factors but also socio-economic variables at local level (e.g. income). Such challenges have been highlighted
in a previous study11, where the authors pointed out the energy recovery barrier is the inability to quantify the garden waste.
Thereby, technology implementation and waste value chain transformation call for robust projection of waste feedstock
quantity and quality (composition) at spatial and temporal scales.
This study aims to project variability in organic solid waste generated in a given UK local authority, accounting for socio-
economic and other environmental factors. Our research focuses on food and garden waste streams as important components
of OFMSW. The chemical compositions and potential conversion pathways of food and garden waste are presented in Figure
SI- S1 in Supporting Information (SI).
Methods for forecasting OFMSW
There is an increasing research interest in forecasting MSW generation in the context of informing local governments to
plan efficient waste management systems.12. A variety of advanced techniques have been used to forecast MSW generation,
which can be broadly classified as follows: descriptive statistical methods13 ; material flow models14 ; regression analysis15 ;
time series analysis16,17; and artificial intelligence models18,19. A list of advanced MSW forecasting techniques and their features
is presented in Table SI-S1.
Descriptive statistical methods typically use demographic information such as population growth and average waste
generation per capita as the main predictor, however this method is prone to inaccuracies due to the dynamics of the MSW
generation process.13
Material flow models have been adapted to predict waste generation under various social and economic scenarios. 14 This
approach fully characterises the dynamic nature of MSW generation, however it is typically applied to total waste rather than
collected waste.20. Hekkert et al.21 highlighted that comparisons of the results of material flow analysis with real observed
waste data on the highest aggregation levels were questionable due to the presence of different aggregations or low consistency
within the studies.
In regression analysis, MSW generation is correlated with economic and demographic variables. 19 The suitability of this
method for a complex real world problem such as MSW generation is limited due to strict requirements placed on the input
variables. These requirements include independency of explanatory variables, constant variance and normality of errors in
order to conform with fundamental regression assumptions.22
3
In contrast with the aforementioned methods, time series analysis is independent of demographic and socio-economic
factors and relies only on historical waste data. A non-linear dynamics based prediction technique was used to forecast waste
generation and compared to a traditional time-series approach known as seasonal AutoRegressive and Integrated Moving
Average (sARIMA).16
More recently, machine learning and artificial intelligence techniques such as support vector machines (SVM) and artificial
neural networks (ANN) have been used to predict MSW generation on long, medium and short-term scales. 23,24 Zade and
Noori 25 used ANNs to forecast weekly waste generation and Abbasi et al. 26 used partial least squares (PLS) for feature
selection with SVM for waste forecasting.
Our research differs from previously published literature which modelled total MSW volumes. Our research focuses on
waste composition projection by accounting for two categories of OFMSW – food and garden waste. In the specific context of
waste-to-resource transformation, the organic waste fraction of MSW is unique in the complexity of its environmental impact
and its potential for value-added product recovery.
This research uses a gradient boosting model, which is based on decision tree regression models. Gradient boosting has
demonstrated the ability to model complex non-linear relationships between variables and has proven higher prediction
accuracy than traditional time series approaches such as AutoRegressive and Integrated Moving Average (ARIMA). 27 The
major drawback of the ARIMA model is its assumption of a linear relationship between independent and dependent variables
which does not mirror the complex nature of real world relationships between variables.28
When compared to widely adopted ANN models, the gradient boosting method has many distinct advantages. There are
typically fewer hyper-parameters to be tuned and methods of interpretation are better developed than for ANNs. More
importantly for our application, the gradient boosting implementation used in this paper has an in-built method for handling
missing values in the data, whereas ANNs are less capable in this regard.
Finally, different from approaches adopted in previous research which focused on MSW forecasting for a given region, this
study investigates the transferability of the OFMSW prediction models to regions not included in the training set. Such
generality would provide a promising means to quantify the future availability of food and garden waste and inform local
authorities currently without collection schemes in place. Our proposed approach has the potential to benefit ODA countries
where the increasing waste trends are particularly intense.2
4
Materials and Methods
Data Collection
An extensive literature review of past MSW forecasting research highlighted population, socio-demographic and climate-
related variables as the most important contributors to MSW generation.23,27,29 The selected features (Table 1), were therefore
chosen to reflect each of these factors.
Table 1 External features used in models for food and garden waste
Features Period Source
Population of Authority Yearly WDFArea Population DensityIndex of DeprivationTemperature Monthly CEDARainfallSolar radiationFraction of population living in rural areas Yearly ONS
Unemployment rate Yearly ONSHousehold income Yearly ONS
Local authority waste data was obtained from the UK municipal waste database Waste-Data-Flow (WDF).30 The data was
provided for each quarter spanning the 7-year period of 2009 - 2016. Each entry in the dataset provided information on the
name of the local authority, period of interest, type of material collected, collection method, number of households served by a
waste collection service and collected waste tonnage. Each entry is provided for a combination of a given authority and a given
quarter, e.g. City of London, January-March 2016. This combination of authority and quarter defines an ‘example’. The actual
number of examples available for training the model was limited by the fact that only a subset of local authorities have
separate collection schemes in place for food and garden waste. The availability of garden and food waste data across different
UK local authorities is illustrated in Fig. 1.
Additional datasets were incorporated into the feature set to provide contextual information about the authorities. This
included meteorological data (e.g. mean temperature, rainfall and solar radiation) obtained from the Centre for Environmental
Data Analysis (CEDA), as well as socio-demographic data (e.g. household disposable income and rural-urban classification)
from the Office of National Statistics (ONS).
5
Figure 1 Availability of garden waste (A) and food waste (B) data across the UK (number of data points per authority)
Modelling Approach
A Gradient Boosted Regression Tree (GBRT) model was used to predict long-term waste generation. GBRT is used here to
estimate waste volumes using only contextual features of a given local authority. This process is performed in three stages-
training, cross-validation and testing (see Figure SI-S2 for a graphical outline of the process). First the model is trained to
identify patterns between features and the target variable, waste volume. In the cross-validation step, the model parameters are
fine-tuned to improve model performance and ensure that the model can generalise to examples beyond those it was trained
on. Finally, the model is tested on a number of unseen examples and its prediction accuracy is determined. Other model types
were examined and tested, including support-vector-machines and random forests, but the GBRT model was found to be most
effective for the two specific use cases in this paper (see Tables SI-S2,3 for further details).
The GBRT method combines the strengths of two algorithms: regression trees and boosting. The resulting model is an
additive regression model comprised of decision trees fitted in a stage-wise manner. Decision trees partition the feature space
into regions and use a series of rules to identify regions with homogenous responses to predictors.31 A different linear model is
then fitted to each region. This binary partitioning is performed recursively and at each stage, the splitting variable and split
point are determined using a greedy recursive-partitioning algorithm.
Fig. 2 illustrates the binary partitioning process in the context of our research. Consider two predictor variables X 1 and X2,
representing population density and mean temperature of a specified local authority at a given point in time, and a response Y,
the total waste generated in the area. Splitting occurs at the first node using population density as the splitting variable. The
next step involves splitting at the subsequent node, using temperature as the splitting variable. The recursive splitting process
A BV
6
is stopped once a defined minimum node size is attained and the resulting large tree is pruned using a process called ‘cost-
complexity pruning’. This process involves removing weak links identified through cross-validation. The result is a single
decision tree which best describes the underlying relationship between a set of variables.32
Figure 2: (A) visualization of a decision tree using two demonstrative predictor variables (population density and temperature) to generate a single response (waste volume); (B) regression tree tuning parameters.
GBRT is an extension of traditional regression trees which incorporates a statistical technique known as “boosting”. This
procedure improves prediction accuracy by combining the outputs of “weak” classifiers to form a single consensus model.
Hastie et al.31 defined a weak classifier as ‘one whose error rate is only slightly better than random guessing’. In the context of
regression trees, boosting is a form of functional gradient descent that optimises performance by adding a new tree at each step
that minimises the gradient of the loss function.32 The first regression tree is the one which minimises the loss rate to the
maximum extent, subsequent decision trees are then fitted to the residuals of the preceding tree. It is also important to note that
although the model is updated each time a new tree is calculated, the existing trees remain unchanged. Instead, the linear
model fitted to each observation is re-calculated to account for the effect of the new tree.31
In order to optimise predictive performance, several parameters including number of estimators, maximum tree depth and
learning rate were tuned. Tuning was performed using graphical methods of comparing training and cross validation
accuracies, with the aim being to maximise the R2 score for the cross-validation set. The number of estimators determines the
maximum number of trees in the model, as illustrated in Fig. 2, while tree depth determines the degree of interaction between
features. Friedman et al.33 states that ‘since each new tree builds on the residuals of the previous tree, shallow trees with a
depth of 4–6 are often preferred’. Learning rate is another key parameter that determines the weighting of each tree in the final
model. A low learning rate will increase the number of trees used and allow regularization of results. Ultimately, this results in
Maximum tree depth
BA
7
Temp. >= 17 C
Waste = 250
Temp. < 17 C
Population Density >= 37 ha-1Population Density < 37 ha-1
Number of estimators
... +++
better model performance and a reduced risk of over-fitting to the training dataset despite a subsequent trade-off in
computation time.34
The final model is a linear combination of all decision trees whose contribution to the overall model is weighted by the
learning rate to minimise the root mean squared error loss function.34 XGBRegressor was implemented using Python’s
XGBoost package version 0.81.
Feature Engineering
Features were mapped to each UK local authority on a quarterly basis from 2009 to 2016. UK weather data was obtained
from CEDA at a grid resolution of 5x5 km2. These were mapped to the relevant regions by locating the nearest grid point to
each local authority with grid point locations determined by minimizing the Euclidean distance between the grid points and the
longitude and latitude given for the local authority. A small number of these authorities were excluded due to discrepancies in
nomenclature between waste authorities and local authorities.
Once the feature mapping stage was completed, features were checked for co-linearity and excluded if their linear
correlation R-squared score exceeded 0.8. Features excluded by this method include local authority dwelling stock and a
number of metrics related to the rural population in local authorities. Feature choices were kept consistent for all model types
to allow comparison of feature importance and more importantly, model accuracy.
Models
Two model types were developed - a forecasting model and a “peer” model. For each of these model types, individual sub-
models were created for food and garden waste.
The forecasting model was developed to predict food and garden waste volumes for future time periods. The food and
garden waste sub-models were trained using data from 2009-2013 for all applicable local authorities and then tested for the
period from 2014-2016. The mean waste volume prediction was compared to actual waste data for this test period. The
purpose of the forecast model is to predict waste volumes in the future for local authorities which already have collection
schemes in place to aid in planning.
The peer model was developed in response to an observed gap in the literature regarding the generalisability of these waste
prediction models to regions which the model had not previously been trained on. The purpose of this model was therefore to
predict OFMSW volumes based only on intrinsic information about the local authority for a given period such as population
8
and weather patterns. In our specific case, the purpose of creating a peer model is to be able to extrapolate forecasts for organic
waste to those local authorities that do not yet have organic waste collection schemes in place, to highlight the ‘lost potential’
for organic waste collection. As such, it could be used to support the decision to implement such a scheme in a local authority.
The peer model was trained on 60% of the 327 local authorities over the entire time period from 2009-2016 and tested on
the other 40% of local authorities which had not been seen by the model. The predicted waste volumes for the unseen local
authorities were compared to actual waste volumes to determine the accuracy of the peer model. Table 2 shows the tuning
parameters used for each model type on the garden and food waste streams.
Dataset splitting
In order to obtain training, cross validation and test sets for the peer model, the original datasets were split such that similar
waste volume distributions were observed across all three sets. Authorities were ranked by their mean waste volume and split
into quintiles based on this metric. Finally, stratified random sampling was used to split the authorities into training, cross
validation and test sets, stratifying on these quintiles.
Model evaluation
The model performance is measured by a coefficient of determination R2 (Eq.(1)) measured against a baseline model (i.e.
mean of the waste volume in the training set).
R2=1−∑
i( y¿¿i− y train)
2
∑i
( y¿¿ i− y i , pred)2(1)¿
¿
where y iand y i , pred denote the true and predicted waste volumes, respectively; y train represents the mean waste volume in the
training dataset.
This coefficient of determination can be calculated for each of the training, cross-validation and testing datasets for each
model. The models’ parameters are tuned with the aim of maximising the cross-validation R2.
Model Interpretation
A number of model interpretation techniques could be applied in waste forecasting models to provide useful insights into
individual feature contribution to the model’s decision-making process. Lundberg et al. 35 showed that a widely adopted
9
approaches such as split count and gain are inconsistent metrics of feature importance and proposed a new method, known as
Shapley Additive Explanations (SHAP), where SHAP advantages and limitations are thoroughly discussed. SHAP combines a
number of Additive Contribution Explanation algorithms to provide a measure of feature contributions that meets the
requirements of local accuracy, missingness and consistency35.
SHAP values measure the contribution of each feature to the model’s output for a certain example. The mean absolute
SHAP values across all examples for each feature are calculated to evaluate the importance of the features for the model.
Higher mean absolute SHAP values indicate that those features generally have a more significant overall additive contribution
to the model’s output. In this study, SHAP values were calculated using the TreeSHAP algorithm 36, implemented in Python’s
SHAP library, version 0.25.2.
Results and discussion
Model Performance
Table 2 shows the prediction accuracy for both the forecasting and peer models for each waste type, as measured by the R 2
value across the entire training/cross-validation/test set. Overall, the forecasting model shows good prediction accuracy on
both food and garden waste. For both model types, test accuracy is consistently higher for garden than food waste, with a 29%
and 11% increase in R2 value for garden waste compared to food waste from the forecasting and peer model respectively. The
peer model shows significantly lower test accuracies for both waste types with an R2 of just 0.29 for garden waste due to the
inherently difficult nature of predicting the waste output of previously unseen local authorities.
Table 2 Models for food and garden wasteForecasting and peer model parametersModel type Waste Type Tree Depth Number of Estimators Learning RateForecasting Garden 4 8000 0.025
Food 4 1000 0.025Peer Garden 3 900 0.004
Food 2 1800 0.006Model prediction resultsModel Type Waste Type Accuracy (R2 score)
Training CV TestForecasting Garden 0.994 0.770 0.651
Food 0.951 0.691 0.506Peer Garden 0.719 0.549 0.291
Food 0.787 0.373 0.262
10
Figure 3. Model predictive performance on mean quarterly collected garden (A) and food waste (B) per authority for training and test set from 2009 – 2016.
Fig. 3 compares forecasting model prediction results for mean quarterly garden and food waste, to actual collection
volumes. The quarterly prediction results were averaged across all local authorities to demonstrate mean waste volumes for the
UK as a whole.
On aggregate, model accuracy is higher at the national level compared to the local authority level with test R 2 scores of
0.766 and 0.899 for garden and food waste models, respectively. The forecasting model for garden waste follows the training
set very closely but is less accurate when applied to the test set. However, it captures the seasonality of garden waste
generation indicated by peaks and troughs. The forecasting model for food waste captures the increasing trend in waste
collection overtime as well as short-term changes in the overall trend.
4.2 Feature Importance
Fig. 4 shows the relative feature importance for the garden and food waste prediction models. The results indicate that
population is the most important feature. However, the importance of socio-economic and environmental factors varies
significantly between the models. Monthly solar radiation and mean temperature are significant features for the garden waste
prediction models, however they are relatively unimportant for the food waste models. Socio-demographic variables, such as
population dynamics, gross disposable household income per head and index of deprivation appear to be more important
features for food waste prediction models.
11
BA
Figure 4. Relative feature importance for garden waste (A) and food waste (B) prediction models.
Figure 5. Correlation between magnitude of feature values and impacts on prediction model outputs for food waste (A) and garden waste (B)
Fig. 5 illustrates the degree to which model output is positively or negatively impacted by the magnitude of the value of each
feature. Feature impact on model output was measured using SHAP. Both models demonstrate that more populated regions
have a high positive impact on model output. Lower population values show a receding influence on the output, but the
magnitude of this influence is noticeably smaller than that for high population values. This suggests that for higher
populations, the population size itself becomes a more dominant feature, for waste prediction, while for lower populations,
other features are emphasised to a greater degree.
Discussion
A B
A B
12
The forecasting models provide robust projection for two types of OFMSW i.e. food and garden waste. They represent
carbon and nutrient rich resources for recovery including carbohydrates, lipid, protein, cellulose, hemicellulose and lignin.
However, more robust results were obtained when predicting garden waste volumes compared to food waste. This is largely
due to the availability of the features characterizing each waste type. We hypothesized garden waste collection to be
influenced primarily by weather-related factors and these are represented well by the feature set used for model training.
Conversely, food waste collection could be influenced by individual household behaviours in addition to socio-economic
variables. These are more difficult to quantify on a local authority basis therefore the feature set is less representative for food
waste.
Feature importance results provide insight into the relative weighting of each feature in the model’s decision-making
process. Population is the most important feature for both model types as expected (more inhabitants generate higher levels of
OFMSW), while monthly sunshine is a more dominant feature for garden waste than mean temperature. More generally,
weather reflects the seasonally varying features, acting as the second most important feature for garden waste prediction
models, indicated by high mean SHAP values for monthly solar radiation and temperature. The effect of using weather as a
feature is reflected by the model’s capability to accurately capture the seasonality of garden waste generation (see Fig 3B).
Additionally, monthly mean temperature and solar radiation are strongly correlated to their impacts on predicted waste
volumes.
Model predictive performance is subject to a number of data and methodology limitations. The presence of incomplete and
noisy data and the inconsistencies in reporting standards across the UK resulted in discrepancies and missing data. Moreover,
variables such as community acceptance of waste-recovery and the differences in waste collection methods between local
authorities are unknown and hence not considered in the models.
Reported accuracies of the peer models are highly dependent on the local authorities included in the separate training, cross
validation and test datasets. As a consequence, variations in prediction accuracy were observed when randomised stratified
sampling was repeated. The generalizability of the models to ‘unseen’ local authorities therefore relies on sufficient
similarities between local authorities in the training and test sets. As only a limited number of local authorities provide
consistent waste data, ensuring a representative split across the training and test sets without introducing external bias is a non-
trivial problem. The results imply that transferring models to local authorities without data with which to retrain is not
straightforward with a data-driven approach such as this. The variability in R2 across different authorities can be seen in
Figures SI-S3,4,5,6.
13
In addition, forecasting model accuracy is impacted by variations in the number of food and garden waste collections
across local authorities. Authorities with a higher number of waste collections are represented better by the forecasting model
than those with only a few. This is likely to result in increased model accuracy for such authorities compared to those which
are less well represented in the data.
Overall, the strong performance of the forecasting models at the local authority and national level proves their suitability
for projecting OFMSW variability in waste quantity and quality (composition). Such projection could inform the future
OFMSW commoditization where the commodity grade definition are dependent on waste composition and recovery values as
energy carriers or nutrient substitutes. Furthermore, the spatial granularity of model predictions offers a promising approach to
inform the decision-making on technology choice, sizing, location, and logistics of OFMSW recovery facilities. These
underpin the potential transformation of waste value chains from retrospective waste management to proactive resource
recovery in a coordinated OFMSW value chain, where waste resource and facilities are interconnected via the internet,
supporting real-time decision-making (following an Industry 4.0 framework).
In future research, model performance may be strengthened by improving data quality and incorporating more features
which reflect the dynamics of waste collection, e.g. collection schedules, population demographics and the number of
households registered for OFMSW collection. Further benefits can be realised from the models investigated in this study by
extending their prediction capabilities to account for variability in OFMSW chemical composition. Determination of model
prediction uncertainty could be another important step towards industrial application of this research e.g. prediction
uncertainty quantification with lower/upper bounds for waste forecasts to optimise responsiveness of waste processing
operations.
Acknowledgement
MG would like to acknowledge the UK Engineering and Physical Sciences Research Council (EPSRC) for providing
financial support for EPSRC Fellowship ‘Resilient and Sustainable Biorenewable Systems Engineering Model’
[EP/N034740/1].
References
(1) Nations, U. World Population Prospects; 2015.
(2) Hoornweg, D.; Bhada-Tata, P. What a Waste - A Global Review of Solid Waste Management; World Bank, 2012.
14
(3) Hoornweg, D.; Bhada-Tata, P.; Kennedy, C. Environment: Waste Production Must Peak This Century. Nature 2013.
(4) FAO. Food Wastage: Key Facts and Figures. 2016.
(5) Commission, E. 2018 Circular Economy Package. 2018.
(6) Affairs, D. for E. F. and R. UK Statistics on Waste. 2018.
(7) Affairs, D. for environment F. and R. Digest of Waste and Resource Statistics . 2018.
(8) Committee on Climate Change. Reducing UK Emissions 2018 Progress Report to Parliament. 2018.
(9) WRAP. Food Waste Chemical Analysis; 2010.
(10) WRAP. Estimates of Food Surplus and Waste Arisings in the UK ; 2018.
(11) Shi, Y.; Ge, Y.; Chang, J.; Shao, H.; Tang, Y. Garden Waste Biomass for Renewable and Sustainable Energy
Production in China: Potential, Challenges and Development. Renew. Sustain. Energy Rev. 2013, 22, 432–437. DOI
10.1016/j.rser.2013.02.003.
(12) Kolekar, K. A.; Hazra, T.; Chakrabarty, S. N. A Review on Prediction of Municipal Solid Waste Generation Models.
Procedia Environ. Sci. 2016, 35, 238–244. DOI 10.1016/j.proenv.2016.07.087.
(13) Sha’Ato, R.; Aboho, S. Y.; Oketunde, F. O.; Eneji, I. S.; Unazi, G.; Agwa, S. Survey of Solid Waste Generation and
Composition in a Rapidly Growing Urban Area in Central Nigeria. Waste Manag. 2007, 27 (3), 352–358. DOI
10.1016/j.wasman.2006.02.008.
(14) Raffield, T.; Angus, A.; Herben, M.; Young, P. J.; Longhurst, P. J.; Pollard, S. J. T. Hidden Flows and Waste
Processing – an Analysis of Illustrative Futures AU - Schiller, F. Environ. Technol. 2010, 31 (14), 1507–1516. DOI
10.1080/09593331003777151.
(15) Xue, Y.; Yin, J.; Ni, W. Prediction of Municipal Solid Waste Generation in China by Multiple Linear Regression
Method AU - Wei, Yuanwei. Int. J. Comput. Appl. 2013, 35 (3), 136–140. DOI 10.2316/Journal.202.2013.3.202-3898.
(16) Navarro-Esbrı́, J.; Diamadopoulos, E.; Ginestar, D. Time Series Analysis and Forecasting Techniques for Municipal
Solid Waste Management. Resour. Conserv. Recycl. 2002, 35 (3), 201–214. DOI 10.1016/S0921-3449(02)00002-2.
(17) Xu, L.; Gao, P.; Cui, S.; Liu, C. A Hybrid Procedure for MSW Generation Forecasting at Multiple Time Scales in
Xiamen City, China. Waste Manag. 2013, 33 (6), 1324–1331. DOI 10.1016/j.wasman.2013.02.012.
(18) Ali Abdoli, M.; Falah Nezhad, M.; Salehi Sede, R.; Behboudian, S. Longterm Forecasting of Solid Waste Generation
by the Artificial Neural Networks. Environ. Prog. Sustain. Energy 2012, 31 (4), 628–636. DOI 10.1002/ep.10591.
(19) Abbasi, H.; Emam-Djomeh, Z.; Ardabili, S. M. S. Artificial Neural Network Approach Coupled with Genetic
15
Algorithm for Predicting Dough Alveograph Characteristics. J. Texture Stud. 2014, 45 (2), 110–120. DOI
10.1111/jtxs.12054.
(20) Beigl, P.; Lebersorger, S.; Salhofer, S. Modelling Municipal Solid Waste Generation: A Review. Waste Manag. 2008,
28 (1), 200–214. DOI 10.1016/j.wasman.2006.12.011.
(21) Hekkert, M. P.; Joosten, L. A. J.; Worrell, E. Analysis of the Paper and Wood Flow in The Netherlands. Resour.
Conserv. Recycl. 2000, 30 (1), 29–48. DOI 10.1016/S0921-3449(00)00044-6.
(22) Hockett, D.; Lober, D. J.; Pilgrim, K. Determinants of Per Capita Municipal Solid Waste Generation in the
Southeastern United States. J. Environ. Manage. 1995, 45 (3), 205–217. DOI 10.1006/jema.1995.0069.
(23) Noori, R.; Abdoli, M. A.; Ghasrodashti, A. A.; Jalili Ghazizade, M. Prediction of Municipal Solid Waste Generation
with Combination of Support Vector Machine and Principal Component Analysis: A Case Study of Mashhad.
Environ. Prog. Sustain. Energy 2009, 28 (2), 249–258. DOI 10.1002/ep.10317.
(24) Abbasi, M.; Abduli, M. A.; Omidvar, B.; Baghvand, A. Results Uncertainty of Support Vector Machine and Hybrid of
Wavelet Transform-Support Vector Machine Models for Solid Waste Generation Forecasting. Environ. Prog. Sustain.
Energy 2014, 33 (1), 220–228. DOI 10.1002/ep.11747.
(25) Jalili, M.; Noori, R. Prediction of Municipal Solid Waste Generation by Use of Artificial Neural Network: A Case
Study of Mashhad; 2007; Vol. 2.
(26) M, A.; Abdoli, M.; Omidvar, B.; A, B. Forecasting Municipal Solid Waste Generation by Hybrid Support Vector
Machine and Partial Least Square Model; 2013; Vol. 7.
(27) Johnson, N. E.; Ianiuk, O.; Cazap, D.; Liu, L.; Starobin, D.; Dobler, G.; Ghandehari, M. Patterns of Waste Generation:
A Gradient Boosting Model for Short-Term Waste Prediction in New York City. Waste Manag. 2017, 62, 3–11. DOI
10.1016/j.wasman.2017.01.037.
(28) Kane, M. J.; Price, N.; Scotch, M.; Rabinowitz, P. Comparison of ARIMA and Random Forest Time Series Models for
Prediction of Avian Influenza H5N1 Outbreaks. BMC Bioinformatics 2014, 15 (1), 276. DOI 0.1186/1471-2105-15-
276.
(29) Tchobanoglous, G. Integrated Solid Waste Management : Engineering Principles and Management Issues; Theisen,
H., Vigil, S., Eds.; McGraw-Hill: New York, 1993.
(30) Department for Environment, F. and R. A. WasteDataFlow . 2018.
(31) Hastie, T. The Elements of Statistical Learning : Data Mining, Inference, and Prediction, Second.; Tibshirani, R.,
16
Friedman, J. H., Eds.; 2009.
(32) Elith, J.; Leathwick, J. R.; Hastie, T. A Working Guide to Boosted Regression Trees. J. Anim. Ecol. 2008, 77 (4), 802–
813. DOI 10.1111/j.1365-2656.2008.01390.x.
(33) Friedman, J. H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29 (5), 1189–1232.
(34) Friedman, J. H. Stochastic Gradient Boosting. Comput. Stat. Data Anal. 2002, 38 (4), 367–378. DOI 10.1016/S0167-
9473(01)00065-2.
(35) Lundberg, S.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions; 2017.
(36) Lundberg, S.; G. Erion, G.; Lee, S.-I. Consistent Individualized Feature Attribution for Tree Ensembles; 2018.
Supporting Information
The following material has been presented in Supplementary Information for Publication in order to enhance the reader’s
understanding of this research and the proposed methodology:
1. Tabulated outline of advanced techniques for MSW forecasting including types of application areas and waste streams
2. Superstructure showing compositions and conversion pathways for food and garden waste constituents of organic waste
3. Graphic showing methodology for development of the gradient boosting model
4. Comparison of performance of tested models on forecasting and peer prediction tasks for green garden waste
5. Figures showing variations in R-2 scores at the local authority level for food and garden waste across the forecasting and
peer models
For Table of Contents Use Only
Predictive analytical techniques are applied to organic-fraction municipal solid waste, enabling effective waste-to-resource
transformation systems to close the loop on the circular economy.
17
18