SHORT-TERM EUROPEAN UNION ALLOWANCE PRICE FORECASTING WITH ARTIFICIAL NEURAL NETWORKS

The European Union Emissions Trading Scheme (EU ETS) was created to reduce greenhouse gas emissions. Companies producing carbon emissions have to manage associated cash flows by buying or selling carbon allowances. Moreover, future carbon prices could affect company decision making on decarbonization technology investments. In this paper, we forecasted short-term future carbon allowance prices using an artificial intelligence tool: a neural network. The resulting mean error was 1.7617 %. This is indicative of very good performance for a time series whose evolution is influenced by subjective economic and political decisions. The inclusion in the forecasting model of variables possibly directly related to the evolution of the price of CO2 emission allowances did not improve prediction accuracy. Therefore, we can assume that emission allowances evolve following a random path. The neural network provided reliable predictions which agents selling or buying allowances can use to make their decisions.


Introduction
The European Union introduced the emissions trading system (EU ETS) in 2005 in order to reduce greenhouse gas emissions which are very much responsible for climate change. The EU ETS is a market mechanism that determines a price for CO2 emissions and tries to create incentives to reduce emissions from industrial sectors. This system allows companies producing carbon emissions to effectively manage associated costs by buying or selling emission allowances. There is a great deal of literature concerning the theoretical foundations of greenhouse gas emission allowance trading schemes, their effects as an instrument of environmental policy, and Therefore, the evolution of the price of CO2 allowances is a highly interesting variable for many different agents. As Figure 2 shows, there have been large price fluctuations since the market came into being. There have even been periods when prices reached rock bottom, calling into question market operation and its capability to generate decarbonization incentives. The aim of this paper is to determine a model that helps to understand and predict EUA price evolution. Based on the above, we can identify at least three broad arguments that justify the interest of this research.
First, the determination of allowance (EUA) prices and price evolution has major implications for companies covered by the EU ETS, as it conditions the cost-effectiveness of their activity and is crucial for decision making ENTREPRENEURSHIP AND SUSTAINABILITY ISSUES ISSN 2345-0282 (online) http://jssidoi.org/jesi/ 2020 Volume 8 Number 1 (September) http://doi.org/10. 9770/jesi.2020.8.1(18) on how to organize their system of production, promoting investment in machinery, plant or, generally, less pollutant forms of production (with changes that can even affect their choice of suppliers or raw materials). Second, as some authors have recently pointed out (e.g., Atsalakis, 2016), corporations, and their directors and officers, are at increasing risk of incurring severe financial costs for not properly addressing EU environmental issues. Additionally, EUA can be classed as a financial asset, which raises EUAs to the portfolio decision-making level. EUA pricing is vital, not only for the companies covered by the system, but also for agents operating in financial markets. Finally, as already mentioned, EUA price plays a very important role in EU environmental policy. If the market works properly, the EUA price should generate incentives driving pollutant companies to invest in less pollutant machinery and plant, use more sustainable raw materials or switch to more environmental-friendly suppliers, moving towards the economy's decarbonization targets.
Economic theory dictates that market fundamentals (basic factors that affect the supply and demand of goods or assets traded on the market) should characterize the evolution of emission allowance prices (EUA). As in other markets, the price of allowances is expected to be determined by the balance between supply and demand (Fezzi and Bunn, 2009). In order to analyze EU ETS operation, the academic literature has tried to identify the factors that shape the price of carbon (Table 1).
Supply and demand elements, such as the number of distributed allowances or expected emissions, can usually be expected to define the CO2 price, but the allowance market can also be affected by macroeconomic or financial market shocks (Chevallier, 2011). Aatola et al. (2013) found that the fundamentals (German electricity and gas and coal prices) had a big impact on the daily future EUA prices from 2005 to 2010. Lutz et al. (2013) also found a strong relationship between EUA prices and fundamentals (prices of gas, coal, oil...) from 2008 to 2012. Their empirical results suggested that the price dynamics behaved nonlinearly. Oberndorfer (2009) and Moreno and Pereira (2016) analyzed and found a significant relationship between CO2 prices and stock market returns (power sector).  Hammoudeh et al. (2014) Bayesian Structural VAR Allowance prices and its fundamentals Hammoudeh et al. (2015) Nonlinear autoregressive Allowance prices and its fundamentals Szoplik (2015) Neural network Gas consumption in Poland Keles et al. (2016) Neural network Electricity price forecasting. Moreno et al. (2016) Panel data model EUA prices and stock market returns. Power sector Kuo et al. (2018) Neural Network Electricity price forecasting. Alblawi et al. (2019) Neural Network Energy consumption The above research highlights the strong relationship between fuel and energy prices and allowance prices. Therefore, the evolution of fuel and energy prices is important enough to attract the attention of researchers. Indeed, a lot of research has focused on forecasting energy (mainly electricity) and fuel demand and prices (González-Romera et al., 2006;Azadeh et al., 2011;Szoplik, 2015;Keles et al., 2016;Kuo et al., 2018, or Alblawi et al., 2019.
Generally, an analysis of the results reported in the empirical literature does not reveal any clear consensus on either the characterization of EUA prices or the sign or temporal stability of the effects that variables like carbon, fuel or gas prices have on the market. Some of the more recent research suggests that standard time series analysis Although there are many different forecasting tools, neural networks are powerful tools that have been widely used for time series forecasting. They have been proven to reproduce nonlinear time series behavior and provide very accurate predictions, mainly when dealing with nonlinear systems, where other tools fail. They are also a valuable forecasting tool for market agents because they can provide very reliable predictions when enough data are used for network training and network structure is properly defined.
The aim of this paper is twofold. First, we use a neural network model (a multilayer perceptron or MLP) to forecast short-term carbon allowance prices (one-day ahead) providing accurate and reliable predictions. Second, we prove that only past variable data are needed to provide these predictions. To do this, we also used data related to electricity and iron and steel prices to find out whether or not they improve forecasting accuracy when included in the forecasting model. The model aims to generate more reliable and simpler predictions than earlier models. Besides, we address a more interesting operating period than used in earlier papers, as we use price data for years when allowances were no longer freely allocated. Again, as opposed to other models conducting medium-term analyses, our predictions refer to the short term, aiming to capture EUA behavior as a financial asset.
The remainder of the paper is organized as follows. Section 2 provides a description of the methodology, that is, the forecasting tool applied (artificial neural networks), and also a description of the dataset and performance indices used. Section 3 analyzes the numerical results and provides the discussion. Finally, Section 4 sets out the conclusions.

Methods and materials
As already mentioned, economic theory dictates that the market fundamentals should characterize allowance price (EUA) evolution. From another point of view, if the allowance market works properly, allowances may evolve, in the short term, like financial assets. This two-sidedness complicates EUA price prediction, which should, besides, be based on techniques suited for analyzing nonlinear relationships between variables. Thus, we use a neural network methodology, that is, a mathematical algorithm that mimics brain structure (Bishop, 1995). A neural network can approximate any system because it can learn the system behavior from a data set (a time series describing the time evolution of allowance prices in this paper) by adaptively modifying its internal parameters (weights) during a training stage prior to use in the task for which it was designed. It is a valuable forecasting tool for market agents because it can provide very reliable predictions when enough data are used to train the network and its structure is properly defined (Rather et al., 2015;Göçken et al., 2016;Ilie et al., 2016;Moghaddam et al., 2016;Qiu et al., 2016). Consequently, neural networks will be used in this paper to forecast future values of carbon allowances.
On the other hand, the influence of magnitudes related to energy or raw materials production on EUA prices needs to be studied in order to find out whether or not they have an impact on allowance price forecasting performance. Some researchers have pointed out that, apart from using past data of a particular variable (future carbon allowance prices in this case), other data that influence this variable may be included in the forecasting model (Pardo et al., 2002;Mirasgedis et al., 2006). Therefore, carbon allowance prices and energy costs (or any other variable relating to allowance prices) may be used to feed a neural network model to forecast future carbon allowance prices. In this way, it will be possible to find out how such energy variables influence forecasting accuracy.

266
The predictions provided by a neural network fed with only past data of carbon allowance prices should be compared with forecasts provided by others also fed with variables other than price. However, there is no guarantee that the inclusion of such variables in the forecasting model could improve the prediction accuracy. Feedforward neural networks (FFNN) are the most widely used for time series forecasting. They are made up of a number of processing elements (neurons) arranged in a multilayer structure. They have the property that information flows from the input to the output of the network without feedback to neurons in the previous layers. They have been shown to be universal approximators (Hornik et al., 1989), that is to say, they are able to approximate any continuous function with only one hidden layer (between the input and output layers), provided that this layer contains enough neurons. One member of this class of neural networks is the multilayer perceptron (MLP), which is one of the most popular neural network models for time series forecasting due to its conceptual simplicity and adaptability to different types of forecasting problems. It is made up of an input layer, one or more hidden layers and an output layer. Nevertheless, only one hidden layer is usually defined, because, as mentioned above, this structure should be enough to approximate any system. The first layer is actually the set of network input data, and the hidden layer is the processing layer, where each neuron receives all the network inputs. The output layer is defined to provide a network response adapted to the values of the processed data rather than to provide another processing layer.
In a MLP every neuron computes the weighted sum of all its inputs plus a bias constant. The result is processed by a function that provides the neuron output: where xj represents the jth input of the kth neuron, wjk stands for the strength (synaptic weight) of the connections between a neuron and all neurons in the previous layer, yk is the neuron output and k signifies a bias constant. (·) represents the neural activation function that provides its output. This is usually a nonlinear function, enabling the neural network to learn the nonlinear behavior of complex systems. Bounded functions, such as Gaussian, hyperbolic tangent or sigmoid functions, are usually used, as a result of which the neural output behavior is bounded as in natural neurons. A linear function is also used in some models. In MLPs, nonlinear functions are usually used in the hidden layers, whereas a linear function is used in the output layer. This is because the aim of the output layer is to provide a response that fits the values of the processed data.
A neural network's ability to approximate any dynamical system is provided by a learning process in which the network adjusts its inner parameters to define a model whose time evolution is close to that of the system that it is trying to approximate. Therefore, before a neural network is used to carry out any task, it must be trained by means of a process in which it learns the system behavior. Accordingly, available data must be split into two different sets: one to train the system and another to validate the model.
In a MLP, training data must be arranged into pairs of network inputs and desired outputs. Accordingly, every time the network is fed with an input pattern, it provides an output response that must be compared with the desired response to output an error function. This error function will be used to properly modify the neuron weights in order to minimize error. This is done by backpropagating the output error to the previous layers and adjusting the respective synaptic weights in a step-by-step process that is iteratively repeated until a predefined minimum error is reached. The algorithm performing this process is therefore known as backpropagation (Bishop, 1995). There are a number of numerical methods that can be used to implement this process. Our research applies the well-known Levenverg-Marquart algorithm. This algorithm uses a two-stage procedure to optimize neuron weights. The first stage provides a rough approximation of the optimum weights, while the second stage outputs a closer approximation of this value. It is the most widely used method to train MLPs.

ENTREPRENEURSHIP AND SUSTAINABILITY ISSUES
ISSN 2345-0282 (online) http://jssidoi.org/jesi/ 2020 Volume 8 Number 1 (September) http://doi.org/10.9770/jesi.2020.8.1 (18) Before training a neural network, its structure, that is to say, the number of layers and the number of neurons in each layer, must be defined. As mentioned above, a network with one hidden layer suffices to approximate any dynamical system. This is, therefore, the structure that was used in this paper. The number of data to be forecasted determines the number of neurons in the output layer. As the aim of this research is to predict the one-day ahead price of CO2 emission allowances, the output layer will have only one neuron. The number of network inputs (to the first layer of the network) must also be carefully specified in order to provide the network with enough data to accurately perform the forecasting process. Therefore, different numbers of input data should be tested to find out the best option: too many inputs will make the model unnecessarily complex, whereas too few inputs will prevent the model from capturing the system dynamics due to missing information. Note that prediction accuracy will not necessarily improve with a high number of network inputs, as was proven by the simulations carried out (results will be shown later). Indeed, they could possibly degrade forecasting accuracy because unnecessary input data, especially if there are a lot, add noise to the information that the network is to process.
It is essential to select the right number of neurons in the hidden layer on two grounds. First, a network with too few neurons will not be able to accurately reproduce the system dynamics. Second, a network with too many neurons will, at best, behave properly (at the expense of an excessive computing load), or, at worst, only learn the input patterns, in which case it will be unable to generalize the knowledge acquired to predict non-learned patterns (an effect known as overfitting). Therefore, different structures must be trained to then select the one that provides the most accurate predictions. Taking these issues into account, several network structures were tested to find out which one provided the most accurate predictions.
The time series used in this research contains 1834 daily future allowance prices of CO2 from 14 October 2009 to 24 October 2016 ( Figure 2). Therefore, this time series includes EUA data from Phase III, located at the end of the series, which will be subsequently forecasted. This is a significant contribution of this research with respect to previous studies that used data from the early auctioning period only (Feng et al., 2011).
A question that often arises when forecasting time series is whether or not variables other than the predicted variable can affect its time evolution, and, hence, whether their inclusion in the forecasting model could improve prediction accuracy. Therefore, this paper also looks at the influence of other variables on the time evolution of future of CO2 allowance prices. It has already been claimed that such an influence does exist. Indeed, several authors have pointed out the possible influence of energy and fuel prices on the evolution of CO2 allowance prices (Paolella and Taschini, 2008;Alberola et al., 2009;Keppler and Mansanet-Bataller, 2010;Hammoudeh et al., 2014;Hammoudeh et al., 2015;Convery and Redmond, 2007;Boersen and Scholtens, 2014). In Jaramillo and García (2019), the influence of energy and raw material prices on this variable was studied by analyzing their correlation. It was found that only electricity and iron and steel prices were correlated with the price of CO2 allowances. In that paper, a MLP provided twenty-day ahead predictions. It was proven that the inclusion of electricity and iron and steel prices along with CO2 prices did not significantly improve forecasting accuracy. Only when the trend or fluctuations of one of the exogenous variables were included did the forecasting error undergo a slight decrease. This paper split the time series into two different series: one describing the price trend and the other, fluctuations around that trend. They were independently forecasted so that the exogenous variable trends and fluctuations could be used as network inputs. The resulting decomposition led to a complex forecasting model, as all the time series had to be preprocessed before they were used.
In this paper, we use a simpler scheme where the time series is forecast directly. This scheme is, as shown below, just as accurate as the complex model. We test the influence of the use of exogenous variables on forecasting accuracy. Taking into account the results reported in Jaramillo and García (2019), only the electricity and iron and steel price time series are considered here. The prices of these three variables were taken from the Thomson Reuters Datastream and Bloomberg Databases. The forecasting accuracy is measured with four error indices which will be used as figures of merit to find out the best forecasting structure and will show how good predictions are. They are: mean absolute percentage error (MAPE), mean absolute error (MAE), mean squared error (MSE) and root mean squared error (RMSE):

ENTREPRENEURSHIP AND SUSTAINABILITY ISSUES
where Ai and Fi are the actual and forecasted data and N their number. They are widely used in literature to analyze accuracy in time series forecasting. Although they all provide information about how good predictions are and can compare different tools forecasting a specified data set, only MAPE is able to compare the accuracy of a specified tool when applied to different time series with different value ranges in their data, as this index provides a scale-independent percentage error. Its only drawback is that it cannot be used to measure errors for the time series with near-zero values in case a division by zero is performed when calculating its value.

Results and discussion
We selected a multilayer perceptron (MLP) with one hidden layer to carry out the forecasting process in this paper, because, as stated above, it has been proven to be a powerful and reliable forecasting tool to predict time series and one hidden layer is enough to guarantee that the network can efficiently approximate the time series behavior. Several neural structures were tested in order to find out which performed best: we considered different numbers of inputs and different numbers of neurons in the hidden layer. The training process was carried out using the first 60% of the available data, whereas the remaining 40% was used to test the network performance. Each network provided a single output: the CO2 allowance price on the day following the prices used as network inputs.
The best performance was provided by a network with two inputs and three neurons in the hidden layer ( Figure 3, Table 2). Other structures with similar numbers of inputs and neurons in the hidden layer provided slightly higher errors. Therefore, we can conclude that, although the best structure has two inputs and three hidden neurons, other neural networks with low values for these two parameters may also provide reliable predictions.

ENTREPRENEURSHIP AND SUSTAINABILITY ISSUES
ISSN 2345-0282 (online) http://jssidoi.org/jesi/ 2020 Structures with higher values of both parameters provided worse results. This is hardly surprising bearing in mind that adding more elements (both inputs and neurons) to a network structure capable of learning the system behavior with a small number of neurons and inputs could actually degrade, rather than improve, network performance (as pointed out above). In fact, the extra input data and neurons may increase global error as a result of the accumulation of the individual errors that each element adds to the network response. The resulting errors can be regarded as really low because the future carbon allowance price evolution is governed by both objective circumstances, for which a deterministic model can account, and subjective conditions (for instance, political decisions), which are unpredictable and could disturb the inherent behavior of the system. Therefore, the neural network can be said to be able to capture the time evolution of the future allowance prices and, therefore, provide a reliable prediction of its short-term evolution.
These results were obtained using only 1434 data. The first 400 were rejected as they were very high and could have degraded network forecasting accuracy since they were to be learned along with the lower values that appear at the end of the training interval. This could worsen the accuracy of predicting low value data (contained in the validation set). In order to check the validity of this assumption, several data sets with different numbers of data were tested. The best results were achieved with the aforementioned data set. Simulations with the whole data set slightly worsened the forecasting accuracy. On the other hand, data sets with fewer elements led to higher errors. This is hardly surprising, since, in this case, the supplied data were insufficient for the network to be able to efficiently learn system behavior.  Different sizes of the training and validation data sets were also tested, and the best results were achieved with 60% used for training and 40% for validation. This is the division used for all the results presented here.

ENTREPRENEURSHIP AND SUSTAINABILITY ISSUES
Having proven that a neural network is able to provide reliable predictions of future carbon allowances prices by taking into account only past values of this variable, we can test the effect of adding exogenous variables (that is to say, variables other than the predicted variable) to the forecasting model. The aim is to find out whether or not they can improve the performance of the model with only one variable. As discussed above, only electricity and iron and steel price indices are correlated with future carbon allowance prices. Therefore, only they were used along with carbon prices as network inputs. The network providing the best performance (the network with two inputs and three neurons in the hidden layer) was used to find out whether or not adding these variables improved network performance. For the sake of simplicity, the same number of future carbon allowance and exogenous variable prices were used as network inputs (two). Three options were tested: adding only one exogenous variable (two cases) and adding them both. The results are shown in Table 3, which illustrates that adding these exogenous variables does not improve forecasting accuracy. The networks including these variables provide more or less the same accuracies as the network using only past data on future allowance prices. Therefore, we can conclude that there is no point in including exogenous variables in the model because it does not lead to any improvement in the forecasting accuracy and merely complicates the network structure.
This could, initially, be surprising because it is logical to think that adding variables correlated with the predicted variable to the model should provide additional information to help the neural structure provide more accurate predictions. Nevertheless, a more detailed analysis shows that the observed results are consistent: the correlation between the above three variables means that they evolve in the same way. Therefore, we can assume that, as previously stated in Section 2, the same market decisions drive their values (Pardo et al., 2002, andMirasgedis et al. 2006). In other words, they evolve together following the same market rules, but they do not in fact influence each other. Taking this point into account, it is clear what caused the above results.
We believe that the explanation necessarily depends on the time range of the price predictions and the ability of companies to make decisions that affect their production in the medium or long term. Jaramillo and García (2019) found that allowance prices from EU ETS auctioning phases are related to pollutant sector information in the midterm (the time horizon that is expected to affect company production or investment decisions). Nevertheless, companies do not have the chance to adapt production to a change in fuel prices or to a new technical situation from day to day. In the short-term, then, agents are bound to play the role of financial investors. Agents will decide whether or not to sell or buy allowances exclusively bearing in mind asset profitability. In this situation, rational expectations will lead EUA prices to follow a random path. If this is the case, perhaps the best specification will be to use only close past data to forecast future data, as additional variables will not improve the model's predictive power. This result has been broadly studied in the efficient market hypothesis for the financial market (Lo and MacKinlay, 2002), but the intuitive idea was also used in consumption theory (Hall, 1978). In a simple approach, an optimal temporal distribution of allowances can be expected. Market decisions are driven by rational agents looking for an optimal temporal EUA distribution in order to maximize profits: where Pt EUA represents the carbon price at time t, '(.) is the firm's marginal profit function, and Et is the conditional mathematical expectation at time t (It). This result has major policy implications. In the short term, only unknown information or unexpected policy can affect company behavior, while, in the long term, companies could change their production decisions, in which case changes in fuel prices or stock returns could affect CO2 prices.

Conclusions
In this paper a type of neural network known as multilayer perceptron has been used to forecast carbon allowance prices. It was able to reproduce their nonlinear behavior, providing accurate predictions. It is very difficult, if not impossible, to define an accurate functional form for this time series. Therefore, the approximation provided by the neural network is very useful, as it gives reliable predictions of future values. It was able to successfully capture the nonlinear behavior of the time evolution of carbon allowance prices and provide a one-day ahead prediction, taking into account only past data of the actual time series. The neural model works like a black box that processes data but does not provide a mathematical model of time series behavior. Accordingly, it is not, unfortunately, able to provide a description of the variables driving the system evolution, which could facilitate its understanding and provide information that could be used by agents in the EUA market to make their decisions. Nevertheless, as it is intractable to define a deterministic mathematical model characterizing the carbon allowance price evolution, the neural network approach reported in this paper can be regarded as the best option for predicting EUA price evolution.
Unlike other empirical studies (Feng et al., 2011), this paper uses data from the second and third EU ETS phases, when allocation by auctioning increased. This is a significant difference from previous forecasting models, which provided predictions from Phase I driven by grandfathering allocation. Besides, we provide predictions for a long data set and prove the robustness and accuracy of the forecasting model. As compared with previous forecasting systems, our model provides a simpler analytic tool for companies to manage environmental decision related to EUA market. Its accuracy can help companies to manage the cost of carbon emissions.
It has also been proved that adding energy variables closely related to carbon allowance prices (electricity and iron and steel prices) to the forecasting model did not improve prediction accuracy, despite the fact that those variables had been found to be highly correlated with carbon allowance prices. This result is hardly surprising since the correlations indicate that these three variables evolve in the same way, but not that they influence each other. Thus, they can be said to be driven by the same market decisions that affect them all in the same way. Therefore, past carbon allowance prices alone suffice to forecast future ones.
Hence, EUA prices operate in the short term like a kind of financial asset, following a random-walk-like process. This result contradicts the findings of Feng et al. (2011), who used data from the first phase of the market, suggesting that, once allowances began to be allocated by auctioning, market operation changed. In contrast to Jaramillo and García (2019), the model proposed in this paper provides short-term forecasting prices, that is, it considers the allowances as financial assets, whose time evolution is not affected by the inclusion of additional variables apart from allowance price. Price evolution is compatible with an optimal temporal distribution of allowances. Market decisions are made by rational agents looking for an optimal temporal EUA distribution in order to achieve profit maximization. Carbon markets may work properly by taking into account only their time evolution, as, in the short run, only unknown information or unexpected policy could affect company behavior.
Furthermore, the EUA market provides new business development opportunities for financial intermediaries, since, in the short term, companies are unable to adapt production to changes in the environment (fuel prices, technical innovation…) or even to changes in allowance prices. Current, as well as predicted, EUA prices are ENTREPRENEURSHIP AND SUSTAINABILITY ISSUES ISSN 2345-0282 (online) http://jssidoi.org/jesi/ 2020 Volume 8 Number 1 (September) http://doi.org/10. 9770/jesi.2020.8.1(18) 272 critical for financial market brokers and traders. Our model provides accurate and flexible estimations to improve knowledge of short-term price evolution and can, therefore, provide support for decision making.
Based on Porter effect, the environmental policies that stimulate green innovation may lead to positive innovation-related outcomes and affect company competitiveness (Lundgren and Zhou, 2017). Thus, a more stringent emission trading system has the potential to stimulate company innovation. Consequently, it is necessary to continue researching the impact of the pollutant sectors on allowance prices, building on Jaramillo and García (2019), as it is directly related to EU ETS effectiveness in the context of EU environmental policy.