DISTRICT HEATING NETWORKS : ENHANCEMENT OF THE EFFICIENCY

During the decades the district heating’s (DH) advantages (more cost-efficient heat generation and reduced air pollution) overcompensated the additional costs of transmission and distribution of the centrally produced thermal energy to consumers. Rapid increase in the efficiency of low-power heaters, development of separated low heat density areas in cities reduce the competitiveness of the large centralized DH systems in comparison with the distributed cluster-size networks and even local heating. Reduction of transmission costs, enhancement of the network efficiency by optimization of the design of the DH networks become a critical issue. The methodology for determination of the key drivers of the cost-efficiency of the DH networks to implement the most efficient (cost-minimal) thermal energy transmission was developed in this study. An inductive benchmarking modelling was applied; the general causal regularity is based on the observations of specific cases, thus determining the relationships between the network’s design and thermal indicators as predictors and transmission costs as the target variable. The key drivers of the network efficiency were disclosed – the network length and the largest inner diameter of the pipes. The methodology is applicable for use by municipalities and heat providers for the heating planning of the new housing developments as well as renovation and/or expansion of the existing DH networks.


Introduction
During the decades, the advantages of the large district heating (DH) systems were accepted almost as an axiom for the densely populated (urban) areas.Even starting the new millennium the recognized development trend was replacement of isolated small DH systems with an adequate heat sources by a powerful central heat station and main transition line that will carry large volumes of heat over long distances (e.g., King, & Shaw, 2010).The advantages were based on the cost-efficient heat generation (due to higher efficiency factor and cheaper fuel as a result of the effect of scale) as well as on reduced air pollution and carbon emissions (due to a small number of controlled sources and extensive use of the renewable energy).These benefits overcompensated the additional costs of transmission and distribution (further transmission) of the centrally produced thermal energy to consumers.
Recent studies (e.g., Delangle, Lambert, Shah, Acha, & Markides, 2017;Song, Li, & Wallin, 2017) and analyses (e.g., DEA, 2016a;DEA, 2016b), however, show the decreasing competitiveness of the large DH systems in the light of the development of new technologies.There has been a rapid increase in the efficiency of low-power heaters in the last decade.The DH systems using natural gas are losing their advantages of the effect of scale; the effect is only partly remained if biomass is used in both central and local plants.Also the modelling results (Fakhri, Ahlgren, & Ekvall, 2016) show that even with an incredibly large difference in heat costs, already at a very small network length (around 3 km), the urban-size centralized network becomes economically disadvantageous in comparison with the distributed cluster-size local networks.
In addition the modern cluster-based urban planning principles (agglomeration, metropolitan area, periurbanisation, etc.) and the heat efficient buildings create the low energy density areas that significantly increase specific transmission costs.But advanced green heat generation technologies (e.g., solar thermal, ground-source heat pumps, geothermal, waste) are well suited for the lower thermal power applications; they become acceptable for use even in the density populated areas.All this reduce the advantages of the large centralized DH systems; actually, only the difference in fuel prices is noteworthy due to the large-scale purchases (which can often be implemented in a distributed system too).
Rapid development of the heat generation technologies directly increases the importance of the design of the DH networks.The network-related costs are not an insignificant component of total heat supply costs, they accounts for up to 40% of the total heating tariff, including distribution losses (10.4% of produced heat in EU28, 12.4% in Latvia in 2016;(Eurostat, 2018)).The reduction of the transmission costs, enhancement of the cost-efficiency of the network become the decisive factor for the DH's competitiveness in the heating sector.
At the same time, the DH currently is well-developed in the Nordic and Eastern European countries as well as in the largest densely populated cities in the EU's countries with mild climate (e.g., ETSAP, 2013;Werner, 2017).Planning and design of the new and renovated networks are long-term projects (for 30-40 years).Evolving over decades, the DH systems have accumulated large amounts of investments; therefore, it is not possible to imagine a rapid dismantling of the DH sector, which would mean colossal sunken investments.
Like in other network industries, heating network operators are the natural monopolists in specific areas.So there is neither direct competition, nor commercially open market; operators have a little motivation to enhance efficiency of the network and business.Therefore the DH transmission segment and its pricing structure are fully regulated.In practice the consumers' choice is limited by administrative and/or fiscal instruments; it is not the cost-based approach, of course.Nevertheless, consumers would opt for a local heat source, especially in the case of a new building connection.Their benefit (value-in-behaviour) will be based on the rationality and economic self-interest (Butler, Gordon, Roggeveen, Waitt, & Cooper, 2016).Consequently, economic values are becoming the core message in DH marketing (Smaliukiene, 2019); the decreased environmental effect would be an inadequate motivator for choice of the DH.
The European Commission has established a single, detailed and strict legal framework for the electricity and gas sectors, which has been developed for several decades.The DH is very fragmented even within one country, it is a distinct sub-regional issue; there is lack of substantial EU-level regulatory framework devoted to it.The development and functioning of the DH has traditionally been under the responsibility and control of municipalities.Exactly the local government (mostly city municipality) is a planner, normative and financial facilitator and supervisor of the modern heating networks (Saeima, 1994;UNEP, 2015); it is a supplier and large consumer of the thermal energy at the same time.In Latvia as in Estonia and Lithuania, the DH sector is dominated by companies owned by local governments, although there are also some private equity companies.A similar situation can also be seen in the Nordic countries, where municipal enterprises continue to be relatively strong (e.g., Aberg, Falting, & Forssell, 2016).
There is also a government support for further deployment of the heat networks on the national scale and political willingness on European and global scale (e.g., UNEP, 2015;UN, 2017;Euroheat & Power, 2019), taking care on national interests (e.g., energy policy, social policy, climate targets), and recognizing national benefits (modern district energy systems as a key solution to integrate energy efficiency and renewables in cities).
The various factors of increasing the efficiency of the DH systems have been the drivers of further strengthening the competitiveness of the systems over many years.The rapidly growing amount of academic research (e.g., on the search term "district heating" database ScienceDirect provided 13 times more issues published in 2017 in comparison with the number of publications in 2008, while the total number of annual records during this period has increased by only 46%) also shows the topicality of the DH related issues.The research has been focused mostly on various aspects of the heat generation (CHP, renewables, thermal storage, heat pumps, etc.) as well as the regulatory issues (e.g., Sarma, & Bazbauers, 2017); the heat transmission temperature is the single practical network issue as well as the hydrodynamic issues as more theoretical topic (see, e.g., Sayegh, Danielewicz, Nannou, Miniewicz, Jadviszczak, Piekarska, & Jouhara, 2017).Currently the studies are linked to development of the fourth generation DH (e.g., Lund, Werner, Wiltshire, Svendsen, Thorsen, Hvelplund, & Mathiesen, 2014).
Each case and each DH network is unique in terms of differences in the amount of the heat transported, operating area, consumers and other factors.The aim of current research is development of the methodology for determination of the key drivers of the cost-efficiency of DH networks to implement the most efficient (costminimal) thermal energy transmission.The methodology will be applicable for the heating planning of the new housing developments as well as renovation and/or expansion of the existing DH networks, increasing the degree of heating centralisation by merging smaller and remote DH systems into a larger one, or, on the contrary, routinely fragmenting or even dismantling existing DH systems.The methodology was developed by analysis of various performance aspects and creation of the general causative regularity using the advanced benchmarking modelling procedures on the basis of practical cases (Latvian DH utilities).

Basic principles of the methodology
A direct theoretical calculation of transmission costs is an extremely difficult and even practically unachievable task.There are several multivariate analytic methods that are appropriate in principle for the study of cause and effect relationships between the input (network parameters) and output (costs) variables without investigation of internal aspects of the network (a black box principle).The intelligent data mining methods are among them, they are well suited to discover existing regularities in the network's data set; particularly it relates to the benchmarking data mining algorithms.
On the other hand, the mathematical modelling procedures are generally accepted instruments for the prediction of the scenarios in the wide variety of fields.To achieve the set goal we applied an inductive benchmarking modelling, using the declared design, thermal and financial indicators of networks, which are functioning in homogenous conditions (in particular geographical area with the similar environment, climate and business factors).The benchmarking modelling means search of the mathematical model that is based on observations of specific cases (e.g., Moriarty, 2011), thus determining the relationships between the network's design and thermal indicators as independent variables (predictors) and some cost indicator as the dependent (target) variable; its reduction should reflect the efficiency progress in the corresponding network.Due to the analogous impact of external factors, the benchmarking provides comparative assessment of transmission costs; this way we will disclose the impact of network indicators on costs and extract the most significant ones (key drivers), reducing a large number of predictors.
The authors already have successfully used several benchmarking modelling tools for the cost assessment in a related industrywater supply (Zuters, Valainis, Karnitis, G., & Karnitis, E., 2016;Karnitis E., Karnitis G., Zuters, & Bobinaite, 2017).The results obtained have shown the suitability of the method to solve the similar tasks.
The necessary performance of the network k is determined (fig.1a) by the requested heat capacity Ph(k), and the thermal energy Qcon(k) that really is supplied through N(k) connection points to consumers.Due to the losses, a higher amount of heat Qpr(k) should be produced and delivered into the network; the price of the Qpr(k) is determined by the regulated heat production tariff Tpr(k).The network consists of different size pipes (inner diameter changes from Dmin(k) to Dmax(k); Dmed(k) is the most frequently used size), their total length is L(k).The total costs Ctr(k) for heat transmission includes all capital expenditures and operational expenses for the network development, maintenance and functioning; the reduction of costs directly characterizes enhancement of network's efficiency.The regulated transmission tariff Ttr(k) could be determined as Ttr(k) = Ctr(k) / Qcon(k); actually it don't provide a new information in our case.All these indicators are clearly and unambiguously defined, they are quantitatively measurable and controllable that is important for modelling quality.

Developed by authors
For the development of the benchmark model, the DH network k is formalized as a multiple-input single-output determined search space (fig.1b); its input data set of predictors Π(k) defines some output (dependent) cost variable C(k).Then the creation of the model means search of the multi-functional regularity (1); the internal structure and operation of the network are irrelevant for performing this task.

C(k) = fk(Π(k))
(1) There are u different functions f1, f2, …, fk, …, fu for u networks; they form a factual basis for the modellingset of the specific cases that can be used for search of the general regularity.Then the benchmark modelling means an inductive processsearch of the general modelled regularity C(m) = f(Π) on the basis of u specific cases C(k) = fk(Π(k)) by navigation in the multidimensional search space.
The total transmission costs Ctr consist of number of componentsheat loss costs, labour costs, depreciation costs, payments for electricity and chemicals, etc.An analysis of the structure of declared basic cost components has been made; it identified a great variety in these components.There are several objective (e.g., various business models, wide but uneven investments) and subjective (e.g.accuracy and uniformity of the data on cost components) reasons for it.The Ctr data can be evaluated as more qualitative (accurate, reliable) ones in comparison with the separate cost components, since the latter very depend on the cost allocation.In addition, exactly total costs are the efficiency criterion as well as the basic regulatory focus as the determinant of transmission tariff Ttr.Therefore, the Ctr were chosen as the dependent variable: Other mentioned network's design and performance indicators form the set of independent input indicators Π(k): It can be predicted that it will not be achievable model, which is completely adequate to all real networks.The leading motive for practical purposes is to create the function f with the best possible quality.To achieve the maximum conformity of the found regularity, correlation of the particular modelled values with corresponding declared specific costs (correl (Ctr(m): Ctr)) is used as the quality criterion during the creation of the model.We have applied also the Average Normalized Squared Deviation from the Mean value (ANSDM) for assessment of the quality of the models; the normalization was used because of large range of networks size: Number of input data sets u and consequently the output data is finite (number of networks u; in our case data on 23 DH networks were available).Nevertheless, in order to have the possibility to evaluate the new and/or modified undertakings, it is necessary to create the regularity that is continuous and monotonous for any predictor (the economic logic does not indicate any reason for some extremum) against any Π(k) in the determined ranges of input data.

First stage of modelling: determination of key drivers
The regression analysis, as the method at the crossroads of data mining and modelling, was chosen as the most preferable statistical modelling tool, which is directly focused on the relationships between the dependent variable and several predictors.For modelling we chose the well-developed, powerful and at the same time userfriendly R statistics environment that provides the flexible modelling algorithms; the modelling was started by using the simpler linear regression algorithm.The post-modelling analysis of residuals will show the purposefulness of continuation of modelling by usage of more complicated nonlinear regression algorithms to obtain the stronger cause-effect relationship.
For the linear modelling we used the built-in linear regression model, implemented in R as a function lm, which is called by command lm.The target was to detail the general regularity (2) and to create the benchmarking model as the linear mathematical expression: where: αconstant; -βiestimated optimal weight of corresponding predictor i.
Identification of outliers (data sets which are too far away from the central data cluster) was made before the modelling by means of standard outlier detecting procedure; the outliers may have a different essence and therefore cannot be used to create the general regularity.After outliers' rejection from the total data set data samples of remaining 20 DH networks had been used for the modelling (u = 20).
Checking indicators' cross-independence level was another preparatory activity; the usage of strongly correlated predictors would lead to the so-called multicollinearity problem, i. e., the possibility of obtaining unreliable results by usage of interdependent predictors.To examine the potential problem the typical tool was usedthe Variance Inflation Factors (VIF), which characterise the mutual correlation between any pair of predictors.A strong reliability criterion was chosen: where: R 2coefficient of determination.
The VIFs were calculated for full matrix of input indicators; the mutual correlations of five variables (Ph, Qpr, Qcon, L, N) were found as being well above the critical threshold).So only one of the five strongly correlated indicators can be used for modelling, the others should be excluded from the data sets.At the same time it is an approval that the structure and specific load of all networks is similar, that significantly increases trust in the results of benchmarked modelling.
To define the most preferable predictor among the strongly interconnected five indicators and so avoid the potential multicollinearity, we made the prior modelling by using the same linear regression algorithm, using all nine independent variables as predictors.The regularity (3) in this case becomes: Although the correlations between any of interconnected five predictors and Ctr are very strong (> 0.8), their impact on Ctr(lin9) is different (their mutual correlation is one of the basic reasons).A perusal of p-values shows that the network length L is the most significant predictor among the five mentioned mutually-correlating ones (tab.1); there is only 3.8% probability on random impact of the L on Ctr(lin9).Chances of other indicators to be accidental ones are incomparably higher (27% -85%).Therefore the network length L was chosen as the predictor for the basic modelling together with Tpr, Dmax, Dmin and Dmed.The modeling process itself was started by creation of linear 5-predictor model in the form: Ctr(lin 5) = α + β1 * Tpr + β2 * L + β3 * Dmax + β4 * Dmin + β5 * Dmed (5) The obtained correlation is extremely strong (tab.2, columns Ctr(lin5)); the p-value for the model is very small.At the same time, the excellent correlation is partly misleading because of small amount of data sets in the 30-fold range of the actual Ctr values; this is confirmed by the quite large value of ANSDM (7.36%) of the actual costs Ctr from those of modelled regularity Ctr (lin5)

Developed by authors
The final result of the linear modelling (tab.2, columns Ctr(lin2) shows that the correlation parameters are virtually unchanged in comparison with the Ctr(lin5)); extremely small p-value confirms the validity of the predictors' choice.The obtained excellent correlation (Fig. 2a) confirms that L and D are the key drivers of transmission costs, i.e., the efficiency of the network; the abandoned three indicators really have not a significant impact on the searched regularity.The results are consistent with the conclusion of the techno-economic model, developed in (Pusat, & Erdem, 2014): the transmission costs increase with the area size, i.e., with the length of the network.On the other hand, costs for lying out pipes as well as the total cost of piping per meter (ETSAP, 2013;Delangle, Lambert, Shah, Acha, & Markides, 2017) directly depends on the pipes' diameter first of all, therefore the significance of the largest diameter is well explained.
Nevertheless, R diagnostic plots point the incomplete compliance of the actual data with the linear model Ctr(lin2).The plot 2b shows that residuals are not equally spread around a horizontal line (especially at small fitted values).It is an indication that the linear model doesn't fully capture the existing non-linear relationship between the predictors and the target variable.The plot 2c identifies the data point that lies very near to the threshold, so called Cook's distance.This point is not an outlier; nevertheless it can become influential against a general regularity.In addition the ANSDM even increased, its reduction is necessary for more accurate evaluation of the key drivers' impact on the network's efficiency.In general the model needs to be further developed to assess the impact of the key drivers on the efficiency of transmission network.

The second stage of modelling: evaluation of the impact of the key drivers
Using the traditional method of the least squares, it was found that the nonlinear trend line in the scatter plot of the created linear model (fig.2a) provides a slightly higher coefficient of determination R 2 in comparison with the standard linear one.It approves a necessity to add some nonlinearity in the model.To find the correct way, we checked the scatter plots (Ctr vs L) and (Ctr vs Dmax) (fig.3).The impact of each individual predictor on the result of the total mining is, of course, different from the individual correlation (e.g., due to some mutual impact of predictors).Nevertheless, the qualitative differences between both scatter plots provide some comparative indication.The optimum trend line on the scatter plot (fig.3a) is linear, while on the scatter plot (fig.3b) one can observe a moderate non-linearity as well as can indicate that the inclusion of power function is preferable.

Developed by authors
Two mutually non-related modelling procedures have been applied to reduce the mentioned inconsistencies, to increase the coincidence of the searched regularity with the declared transmission costs, and to ensure the crosscheck of results, thus improving the quality of the model.
To define the non-linearity, which best matches the regularities of the actual costs Ctr, as the first step of the nonlinear regression process we used the NLS function, which determines the nonlinear (weighted) least-squares estimates of the parameters of the nonlinear model: where: -the start values of the β1 and β2 were determined according to the linear model ( 5); -the optimum values of the coefficients β1 and β2 were calculated by NLS function for number of picked γ values using the correlation correl(Ctr:Ctr(nolin)) as the quality criterion.
As the second step we used the linear modelling to find the free term of the final model Ctr(nolin) by processing function ( 7) that was found in the first step: Ctr(nolin) <-lm(Ct~predict(mod(1))) ( 8) Slightly improved formal correlation and p-value (tab.2, columns Ctr(nolin)) in with the linear model Ctr(lin2) is not the main achievement.The ANSDM is significantly reduced (3.03%), the deviation of actual costs Ctr from the modelled Ctr(nolin) has decreased for 14 of 20 networks, while it has insignificantly increased only for 4 networks.The highlighted confidence interval (at showed 95% confidence level) is satisfactorily narrow.One can see that the nonlinear model (fig.4a) is much more coinciding with the actual costs of the utilities in comparison with the linear one.Another modelling procedure that was applieda classical version of multi-layer perceptron with the error backpropagation training algorithm.The multi-layer perceptrons have proven themselves in a variety of data mining tasks (e.g., Daranda, 2016;Jan, & Ali, 2018;Briedis, & Freivalds, 2018).To obtain the neural model for our case, in addition to standard procedures we introduced techniques to cope with the task-specific conditions: (a) small amount of training data; (b) the monotony requirements for the resulting function with regard to each of the input parameters.
It was realized in the pre-processing phase that it is enough to have just three neurons (computing units) in the hidden layer of the network to achieve approximately maximum possible accuracy; input data normalization was made by the S-type function because of very diverse scales of data values.It was also experimentally confirmed that it's enough with just 2 input parameters out of 5 to achieve almost the same quality of the model -Dmax and is greater; also, as the total network length decreases, the effect increases as the Dmax-sized pipes make up a larger part of the total network length.

Conclusions
The created methodology and developed benchmarking models well indicate the total network length and the largest inner diameter of pipes as the key drivers of the heat transmission costs, so the network efficiency.Very high coefficient of determination and low ANSDM approve that the modelled causal regularity is very near to the true one in the full range of combination of predictors.The correlation between models, which are created by mutually absolutely independent modelling tools, is excellent.Hence, the trustworthiness in the results is high; it confirms the quality and credibility of the developed methodology as well as its appropriateness for evaluation and optimization of the network design.The methodology is a reasonable compromise between the accuracy and simplicity, which is its strong advantage for the practical applications.
Results of the current study clearly indicates that an optimum planning of the pipe routes, the choice of adequate pipes' sizes, first of all size of large diameter transmission pipes, are the key drivers designing the modern efficient transition and distribution heating networks for the development, expansion and/or renovation of the DH systems.The still used Dmax pipes only in 3 of the 20 networks would match the volume of actual Pcon (see, e.g., ETSAP 2013), even taking into account the reasonable reserve for the potential future growth of the consumption.In other networks, the Dmax exploited for up to 4 standardized sizes exceeds the Dmax optimal.
Replace of existing Dmax size pipes with the matching size pipes during the network renovation would significantly increase the network's efficiency, dropping the Ctr by up to 35% and thus the overall heat supply tariff (Tpr + Ttr) by up to 12%.
In practice, of course, the actual DH systems are not as homogeneous as the theoretical cases discussed above.However, similar principles should be observed when planning their optimal development.An example is the DH system of Riga, which according to its efficiency and other performance indicators ranks among the best city energy systems in the Baltic Sea region.Its structure consists of two large DH systems and more than ten small and very small systems with the boiler houses of adequate capacity (RS, 2018), which supply remote clusters, to which it was not economically justified to build the long poorly loaded transit pipelines, or even such pipes have been liquidated during the reconstruction.
The complete rebuilding of the existing DH systems is, of course, very expensive ant therefore practically impossible.Another option would be to intensify the transition to the fourth generation DH networks with a low heat carrier temperature that will require pipes of larger size.But the heating systems of the buildings currently are not yet prepared for this transition.So the analysis of various DH system models, possible re-planning of the network design and selection of adequate pipe's size is a single real option to enhance the efficiency of the network.
Nevertheless, the current study has to be evaluated as the first stage of the research; it demonstrates the correctness of the working hypothesis and perspective of the research.There is a strong basis for continuation of research to further enhance the methodology, to increase its quality and to narrow the standard segment; further improvement of the methodology certainly is possible and necessary.
Currently data on only 23 of 56 functioning networks were available for modelling; in addition three of them were classified as outliers.Increase of the number of data sets would be highly desirable to reflect better the large range of networks' sizes; mentioned differences between the correlation and ANSDM is a direct sign.Our previous experience shows that the availability of data over several consecutive years is a substantial positive factor.It will become possible to identify and to remove from the full data pack several unreliable and/or even contradictory bad data sets, which distort the regularity of the declared data on the networks.Modelling will be made on the basis of the most qualitative and reliable (good) data sets; the obtained general regularity will be applicable to the bad networks of low confidence too.Fig. 5 clearly shows that some data of several networks seems being too far from the general regularity; it is known that operators of these small networks performs a number of other municipal functions too, obviously the cost accounting is far from perfect in the utility.The consequences of the model's attempts to include such data as much as possible in the common model are the deterioration of the quality of the model.E.g., in our case the exclusion of only two bad data sets reduces the ANSDM by one third.
Overall, the results of the study emphasize the municipalities' duty to be proactive in fulfilling their heating supply obligations, ensuring the most efficient functioning of the DH networks and thus minimizing the heat tariffs for their area's population and other consumers.It is time to revise approach of local governments to heat supply; each municipality needs to consider carefully whether to give the preference to a centralized energy model with long, inefficient large-sized pipes, or to several local heat generation stations and an efficient distribution networks.Each individual DH project must be carefully analysed by reviewing the stagnant habits and finding the optimal solution; the work and resources invested will pay off by the most effective network design and implementation for specific area.

Table 1 .
Statistical characteristics of Ctr(lin9) model and its predictors

Table 2 .
Characteristics of the models and predictors