of a large-scale battery ageing experiment

17
energies Article Feature Extraction, Ageing Modelling and Information Analysis of a Large-Scale Battery Ageing Experiment Jose Genario de Oliveira, Jr. 1, * , Vipul Dhingra 2 and Christoph Hametner 1 Citation: de Oliveira, J.G., Jr.; Dhingra, V.; Hametner, C. Feature Extraction, Ageing Modelling and Information Analysis of a Large-Scale Battery Ageing Experiment. Energies 2021, 14, 5295. https://doi.org/ 10.3390/en14175295 Academic Editor: Carlos Miguel Costa Received: 23 July 2021 Accepted: 23 August 2021 Published: 26 August 2021 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affil- iations. Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). 1 Christian Doppler Laboratory for Innovative Control and Monitoring of Automotive Powertrain Systems, TU Wien, 1010 Vienna, Austria; [email protected] 2 AVL List GmbH, 8010 Graz, Austria; [email protected] * Correspondence: [email protected]; Tel.: +43-681-207-00814 Abstract: Large scale testing of newly developed Li-ion cells is associated with high costs for the interested parties, and ideally, testing time should be kept to a minimum. In this work, an ageing model was developed and trained with real data from a large-scale testing experiment in order to answer how much testing time and data would have been really needed to achieve similar model generalisation performance on previously unseen data. A linear regression model was used, and the feature engineering, extraction and selection steps are shown herein, alongside accurate prediction results for the majority of the accelerated ageing experiments. Information analysis was performed to achieve the desired data reduction, obtaining similar model properties with a fifth of the number of cells and half of the testing time. The proposed ageing model uses features commonly found in the literature, and the structure is simple enough for the training to be performed online in an EV. It has good generalisation capabilities. Lastly, the data reduction approach used here is model-independent, allowing a similar methodology to be used with different modelling assumptions. Keywords: battery ageing; battery modelling; capacity fade estimation; feature engineering 1. Introduction The usage of appliances that rely on Li-ion storage technology to function has sky- rocketed in recent years, with examples ranging from mobile phones to large-scale energy storage and electric/hybrid vehicles. Due to that, there is an ever-growing necessity to study the causes and consequences of degradation processes associated with Li-ion batter- ies. From a safety standpoint, battery cells can degrade in ways that lead to, for example, thermal runaways that can pose a serious risk to users, and it is paramount that the ageing processes that drive the cells into these states are well understood. Additionally, consid- ering that currently it is estimated that the battery represents a significant portion of the total costs in an electric vehicle [1], models that correctly predict the capacity fade and are used to develop operating strategies that minimise cell degradation are useful in order to reduce total ownership costs of electric vehicles (EV) and hybrid electric vehicles (HEV), as shown in [2]. There are two important research areas associated with battery state of health (SoH). The first is the SoH estimation, which tries to estimate the current battery degradation dur- ing normal vehicle operation. Examples of such approaches can be seen in [35]. The other area is the SoH prediction, which predicts how the battery will degrade over time, depend- ing on how it was used. There are several use cases when it comes to ageing prediction itself. The most common goal is to predict the remaining useful life (RUL) of the battery at a given state and presumed load. In this work, the main target was to predict the capacity fading trajectory, yielding additional insights on what happens between the current battery state and the end of life. For that purpose, an ageing prediction model that maps the future inputs to some predicted capacity and/or internal resistance change was required. In general, these models can have different structures and focuses, being divided usually Energies 2021, 14, 5295. https://doi.org/10.3390/en14175295 https://www.mdpi.com/journal/energies

Upload: others

Post on 01-Oct-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: of a Large-Scale Battery Ageing Experiment

energies

Article

Feature Extraction, Ageing Modelling and Information Analysisof a Large-Scale Battery Ageing Experiment

Jose Genario de Oliveira, Jr. 1,* , Vipul Dhingra 2 and Christoph Hametner 1

�����������������

Citation: de Oliveira, J.G., Jr.;

Dhingra, V.; Hametner, C. Feature

Extraction, Ageing Modelling and

Information Analysis of a Large-Scale

Battery Ageing Experiment. Energies

2021, 14, 5295. https://doi.org/

10.3390/en14175295

Academic Editor: Carlos Miguel

Costa

Received: 23 July 2021

Accepted: 23 August 2021

Published: 26 August 2021

Publisher’s Note: MDPI stays neutral

with regard to jurisdictional claims in

published maps and institutional affil-

iations.

Copyright: © 2021 by the authors.

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license (https://

creativecommons.org/licenses/by/

4.0/).

1 Christian Doppler Laboratory for Innovative Control and Monitoring of Automotive Powertrain Systems,TU Wien, 1010 Vienna, Austria; [email protected]

2 AVL List GmbH, 8010 Graz, Austria; [email protected]* Correspondence: [email protected]; Tel.: +43-681-207-00814

Abstract: Large scale testing of newly developed Li-ion cells is associated with high costs for theinterested parties, and ideally, testing time should be kept to a minimum. In this work, an ageingmodel was developed and trained with real data from a large-scale testing experiment in order toanswer how much testing time and data would have been really needed to achieve similar modelgeneralisation performance on previously unseen data. A linear regression model was used, and thefeature engineering, extraction and selection steps are shown herein, alongside accurate predictionresults for the majority of the accelerated ageing experiments. Information analysis was performedto achieve the desired data reduction, obtaining similar model properties with a fifth of the numberof cells and half of the testing time. The proposed ageing model uses features commonly found in theliterature, and the structure is simple enough for the training to be performed online in an EV. It hasgood generalisation capabilities. Lastly, the data reduction approach used here is model-independent,allowing a similar methodology to be used with different modelling assumptions.

Keywords: battery ageing; battery modelling; capacity fade estimation; feature engineering

1. Introduction

The usage of appliances that rely on Li-ion storage technology to function has sky-rocketed in recent years, with examples ranging from mobile phones to large-scale energystorage and electric/hybrid vehicles. Due to that, there is an ever-growing necessity tostudy the causes and consequences of degradation processes associated with Li-ion batter-ies. From a safety standpoint, battery cells can degrade in ways that lead to, for example,thermal runaways that can pose a serious risk to users, and it is paramount that the ageingprocesses that drive the cells into these states are well understood. Additionally, consid-ering that currently it is estimated that the battery represents a significant portion of thetotal costs in an electric vehicle [1], models that correctly predict the capacity fade and areused to develop operating strategies that minimise cell degradation are useful in order toreduce total ownership costs of electric vehicles (EV) and hybrid electric vehicles (HEV),as shown in [2].

There are two important research areas associated with battery state of health (SoH).The first is the SoH estimation, which tries to estimate the current battery degradation dur-ing normal vehicle operation. Examples of such approaches can be seen in [3–5]. The otherarea is the SoH prediction, which predicts how the battery will degrade over time, depend-ing on how it was used. There are several use cases when it comes to ageing predictionitself. The most common goal is to predict the remaining useful life (RUL) of the battery ata given state and presumed load. In this work, the main target was to predict the capacityfading trajectory, yielding additional insights on what happens between the current batterystate and the end of life. For that purpose, an ageing prediction model that maps thefuture inputs to some predicted capacity and/or internal resistance change was required.In general, these models can have different structures and focuses, being divided usually

Energies 2021, 14, 5295. https://doi.org/10.3390/en14175295 https://www.mdpi.com/journal/energies

Page 2: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 2 of 17

into model-based, data-driven and hybrid approaches. An excellent overview on the topicis given in [6]. The main advantage of the mechanistically focused approaches is that theyare able to explain how the ageing process occurs in operating regimes not necessarilycovered in the training set or for other battery chemistries, given enough knowledge onthe physical parameters and underlying ageing mechanisms. However, often the electro-chemical side-reactions are complex; they are usually based on the solutions of partialdifferential equations; they are usually hard to parametrise correctly and time-consumingto simulate. A thorough analysis on several ageing phenomena that normally occur ina Li-ion battery cell was done in [7], showing that the intensity of each ageing processdiffers according to chemistry, and often these different degradation effects are modelledseparately, with the integration of those mechanisms also posing a significant challenge.An example of an approach where the degradation effects are modelled individually can beseen in [8]. A first principles model was developed, and the authors investigated the lossof active material due to the formation of a film over the surface of the negative electrodewhen the cell was in charge mode. Reference [9] simplifies a pseudo-2D electrochemicaldegradation model in order to improve computational speed while conserving accuracysimilarly to a first principles model, with the final model lying somewhere in between thetwo approaches. However, the parametrisation of such a model is done with measurementsof the cells kept at constant storage and cycling conditions. On the other hand, data-drivenmodels are normally restricted to specific cell chemistry and use curve-fitting tools toanalyse the influences of effects such as time, number of cycles, temperature and voltageon ageing. They are also usually divided into cycling and calendar ageing. An exampleof such an approach can be seen in [10], which uses an Arrhenius type function to modelthe capacity fade over the ampere-hour Ah throughput. In [11], a capacity fade model dueto cycling and calendar ageing, restricted to constant experimental conditions, is investi-gated. A common limitation of some of the approaches mentioned in the literature is thatextending these models to a more general framework that considers changes in ageingfactors such as average C-rate or temperature, is often not straightforward. One example ofa more general approach can be seen in [12]. The authors used a Gaussian process model topredict the capacity fade with an input feature extraction procedure directly from the inputsignal. Additionally, with new technologies associated with Li-ion cells being developedconstantly, often there is an ever-pressing need of being able to test and develop models ina quick fashion, not just for these new cells, but for their ageing processes as well. One vitalstep of this testing procedure is the ability to make an informed decision, often based onpast experience, of how much testing is required to incorporate these new cells into existingmethodologies and perform these tests accordingly. Specifically concerning ageing models,these tests are quite time-consuming, even when accelerated ageing tests are conducted,and it is a challenge to determine how much and for how long to test.

The main contributions of this work are in helping to answer to the questions: howmuch and for how long should we test new Li-ion cells in order to obtain an ageingprediction model that is acceptable? This was done in a qualitative fashion, using thedeveloped model, showing the trade-off between validation performance and quantity ofdata used. This analysis was done based on information theory; thus, it can be extendedto other models as well. The other main contribution was the ageing modelling approachitself. It aims to build an incremental capacity model that extracts specific features fromarbitrary known future loads in the form of current I, voltage V and temperature T signals,and then maps those to a capacity change ∆Q associated with this interval. This is donein such a fashion that the model training could be performed online and with limitedhardware, such as in a battery management system (BMS), as shown in Figure 1. In order toparametrise the ageing model, a dataset of roughly two hundred LiNiMnCoO2 (NMC) cellswas used. They were submitted to a wide array of different accelerated ageing profiles.

Page 3: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 3 of 17

Figure 1. A block diagram of the state estimation and ageing prediction structure.

This paper is divided in the following sections: In Section 2, the dataset used will bepresented alongside the available measurements from one of the cells. Section 3 explainshow the features used in the models were extracted and selected from the measurements,also displaying the model structure considered for the ageing model. In Section 4 presentsthe validation results of the ageing model compared against the dataset, Section 5 con-tains an Information-based analysis using the model developed in previous sections andSection 7 is the conclusion.

2. Battery Ageing Data2.1. Dataset Description and Testing Equipment

A dataset consisting of about two hundred NMC cells (18,650) was used in this work.The nominal capacity of the cells was 330 mAh, and they were aged under diverse operatingconditions, for a period of approximately two years. The experiments were conducted ontwo different test sites. The first used a battery tester from ARBIN, containing six channelswith a current range of ±300 A (5 V). The other test site used a DIGATRON lithium celltester with access to four circuits that were able to source ±400 A from 0–6 V. For eachtest in this dataset, the available measurements are current I, temperature T and voltage V.More information on the cells can be seen in Table 1.

Table 1. Additional information on the lithium-ion cells.

Cell Chemistry

Positive Electrode NMCNegative Electrode GraphiteNominal Capacity 330 mAh

Upper Voltage Limits

Constant Current 4.085 V5 s Pulse 4.2 VSafety Limit 4.5 V

Lower Voltage Limits

Constant Current 3.354 V5 s Pulse 2.8 VSafety Limit 2.5 V

Page 4: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 4 of 17

2.2. Testing Overview

The design of experiments for the tests fall within two categories: the calendar andthe accelerated ageing experiments. For the calendar ageing experiments, the two mainimpact factors defined previously were the storage SoC, which varied from 5 to 95% andtemperature, ranging from −10 to 60 ◦C. For all tests, reference cycles were carried outat different temperature levels and were repeated from time to time in order to extractdifferent cell parameters over time. A reference cycle profile is shown in Figure 2 at 25 ◦C.In order to cause the accelerated ageing behaviour, load cycles were repeated exhaustively.The main impact factors for the accelerated ageing tests are shown in Table 2. They weretemperature(T), constant charging current (CC), peak discharging current (PDC), averageSoC (SOC) and delta DoD (dDoD). The load cycles were generated by charging the cellwith a predefined constant current and discharging it in a way that the remaining impactfactors were met, which was then repeated for a predefined time.

Table 2. Impact factors for the accelerated ageing tests.

Variable Minimum Maximum

T −20 ◦C 45 ◦CCC 0.2 ◦C 2.4 ◦CPDC 0.2 ◦C 10 ◦CSOC 15% 80%dDoD 2.5% 80%

Version August 12, 2021 submitted to Energies 4 of 18

-3-2-101

3.4

3.8

4.2

5 10 15 20 25 30 3525.2

25.6

26

26.4

t(h)

Cur

rent

(A)

Volt

age

(V)

Tem

p.(◦

C)

Figure 2. Current, voltage and temperature profiles over time from a reference test cycle showingcharacterization pulses at varying SoCs.

are carried out at different temperature levels and are repeated from time to time in115

order to extract different cell parameters over time. A reference cycle profile is shown116

in Figure 2 at 25◦C. In order to cause the accelerated ageing behaviour, load cycles117

were repeated exhaustively. The main impact factors for the accelerated ageing tests118

are shown in Table 2, they were temperature(T), constant charging current(CC), peak119

discharging current(PDC), average SoC(SOC) and delta DoD(dDoD). The load cycles120

were generated by charging the cell with a predefined constant current and discharge it121

in a way that the remaining impact factors are met, then repeated for a predefined time.122

Examples of these load cycles are depicted in Figures 3 and 4. Additionally, there are

Table 2: Impact factors for the accelerated ageing tests

Variable Minimum Maximum

T -20◦C 45◦CCC 0.2C 2.4CPDC 0.2C 10CSOC 15% 80%dDoD 2.5% 80%

123

variable resting times between load and reference cycles that differ from cell to cell and124

from test to test.125

2.3. Capacity estimation and initial analysis126

The capacity is estimated from Coulomb counting between the end of a CC-CV127

charge (constant current, constant voltage at Vmax) and the end of a CC-CV discharge128

at Vmin. This can be seen in the beginning of the voltage plot in Figure 2. The voltages129

Vmin and Vmax for the CV parts are usually defined by the manufacturer, here they are130

3.5 and 4.1 V, corresponding to 15% and 95% SoC at 25◦C. Since there is a dependency131

of the battery terminal voltage on the temperature, the discharge capacity estimated in132

this fashion is also temperature-dependent. The reference temperature for the capacity133

estimation is assumed here to 25◦C. Thus, only the capacity values estimated at this134

temperature will be used for the ageing analysis. This is important in order to decouple135

the effects of the temperature in estimating the discharge capacity to ageing per se. Table136

3 shows the average ageing conditions for some selected cells in the dataset, e.g. one137

can have an idea of the temperature influence from cells 1, 4 and 33 when comparing138

Figure 2. Current, voltage and temperature profiles over time from a reference test cycle showingcharacterisation pulses at varying SoCs.

Examples of these load cycles are depicted in Figures 3 and 4. Additionally, there werevariable resting times between load and reference cycles that differed from cell to cell andfrom test to test.

Page 5: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 5 of 17

Version August 12, 2021 submitted to Energies 5 of 18

-3-2-10

3.23.43.63.8

4

2 4 6 8 10 12 14 16

40

41

42

t(h)

Cur

rent

(A)

Volt

age

(V)

Tem

p.(◦

C)

Figure 3. Current, Voltage and Temperature profiles of a section of a load cycle at high tempera-tures, displaying uneven charge and discharge behaviour.

-0.2

0

0.2

3.6

3.8

4

5 10 15 20 25 30-9.4

-9

-8.6

-8.2

Cur

rent

(A)

Volt

age

(V)

Tem

p.(◦

C)

t(h)Figure 4. Snapshot of Current, Voltage and Temperature profiles on a section of a load cycle atlow temperatures with similar charge and discharge pulses.

Figure 3. Current, voltage and temperature profiles of a section of a load cycle at high temperatures,displaying uneven charge and discharge behaviour.

Version August 12, 2021 submitted to Energies 5 of 18

-3-2-10

3.23.43.63.8

4

2 4 6 8 10 12 14 16

40

41

42

t(h)

Cur

rent

(A)

Volt

age

(V)

Tem

p.(◦

C)

Figure 3. Current, Voltage and Temperature profiles of a section of a load cycle at high tempera-tures, displaying uneven charge and discharge behaviour.

-0.2

0

0.2

3.6

3.8

4

5 10 15 20 25 30-9.4

-9

-8.6

-8.2

Cur

rent

(A)

Volt

age

(V)

Tem

p.(◦

C)

t(h)Figure 4. Snapshot of Current, Voltage and Temperature profiles on a section of a load cycle atlow temperatures with similar charge and discharge pulses.

Figure 4. A snapshot of current, voltage and temperature profiles on a section of a load cycle at lowtemperatures with similar charge and discharge pulses.

2.3. Capacity Estimation and Initial Analysis

The capacity was estimated from Coulomb counting between the end of a CC–CVcharge (constant current, constant voltage at Vmax) and the end of a CC–CV discharge atVmin. This can be seen in the beginning of the voltage plot in Figure 2. The voltages Vminand Vmax for the CV parts are usually defined by the manufacturer; here they were 3.354and 4.085 V, corresponding to 15% and 95% SoC at 25 ◦C. Since there is a dependencyof the battery terminal voltage on the temperature, the discharge capacity estimated inthis fashion is also temperature-dependent. The reference temperature for the capacityestimation was assumed to be 25 ◦C. Thus, only the capacity values estimated at thistemperature were used for the ageing analysis. This was important in order to decouplethe effects of the temperature when estimating the discharge capacity after ageing. Table 3shows the average ageing conditions for some selected cells in the dataset; e.g., one canhave an idea of the temperature influences from cells 1, 4 and 33 when comparing thecapacity losses of these cells. A similar analysis can be done for the other average featuresshown here, such as the total Ah influence when looking at cells 5 and 17.

Page 6: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 6 of 17

Table 3. Multiple cells’ data. An overview of the average conditions of the ageing tests. The tem-peratures are in ◦C, the overall experimental durations are in days and the capacity fields are inmAh.

Cell Nr. Temperature Duration Mean Ah Total Ah Initial Cap. Cap. Loss

1 −9 540 1.33 × 104 2.83 × 106 315 334 8 546 1.27 × 104 2.93 × 106 314 405 8 310 9.83 × 103 7.20 × 106 329 11517 8 395 1.47 × 104 2.84 × 106 321 3833 41 546 1.20 × 104 2.28 × 106 319 6946 −8 148 1.33 × 104 3.62 × 106 320 6563 29 547 1.55 × 104 4.06 × 106 322 35141 40 418 1.24 × 104 3.50 × 106 324 96217 42 176 1.10 × 104 9.62 × 106 319 120

3. Model Based Analysis

One of the goals of this work was to be able to develop a model that predicts thecapacity loss based on measurement data from a cell. In order to do that, one of the keyquestions that needs to be answered is, what causes a cell to age? Another is, how can weextract such information from the available measurements from the dataset? As mentionedearlier, extensive work has been done in this area from an electrochemical point of view.The main challenge associated with this approach is how to observe such effects withouthaving to conduct a thorough inspection of the battery. Ideally, this information shouldbe seen from the current, voltage and temperature data, which are the measurementsthat would be usually available online in a battery management system (BMS). Hence,a data-driven approach backed up by expert knowledge was pursued. The followingequation is used to depict the incremental capacity loss associated with one interval:

∆Qi = f (xi|θ) (1)

where θ is the model parameter vector and xi is the feature vector extracted from the Ii,Vi and Ti signals. The subscript i denotes that the measurements were taken at the timeinterval i, between an arbitrary initial time instant tini and the final time tend. These featuresare mostly what we expect to be relevant, based on different Li-ion cell ageing processes. Itis also relevant to point out the difference between the incremental capacity loss ∆Qi andthe capacity itself Qi, which are related as

Qi = Qi−1 − ∆Qi. (2)

From the measurements, the main features that drive the ageing of the cells are boththe time interval ∆t, defined as

∆t = tend − tini, (3)

with ∆t being the duration of the cycle, and the absolute amount of charge transferred toand from the cell, in Ah, resulting in

∆Ah =1

3600

∫ tend

tini

|I(t)|dt. (4)

Another feature, which is more complex, arises from the reconstruction of the state ofcharge (SoC) from the current signal via standard coulomb counting, such that

SoC(t) =1

Cnom

∫ t

tini

I(t)dt + SoC(tini), (5)

Page 7: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 7 of 17

where Cnom is the cell capacity at the beginning of the load profile, tini is the initial time,tend is the final time, I is the current signal and SoC(tini) is the initial value of the SoC inthe considered interval. While it would be possible to use an observer to reconstruct theSoC, as presented in [13,14], the results obtained using standard Coulomb counting wereconsidered sufficient for the scope of this work. The reconstructed SoC obtained from thisprocedure was then analysed with a rainflow counting algorithm, as suggested in [15]. Thisis depicted in Figure 5, in order to break down the SoC profile into N elementary cycles,each with a different amplitude ∆DoDi. The DoD, or depth of discharge, is defined here asDoD = 1− SoC, and this nomenclature will be used in order to maintain conformity withother works. This allows for the extraction of more features, such as

∆DoD =1N

N

∑i=1

∆DoDi, (6)

namely, the average ∆DoD and the frequency of cycles associated with this procedure:

ω∆DoD =N

tend − tini. (7)

The rainflow analysis was also done for the current signal I, resulting in additionalfeatures ∆I and ω∆I defined analogously as ∆DoD and ω∆DoD. Additional features pro-portional to the power dissipation in the cell were also considered, resulting in

I2 =1

tend − tini

∫ tend

tini

I(t)2dt (8)

and

∑ I2 = (tend − tini)I2. (9)

The complete list of features that were considered for the model can be seen in Table 4.For this work, the main drive behind this feature extraction procedure was that it we wantedto have an underlying physical motivation behind the inclusion of a given feature. Thisresulted in the elimination of features commonly used in standard time-series analysis thatwould have been hard to interpret and/or that are arbitrary, such as the second coefficientof a Fourier decomposition of the voltage signal with a given sampling time. The latter issignificantly harder to interpret than, e.g., the average temperature in a given cycle. Note thatat this step there was no consideration of whether a feature is important for capacity fadeprediction or not; this will be investigated in more detail in the next section.

Version August 12, 2021 submitted to Energies 8 of 18

30 30.5 31 31.5 320

20

40

60

80 State of Charge

Cycle Breakdown

t(h)

SoC(%

)

∆DoD1

∆DoD2 ∆DoD3

Figure 5. Example of feature extraction procedure (∆DoD and ω∆DoD) from the SoC signal witha rainflow counting algorithm.

Table 4: Extracted Features

Feature Description

∆t Time interval∆Ah Charge in a given intervalT Average temperaturetini Absolute time at the beginningAhini Absolute total charge at the beginningV Average Voltage

¯SoC Average SoC∆DoD Average cycle magnitude extracted from SoC∆I Same as above, but extracted from Current Iω∆DoD Cycling frequency extracted from SoCω∆I Same as above, but extracted from Current IIch Average charging currentIdis Average discharging currentI2 Average squared current∑ I2 Sum of squared currentN Number of cycles from DoD analysis

4. Model Structure Selection192

With the list of features considered for the ageing model presented in the last sec-193

tion, the question about how to combine and select such features remains. These topics194

will be discussed in the next subsections, both regarding model structure and which195

features to use.196

4.1. Structure Considerations197

In order to keep the model with some degree of interpretability, there are some198

considerations that were made concerning the structure of the model:199

1. Given an arbitrary ∆Q with elapsed time ∆t and charge throughput ∆Ah, if we200

split the interval into parts such that:201

∆t = ∆t1 + ∆t2, (10)

∆Ah = ∆Ah1 + ∆Ah2 (11)

Figure 5. An example of feature extraction procedure (∆DoD and ω∆DoD) from the SoC signal witha rainflow counting algorithm.

Page 8: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 8 of 17

Table 4. Extracted features.

Feature Description

∆t Time interval∆Ah Charge in a given intervalT Average temperaturetini Absolute time at the beginningAhini Absolute total charge at the beginningV Average Voltage

¯SoC Average SoC∆DoD Average cycle magnitude extracted from SoC∆I Same as above, but extracted from Current Iω∆DoD Cycling frequency extracted from SoCω∆I Same as above, but extracted from Current IIch Average charging currentIdis Average discharging currentI2 Average squared current∑ I2 Sum of squared currentN Number of cycles from DoD analysis

4. Model Structure Selection

With the list of features considered for the ageing model presented in the last section,the question about how to combine and select such features remains. These topics arediscussed in the next subsections, both regarding model structure and which featuresto use.

4.1. Structure Considerations

In order to keep the model having some degree of interpretability, certain considera-tions were kept in mind concerning the structure of the model:

1. Given an arbitrary ∆Q with elapsed time ∆t and charge throughput ∆Ah, if we splitthe interval into parts such that:

∆t = ∆t1 + ∆t2, (10)

∆Ah = ∆Ah1 + ∆Ah2 (11)

then

∆Q = ∆Q1 + ∆Q2 (12)

must hold for any positive ∆t1, ∆t2, ∆Ah1, ∆Ah2.2. If ∆t = 0 and ∆Ah = 0 are zero then ∆Q = 0. Note that it is not possible that ∆Ah 6= 0

and ∆t = 0.3. In [7], the ageing effects were described together with accelerating factors. One

possible way to take that into consideration is that these accelerating factors shouldbe linked to a main ageing feature in a regression structure.

4. Ideally, the model’s structure should be kept simple in order to, if necessary, performthe parametrisation in an online fashion, with limited hardware, such as in a BMS.

The intuitive reason behind the first requirement is that given any arbitrary load,the sum of the capacity loss due each partial load obtained by splitting the original loadinto smaller pieces should be the same as the overall capacity loss. The second requirementconstrains the model in a way that if no time has passed, there should be no capacity loss,which is also intuitive. However, these requirements restrict the possible model structuresfor the ageing data. On Table 4, it is easy to verify that there are some features that areintrinsically associated with the rate of decay of the capacity, such as T, V, SoC, ∆DoD, ∆I,ω∆DoD, ω∆I , Ich, Idis, I2 and N. These features are modelled as accelerating factors for the

Page 9: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 9 of 17

elapsed time ∆t and charge throughput ∆Ah, as done in a similar fashion in [15]. An initialgeneric structure of a model based on the previous discussion is

∆Qi = f1(tini, ∆t)g1(vi) + f2(Ahini, ∆Ah)g2(vi), (13)

with vi being the feature vector associated with the rates, as described previously for agiven interval i, f1 and f2; and g1 and g2 being functions associated with their arguments.Note that by doing this, the model is split into one part associated with calendric ageingand another associated with cycling. It is also interesting to point out that some featuresshown in Table 4 are linked to calendric ageing and/or cycling. The temperature is oftenconsidered as an Arrhenius-type accelerating factor that is independent from the rest. Thisconsideration will be somewhat relaxed by considering a second-order linear relationshipto model the temperature effect. By taking into account these structural considerations,the accelerating factor functions g1 and g2 were chosen as

g1 = conv([1 T T2], [1 SoC V])θg1 , (14)

g2 = conv([1 T T2], [1 SoC V ∆DoD ∆I ω∆DoD... (15)

...ω∆I Ich Idis I2 ∑ I2 N])θg2 ,

where conv(u, v) denotes the convolution between two vectors u and v, modelling theinteractions between an accelerated factor function of temperature and the other features.The parameter vectors associated, respectively, with g1 and g2 are θg1 and θg2 . The functionsf1(tini, ∆t) and f2(Ahini, ∆Ah) are taken as

f1 = (tini + ∆t)p − tpini, (16)

f2 = (Ahini + ∆Ah)q − Ahqini. (17)

The underlying reason for this choice was to have the functions f1 and f2 modelbehaviours similar to mixed kinetic-diffusion processes, as seen in [15]. This assumption isflexible enough to represent both linear and accelerated behaviour during the cell beginningof life, which in general can be modelled by such processes. The parameters p and q areregressed in conjunction with the feature selection procedure that will be presented next.

4.2. Feature Selection

When selecting the features relevant for the model, all possible subsets of features wereenumerated. For each combination, the parameters p and q were found by minimisingthe root mean squared error (RMSE) for the capacity loss over the overall training set.The approach selected here to ensure that the model generalisation performance would begood was to select the subset of features that minimised the k-fold cross-validation errordefined in [16] and given by

CV( f ) =1N

N

∑i=1

L(yi, f−k(i)(xi)), (18)

with the k-folds being selected randomly, f−k(i) being the fitted function with the k-th partremoved and K being the number of folds. Additionally, the RMSE and NRMSE on Q aredefined as

RMSEQ =1N

N

∑i=1

(Qi − Qi)2, (19)

NRMSEQ =1

QnomRMSEQ, (20)

Page 10: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 10 of 17

with N being the number of data points, Qi the i-th measurement of Q, Qi the i-th predictionof Q and Qnom the nominal value of Q across the dataset, 330 mAh. The metrics RMSE∆Qand NRMSE∆Q are defined in an analogous fashion as Equations (19) and (20), with thenormalisation still being done with respect to the nominal capacity. In general, there arewell-known feature selection approaches available in the literature, such as filter-basedmethods, as seen in [17], and wrapper-based methods, such as exemplified in [18], thatwould be able to solve the task at hand. However, given the specialised restricted structureof the model presented here, the extremely fast model training and the limited numberof features, all possible combinations of features for g1 (with and without SoC and/or Vfor example) and g2 were tested, resulting in 8192 combinations. Additionally, for eachcombination, the optimal parameters p and q were found by solving a minimisationproblem with the 10-fold cross-validation MSE as the objective function. The model whichgave the best performance was found by eliminating the average SoC associated with thecycling part g2 of the model and is given by

g2 = conv([1 T T2], [1 V ∆DoD ∆I ω∆DoD . . . (21)

. . . ω∆I Ich Idis I2 ∑ I2 N])θg2 ,

with the remaining equations Equations (14), (16) and (17) being unchanged. The valuesfor p, q were also regressed, yielding p = 0.5514 and q = 0.5178. Note that with p and qfixed, the full problem can be written in a standard linear regression form with respect tothe parameters

Y = Xθ, (22)

with

X =

f1(tini 1, ∆t1)g∗1(v1) f2(Ahini 1, ∆Ah1)g∗2(v1)...

...f1(tini N , ∆tN)g∗1(vN) f2(Ahini N , ∆AhN)g∗2(vN)

, (23)

θ =

[θg1

θg2

], (24)

Y =

∆Q1...

∆QN

, (25)

where

g1 = g∗1θg1 , (26)

g2 = g∗2θg2 (27)

and N denotes the number of datapoints.

5. Model Training and Validation

In order to validate the modelling approach, the dataset was split into two parts, oneonly for training and the other only for validation. This division was made in a way to keepthe training and validation sets similar to each other, including in terms of total capacityloss. To assess the model quality, there are two key measures that are important when itcomes to age prediction, the error regarding ∆Q, the incremental capacity loss betweentwo consecutive points; and Q, the actual cell capacity. In other words, the model needs toperform well in both these metrics in order for the model quality to be considered good. Itwas found heuristically that due to the possibly high number of features, the best validationresults simultaneously on ∆Q and Q were found when, instead of minimising the `2 normof the error, the regularised `1 norm was minimised. The cost function is written as

Page 11: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 11 of 17

J1 = ||Y − Xθ||1 + λ||θ||1, (28)

where λ is an arbitrary regularisation parameter, chosen here in order to minimise the10-fold cross-validation error in the training set. The problem of minimising the cost shownin the equation above can be rewritten as a linear program (LP), as seen in [19].

Figure 6 shows the model’s prediction in a solid line and the data as crosses for thecapacities of four different cells in the validation set. Some important average ageingfeatures for the cell are shown next to the four plots. The upper left plot depicts a scenariowhere the cell was aged with fairly mild accelerated ageing tests, and the effect of tempera-ture is seen when comparing it to the upper right plot, where the cycling conditions wereon average not as harsh as in the first, but the temperature increased significantly—thecell achieved the same capacity loss in roughly two thirds of the time and with less thanhalf the number of equivalent cycles. The model was able to explain what happened in asatisfactory fashion. The NRMSEQ values for both of these cells indicate that it is reallygood compared to the average cell of the dataset. Again, it is worth emphasising that theaverage values were used for the analysis and not for the ageing model prediction. This isnoticeable from the lower left plot, which shows a slightly higher total capacity loss thanthe upper right plot, despite having almost double the average ∆DoD, triple the numberof cycles and almost double the total testing time. The NRMSEQ value for this cell issignificantly higher than the one from the upper left plot, this is due to an offset. The modelwas still really good, which is indicated by the NRMSE∆Q metric, especially if this modelwould be combined with a state of health observer. The lower right plot shows the model’sprediction for rather severe average ageing conditions, with a high temperature, a highrate of cycling and a high ∆DoD. Almost the same total capacity loss occurred as in theother cells in a third of the testing time. If one just looks at the NRMSEQ, it would seemthat the model prediction errors of the lower cells are similar, but that is not true, simplybecause the NRMSEQ does not give the full picture of the prediction quality.

Figure 7 shows the histogram of the performance on the validation set. There was asubstantial quantity of cells with which the performance was really good, an intermediatesegment with average results and some outlier cells, which brought the average of theNRMSEQ up by a significant margin. Given that the tests performed on the cells variedsubstantially, this could indicate that there were some underlying ageing effects that werewell-explained by the model, e.g., SEI growth, whereas others such as current collectordegradation were simply not covered, and while not a significant phenomenon for mostcells, it degrades the cells in which it is present rather quickly. Additionally, there havemight been a correlation between model prediction and temperature simply because thesemore extreme degradation effects are usually associated with harsh operating conditionsat very high or very low temperatures. The accelerating fade towards the end of life thatis usually present on some Li-ion cells is not seen in the results simply because the vastmajority of the cells did not experience it; thus, it was not significantly present in thedataset. The performance metrics for ∆Q and Q, for both training and validation sets, areshown in Table 5.

Table 5. Training and validation performance.

Dataset NRMSE∆Q NRMSEQ

Training Data 0.0087 0.0353Validation Data 0.0084 0.0384

Page 12: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 12 of 17Version August 12, 2021 submitted to Energies 13 of 18

0 200 400 6000.8

0.9

1

0 200 400 6000.8

0.9

1

0 200 400 6000.8

0.9

1

0 200 400 6000.8

0.9

1

Time(days) Time(days)

Cap

acit

yC

apac

ity

NRMSEQ 0.0036594 NRMSEQ 0.0045392

NRMSEQ 0.014468 NRMSEQ 0.019448

Figure 6. Model prediction(solid) for cells under different average temperatures, delta DoD andnumber of equivalent cycles with varying levels of accuracy.

6. Information Analysis335

Usually the process to obtain the testing data required to parametrize such ageing336

models is quite time-consuming and expensive. It becomes paramount then to answer337

a key question:338

• How much data and testing time are needed in order to parametrize the model?339

These are usually the goal of "Design of Experiments" approaches and are performed340

before the testing phase. However, since the dataset was available here beforehand, the341

question that will actually be answered here instead is:342

• How much of the dataset and testing time would have been needed to sufficiently343

parametrize the model?344

The striking difference between these two questions is that on the first, the tests for345

each cell would have been designed in a way to maximize the information and, on346

the second, the testing was already done. Techniques based on the concept of Fisher347

Information will be used next to analyze the data.348

6.1. Fisher Information349

The Fisher Information matrix, defined in [20] as350

I(θ) = E{−d2ℓ(θ)

dθ2 }, (32)

where ℓ(θ) denotes the log-likelihood function, θ the parameter vector and E{} the351

expected value, allows us to quantify the amount of information contained within a352

dataset, being directly linked to the uncertainty in an estimated parameter vector θ. In353

the special case where the model is linear in the parameters, i.e., so it can be written as354

Y = Xθ+ ǫ, (33)

where ǫ is the measurement error following a normal distribution with variance σ2, the355

Fisher information is simplified to356

Figure 6. Predictions (solid) for cells under different average temperatures, delta DoD and numbersof equivalent cycles, with varying levels of accuracy.

Version August 12, 2021 submitted to Energies 14 of 18

0 2 4 60

5

10

15

Nr. of Cells

25% quantile

50% quantile

75% quantile

NRMSEQ

Num

ber

ofce

lls

Figure 7. Histogram of the validation performance showing good validation for most cells andsome unexplained ageing.

I =XT X

σ2 , (34)

being independent from the parameter vector θ as in the general case. In design of357

experiments, a specific design is said to be efficient when it maximizes the information358

contained within the experiments. Since the Fisher information is a matrix, there are359

different associated matrix norms that can be maximized or minimized when doing360

such task, the so-called optimality criteria. Among those, e.g. A, D, E, the D-optimality361

criterion was chosen due to the results and good numerical properties, such as the lack362

of matrix inversion, which may not be true depending on the optimality criteria of363

choice. A design is said to be D-optimal when it maximizes the determinant of364

D = |XTX|. (35)

This is can be interpreted as making the information matrix ’big’. The matrix X for the365

ageing model analysis is derived in Eqn.(26), hence it is only dependent on extracted366

features of the measurements and, as mentioned earlier, not on the parameters, being367

completely independent from the model training step.368

6.2. Optimal Cell Selection369

Based on the discussion above, a good heuristic for selecting the minimum amount370

of cells needed to parametrize the model is then, to find the optimal cell combination371

i.e. that maximizes the information, where k cells are selected for the model training,372

leaving the remaining cells outside of the training set.373

The data was originally split into a validation and a training set, from which training374

subsets will be selected. At each algorithm iteration, the cell that leads to the least375

amount of information lost is removed from the training set, as shown in Figure 8. A376

comparison between the validation RMSE of the capacity when the training subsets377

are selected at random versus using D-optimality is depicted in Figure 9. The ageing378

model parameters are estimated on the partitions of the training set with the k best cells,379

showing that in this case, selecting the correct 20 cells in the dataset already give 95% of380

the validation RMSE that would be achieved by using the complete training set. While381

information ratio and validation performance ratio are not the same, it makes it possible382

Figure 7. Histogram of the validation performance, showing good validation for most cells and someunexplained ageing.

A small detail that might go unnoticed is that usually the only signal that is knowna priori for a future load profile is the current. In order to correctly estimate the voltagesignal, an accurate cell model is needed, often with parameters changing over time dueto ageing. This might be problematic if we use features to predict ageing, which in turnrequires processing of an unknown voltage signal. Table 6 show the results for training andvalidation performance without considering features associated with the voltage signal,i.e., omitting the average voltage V as a feature. It is visible from the same table that theperformance of the model on validation and training decreased minimally.

A similar argument could be made with respect to the future temperature measure-ments. However, since the temperature is often controlled in a BMS, and the seasonalaverage temperatures are available, it is not unreasonable to assume that at least the averagetemperature of the cell should be known with acceptable precision.

Page 13: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 13 of 17

Table 6. Training and validation—no voltage signal.

Dataset NRMSE∆Q NRMSEQ

Training Data 0.0088 0.0364Validation Data 0.0086 0.0395

6. Information Analysis

Usually the process to obtain the testing data required to parametrise such ageingmodels is quite time-consuming and expensive. It becomes paramount then to answer akey question:

• How much data and testing time are needed in order to parametrize the model?

These are usually the goals of "Design of Experiments" approaches, and they are per-formed before the testing phase. However, since the dataset was available here beforehand,the question that will actually be answered here instead is:

• How much of the dataset and how much testing time would have been needed tosufficiently parametrize the model?

The striking difference between these two questions is that for the first, the tests foreach cell would have been designed in such a way as to maximise the information, andfor the second, the testing was already done. Techniques based on the concept of Fisherinformation will be used next to analyse the data.

6.1. Fisher Information

The Fisher information matrix, defined in [20] as

I(θ) = E{−d2`(θ)

dθ2 }, (29)

where `(θ) denotes the log-likelihood function, θ the parameter vector and E{} the ex-pected value, allows us to quantify the amount of information contained within a dataset,being directly linked to the uncertainty in an estimated parameter vector θ. In the specialcase where the model is linear in the parameters—i.e., when it can be written as

Y = Xθ+ ε, (30)

where ε is the measurement error following a normal distribution with variance σ2—the Fisher information is simplified to

I =XT X

σ2 , (31)

being independent from the parameter vector θ. In Design of Experiments, a specific designis said to be efficient when it maximises the information contained within the experiments.Since the Fisher information is a matrix, there are different associated matrix norms thatcan be maximised or minimised when doing such task, the so-called optimality criteria.Among those, e.g., A, D and E, the D-optimality criterion was chosen due to the results andgood numerical properties, such as the lack of matrix inversion, which may not be truedepending on the optimality criteria of choice. A design is said to be D-optimal when itmaximises the determinant of

D = |XTX|. (32)

This is can be interpreted as making the information matrix "big." The matrix X for theageing model analysis is derived in Equation (23); hence, it is only dependent on extractedfeatures of the measurements, and as mentioned earlier, not on the parameters, beingcompletely independent from the model training step.

Page 14: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 14 of 17

6.2. Optimal Cell Selection

Based on the discussion above, a good heuristic for selecting the minimum amount ofcells needed to parametrise the model is then, to find the optimal cell combination, i.e., theone that maximises the information, where k cells are selected for the model training,leaving the remaining cells outside of the training set.

The data were originally split into a validation and a training set, from which trainingsubsets were selected. In each iteration, the cell that led to the least amount of informationlost was removed from the training set, as shown in Figure 8. A comparison betweenthe validation RMSE of the capacity when the training subsets were selected at randomversus using D-optimality is depicted in Figure 9. The ageing model parameters wereestimated based on the partitions of the training set with the k best cells. In this case,selecting the correct 20 cells in the dataset already gave 95% of the validation RMSE thatwould be achieved by using the complete training set. While the information ratio andvalidation performance ratio are not the same, it made it possible to obtain a good guesson how many different cells this dataset would need to have in order to obtain a similarperformance with respect to ageing modelling. Another important point to be made isthat these results are dependent on the model structure and features that were extractedfrom the data. When using another ageing modelling approach, the guideline on how toconduct the analysis is still valid; however, the results are not. This means that it might bepossible that, with another ageing model, 60 cells instead of 20 would have been needed inorder to achieve the same RMSE ratio between the partial and full dataset.

Version August 12, 2021 submitted to Energies 15 of 18

20 40 60 80 100 120 1400

0.2

0.4

0.6

0.8

1

Information-based

Realization 1

Realization 2

Number of cells

Nor

mal

ized

Info

rmat

ion

Figure 8. Normalized logarithm of D-optimality when optimal selection is done compared torandom realizations

to obtain a good guess on how many different cells this dataset would need to have in383

order to obtain a similar performance w.r.t. ageing modelling. Another important point384

to be made is that these results are dependent on the model structure and features that385

were extracted from the data. When using another ageing modelling approach, the386

guideline on how to conduct the analysis is still valid, however the results are not. This387

means that it might be possible that, with another ageing model, 60 cells instead of 20388

would have been needed in order to achieve the same RMSE ratio between the partial389

and full dataset.390

6.3. Impact of the Testing Time391

Here, the 40 best D-optimal cells were selected and the training data split into392

intervals corresponding to the total amount of time available for testing (2 to 20 months).393

Figure 10 shows the validation RMSE as more data is available until the full testing394

time is used. Different cell selections are also investigated to verify that even with395

the full testing time and wrong cell selection, the validation RMSE does not converge396

to the lower bound, which was computed using the Information-based selection. As397

expected, there is a trade-off between the amount of testing available and the relative398

error, so the decision on how much testing time is required was done here by defining399

a maximum acceptable relative error, which is 10% relative error for 10 months and 3%400

for 15 months in this case.401

7. Conclusion402

The methodology presented here starts with an analysis of the dataset, the so-403

called Feature Engineering steps, which often depends on expert knowledge by extract-404

ing the correct features, followed by a feature selection procedure and model valida-405

tion. Then, the obtained model was used in order to assist on how to answer two key406

questions: How much testing is needed to parametrize such ageing model and which407

tests proved to be more important in doing so. Actually the results provide an insight408

on which cells from the dataset were more important, with the same situation hap-409

pening to the load test types. The key difference is that on the latter, the dataset was410

already obtained, and the key assumption is that some of these results are extendable to411

some other cells types/chemistries, thus providing a guideline or good starting point412

Figure 8. Normalised logarithm of D-optimality when optimal selection was done compared torandom realisations.

Page 15: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 15 of 17

Version August 12, 2021 submitted to Energies 16 of 18

20 40 60 80 100 120 1400

0.05

0.1

0.15

0.2

0.25

0.3

Information-based

Realization 1

Realization 2

Val

idat

ion

RM

SEQ

Number of CellsFigure 9. RMSE on validation set when D-optimal selection is done compared to random realiza-tions.

5 10 15 200

0.05

0.1

0.15

0.2

0.25

0.3

Lower Bound

Information-based

Realization 1

Realization 2

Val

idat

ion

RM

SEQ

Testing Time (Months)

Figure 10. Validation RMSE versus total testing time for the 40 best cells and different selectionstrategies.

Figure 9. RMSE of the validation set when D-optimal selection was done compared to random reali-sations.

6.3. Impact of the Testing Time

The 40 best D-optimal cells were selected and the training data were split into in-tervals corresponding to the total amounts of time available for testing (2 to 20 months).Figure 10 shows the validation RMSE as more data were available, until the full testingtime was used. Different cell selections were also investigated to verify that even withthe full testing time and wrong cell selection, the validation RMSE did not converge onthe lower bound, which was computed using information-based selection. As expected,there was a trade-off between the amount of testing available and the relative error, sothe decision on how much testing time was required was done by defining a maximumacceptable relative error, which was 10% relative error for 10 months and 3% for 15 monthsin this case.

Version August 12, 2021 submitted to Energies 16 of 18

20 40 60 80 100 120 1400

0.05

0.1

0.15

0.2

0.25

0.3

Information-based

Realization 1

Realization 2

Val

idat

ion

RM

SEQ

Number of CellsFigure 9. RMSE on validation set when D-optimal selection is done compared to random realiza-tions.

5 10 15 200

0.05

0.1

0.15

0.2

0.25

0.3

Lower Bound

Information-based

Realization 1

Realization 2

Val

idat

ion

RM

SEQ

Testing Time (Months)

Figure 10. Validation RMSE versus total testing time for the 40 best cells and different selectionstrategies.

Figure 10. Validation RMSE versus total testing time for the 40 best cells and different selec-tion strategies.

Page 16: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 16 of 17

7. Conclusions

The methodology presented here starts with an analysis of the dataset, the so-calledfeature engineering steps, which often depends on expert knowledge by extracting thecorrect features, followed by a feature selection procedure and model validation. Then, themodel was used in order to help with answering two key questions: How much testing isneeded to parametrise such an ageing model? Which tests proved to be more important indoing so? Actually, the results provided an insight into which cells from the dataset weremore important. The same situation happened with the load test types. The key differenceis that for the latter, the dataset was already obtained, and the key assumption is that someof these results are extendable to some other cells types/chemistries, thereby providing aguideline or good starting point on where to look when designing new tests and defininghow many cells should be tested.

Further research on this topic could be done by trying out a more sophisticatedageing modelling approach, since, while the modelling approach presented here worksfor the vast majority of the cells in the dataset, there are some ageing phenomena thatare not explainable using the current methods. On the other hand, this would also implyan information matrix that is possibly dependent on the model parameters, posing anadditional challenge. Finally, it is important to point out that the methodology presentedhere assumes, in the general case, that future signals of voltage, temperature and SoC areavailable, when in reality only the future current is exactly known a priori. This issue canbe addressed by using electrical and thermal models for the cell in order to obtain suchsignals if they are really needed.

Author Contributions: Conceptualisation, V.D., C.H. and J.G.d.O.J.; methodology, J.G.d.O.J. andC.H.; software, J.G.d.O.J.; validation, J.G.d.O.J.; formal analysis, J.G.d.O.J. and C.H.; investigation,J.G.d.O.J., C.H. and V.D.; resources, V.D. and C.H.; data curation, J.G.d.O.J. and V.D.; writing—original draft preparation, J.G.d.O.J.; writing—review and editing, J.G.d.O.J. and C.H.; visualisation,J.G.d.O.J. and C.H.; supervision, C.H.; project administration, C.H.; funding acquisition, V.D. andC.H. All authors have read and agreed to the published version of the manuscript.

Funding: The financial support by the Austrian Federal Ministry for Digital and Economic Affairs;the National Foundation for Research, Technology and Development; and the Christian DopplerResearch Association are gratefully acknowledged.

Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.

Data Availability Statement: Restrictions apply to the availability of these data. Data was obtainedfrom AVL List GmbH, Graz, Austria.

Conflicts of Interest: The authors declare no conflict of interest.

References1. Lutsey, N.; Nicholas, M. Update on electric vehicle costs in the United States through 2030. Int. Counc. Clean Transp. 2019, 2, 1–12.2. Hoke, A.; Brissette, A.; Smith, K.; Pratt, A.; Maksimovic, D. Accounting for Lithium-Ion Battery Degradation in Electric Vehicle

Charging Optimization. IEEE J. Emerg. Sel. Top. Power Electron. 2014, 2, 691–700. [CrossRef]3. Plett, G.L. Sigma-point Kalman filtering for battery management systems of LiPB-based HEV battery packs: Part 2: Simultaneous

state and parameter estimation. J. Power Sources 2006, 161, 1369–1384. [CrossRef]4. Hametner, C.; Jakubek, S.; Prochazka, W. Data-Driven Design of a Cascaded Observer for Battery State of Health Estimation.

IEEE Trans. Ind. Appl. 2018, 54, 6258–6266. [CrossRef]5. Chen, C.; Xiong, R.; Shen, W. A Lithium-Ion Battery-in-the-Loop Approach to Test and Validate Multiscale Dual H Infinity Filters

for State-of-Charge and Capacity Estimation. IEEE Trans. Power Electron. 2018, 33, 332–342. [CrossRef]6. Hu, X.; Xu, L.; Lin, X.; Pecht, M. Battery Lifetime Prognostics. Joule 2020, 4, 310–346. [CrossRef]7. Vetter, J.; Novák, P.; Wagner, M.; Veit, C.; Moller, K.C.; Besenhard, J.; Winter, M.; Wohlfahrt-Mehrens, M.; Vogler, C.; Hammouche,

A. Ageing mechanisms in lithium-ion batteries. J. Power Sources 2005, 147, 269–281. [CrossRef]8. Ramadass, P.; Haran, B.; Gomadam, P.M.; White, R.; Popov, B.N. Development of First Principles Capacity Fade Model for Li-Ion

Cells. J. Electrochem. Soc. 2004, 151, A196–A203. [CrossRef]

Page 17: of a Large-Scale Battery Ageing Experiment

Energies 2021, 14, 5295 17 of 17

9. Jin, X.; Vora, A.; Hoshing, V.; Saha, T.; Shaver, G.; García, R.E.; Wasynczuk, O.; Varigonda, S. Physically-based reduced-ordercapacity loss model for graphite anodes in Li-ion battery cells. J. Power Sources 2017, 342, 750–761. [CrossRef]

10. Wang, J.; Liu, P.; Hicks-Garner, J.; Sherman, E.; Soukiazian, S.; Verbrugge, M.; Tataria, H.; Musser, J.; Finamore, P. Cycle-life modelfor graphite-LiFePO4 cells. J. Power Sources 2011, 196, 3942–3948. [CrossRef]

11. Schmalstieg, J.; Käbitz, S.; Ecker, M.; Sauer, D.U. A holistic aging model for Li(NiMnCo)O2 based 18650 lithium-ion batteries. J.Power Sources 2014, 257, 325–334. [CrossRef]

12. Richardson, R.R.; Osborne, M.A.; Howey, D.A. Battery health prediction under generalized conditions using a Gaussian processtransition model. J. Energy Storage 2019, 23, 320–328. [CrossRef]

13. Wu, X.; Li, X.; Du, J. State of Charge Estimation of Lithium-Ion Batteries Over Wide Temperature Range Using Unscented KalmanFilter. IEEE Access 2018, 6, 41993–42003. [CrossRef]

14. Hametner, C.; Jakubek, S. State of charge estimation for Lithium Ion cells: Design of experiments, nonlinear identification andfuzzy observer design. J. Power Sources 2013, 238, 413–421. [CrossRef]

15. Santhanagopalan, S.; Smith, K.; Neubauer, J.; Kim, G.H.; Keyser, M.; Pesaran, A. Design and aNalysis of Large Lithium-Ion BatterySystems, 1st ed.; Artech House: Norwood, MA, USA, 2015; Volume 4.

16. Friedman, H.; Tibshirani. The Elements of Statistical Learning, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2008.17. Ferreira, A.J.; Figueiredo, M.A. Efficient feature selection filters for high-dimensional data. Pattern Recognit. Lett. 2012,

33, 1794–1804. [CrossRef]18. Liu, H.; Motoda, H. Feature Selection for Knowledge Discovery and Data Mining, 1st ed.; Springer: New York City, NY, USA, 199819. Boyd, S.; Vandenberghe, L. Convex Optimization, 7th ed.; Cambridge University Press: Cambridge, UK, 2009.20. Davison, A.C. Statistical Models, 1st ed.; Cambridge University Press: Cambridge, UK, 2003.