Detections and SIR simulations of the COVID-19 pandemic waves in Ukraine


 
 Background. Unfortunately, the COVID-19 pandemic is still far from stabilizing. Of particular concern is the sharp increase in the number of diseases in June-July, September-October 2020 and February-March 2021. The causes and consequences of this sharp increase in the number of cases are still waiting for their researchers, but there is already an urgent need to assess the possible duration of the pandemic, the expected number of patients and deaths. Correct simulation of the infectious disease dynamics needs complicated mathematical models and many efforts for unknown parameters identification. Constant changes in the pandemic conditions (in particular, the peculiarities of quarantine and its violation, situations with testing and isolation of patients) cause various epidemic waves, lead to changes in the parameter values of the mathematical models.
 
 Objective. In this article, pandemic waves in Ukraine will be detected, calculated and discussed. The estimations for durations and final sizes of the epidemic waves will be presented.
 
 Methods. We propose a simple method for the epidemic waves detection based on the differentiation of the smoothed number of cases. We use the generalized SIR (susceptible-infected-removed) model for the dynamics of the epidemic waves. The known exact solution of the SIR differential equations and statistical approach were used. We will use different data sets for accumulated number of cases in order to compare the results of simulations and predictions.
 
 Results. Nine pandemic waves were detected in Ukraine and corresponding optimal values of the SIR model parameters were identified. The number of cases and the number of patients spreading the infection versus time were calculated. In particular, the pandemic in Ukraine probably began in January 2020. If current trends continue, the end of the pandemic should be expected no earlier than in summer 2021.
 
 Conclusions. The differentiation of the smoothed number of cases, the SIR model and statistical approach to the parameter identification are helpful to select COVID-19 pandemic waves and make some reliable estimations and predictions. The obtained information will be useful to regulate the quarantine activities, to predict the medical and economic consequences of the pandemic.


Introduction
Here we consider the COVID-19 pandemic dynamics in Ukraine with the use of official WHO data sets about the confirmed number of cases, [1]; Ukrainian national statistics (UNS) [2, 3] and COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU), [4]. The classical SIR model, connecting the number of susceptible S , infected and spreading the infection I and removed R persons, was developed in [5,6,7]. The unknown parameters of this model can be estimated with the use of the cumulative number of cases V=I+R and the statistics-based method of parameter identification [8,9].
This approach was used in [9,10,11,12,13,14,15,16,17,18,19,20] to estimate the first waves of pandemic dynamics in Ukraine, the city of Kyiv, China, the Republic of Korea, Italy, Austria, Spain, Germany, France, the Republic of Moldova, the UK, USA and in the world. Usually the number of cases registered during the initial period of an epidemic is not reliable, since many infected persons are not detected. That is why the correct estimations of epidemic parameters can be done with the use of data sets obtained for later periods of the epidemic when the number of detected cases is closer to the real one. On the other hand, changes in quarantine conditions, human behavior, pathogen activity, weather etc. can cause changes in the course of the pandemic, namely the so-called epidemic waves. The mathematical simulation of these waves needs development of some criteria of the waves detection, modification of models and methods of parameter identification.
In this paper simple criteria for identification of next pandemic waves will be discussed. The SIR model and the parameter identification procedure will be modified in order to simulate next waves of the pandemic. The results of calculations for nine waves of the Covid-19 pandemic in Ukraine will be presented and discussed.

. Data
The information regarding the accumulated numbers of confirmed COVID-19 cases V j in Ukraine from WHO daily situation reports, [1]; UNS [2, 3] (denoted by V j and V j ) and JHU [4] (denoted V j ) is presented in Tables 1-3. The corresponding moments of time t j ( measured in days, zero point corresponds to January 20, 2020) are also shown in Tables 1 and 2. All these data sets were used for the identification of the epidemic waves. For SIR simulations were used only some time periods corresponding to specific waves. Other values were used only for verifications of SIR simulations.

. Epidemic waves detection
Changes in quarantine conditions, human behavior, pathogen activity, weather etc. can cause changes in the epidemic dynamics, namely the so-called epidemic waves. The simplest way of their detection is to find some changes in the dependences of the number of registered cases on time. Since the number of cases is random, its time dependence needs some smoothing. We can use the method proposed in [19,20]: and the derivatives of the smoothed values in order to detect the changes in epidemics dynamics. Some mathematical background of this method can be found in [21,22,23,24].
The parameters ρ i characterize the patient removal rates, since eq. (6) demonstrates the increase rate for R. The inverse values /ρ i are the estimations for time of spreading the infection τ. So, we are interested in increasing the values of parameters ρ i and decreasing /ρ i . People and public authorities should work on this and organize immediate isolation of suspicious cases.
Since the derivative d(S + I + R)/dt is equal to zero (it follows from summarizing Eqs. (4)-(6)), the sum must be constant for every wave and is not the volume of population (see also [20]).

. Analytical solution of SIR equations
To determine the initial conditions for the set of equations (4)-(6), let us suppose that at the beginning of every epidemic wave t i * : In particular, when the first wave of the epidemic starts with one infected person, the initial conditions (8) can be written as follows: Equations (9) were used in [9,10,11,12,13,14,15,16,17,18,19,20] to simulate the first waves of the COVID-19 pandemic in different countries. It follows from (4) and (5) that Integration of (10) with the initial conditions (8) yields: It follows from (11) that function I has a maximum at S = ν i and tends to zero at infinity. The corresponding number of susceptible persons at infinity S i∞ > can be calculated from a non-linear equation Formula (12) follows from (11) at I=0. As in [8] we solve (4)-(6) by introducing the function V(t) = I(t) + R(t), corresponding to the number of victims or cumulative confirmed number of cases. It follows from (5)- (7) and (11) that: Integration of (13) yield an analytical solution for the set of equations (4)- (6): . (15) Thus, for every set of parameters N i , (15) can be calculated and a corresponding moment of time can be determined from (14). Then functions I(t) and R(t) can be easily calculated with the use of formulas (11) and: The final number of victims (final accumulated number of cases in i-th epidemic wave) can be calculated from: To estimate the final day of an epidemic wave, we can use the condition: which means that at t > t if less than one person still spreads the infection.

. Parameter identification procedures
In the case of a new epidemic, the values of its independent six parameters are unknown and must be identified with the use of limited data sets. For the first wave of an epidemic starting with one infected person, the number of unknown parameters is only four, since I = and R = . The corresponding statistical approach was used in [8,9,10,11,12,13,14,15,16,17,18,19,20] to estimate the values of four unknown parameters. For the next epidemic waves (i > 1), the moments of time t * i corresponding to their beginning are known. Therefore the exact solution (14)-(16) depend only on five parameters -N i , I i , R i , ν i , α i . Then the registered number of victims V j corresponding to the moments of time t j can be used in eq. (15) in order to calculate for every fixed values of N i , ν i , I i , R i and then to check how the registered points fit the straight line (14).
Eq. (14) can be rewritten as follows: We can estimate the values of parameters γ and β, by treating the values and corresponding time moments t j as random variables. Then we can use the observations of the accumulated number of cases and the linear regression in order to calculate the coefficients γ and β of the regression line using the standard formulas from, e.g., [25]. Values γ and β can be treated as statistics-based estimations of parameters γ and β from relationships (19). The reliability of the method can be checked by calculating the correlation coefficients r i (see e.g., [25]) for every epidemic wave checking how close is its value to unity. We can use also the F-test for the null hypothesis that says that the proposed linear relationship (19) fits the data set. The experimental values of the Fisher function can be calculated for every epidemic wave with the use of the formula: where n i is the number of observations for the i-th epidemic wave, m = 2 is the number of parameters in the regression equation. The corresponding experimental value F i has to be compared with the critical value F C (k , k ) of the Fisher function at a desired significance or confidence level (k = m − , k = n i − m).
When the values n i and m are fixed, the maximum of the Fisher function coincides with the maximum of the correlation coefficient. Therefore, to find the optimal values of parameters N i , ν i , I i , R i , we have to find the maximum of the correlation coefficient for the linear dependence (19). To compare the reliability of different predictions (with different values of n i ) it is useful to use the ratio F i /F C ( , n i − ) at fixed significance level. We will use the level 0.001; corresponding values of F C ( , n i − ) can be taken from [26]. The most reliable prediction yields the highest F i /F C ( , n i − ) ratio. The exact solution (14)- (16) allows avoiding numerical solutions of differential equations (4)-(6) and significantly reduce the time spent on calculations. In the case of sequential calculation of epidemic waves i = 1,2,3 . . . , it is possible to avoid determining the four optimal unknown parameters N i , ν i , I i , R i , thereby reducing the amount of calculations and difficulties in isolation a maximum of the correlation coefficient. For parameters I i , R i it is possible to use the numbers of I and R calculated for the previous wave of epidemic at the moment of time when the following wave began. Then we need to calculate values (20), correlation coefficient r i , F i /F C ( , n− ) and to isolate the values of parameters N i and ν i corresponding to the maximum of r i . Knowing the optimal values of five parameters N i , I i , R i , ν i , α i , the SIR curves and other characteristics of the corresponding epidemic wave can be calculated with the use of formulas (10)- (18). This approach has been successfully used in [20]. In particular, six waves of the Covid-19 epidemic in Ukraine and four pandemic waves in the world were calculated.
Segmentation of epidemic waves and their sequential SIR simulations need a lot of efforts. To avoid this, a new method of obtaining the optimal values of SIR parameters was proposed in [20]. First of all we can use the relationship To estimate the value V i , we can use the smoothed accumulated number of cases (e.g., formula (1)). Then where i corresponds to the moment of time t * i . To obtain one more relationship, let us use (5)- (7). Then To estimate the average number of new cases dV/dt at the moment of time t * i , we can use (2). Thus we have only two independent parameters N i and ν i . To calculate the value of parameter α i , some iterations can be used (see details in [20]). In this study we will use both methods of identification of SIR model parameters.

Results and Discussion . Detection of the COVID-19 pandemic waves in Ukraine
Applications of formulae (1)-(3) for the pandemic dynamics in Ukraine are shown in Figs. 1 and 2. The accumulated numbers of cases ("circles") were smoothed with the use of eq. (1) and shown by lines. "Triangles" and "stars" represent the results of differentiation (2) and (3) respectively. The make the results more visible, the first derivative (2) is multiplied by 100, the second one (formula (3)) -by 1000. Fig. 1 demonstrates that the second derivatives increase a er epidemic outbreak, then become smaller and negative. Such behavior is typical for the first wave of the epidemics (before May 17, 2020). The jumps in the values of the second derivative indicate changes in the conditions of the pandemic (for example, the (1)). Red markers represent first derivative (eq. (2)), black one show the second derivative (eq. (3)).
weakening of quarantine) and the transitions to the next waves with other values of the parameters of mathematical models. These jumps occurred on May 16, May 29, June 8, July 3 and July 19-20 (see Fig. 1). Therefore these days can be treated as the beginning of the second, third etc. waves.
The second epidemic wave in Ukraine was caused by the cancelation of the national lockdown on May 10. A er the incubation period (approximately a er May 16), the number of new cases began to grow faster. Further waves of the epidemic in Ukraine are associated with further easing of quarantine and mass noncompliance with social distancing. The fi h epidemic wave in Ukraine can be explained by the consequences of the holiday season, which increased the number of trips and violations of social distancing.
The COVID-19 pandemic characteristics for Ukraine in autumn 2020 are shown in Fig. 2. Differentiation of the smoothed number of accumulated cases (eq. (1), line) with the use of formulas (2) ("triangles") and (3) (stars) allow us to detect the changes in epidemic dynamics. It can be seen that a er November 25 the average daily number of new cases ("triangles") started to decrease. Similar short periods of the epidemic stabilization occurred in May, June and August, 2020 (see Fig. 1).
The values of the second derivative (3) allow detecting the changes in the epidemic characteristics and separating its different waves. The jump in d V/dt values corresponding to September 11 (see "stars" in Fig. 2) can be explained by the beginning of classes in schools and universities (on September 1). Children and young people are o en asymptomatic carriers of the infection and bring it to their families. For example, employees of two kindergartens and two schools in the Ukrainian city of Chmelnytskii were tested for antibodies to COVID-19, [27]. In total 292 people work in the surveyed institutions. Some of the staff had already fallen ill with COVID-19 or were hospitalized. Therefore, they were not tested accordingly. Of the 241 educators tested, antibodies were detected in 148, or 61.4%. These results indicate the important role of children in the spread of COVID-19 infection and the fact that in Ukraine those people who have become ill and have antibodies to coronavirus infection, obviously, are much more than the official statistics (presented in Tables 1-3) states.
The severe jumps in d V/dt values occurred also in October and November, 2020 (see "stars" in Fig. 2). Probably, this is due to the local elections and a presidential poll, which were held throughout Ukraine on October 25, 2020 and involved hundreds of thousands of people to campaign and work in election commissions (their number was about 30 thousand). This obviously increased the number of contacts and the likelihood of additional infections. The corresponding seventh epidemic wave in Ukraine was considered in [20].
In November 2020, the Ukrainian government introduced a weekend lockdown. In the period from 00:00 on Saturday to 00:00 on Monday in Ukraine from November 14 to November 30 it was prohibited: the work of catering establishments (except for takeaway services), the work of shopping and entertainment centers,  1)). "Triangles" show the first derivative (eq. (2)) multiplied by 100, "stars" -the second derivative (eq. (3) ) multiplied by 1000. entertainment establishments, the activities of economic entities engaged in trade and consumer services, except for food trade in retail space, at least 60 percent of which intended for trade in food, fuel, medicines and medical devices, veterinary drugs, feed. It was prohibited to operate cultural institutions and hold cultural events. Gyms, fitness centers and swimming pools should also be closed, [28]. Probably, positive results of this lockdown are visible in Fig. 2. Between November 21 and December 10 (a er some incubation period) we can see decreasing of the second derivative ("stars").
The SIR simulations of corresponding epidemic waves in Ukraine will be presented in the next Section. In some cases the distance between jumps of the second derivative was too small to make statistical estimates of the parameters (e.g., in the second half of May 2020), so the individual waves of the epidemic during these periods were not isolated.
The COVID-19 pandemic characteristics for Ukraine in December 2020 and January 2021 are shown in Figs. 3 and 4. Red color corresponds to the national statistics (UNS) [2, 3]; black -to JHU data [4]. "Circles" show the corresponding accumulated numbers of cases; lines represent the smoothed number of accumulated cases (eq. (1) ); "crosses" -the first derivative (eq. (2)) multiplied by 10; "dots" -second derivative (eq. (3)) multiplied by 1000. Fig. 3 illustrates the data sets presented in Table 3. It can be seen one day shi for the values of the second derivatives calculated with the use of different data sets. If the numbers of cases reported by JHU are attributed to the previous day, the differences in the values of the second derivatives become almost imperceptible (see Fig. 4), but the numbers of accumulated cases according to JHU are still much higher than for UNS (see Fig. 4).
The jump of the second derivatives occurred on January 10-12, 2021 (see "dots" in Fig. 4) can be explained the New Year and Christmas celebrations. The increase in the number of contacts caused the increase in the number of new cases a er some incubation period (see "crosses" in Figs. 3 and 4). The national lockdown in the period January 8-24, 2021 allowed stopping this tendency.
To illustrate the influence of data on the results of SIR simulations, different estimations of the first epidemic wave in Ukraine are presented in Table 4. It can be seen, that the use of more recent (and complete) data has changed the estimation for the pandemic beginning. Table 4 illustrates that prediction 21 calculated with the use of number of cases from the period May 3-16 (immediately before the start of the second wave) yields much longer hidden period of the epidemic outbreak in Ukraine in comparison with the previous prediction 8, [16]. Prediction 19, [19] yields even longer hidden period, but it was obtained with the use of the dataset for the period May 13-26, which corresponds the transition form first to the second wave. This prediction yielded the smallest values of F i and F i /F C ( , n i − ) in comparison with predictions 8 and 21 (see Table  4). The maximum corresponding values of these parameters demonstrate that the prediction 21 estimating the epidemic outbreak in Ukraine in the beginning of January, 2020 is probably the most reliable. Similar simulations of the global dynamics show that the COVID-19 pandemic probably started in the beginning of August 2019, [20]. The characteristics of the pandemic waves 2-4 for Ukraine are presented in Table 5. The results of SIR simulations of the next pandemic waves in Ukraine are shown in Tables 6 and 7 and in Figs. 5-7. In can be seen that optimal values of the model parameters are rather different for different pandemic waves. In particular, the final sizes and durations of the pandemic significantly differ. It is not surprising, since different time periods T ci with different conditions were used for calculations. Value n i =14 and general SIR model were used for all the cases. The periods T ci taken for calculations were selected according to the time periods of the corresponding epidemic waves (see Tables 5-7). Fig. 5 illustrates first six epidemic waves in Ukraine. The accumulated number of cases V=I+R (solid lines) increases for each next wave. Every new wave also increases the number of infected and spreading the infection persons I. The calculated dependences 10*I(t) are shown by dashed lines. The accuracy of simulations is rather good, especially for second and further waves of pandemic, since blue "stars" ( showing the accumulated values of confirmed cases used only for control the calculations) are located very close to the corresponding solid lines showing the calculated V=I+R values. There are some discrepancies for the early stages of first waves, when the number of registered cases is lower that the real one due to the problems with the identification of the infected persons. Fig. 6 illustrates rather good accuracy of SIR simulations. In particular, the V j values ("triangles") started to deviate from the solid red line only a er September 16. This fact can be explained by the beginning classes in schools and universities (as mentioned above). This curve was calculated with the use of data set from the period August 9-22. The increased number of contacts in schools and universities caused rapid increase of the number of cases. This increase in the number of diseases intensified in October and November 2020 through elections and a presidential poll. The very irregular nature of the epidemic dynamics (particularly the large values of the second derivative (see "stars" in Fig. 1)) led to the fact that V=I+R curve for the seventh wave    very quickly began to deviate from the recorded number of cases V j (compare solid black line and "stars" in Fig. 6). The absence of sharp jumps of the second derivative a er November 21, 2020 (see "stars" in Fig. 2), allowed us to predict quite accurately the further epidemic dynamics (compare solid blue line and "stars" in Fig. 6). Blue dashed line in Fig. 6 show that the number of persons spreading the infection diminished in December 2020. It can be seen that a er December 16, the number of reported cases slightly exceeds the theoretical estimates (compare the blue solid line and "stars" in Fig. 6). As calculations of the eighth epidemic wave were carried out during the period of influence of the weekend lockdown, its cancellation probably led to increase in the number of cases.
The results of SIR simulations with the use of two data sets (UNS and JHU, presented in Table 3) are shown in Table 7 and Fig. 7. The optimal parameters of SIR model and other epidemic characteristics can be compared with the results obtained for the eighth wave (see Table 7). The number of observations taken for calculations n i was 14 in all the cases.
It can be seen that two data sets yield rather different values of the optimal parameters for the ninth wave (especially for N , S ∞ , and ν ), nevertheless the final sizes of this wave V ∞ and ρ are rather close; the duration based on the national statistics is one month longer in comparison with the calculations based on JHU data set. Both simulations for the ninth epidemic wave in Ukraine yield slightly higher final sizes in comparison with the eighth wave (see Table 7).   Table 3. "Circles" correspond to the accumulated numbers of cases taken for calculations (during period of time T ci ); "triangles" -numbers of cases before T ci ; "stars" -number of cases a er T ci . It can be seen that the accuracy of simulations based on the national statistics is rather good (the deviations between red "stars" and red solid line are small). The use of JHU data sets yields worth accuracy. Nevertheless, the real numbers of cases already exceed the predicted saturations levels for both data sets and corresponding simulations.   Table 7: national statistics [2, 3] (red) and data set reported by JHU [4] (black). Numbers of victims V(t)=I(t)+R(t) -solid lines; numbers of infected and spreading I(t) multiplied by 10 -dashed; derivatives dV/dt (eq. (13)) multiplied by 10 -dotted. Markers show accumulated numbers of cases V j and V j from Table 3. "Circles" correspond to the accumulated numbers of cases taken for calculations (during period of time T c ); "triangles" -numbers of cases before T c ; "stars" -number of cases a er T c .
The smoothed dependence of the accumulated number of cases and its differentiation can provide fairly accurate and useful information about the course of the epidemic, identify important changes in its dynamics and provide timely recommendations for quarantine measures or control of social distancing. To simulate different pandemic waves (periods with more or less constant values of its dynamics parameters), the generalized SIR model and presented procedures of its parameters identification can be applied to calculate different pandemic waves.
Incomplete data and different methods of cases recording can give quite different values of model parameters and forecasts, which was demonstrated by the example of two data sets for Ukraine. The forecast of the duration of the epidemic in Ukraine is not optimistic. New cases will appear at least until July 2021. Let us hope that vaccinations can change this sad predictions. The obtained results can be also useful for assessing the effectiveness of mass vaccination in Ukraine and other countries.
Very long duration of the pandemic requires correction of our behavior, we can not live as before it occurred. Decreased feelings of insecurity and non-compliance with social distancing may further increase the pandemic duration and the number of the coronavirus victims. Total closure of settlements or regions can be recommended only in the event of a sharp increase in the number of cases. There are many things that can be done without loss to the economy and our daily lives: 1. Minimize the number of contacts and trips, not visit crowded places. Work and study remotely where possible.

Refrain from shaking hands and kisses during meetings. Use masks in transport and crowded areas.
3. If you (or others) have any suspicious symptoms, do your best to avoid the spread of the infection.