7-day regression: linear and log-linear
The time-series of cumulative confirmed cases (CCC) are fitted using the latest 7-day data points using linear and log-linear regression for each county. The linear regression of a time-series is the extrapolation of the pandemic trend for the short-term duration.
The log-linear regression indicates that the common logarithm of CCC is regressed with time, i.e., the number of days after the infection case's first occurrence. The logarithmic transformation is used because the pandemic time-series often follow exponential growth decay for a medium-term duration.
A general equation for the linear regression is\[y=ax+b\]where \(a\) and \(b\) are the slope and intercept of the linear equation. In our case, the logarithmic CCC and the reduced time \(t\), i.e., the number of days after the first occurrence, are \(y\) and \(x\) axes, respectively, such as\[\log_{10}(y) = At+B\]where \(A\) and \(B\) are calculated using data points of the last seven days to include weekly variations. Using the calculated values of \(A\) and \(B\), the CCC is forecasted using\[\mathrm{CCC} = 10^{At+B}\]and the numerical values are plotted as a prediction/forecasting line with the reported CCC data.
The standard regression process calculate \(a\) and \(b\). Because we use only 7-day data, both the regressions provide similar forecasting in the next few days. The linear and log-linear regression values of the past seven days data are superimposed to CCC time-series' scatter plot.
A physical interpretation of these mathematical analyses is as follows. The log-linear regression presumes that the pandemic time-series follows the exponential growth pattern, which implies that while viruses are spreading, the infection trend is similar to the biological population growth by cell bisection. On the other hand, if the pandemic trends follow the linearly increasing trends, it can be understood as the coronavirus's surrounding conditions primarily govern the CCC growth.
The log-linear regression indicates that the common logarithm of CCC is regressed with time, i.e., the number of days after the infection case's first occurrence. The logarithmic transformation is used because the pandemic time-series often follow exponential growth decay for a medium-term duration.
A general equation for the linear regression is\[y=ax+b\]where \(a\) and \(b\) are the slope and intercept of the linear equation. In our case, the logarithmic CCC and the reduced time \(t\), i.e., the number of days after the first occurrence, are \(y\) and \(x\) axes, respectively, such as\[\log_{10}(y) = At+B\]where \(A\) and \(B\) are calculated using data points of the last seven days to include weekly variations. Using the calculated values of \(A\) and \(B\), the CCC is forecasted using\[\mathrm{CCC} = 10^{At+B}\]and the numerical values are plotted as a prediction/forecasting line with the reported CCC data.
The standard regression process calculate \(a\) and \(b\). Because we use only 7-day data, both the regressions provide similar forecasting in the next few days. The linear and log-linear regression values of the past seven days data are superimposed to CCC time-series' scatter plot.
A physical interpretation of these mathematical analyses is as follows. The log-linear regression presumes that the pandemic time-series follows the exponential growth pattern, which implies that while viruses are spreading, the infection trend is similar to the biological population growth by cell bisection. On the other hand, if the pandemic trends follow the linearly increasing trends, it can be understood as the coronavirus's surrounding conditions primarily govern the CCC growth.
Mathematical Approaches to Predict Pandemic Trends in the State of Hawaiʻi: version 1 for conservative modeling
In mathematics, the hyperbolic tangent function is determined as the ratio of hyperbolic sine to hyperbolic cosine functions. They are related to sine and cosine functions, respectively, through imaginary number, i = sqrt(-1). As shown in tanh(bt)
If we change \(\beta\) value and plot the hyperbolic tangent with respect to \(t\), graphs look like the figure below. Small and large \(\beta\) values provide gradual (i.e., almost linear) increase in \(y\) and step-wise variation with \(t\), respectively.
In our study, the general functional form to fit the pandemic data of the cumulative confirmed cases (CCC) is \[y=\frac{1}{2}\alpha\left[\tanh(\beta (t-\tau))+1\right]\] where \(\alpha\) is the plateau-state number at the end of a pandemic wave, \(\beta\) indicates the growth rate of \(y\) (i.e., CCC), and \(\tau\) is the day when a pandemic wave is in the middle of the growth.
In our study, the general functional form to fit the pandemic data of the cumulative confirmed cases (CCC) is \[y=\frac{1}{2}\alpha\left[\tanh(\beta (t-\tau))+1\right]\] where \(\alpha\) is the plateau-state number at the end of a pandemic wave, \(\beta\) indicates the growth rate of \(y\) (i.e., CCC), and \(\tau\) is the day when a pandemic wave is in the middle of the growth.
To predict COVID-19 cases across the islands, we've created this general hyperbolic tangent equation, as described above, which allows us to change three parameters to make our predictions more accurate: \(\alpha\), \(\beta\), and \(\tau\).
Using this equation, we separated our data in terms of "waves." We consider a COVID wave when cases rise significantly, and then after some period, it starts to slowly hit a plateau in which there is little to no cases for a certain amount of time. As of now, that is how we are separating "waves." We are currently creating newfound data which will allow us to better justify a "wave," by plotting its parameters separately and analyzing its data. This data will be shared shortly once we analyze and assess our findings.
Using this equation, we separated our data in terms of "waves." We consider a COVID wave when cases rise significantly, and then after some period, it starts to slowly hit a plateau in which there is little to no cases for a certain amount of time. As of now, that is how we are separating "waves." We are currently creating newfound data which will allow us to better justify a "wave," by plotting its parameters separately and analyzing its data. This data will be shared shortly once we analyze and assess our findings.
Data Source
All the pandemic data of the Hawai'i state are obtained from USF Facts website: https://usafacts.org/visualizations/coronavirus-covid-19-spread-map/
Related previous work
The first COVID-19 research on the CCC prediction focused on the universality of CCC time-series over the G20 countries.
- Albert S. Kim, Transformed time series analysis of first-wave COVID-19: universal similarities found in the Group of Twenty (G20) Countries, medRxiv: https://www.medrxiv.org/content/10.1101/2020.06.11.20128991v1 (DOI: https://doi.org/10.1101/2020.06.11.20128991). A full pre-print manuscript is available: PDF.
Nomenclature
Proudly powered by Weebly