樱花视频

Skip to main content
  • Research
  • Published:

Development of Bayesian segmented Poisson regression model to forecast COVID-19 dynamics based on wastewater data: a case study in Nanning City, China

Abstract

Introduction

COVID-19 has caused tremendous hardships and challenges around the globe. Due to the prevalence of asymptomatic and pre-symptomatic carriers, relying solely on disease testing to screen for infections is not entirely reliable, which may affect the accuracy of predictions about the pandemic trends. This study is dedicated to developing a predictive model aimed at estimating of the dynamics of COVID-19 at an early stage based on wastewater data, to assist in establishing an effective early warning system for disease control.

Method

Viral load in wastewater and the number of daily reported COVID-19 cases were collected from Nanning CDC and the Chinese Disease Prevention and Control Information System, respectively. We used the viral load to estimate daily reported cases by a Bayesian linear regression model. Subsequently, a Bayesian (segmented) Poisson regression model was developed, using data from the first wave of the epidemic as prior information, to predict the COVID-19 epidemic trend of the second wave. Finally, in order to explore the optimal training data for predicting outbreak dynamics during the pandemic, we fitted the model using various training sets.

Results

The results revealed the estimated cases, using the viral load with a 3-day lag, were consistent with the actual reported cases, with adjusted R虏 value of 0.935 (p鈥<鈥0.001). Our model successfully predicted the epidemic peak time and provided early warnings on the third day after the outbreak began. Furthermore, after using data from the first 6 days of the outbreak, the model鈥檚 MAPE rapidly decreasing to lower levels (MAPE鈥=鈥29.34%) and eventually stabilized at approximately 20%. Compared to using non-informative priors, this result allows for an advance warning of approximately two weeks. Importantly, as the inclusion of data from early outbreak increased, the predictive results of the model became more stable and accurate.

Conclusion

This study demonstrates the potential of wastewater-based epidemiology combined with Bayesian methods as a monitoring and predictive tool during infectious disease outbreaks.

Peer Review reports

Introduction

The rapid global spread of coronavirus disease 2019 (COVID-19), beginning at the end of 2019, evolved into a major pandemic [1]. As of September 24, 2023, over 770听million people have been reported infected, resulting in more than 6听million reported deaths worldwide [2]. Therefore, staying informed about COVID-19 and understanding its modes of transmission remains important. Currently, most research focuses on predictions using reported cases data. However, the increasing number of asymptomatic and mildly symptomatic individuals, coupled with their perception that testing is not worthwhile, raises the risk of underestimating the number of reported cases [3,4,5]. Continuing to rely solely on reported cases data to build models might lead to inaccurate predictions. Our objective is to develop a model that uses viral load from wastewater to forecast the transmission dynamics of COVID-19 outbreaks through Bayesian methods, thus assisting an effective early warning system for disease control.

Wastewater data are considered a complement, or even a substitute for disease surveillance data, particularly in times of low human testing activity [6]. Compared to traditional clinical testing, Wastewater-Based Epidemiology (WBE) effectively circumvents the inherent high under-reporting rates [7], objectively revealing early infection situations in specific areas and providing early warnings for potential disease outbreaks. Additionally, WBE is more cost-effective, easier to operate, and non-invasive, making it highly suitable for large-scale implementation [8]. During the COVID-19 pandemic, WBE has been increasingly utilized to predict the dynamics of COVID-19 spread [9, 10]. For instance, Li et al. utilized an Artificial Neural Network model (ANN), based on observed prevalence of COVID-19 together with other variables including viral load in wastewater, wastewater temperature and population size, to predict the prevalence of COVID-19 up to 2鈥4 days in advance [11]. Another study by Proverbio et al. utilized the Susceptible-Exposed-Infectious-Recovered (SEIR) model, integrating wastewater with reported cases, to predict the dynamics of COVID-19 over a 7-day period [12]. Their results indicated that predictions derived from wastewater closely align with actual report data. These methods demonstrate that wastewater-based predictions can provide early warnings for outbreaks. However, both of the aforementioned methods provide warnings close to the peak of the outbreak, which may limit their ability to provide sufficient early warning during the initial stages of the pandemic. Additionally, the ANN model requires more input factors such as wastewater and air temperature to achieve more accurate predictions. Similarly, the predictive performance of the SEIR model is highly dependent on parameter selection [12]. Moreover, existing research indicates that the virus of SARS-CoV-2 evolves toward more infectious variants [13], causing the transmission parameters of COVID-19 to change with viral mutations, which could potentially affect the predictive performance of the SEIR model. Zhang et al. introduced a segmented Poisson model, integrating the power law and the exponential law, to predict the dynamics of the COVID-19 outbreak [14], effectively addressing the limitations of the SEIR model, which relies heavily on parameter selection, and offers a simple and efficient forecasting method. Bayesian methods can integrate prior knowledge with new observational data to update probability estimates [15]. This can make more accurate inferences when data is limited. Integrated Nested Laplace Approximation (INLA) technology efficiently estimates the posterior distributions of Bayesian model, significantly enhancing the speed and capabilities of Bayesian inference [16]. By integrating Bayesian methods with wastewater data, it is possible to objectively reflect the dynamics of COVID-19 in its early stages when data is limited. This is particularly important for epidemics that require a rapid response.

We selected Nanning City, Guangxi Province, China, as our study area and explored the relationship between viral load and the number of daily reported COVID-19 cases. Following this, we developed a Bayesian Poisson regression model to predict the dynamics of COVID-19. Furthermore, we explored how the model鈥檚 predictive performance was affected by the data from different early days used for predicting the outbreak.

Method

Method overview

In this study, our primary aim was to predict the epidemic dynamics of the COVID-19 outbreak based on viral load in wastewater using a Bayesian Poisson regression model. Firstly, we quantified the relationship between the number of daily reported COVID-19 cases and the viral load. Subsequently, we established a Bayesian model that utilized viral load in wastewater and prior information from the first outbreak to predict the dynamics of the second outbreak of COVID-19.

Data source

The main data sources include the viral load in wastewater and the number of daily reported COVID-19 cases in Nanning City. The wastewater data were provided by the Nanning Center for Disease Control and Prevention, with samples collected from March 2 to June 15, 2023, from 11 wastewater treatment plants located in densely populated areas. The COVID-19 case data, covering the period from November 20, 2022, to June 15, 2023, were obtained through the Chinese Disease Prevention and Control Information System. It includes details such as age, gender, occupation, date of birth, date of onset, vaccination status, current residential address with national standard code, and case classification.

Statistical analysis

Estimation of reported cases based on wastewater

We selected the wastewater and case data collected from April 1 to June 15, 2023, during which Nanning experienced the second wave of the COVID-19 outbreak. Initially, we averaged wastewater data from different locations to represent the viral load for Nanning City. Due to reported cases being updated daily, while wastewater data were collected only twice a week, we applied interpolation to the viral load to align it with the daily reported cases. We previously explored and compared various interpolation methods, including spline interpolation, moving average interpolation, and linear interpolation. As the results showed minimal differences, we ultimately chose linear interpolation. Subsequently, we explored the relationship between daily reported cases and viral load at various lag times through cross-lagged and correlation analysis.Additionally, we assumed that the daily reported cases \(\:{Y}_{t}\) followed a Poisson distribution \(\:{Y}_{t}\sim\text{P}\text{o}\text{i}\text{s}\text{s}\text{o}\text{n}\left({\mu\:}_{t}\right)\), where \(\:{Y}_{t}\) denoted the daily reported cases at day t and \(\:{\mu\:}_{t}\) was the expectation of \(\:{Y}_{t}\). Then we used a Bayesian linear regression model on the log scale to quantify the relationship between the expectation of reported cases and the viral load, as

$$\:\text{log}\left({\mu\:}_{t}\right)=\epsilon\:\text{*}\text{log}\left({c}_{t+i}\right)$$
(1)

Here \(\:\epsilon\:\) is the coefficient in the logarithmic relationship, and \(\:i\) represents the lag time from the cross-lagged analysis. This equation allows us to predict daily reported cases from viral load, and these predictions serve as input for the subsequent Bayesian Poisson regression model.

Estimation of outbreak dynamics

Identification of turning point

Our preliminary exploration indicates that using a portion of sporadic period data to build the model could enhance its predictive performance (Supplementary Method 1). Given the distinct transmission characteristics of COVID-19 during the sporadic and outbreak phases, identifying the critical turning point from the sporadic phase to the outbreak phase is important. Firstly, after the turning point, the transmission speed of the epidemic accelerates significantly. Timely identification of the turning point helps to implement targeted measures in the early stages of the outbreak, effectively mitigating the risk of further spread. Secondly, since the transmission characteristics of the epidemic vary across different phases, identifying the turning point allows for the optimization of model parameters, enabling the model to more accurately reflect the dynamics of the epidemic鈥檚 transmission. Based on this, we designed an analysis method using a 7-day sliding time window (Supplementary Method 2). Starting from the sporadic period, we analyze data from seven consecutive days at a time to calculate the regression slope. If the slopes of seven consecutive time windows are significantly nonzero, we identify the last day of the first window as the turning point (ts) of the epidemic.

Bayesian segmented poisson regression model

We classify the COVID-19 transmission states into sporadic and outbreak periods based on the ts. Building on this classification, we developed a Bayesian Poisson regression model using wastewater data to estimate the epidemic dynamics of COVID-19 outbreak. We first used viral load in wastewater to estimate the number of daily reported cases \(\:{\widehat{Y}}_{t}\), as shown in the previous section of 鈥淓stimation of reported cases based on wastewater鈥. Subsequently, we assume that the estimated reported cases \(\:{\widehat{Y}}_{t}\) follows a Poisson distribution, specifically \(\:{\widehat{Y}}_{t}\sim\text{P}\text{o}\text{i}\text{s}\text{s}\text{o}\text{n}\left({\lambda\:}_{t}\right)\), where \(\:{\lambda\:}_{t}\) represents the expectation of the reported cases during the epidemic.

$$\left\{ {\matrix{{\lambda {\>_t} = \alpha {\>_1}{t^{\beta {\>_1}}}{e^{ - \gamma {\>_1}t}}} \hfill & {\,\,\,\,\,\,\,\,\,\,\,T < {t_s}} \hfill & {\,\,\,\,\,\,\,\,\,\,\,(1)} \hfill \cr {\lambda {\>_t} = \alpha {\>_2}{t^{\beta {\>_2}}}{e^{ - \gamma {\>_2}t}}} \hfill & {\,\,\,\,\,\,\,\,\,\,\,T \ge \>{t_s}} \hfill & {\,\,\,\,\,\,\,\,\,\,\,(2)} \hfill \cr } } \right.$$
(2)

Where \(\:{\alpha\:}_{k}\), \(\:{\beta\:}_{k}\), and \(\:{\gamma\:}_{k}\) were regression parameters, with k鈥=鈥1, 2 indicated the periods before and after the epidemic transitions from sporadic cases to an outbreak. The day on which the derivative of Eq.听(2) was zero corresponded to the peak of the daily reported cases, defined as \(\:{t}_{peak}\)=\(\:{\beta\:}_{2}/{\gamma\:}_{2}\). On this day, the maximum number of reported cases \(\:{Y}_{max}\), could be calculated as

$$\:{Y}_{max}={{\alpha\:}_{2}\left(\frac{{\beta\:}_{2}}{{\gamma\:}_{2}}\right)}^{{\beta\:}_{2}}{e}^{-{\beta\:}_{2}}$$
(3)

We fitted the model using the estimated reported cases derived from wastewater data. Additionally, to explore the optimal amount of training data required during the outbreak period for effectively predicting the outbreak dynamics. We set different training datasets with different included data (For the data from the outbreak period, we incrementally expanded the training data in 2 days intervals, starting with an initial span of 3 days from the outbreak and steadily increasing to 22 days, with the remaining reported case data treated as the validation set) and compared the predicted epidemic scale with the actual scale of the second wave of COVID-19. Additionally, we conducted further analysis based the same data using the ANN and SEIR methods, and compared their performances with those from our Bayesian model (Supplementary Method 3).

Prior distributions

All models were built under a Bayesian framework. For Formula 1, we assigned a non-informative prior for the regression parameter \(\epsilon \sim N(0,1000)\). For Formula 2, we used non-informative priors to fit the first wave COVID-19 data, obtained posterior means and standard deviations for each parameter (, , and ), and used these as priors for the second wave prediction. Additionally, for Formula 2, we conducted sensitivity analysis using priors of different ranges.

Model validation

In the Model validation section, for Formula 1, we randomly selected 80% of reported data as the training set, and the remaining 20% as the validating set. We compared the actual number of the reported cases with the estimated number and calculated the adjusted R-squared (R2) value to assess the fit of the model. For Formula 2, we calculated the mean absolute percentage error (\(\:\text{M}\text{A}\text{P}\text{E}=\frac{1}{T}\sum\:_{t=1}^{T}\left|\frac{{Y}_{t}-{\mu\:}_{t}}{{Y}_{t}}\right|\times\:100\)), the mean absolute error (\(\:\text{M}\text{A}\text{E}=\frac{1}{T}\sum\:_{t=1}^{T}|{Y}_{t}-{\mu\:}_{t}|\)), and the adjusted R2 to validate the robustness of the model.

The data were processed and analyzed using R software (INLA package, version 4.3.3) and ArcMap 10.8.

Results

The spatial distribution of reported cases in Nanning City from April 1 to June 15, 2023, along with temporal variations in both reported cases and viral load, is shown in Fig.听1. Spatially, individual cases were primarily distributed in densely populated urban centers, which were also where the wastewater treatment plants under study were located (Fig.听1a). The peaks of daily reported cases and viral load occurred on May 15, 2023, and May 18, 2023, respectively (Fig.听1b). We found that wastewater data with a three-day lag could most effectively explain the variations in reported cases through cross-lagged analysis (Table听1). At the same time, the highest correlation occurred when the wastewater data were advanced by three days (r: 0.967; 95% CI: 0.953 to 0.977; p鈥<鈥0.001) (Table听2). Through the Bayesian regression analysis between reported cases and viral load in wastewater, the regression coefficient was 0.939 (95% Bayesian credible interval (BCI): 0.937 to 0.941), with the model:

Fig. 1
figure 1

Spatial and temporal distribution of reported COVID-19 cases and viral load in Nanning City from March 2 to June 15, 2023 a Spatial distribution of reported cases and wastewater treatment plants in Nanning City; b Temporal trends of daily reported case numbers and viral load in wastewater

Table 1 Results of cross-lagged analysis for viral load data delays
Table 2 Results of of time-lagged correlation analysis for viral load data delays
$$\:{\widehat{Y}}_{t}=\text{exp}\left[0.939\text{*}\text{log}\left({c}_{t+3}\right)\right]$$

Additionally, this model performed well in the validation set (\(\:\text{a}\text{d}\text{j}\text{u}\text{s}\text{t}\:{R}^{2}\): 0.935; p鈥<鈥0.001), indicating that using viral load in wastewater to estimate the daily reported cases is reasonable.

We estimated the turning point of the second outbreak on April 23, 2023 (Supplementary Fig.听2). The seventh day before this was chosen as the model鈥檚 start date, as it minimized overall MAE (Supplementary Method 1). Using posterior estimates (Supplementary Table 1) from the first wave as priors, we estimated the second wave鈥檚 regression parameters and their 95% BCIs across various training datasets (Table听3). Figure听2 shows the model accurately predicted the epidemic peak using only the first three days of the outbreak鈥檚 data. Our model can use data from the first three days of the COVID-19 outbreak to predict the peak time of the epidemic (May 15th, BCI [May 14th to May 16th]), and in all subsequent predictions, the error between the peak time predicted and the actual peak consistently remains within three days (Supplementary Table 2). The model effectively estimated the scale of the COVID-19 epidemic using data from the first six days of the outbreak (Fig.听2), with the MAPE rapidly decreasing to 29.34%. As the volume of training data increased, the model鈥檚 MAPE exhibited firstly fluctuations and eventually stabilized at around 20% (Fig.听3). Therefore, overall, the model鈥檚 predictive performance remained good.

Table 3 Parameter estimation and 95% BCI of Bayesian segmented Poisson regression model with informative prior based on training sets with different included data of early period
Fig. 2
figure 2

Model prediction of daily reported cases covering the second wave of pandemic with informative priors (A series of subplots (a to t) represent the results based on training sets with different included data of early period. The blue dots indicate the actual reported number of cases. The green line and shaded area depict the predicted curve and its associated 95% BCI, respectively. Similarly, the red line and shaded area demonstrate the fitted curve and its associated 95% BCI)

Fig. 3
figure 3

The performance of Bayesian segmented Poisson regression models with informative priors based on training sets with different included data of early period

Supplementary Fig.听3 and Supplementary Table 3 present the predictive performance of the Bayesian model with both non-informative and informative priors. The results reveal that using data from the first wave of the epidemic as prior knowledge allows for an advance warning of about two weeks. Supplementary Fig.听4 presents the results of the sensitivity analysis. Under conditions of limited data, the adjusted R虏 gradually decreases as the prior information becomes less informative. Figure听4 compares the performance of the Bayesian, ANN, and SEIR models in predicting COVID-19 dynamics. The results indicate that the Bayesian model exhibits higher stability and efficiency, maintaining a consistently high adjusted R2 value, particularly when data availability is limited, performing better than or comparable to the ANN and SEIR models.

Fig. 4
figure 4

Comparison of the predictive performance of Bayesian, ANN, and SEIR models

Discussion

In this study, we developed an innovative model that combines wastewater data and Bayesian methods to estimate the spread of COVID-19. Results showed a strong positive correlation between wastewater viral load and daily reported cases, consistent with the findings of Hoffman et al. [17], confirming the effectiveness of WBE for real-time monitoring of COVID-19 dynamics. Additionally, this method can provide about two weeks of lead time and effectively estimates the scale of the epidemic. This discovery is crucial for public health workers, as it enables them to allocate medical resources more effectively in response to surges in hospital admissions. Especially after the cancellation of large-scale testing programs in early 2023 in China [18], wastewater monitoring is expected to become a key source of information for the healthcare system in managing the COVID-19 pandemic.

Unlike most studies, we did not find the viral load to be a leading indicator of reported cases; instead, it lagged three days behind, consistent with the findings of Hoffman K et al. [17]. This time difference may be related to the specific conditions of the monitoring sites, such as the retention time of sewage, the geographical distance from the wastewater monitoring sites to the laboratory, and laboratory processing time [19,20,21]. Additionally, the fact that our model performed poorly within the first three days may be attributed to the lack of early data, especially during phases of rapid epidemic change. Thirdly, according to the Bayesian model, the peak of the epidemic was predicted around 1鈥2 days before the actual peak. This slight bias in prediction may be attributed to the low monitoring frequency, which fails to capture the actual peak, or to inherent limitations within the model itself. Nevertheless, the predicted peaks maintain a high degree of consistency with the actual peaks, demonstrating strong stability. Therefore, these predictions continue to offer significant guidance value for public health decision-making.

Compared to previous studies, our model excels in providing early warnings; it accurately predicts the peak of the epidemic three days after the turning point and anticipates the overall scale of the outbreak. Moreover, our model is data-driven, with parameters estimated from the data rather than predetermined, and it processes rapidly, allowing updates as new data becomes available. The model is also simple to operate, making it easy for healthcare workers to understand and implement. Apart from being suitable for predicting COVID-19 trends, it may also be applied to other common infectious diseases, such as influenza [22] and norovirus [23], allowing for timely adjustments to predictions and intervention strategies based on the latest data.

This study has some limitations that need to be acknowledged. Firstly, higher temperatures [24] or the reduction of organic matter and suspended solids [22] in wastewater may accelerate the degradation of SARS-CoV-2. Additionally, increased rainfall could dilute the viral concentration. These factors could potentially affect the estimation of case numbers based on viral load, but they were not considered in our model. Secondly, on January 8, 2023, China reclassified COVID-19 as a Category B infectious disease [18]. After that, more people began using at-home rapid COVID-19 antigen tests, which resulted in fewer cases being reported than the actual number of cases. However, during our study period, the collection of information on reported cases remained relatively systematic. Therefore, this does not notably affect our primary objective, which is to use wastewater data as a valuable supplement to predict the transmission dynamics of COVID-19 and provide a reliable method for epidemic monitoring when reported data are incomplete or inaccurate. It is important to be aware that, over time, with the increased use of at-home COVID-19 tests and the impact of pandemic fatigue, laboratory-based reporting systems may under-report more actual infections [25]. In this context, wastewater-based data predictions will become more advantageous. Thirdly, our wastewater monitoring data was collected twice a week and the viral load for non-monitored dates was generated through linear interpolation, which could lead to bias of the viral load estimation for those non-monitored dates. This limitation might account for the slight underestimation of the peak number of cases (Fig.听2p-t). Therefore, increasing the frequency of wastewater monitoring in outbreaks in future studies could be important to improve the accuracy of the estimation. Additionally, due to the lack of specific reported cases corresponding to each treatment plant, we were unable to analyze the epidemic transmission trends in the areas surrounding each wastewater treatment plant in detail. Instead, we averaged the viral loads from 11 wastewater treatment plants to represent the situation across the entire city of Nanning. This approach might limit our ability to determine COVID-19 transmission patterns at a finer spatial scale. Therefore, future research can explore the application of this model at a higher spatial resolution. Additionally, the sensitivity analysis revealed that predictions accuracy decreases with less informative priors. Therefore, significant differences in the characteristics of the previous and current outbreaks might lower the model performance.

Conclusion

In summary, this study proposes a wastewater data-based method to predict the dynamics of the COVID-19 pandemic under Bayesian statistical framework. The method addresses the limitations of relying solely on reported case data and provides early warning signals for public health authorities, thereby assisting early healthcare planning and resource allocation. Additionally, the model is computationally efficient, easy to operate, and suitable for frontline workers. Future research could focus on exploring the model鈥檚 applicability in other cities or at finer spatial scales.

Data availability

The datasets used and/or analyzed during the current study are available from corresponding author on reasonable request.

Abbreviations

COVID-19:

Coronavirus Disease 2019

WHO:

World Health Organization

WBE:

Wastewater-Based Epidemiology

ANN:

Artificial Neural Network

SEIR:

Susceptible-Exposed-Infectious-Recovered

CDC:

Center for Disease Control and Prevention

MAPE:

Mean Absolute Percentage Error

MAE:

Mean Absolute Error

BCI:

Bayesian Credible Interval

References

  1. Li J, Lai S, Gao GF, Shi W. The emergence, genomic diversity and global spread of SARS-CoV-2. Nature. 2021;600(7889):408鈥18.

    CAS听 听 听

  2. WHO. COVID-19 Epidemiological Update 鈥撯29 September 2023 [update 2023; cited 2024 Aug 5].

  3. Rahmandad H, Lim TY, Sterman J. Estimating COVID-19 under-reporting across 86 nations: implications for projections and control. Europe PMC. 2020. .

    听 听

  4. Alvarez E, Bielska IA, Hopkins S, Belal AA, Goldstein DM, Slick J, et al. Limitations of COVID-19 testing and case data for evidence-informed health policy and practice. Health Res Policy Syst. 2023;21(1):11.

    听 听 听 听

  5. Lippi G, Mattiuzzi C, Henry BM. Uncontrolled confounding in COVID-19 epidemiology. Diagnosis (Berl). 2023; 10(2):200鈥2.

  6. McManus O, Christiansen LE, Nauta M, Krogsgaard LW, Bahrenscheer NS, von Kappelgaard L, et al. Predicting COVID-19 incidence using Wastewater Surveillance Data, Denmark, October 2021-June 2022. Emerg Infect Dis. 2023;29(8):1589鈥97.

    CAS听 听 听 听

  7. Jones TC, Biele G, M眉hlemann B, Veith T, Schneider J, Beheim-Schwarzbach J, et al. Estimating infectiousness throughout SARS-CoV-2 infection course. Science. 2021;373(6551):eabi5273.

    CAS听 听 听 听

  8. Shrestha S, Yoshinaga E, Chapagain SK, et al. Wastewater-based epidemiology for cost-effective mass surveillance of COVID-19 in low-and middle-income countries: challenges and opportunities[J]. Water. 2021;13(20):2897.

    CAS听 听

  9. Street R, Malema S, Mahlangeni N, Mathee A. Wastewater surveillance for Covid-19: an African perspective. Sci Total Environ. 2020;743:140719.

    CAS听 听 听 听

  10. Mao K, Zhang K, Du W, Ali W, Feng X, Zhang H. The potential of wastewater-based epidemiology as surveillance and early warning of infectious disease outbreaks. Curr Opin Environ Sci Health. 2020;17:1鈥7.

    听 听 听 听

  11. Li X, Kulandaivelu J, Zhang S, Shi J, Sivakumar M, Mueller J, et al. Data-driven estimation of COVID-19 community prevalence through wastewater-based epidemiology. Sci Total Environ. 2021;789:147947.

    CAS听 听 听 听

  12. Proverbio D, Kemp F, Magni S, Ogorzaly L, Cauchie H-M, Gon莽alves J, et al. Model-based assessment of COVID-19 epidemic dynamics by wastewater analysis. Sci Total Environ. 2022;827:154235.

    CAS听 听 听 听

  13. Wang R, Chen J, Gao K, Wei G-W. Vaccine-escape and fast-growing mutations in the United Kingdom, the United States, Singapore, Spain, India, and other COVID-19-devastated countries. Genomics. 2021;113(4):2158鈥70.

    CAS听 听 听

  14. Zhang X, Ma R, Wang L. Predicting turning point, duration and attack rate of COVID-19 outbreaks in major western countries. Chaos Solitons Fractals. 2020;135:109829.

    听 听 听 听

  15. van de Schoot R, Depaoli S, King R, Kramer B, M盲rtens K, Tadesse MG, et al. Bayesian statistics and modelling. Nat Reviews Methods Primers. 2021;1(1):1.

    听 听

  16. Rue H, Martino S, Chopin N. Approximate bayesian inference for latent gaussian models by using Integrated Nested Laplace approximations. J Royal Stat Soc Ser B: Stat Methodol. 2009;71(2):319鈥92.

    听 听

  17. Hoffman K, Holcomb D, Reckling S, Clerkin T, Blackwood D, Beattie R, et al. Using detrending to assess SARS-CoV-2 wastewater loads as a leading indicator of fluctuations in COVID-19 cases at fine temporal scales: correlations across twenty sewersheds in North Carolina. PLOS Water. 2023;2(10):e0000140.

    听 听

  18. THE STATE COUNCIL THE PEOPLE鈥橲 REPUBLIC, OF CHINA. Major adjustment! COVID-19 infection management category will be adjusted from Category B with A-level control to Category B with B-level control [update 2022; cited 2024 Aug 5].

  19. Bibby K, Bivins A, Wu Z, North D. Making waves: plausible lead time for wastewater based epidemiology as an early warning system for COVID-19. Water Res. 2021;202:117438.

    CAS听 听 听 听

  20. Zhu Y, Oishi W, Maruo C, Saito M, Chen R, Kitajima M, et al. Early warning of COVID-19 via wastewater-based epidemiology: potential and bottlenecks. Sci Total Environ. 2021;767:145124.

    CAS听 听 听 听

  21. Schill R, Nelson KL, Harris-Lovett S, Kantor RS. The dynamic relationship between COVID-19 cases and SARS-CoV-2 wastewater concentrations across time and space: considerations for model training data sets. Sci Total Environ. 2023;871:162069.

    CAS听 听 听 听

  22. Wolfe MK, Duong D, Bakker KM, Ammerman M, Mortenson L, Hughes B, et al. Wastewater-based detection of two influenza outbreaks. Environ Sci Technol Lett. 2022;9(8):687鈥92.

    CAS听 听

  23. Xagoraraki I, O鈥橞rien E. Wastewater-based epidemiology for early detection of viral outbreaks. In: O鈥橞annon DJ, editor. Women in water quality: investigations by prominent female engineers. Kansas: Springer; 2020. pp. 75鈥97.

    听 听

  24. Schussman MK, McLellan SL. Effect of time and temperature on SARS-CoV-2 in municipal wastewater conveyance systems. Water. 2022;14(9):1373.

    CAS听 听

  25. MMWR. Use of At-Home COVID-19 Tests 鈥 United States, August 23, 2021鈥揗arch 12, 2022 [update 2022; cited 2024 Aug 5].

Acknowledgements

Not applicable.

Funding

This research was supported by the 2020 Talent Highland Special Foundation of Nanning (2020021), the Health Commission Self-Foundation of Guangxi (Z-A20241133), and the National Natural Science Foundation of China (82073665)

Author information

Authors and Affiliations

Authors

Contributions

B.X. obtained the data and contributed to funding acquisition. X.S. contributed to the conceptualization of the study, data curation, formal analysis, drafting the manuscript, and interpretation of the results, along with critically reviewing the final manuscript. C.L. obtained the data and contributed to funding acquisition, the conceptualization of the study, and interpretation of the results. C.S. and C.P. participated in data curation, formal analysis, and interpretation of the results. Y.L. contributed to funding acquisition and the conceptualization of the study, guided the formal analysis and data curation, supervised the manuscript writing, performed language checks, and provided intellectual support during the critical review of the final manuscript. B.X., C.L., and Y.L. coordinated and facilitated the project implementation activities. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Yingsi Lai.

Ethics declarations

Ethical approval

Not applicable.

Consent for publication

Not Applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher鈥檚 note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article鈥檚 Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article鈥檚 Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit .

About this article

Cite this article

Xu, B., Shi, X., Liang, C. et al. Development of Bayesian segmented Poisson regression model to forecast COVID-19 dynamics based on wastewater data: a case study in Nanning City, China. 樱花视频 25, 118 (2025). https://doi.org/10.1186/s12889-024-20968-x

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12889-024-20968-x

Keywords