COVID Winter Forecast
Things are getting worse, but how bad will they get? I'll explore three different scenarios and forecast the number cases, deaths and hospitalizations in the next four months.
We’ve experienced two separate surges of COVID infections and deaths, one peaking in April and one over the summer peaking in July.
Now, we are on the cusp of another surge, leading to the inevitable question: How bad will it get?
This post is a forecast of COVID infections, deaths and health care utilization over the next 4 months. It is based on our work at DataContours, which provides daily COVID and health forecasts to local hospitals and health departments across the country. Background on the models and methods are at the bottom of the post.
I’ll look at regional forecasts in future posts, so Subscribe or follow me on Twitter at @datareich for the latest.
Forecast summary
Here is the quick summary of my latest forecast, current as of December 4, 2020.
2 months from the peak of new cases and deaths in late January/early February.
At the peak, expect somewhere around 290,000 new positives tests and 4,500 reported deaths per day.
13 million active infections by late January. This is about 3.5% of the US population infected at the same time.
Hospital utilization will peak at about 175,000 patients, with around 40,000 of those patients needing ICU-level support. This is about 150% capacity nationwide, with some areas seeing 200% surges.
Back in October we estimated that the US would see 2 to 4 times the level of infections as we saw in the summer peaks, or 140 - 280k cases per day at the worst point this winter. Things are looking worse than they were a few months ago.
I’ll spend the rest of the post exploring why.
Have a question or comment?
Where are we today?
In a nutshell, we go into the winter in a pretty bad place. Cases and hospitalizations have reached new highs in the last four weeks, roughly doubling the previous peak in July/August. The recent dip in new cases we’ve seen is likely due to reporting issues because of the Thanksgiving holiday, rather than a real downward trend in the data. The last two days have shown rebounding rates reverting to the 14 day growth trend.
Deaths are increasing rapidly and have surpassed the previous peak reached in April. We now lose about the same number of people each day that died on 9/11.
Deaths, as a rate per 100k population, have largely been focused in the mid-west. North and South Dakota are some of the worst hit areas, by death rate, in the world.
The real cause for concern is how widespread the virus is in our communities, despite near constant containment measures.
To illustrate that, three maps. First, per-capita case rate on April 12, the first peak high.
You can see a spike around NYC, but very few cases elsewhere. This is why shutdowns were successful; there was very little community spread, so isolation and contact tracing was generally able to suppress infections at the community level.
Second, per-capita case rate on July 22.
More spread than in March, and largely in urban areas in the SE. Other places in the country still successfully mitigating spread.
Finally, November 23.
Endemic community spread everywhere. The reports I’ve seen from various states estimate that anywhere from 40-60% of cases have no known point of infection.
We’re going into the fall with widespread infections, very limited health care capacity, and social interventions which are failing to contain the spread.
Deaths, which typically lag cases by about 3 weeks, are also on the rise just about everywhere. As with cases, some of the highest death rates are currently being seen in the mid-west states.
Three different scenarios
Our forecast is based on three different scenarios of disease spread: best case, worst case, and a moderate baseline case. Each scenario is deterministic, in that the model is based on a set of fixed assumptions about virus spread over 4 months.
A note on vaccines, none of our models account for widespread distribution of a COVID vaccine. When this post was published, the earliest prediction for widespread availability of COVID vaccines was Q2 of 2021, beyond our forecast horizon. I’ve got another post in the works on how a vaccine might play out in the spring.
The Best Case
The best case scenario assumes strong social distancing measures are in place nation-wide, but no stay-at-home orders (R(0) is 1.05). In this scenario, we see cases peak at 210,000 per day in late February. This is about a 25% increase over the current 7-day average for reported cases.
Daily deaths continue to increase through February, reaching a peak of just over 4,000/day in early March. This is about a 33% increase over current levels. Note: our deaths forecast includes all deaths, not just those that are reported, which tend to undercount the real death rate by 33%.
Hospitalizations continue to stay relatively flat at just over 100,000 hospitalized and 25,000 of those receiving ICU-level care. This is 10% above the current hospitalization rates.
The Worst Case
The worst case scenario shows what would happen if all restrictions were lifted today (and R(0) is 2.0). This is why we ‘flatten the curve’: with no measures we see cases peak at 1.245 million positive tests in early February.
With no mitigation, daily deaths would also reach a catastrophic peak of 24,000 per day on February 10. This is about 10x higher than the best case scenario above.
Hospitalizations experience a similar peak, with just over 400k hospitalized by mid-February. Almost 100k of receiving ICU-level care.
The Middle-of-the-Road Case
Currently, the United States has some social distancing in place, but adherence is spotty and there is not nation-wide stay at home order. This is the middle-of-the-road scenario (R(0) of ~1.5).
In this case, we would expect cases to peak at just over 500k cases per day in late March.
Deaths would similarly peak in early March at just over 10,000 per day.
Hospitalizations would double to about 200k, with nearly 50,000 people receiving ICU-level care.
What is most likely to happen?
One of the challenging things about forecasting is that there are an unknown number of unknowns. Despite our best efforts, we don’t know what the actual reproduction rate is, or where outbreaks will pop up. A probabilistic model can provide some insight into this uncertainty by running an ‘ensemble’ with all the different observed transmission rates (called an ensemble member) to determine which outcome is most likely. This can help us ‘true’ the forecast by better understanding where we currently are relative to the different scenarios outlined above.
Based on these case numbers, we’re roughly a third of the way to the peak in late January, or early February. One important thing to note: we likely won’t see a dramatic drop off in cases after the peak, but rather a slow gradual decline over a couple of months.
The probability model show that there is a fairly tight spread of solutions between 240k and 340k cases per day. The means relatively high-confidence in the average (mean) result.
Based on the probabilistic model, we are trending somewhere between the best and moderate cases. Though it is also worth noting that things are trending worse each week.
Going forward, I see the high-side risk of exceeding the numbers as greater than the low-side odds of coming in better than expected.
What can change?
There are a couple of unknowns that will impact how COVID infections play out over the next couple of months.
What will spread look like due to Thanksgiving and Christmas travel and gatherings? Initial data from Thanksgiving suggests that air-travel wasn’t as bad as feared, down 60% from last year. However, given that is still a few tens of million people traveling, the likelihood of additional cases is high.
How and where will additional social distancing measures be put in place? Los Angeles county in California just announced new stay-at-home orders. More interventions like this can reduce the infections we see in January/February.
When vaccines become widely available enough to effectively reduce the spread? Right now, most doses won’t be available until after February/March. On this timeline, we don’t really expect to see a reduction in cases due to the surge.
I hope this provides some insight into what the next two or three months may look like. Thanks for making it through a long post! If you have a question or feedback, add it to the comments.
Background
A bit of background: we run a gridded, stratified SEIR variant to forecast the number of cases and deaths due to COVID for the next 120 days under different scenarios.
This model is different in that we can generate forecasts for local areas, like cities or combinations of counties. We also scale our model to fit the entire US; rather than run as a single aggregate, we run lots of different models for each age group in different parts of the country, then add them up. This way we can see local and regional effects that might otherwise be lost.
We model three different scenarios; best, likely and worst cases. One of the modifications we made to the standard SEIR modeling is to introduce more exogenous effects that better mirror the COVID restrictions in place in most parts of the country. Each scenario includes a different set of assumptions about the effectiveness of those interventions, and the willingness of sub-groups (e.g. young people) to adhere to restrictions, as represented by a total max infectable population co-efficient.
I appreciate all the work you've put into this, particularly given the horrific topic and our need to understand all our risks. It's kind of you to share. I hoped you might have included Puerto Rico (although was just being optimistic); I live in the District of Columbia, and that is often left off of these inquiries so am used to that. I find it jarring however to see any modern map of the United States that excludes Hawaii and Alaska, particularly since one of the maps you do use shows the latter as visible but empty of data. I am sure that you must have those data (or the graphs also would be missing two states) so perhaps the maps might be corrected?
This is both frightening and fascinating. Thank you. It does seem more straight forward than some models and in line with what I see. Is there any further detail available on the "under the hood" assumptions you use? In particular, ascertainment rate over time, IFR by age, and age prevalence over time. I've spent time modeling Florida's Covid death history (I'm a dabbling retired numbers guy) and find working with a model provides a better understanding of what the various pieces are and how they might be fitting together. A big question of course is what cumulative infection level this will get to. I'm starting to think based on Youyang Gu's numbers maybe 30-40% might be a common end point. Yours may be higher. Also wonder if the declines in general could end up being more precipitous than portrayed here (a la ND, IA, WI). Anyway, thanks for this post. Subscribed - harrison1244@gmail.com