Covid-19 and the Vaccination efforts

Note This is a notebook about something that came up in one of my courses, it involves looking at data from multiple sources over the past 2 years primarily on Covid data. All together it is over 100 million data points and it is a nice indication of how some people choose their own outcomes based upon what side of our US political divide they find themselves in. I have up to date data but this notebook is written using data up to Feb 23rd 2022. My sources are linked below.

At a basic level I now have access to every Federal Information Processing Standards (FIPS) county in America since Jan 2020. Some data does not show up until March 2020, but it is all there.

I can get the vaccination rate for different type of people, old young poor etc etc....I am mainly interested in just the people whom completed their sequence of vaccination. Too much parsing of vaccination types leads to too much nuance and you loose the big picture. For an example of what data there is, let us look at the vaccination rate of Buchanan County Missouri over time. We will also print the data from the last day of this data set.

Note: Booster percentage is only the percentage of people who have had complete series, not of county

One could think that this is good until you look comparitively, 50 miles south is the KC metro region. We can compare each county in the metro region to Buchanan

Lets compare with KC region

An interesting thing is the kink in the graph, those vertical lines in August 2021 is essentially new records being re-classified. As you can see, Buchanan County is not keeping pace with the rest of the other counties. My next though was to figure out does it matter, do the vaccines actually make a difference, not just individually but as a policy. So I joined those sets together and did some Regressions

Lets take a look at the vaccination and covid sets together

So I have all the raw numbers but not the percentages, I want to see if proportionally the vaccines made a difference in the 3,133 counties that I am tracking in the USA

Question: Does Vaccination rates of the county have anything to do with death rates?

This clearly is statistically significant, also I would note that I did not parse the data or choose what subgroup of the data to use. Simply put, the slope of the regression line indicates that the more you are vaccinated the less you die. One can get into arguments about correlation and causation, but if you are talking about Vaccination rates for a disease and death from said disease, you can infer causation.

I decided to look deeper into the data and see if there are any leading indicators of behavior, for example what type of county

There is a clause to define Metro or Non-Metro in this data (CDC Classification)

So according to the above graph, people who live in a "metro" area have a greater advantage if they are vaccinated compared to non-metro. You can explicitly see this looking that their linear regressions below, the coefficients for metro is -0.0034 compared to -0.0025 for non-metro.

This difference perked my interest, so I started looking at other classifications. The one that came to mind was the fevrocity of people concerning vaccines in relation to their political leanings. So I went and got voting data for each county in the USA

I needed to join the data sets I had, with data on Covid infections, deaths, vaccination rates and now voting record. I created a "Trump" margin value, which is the percentage win of Trump in the 2020 election. So -10 would mean the county went Biden 60% and Trump 40%

Then I plotted each county Vaccination versus Death rate, I color coded the Trump margin with more red being more Trump-margin and more blue being more Biden-margin. I also sized each county to the population of the county

As a mathematician I knew I was on to something from this picture so I started to dig. First this I did was create a Trump margin versus Vaccination rate

Looking at all counties and all types of counties this is impressive. The more your county voted for Trump the less likely that a random person in your county was fully vaccinated. If you look at the linear regression, you can see its statistically significant and the confidence interval i very tight. Essentially for every 2 percentage points a county voted for Trump they had a one percentage point drop in vaccination rate.

I was worried that this data was skewed because of the size of these counties, with more democratic counties being "metro" and more republican being "non-metro". So I split all 3,133 counties in to quartiles in size and and extra one for very large cities (500k+)

Lets look if it is the same throughout all populations

For each of these classifications, the results were the same, Statistcially significant and approximately 2:1 ratio of Trump-Vote:vaccination decrease.

So the morbid part of me decided to look at death rates instead of vaccination rates. I kow already that they are connected, but I wanted to see the relationship between voting for Trump and chance of death

Question: did this have any effect on deaths in those counties?

At last I saw a change,

When you look at counties with populations below 11,000 there was no effect that trump had on the death rates. The regression line is horizontal indicating it does not matter whether or not you voted one way or another, you have equally enough chance of dying. Unfortunately this lack of effect was only for 4.6 million people.

Every other size of county was statistically significant and all said the more a county voted for Trump the greater chance of dying.

The other thing I wanted to see was if this was the case since the vaccines were freely available.

Lets have a look at this but from June 1st 2021, When Vaccines were freely available

Essentially the same thing happened, it had more of a relative effect since the proportion of the county whose people were dying was smaller. But the compartive slope of the regression line was steeper. What this means is, the more that county went democratic the better chance of not dying comparitive to pre vaccine availablility.

I wanted to see whether or not there was somethign else, so I did a mulitple regression using Trump margin and Vaccine rates.

What is strange is that the vaccine rate has less of an effect on dying than voting for Trump. Trump was a leading indicator of dying rather than vaccination rates. Now this is obviously partially explained by the fact that there is such a strong relationship between Trump voting and Vaccination rates, but I speculate it must be more than vaccination rates, else the Vaccination rates would have been the leading indicator. So there is something inherent about counties that Voted for Trump and death rates. One could speculate age in those counties, and it might be that way for soem counties, but this data is true for each of the subgroups of counties with populations 12,000 and over.

What about Voter turnout, civil awareness

This was something I came up with while drinking soem wine and looking at these graphs. What if its not about democrate or republican, what if it is about civil engaement, measured in part by voter turnout?

The next graph just appeased my curiousity of something talking heads used to say. The more people turn out the more people vote democrat. I wanted to see if this is true.

The negative slope indicates this, the more voter turn out the higher Biden percentage

Whats more interesting is Vaccination rates compared with Voter turnout

This in itself was interesting, but predictable and statistcally significant

Another thing was just to look at infection rates and voter turnout. In this graph there seems to be a nice uniform distribution of voter types but the inference is goign in the same direction

The more voter turnout, or civic awareness, the less infections have happened.

This also translates into death versus voter turnout

Also

recent death versus voter turnout

I hope you enjoyed this notebook.

All of the code can be used by anybody, the data is available via the links given at the top.