tl;dr: i repeated frank's work from scratch today, and it's terrible. perhaps he was once a talented chemist, but he's clearly an awful statistician. his claim that he can predict election turnout is utterly bogus — he simply fits a line to 2020 county level voter turnout data within a state, then says "wow this line predicts 2020 county level voter turnout in this state!" well, no shit.
all the voter data i used can be found
here.
the first point frank makes is to show a plot like this and talk about how it's
impossible for the registered voters curve to so closely match all the bumps and wiggles of the turnout curve. he also shows a plot of population vs age census data and shows that the same bumps and wiggles are present. he never says why it should be impossible, he just shows and says
what are the odds???pretty good odds, actually. just think about it — are 55-year-olds significantly more or less likely to vote than 56-year-olds? of course not. the voter participation rate doesn't swing wildly over small age differences. so, if there are more 56-year-olds than 55-year-olds, and if the voter participation rate is the same between both ages, then we should expect to see more total ballots from 56-year-olds.
to give some intuition for this, let's imagine that voter participation doesn't vary with age at all; that is, everyone is equally likely to vote regardless of age. here's what frank's plot for hamilton county, ohio, would look like if the voter participation rate were 80%.
in other words, frank's plot isn't showing anything impossible. he's simply demonstrating that the null hypothesis — that voter participation doesn't vary on short timescales — is consistent with his data. 55-year-olds are just as likely to vote as 56-year-olds. nothing surprising there.
next, frank talks about his so-called "key" to predicting voter turnout in a state. he shows a plot like the first figure and says that every county in a state has the same proportion of ballots received vs registered voters. in other words, for every county, the red line has the same proportional difference to the black line. here's an example for hamilton county:
frank says the curve on the right is identical for all ohio counties, but that's not correct. here's the same plot for hamilton and franklin counties. they're similar, but not identical.
he then makes these proportional difference curves for every county in ohio. i was too lazy to download every county voter roll, but here are the curves for a random-ish sample of ohio counties large and small:
here's where shit really goes awry. he fits a polynomial curve to these data (fyi there is absolutely nothing special about a sixth order polynomial), but he doesn't specify how. not a huge deal, i assume he just minimizes mse or rms or some other loss function and fits a curve. the way i did it was much simpler: find the average voter participation per age. lol that is literally all he's actually done. see for yourself:
this is exactly the "6th order polynomial" he shows at around 11:43 of the lindell video. it's just the average voter participation across all age groups. i added the 2-sigma standard error in blue shading to mine, but otherwise they're the same curve.
he concludes by basically saying that this is incredible because you can use this line to predict 2020 voter turnout for any county in ohio — for each age you simply multiply the number of registered voters by the average turnout rate for that age...lmao no fucking shit, dude. you fit a curve to a bunch of data, and now you're acting shocked that the curve fits your data lolololololol. that's not prediction. that's just working forward to an average, then working backward to the original number, all with the same data.
the best part is that in the lindell video he says explicitly that each state has its own curve. the same curve doesn't work for all states. lol so it has absolutely no predictive power whatsoever.
end note: i still am not sure how this information would be useful to anyone who wants to rig the election with fake ballots. the data he and i are using includes the vote tallies. so if the extra votes don't show up in the tallies, then what difference does it make about all these age differences? none of this makes sense. anyone who falls for it is simply failing to think critically.