For PubAffr819: What not to say to a policy analyst

(1) not make absolutist claims without knowing the nature of the data; (2) Do not abuse statistical terminology; (3) don’t claim a conspiracy just because the data doesn’t match your preferred narrative.

First, consider a comment on the death toll from Hurricane Maria:

This [assertion that thousands of American citizens have died] flat out false, Menzies. The excess of deaths in the PR until the end of the year, recorded by the Bureau of Statistics, amounted to only 654 people. Most of them fell on the last ten days of September and the whole of October. Although the power outages there were exacerbated by government ownership of the PR utility, most of the excess deaths would likely have occurred independently, given the terrain and strength of the hurricane. So perhaps 300-400 excess deaths would have occurred regardless of the steps anyone could take to fix the power supply. The remainder can be attributed mainly to state ownership of the electricity company.

I would like to note that in December the excess mortality fell by half. Thus, the evidence suggests that the hurricane hastened the death of sick and dying people rather than killing them all at once. I expect excess deaths over a yearly horizon (say before October 1, 2018) to be perhaps 200-400. Still a noticeable number, but certainly not 4600.

See analysis:

I would point out that the official death toll is 2,975, according to a GWU report commissioned by the Commonwealth of Puerto Rico.see discussion of estimates Here.

Secondly, post 2018 regarding the uncertainty of statistical inference.

Mister. Steven Kopitz objects conducted a study point estimate (4645) and confidence interval (798, 8498) with the Harvard School of Public Health for excess mortality in Puerto Rico after Mary, thus:

Is Harvard behind the research or not?

That is, does Harvard SPH think the central estimate of additional deaths as of December 31st is 4645 or not? Is it beyond the confidence interval or not? Are there 50+ more likely that the death toll is over 4600? If there is, then the PR people should start looking for the 3,250 missing, or the press should assume that the PR people are lying. These are implied actions.

Or should we just take whatever number HSPH releases in the future and divide by 3 to get a realistic estimate of the actual value?

Let’s show the details of the graph displayed earlier (in this mail):

Picture 1: Grades from Santos-Lozada and Geoffrey Howard (November 2017) for September and October (calculated as the difference between the average marks) and Nashant Kishore et al (May 2018) for December 2017 (blue triangles), etc. Roberto Rivera and Wolfgang Rolke (February 2018) (red square) and a Santos-Lozada estimate based on published administrative data of 6/1 (large dark blue triangle), end-of-month figures, all on a logarithmic scale. + specify upper and lower bounds for 95% confidence intervals. The orange triangle is Steven Kopitz’s year-end estimate as of June 4. Aggregate figure for Santos Lozada and Howard October is calculated by the author based on reported monthly figures.

The middle paragraph (highlighted in red) shows a misunderstanding of what a confidence interval is. The true parameter is either in the confidence interval or not. Rather, this would be the best feature of the 95% CI:

“If this procedure were repeated over multiple samples, the proportion of calculated confidence intervals (which will be different for each sample) that capture the true population parameter would tend to 95%.”

In other words, it would be a mistake to talk about a 50% chance that the actual number will be higher than the point estimate. But this is precisely what Mr. Kopitz considers the confidence interval to be a tool for. He is wrong in this regard. From Politifact:

University of Puerto Rico statistician Roberto Rivera, who, along with his colleague Wolfgang Rolke, used death certificates to estimate a much lower number of deaths, said proxy estimates should be interpreted with caution.

“Note that, according to the study, the true number of deaths due to Maria could be any number between 793 and 8498: 4645 is no more likely than any other value in that range,” Rivera said.

Once again, I think it’s best if those who want to comment on the estimates are familiar with statistical concepts.

Third, here is an example of data paranoia from a recent mail.

Reader Steve Kopitz writes about the debate over the number of employees:

At the same time, I considered it possible that both surveys were in fact correct, but skewed by the effect of suppression recovery, which created a misleading impression because we misinterpreted the data. It still seems possible, although I’ve read that others think CES was manipulated to create a rosier picture ahead of the election.

This statement joins a long stack of such statements, for example, Senator Barrazo, Jack Welch, former rep. Allan West, Zerohedge, Mick Mulvaney, among others. All I can say is that if there was a conspiracy, they didn’t do a very good job. Thanks to the revision of the January benchmark, we can update our assessment of how poorly the alleged conspirators did their jobs.

Picture 1: Nonfarm payrolls in the January 2023 report (red), in the October 2022 report (blue), in thousands of sa, source: BLS via FRED.

Now, over time, it may turn out (after the next revision of the benchmark, the results of which will be made public in February 2024), that in the second quarter NFP will be lower than indicated by CES. But for voter fraud in November 2022, it seems like a lousy way to go.

In any case, before people start screaming about the data being manipulated, I would like them to read the BLS technical notes on (1) revisions and mean absolute changes, (2) benchmark revisions, (3) calculation of seasonal adjustment factors , (4) application of population control in CIPF. Before they start quoting different series, I would like them to understand the informational content (relative to business cycle fluctuations) of the CPS employment series compared to that of the CES employment series. This understanding can be gained by reading the work of people who understand the characteristics of macro data (Furman (2016); CEA (2017); Goto et al (2021)).

From a sociological standpoint, I wonder why conspiracy theories are so appealing to some people. Here Scientific American article laying out some character traits that are associated with adherence to conspiracy theories.