Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paper suggests that even second-digit analysis cannot be used #16

Open
ghost opened this issue Nov 7, 2020 · 1 comment
Open

Paper suggests that even second-digit analysis cannot be used #16

ghost opened this issue Nov 7, 2020 · 1 comment

Comments

@ghost
Copy link

ghost commented Nov 7, 2020

Please refer to chapter 2 in the following paper:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.697.5592&rep=rep1&type=pdf

The paper suggests precinct results in previous elections in a number of countries do not seem to follow the second-digit Benford distribution.

Let me try to outline why this does not hold for second digits either. If you have precincts in cities designed so that the votes for a certain candidate follows a chi squared distribution with an expected value of 5000 and a certain deviation, then the most likely result is 5000 (2nd digit: 0). The second most likely results are 4999 and 5001 (2nd digits: 9 and 0). The third most likely results are 4998 and 5002 (2nd digits: 9 and 0). Etc. (edit: i got this wrong the first time)

On the other hand, for a Benford distribution, the most likely result is 1. The second most likely result is 2. The third most likely result is 3. Etc.

Hence, using second digits does not fix the problem with planned precinct sizes. We can perhaps see from the example how Benford's Law will only work if the expected value of the distribution is 0. With rational planning of precinct sizes inside cities, that won't happen. Countryside precincts are more likely to follow the Benford pattern, as the number of votes in each precinct will be more "organically" determined and less planned.

It thus seems that the methodology cannot be applied inside cities.

@dshield55
Copy link

dshield55 commented Nov 7, 2020

with an expected value of 5000 and a certain deviation, then the most likely result is 5000 (2nd digit: 0).

I'm not sure how well this applies to the data I've been looking at. At first glance, I don't think it does.

The size of the precincts seem non-conforming to me as I've been looking around. Like here in Milwaukee, you can see they try to target precincts around 1,000 voters, but the actual number of registered voters per precinct looks pretty random. Im thinking # of registered voters per precinct probably fits it's own Benford curve, but then you have to include on top of that that each of those is going to have it's own turnout rates which in Maulwaukee range 50% - 97% (lol @ 97% turnout rate 680/702 voters.)

milwaukeehistregpre

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant