By-Election Calculations: Sorcery or Science?

Updated May 2020

Basics

Since I’ve started this project, I’ve occasionally had folk say they find my “re-calculation for a single councillor” quite dubious. How can you possibly know? Is it educated guesswork? Do I use some kind of complex statistical model? Have I got a direct line to Professor John Curtice? Do I just make it up for funsies? The short answer is simple: I literally count the ballots again.

STV calculations are complex. Counting up later preferences in full to get the correct fractional transfer is time consuming. That’s why it can take a few days to finalise Assembly elections in Northern Ireland. And in Australia, where there can be dozens upon dozens of candidates on the ballot, it’s not unusual for it to take a month to confirm the Senate’s composition.

Alternatively, you can opt not to do a full count of later preferences, and instead take a sample and apply that. So if you take a small bundle of votes for a successfully elected candidate with a 2000 vote surplus to transfer, and 8% of the next preferences go to Jane Bloggs, you just fire over 160 votes. But that’s imprecise and in Ireland it’s not unheard of for candidates in tight races to demand recounts on a different sample.

In Scotland, where we like our election results both quick and perfectly accurate, we decided to forego hand counts in favour of computerisation, at least for full elections. That means you usually know the makeup of every council in Scotland within 24 hours of polls closing, with absolute certainty about transfers. But computers have been a great addition for another reason. A really useful side effect of this is that you end up with a wealth of public data that it isn’t feasible to collect for any other election.

For example, anyone can access a complete list of votes broken down by individual ballot box in each ward – which you can then use to identify areas where a given party did particularly well or badly. Another of these documents, the Preference Profile Report, is what I use to do my single councillor re-calcs. Here’s an excerpt from Clackmannanshire South’s;

As you can see in the above image, it’s an excruciatingly dull and at first glance really cryptic document. It’s just a list of numbers; what on earth do they mean? Well, those numbers represent absolutely every single ballot that was counted and the exact order candidates were marked. The first number on each line tells us how many ballots were marked in this order, numbers after are the order candidates were marked, and a 0 means no further preferences.

So the first line tells us that 12 voters decided to only vote for the first candidate on the ballot. Second from bottom tells us 2 voters marked candidates 1, 2, 3 and 7 in that order, then no more. And we can see that 3 voters actually opted to rank candidates as they were on the ballot, 1, 2, 3, 4, 5, 6, 7! This goes on for lines and lines and lines, showing every single ballot.

In other words, you can completely reconstruct previous election results for fewer councillors. No guesswork, no theories, no magic – just counting the ballots as normal. And after two whole years of doing this by hand, I finally cracked how to automate it! It seems blindingly obvious now, but I spent ages wracking my brain for the solution. You can now have complete confidence that the numbers reported are spot on. Previously you could have been about 99.9% confident, as the mind-numbing work of re-calculating by hand would occasionally cause me to lose a handful of ballots in a given round – never enough to change the outcome! The new auto-calculated versions will be found in preview posts dated from May 2020 and later.

Caveats

It is important to bear in mind that it’s likely there would have been slightly different voter behaviour if they were just electing one councillor at the full election, however. A small but not insignificant number of voters will “bullet vote” – vote for just one candidate – even when there are multiple candidates from the same party on the ballot. Additionally, some voters will go across party lines for their later preferences rather than stay within one party for their first few. That can occasionally result in some oddities, for example by very rarely meaning a party ends up with fewer votes by the end of the transfer process than it had in first preferences.

It’s almost certain that some of that is due to not really being sure how the system works, and when faced with just one candidate from their preferred party, they’d vote for that candidate. But it’s also certain that some of it accurately reflects voter preference, and so we can’t assume that all first preferences for a party would have applied to a single candidate. Since we only know how people voted but not why, it’s best just to stick strictly to what the numerical data says. The effect this will have on the general thrust of the result will be minimal in almost every case.

Additionally, in those circumstances where a party has stood multiple candidates, it’s useful to eliminate the lower-placed candidates first. This helps replicate the conditions of a by-election by getting the calculation to a point where each party has one candidate. This has no impact on the overall final result, it’s just a matter of presentation. You’ll notice this process in preview posts dating from early 2020.