Search This Blog

Thursday 25 September 2014

Significant figures

This was one of my early radio talks, broadcast in about 1990, I think. 
 
High precision wombats ahead!
I have a peeve. It's been my pet for many years, and it doesn't look like going away. That being so, I don't see why I shouldn't try to share it around.

I got my peeve when I was quite young, about twelve or so. I used to wonder why newspaper reports of fires in America always seemed to mention damage that came to such odd amounts. Why were there so many fires costing four hundred and forty six thousand pounds, I used to ask myself. Then I found a table for converting foreign currencies, and the penny dropped.

Somewhere in America, a soot-stained and water-damaged Fire Chief had rasped out, through a smoke-filled larynx, his estimate that the cost was "a million dollars". Or maybe it was the jubilant and over-insured proprietor, secretly rubbing his hands at a successful outcome. Let's just say that it was the Fire Chief. I don't really want a defamation suit on my hands.

Eager reporters had seized on this figure of a million dollars, and scribbled it in their note-books, then rushed back to wherever they go to put things "on the wire". In distant Australia, a hack sub-editor sat, blue pencil in hand, eagerly waiting for a chance to earn his pittance by making the story seem more local.

In those far-off days, when my peeve was just a pup, we had pounds, not dollars, and one million merican dollars was four hundred and forty six thousand of our pounds. Plus a few odd shillings and pence, but even sub-editors could see that would be a bit silly. So four hundred and forty six thousand pounds it became.

The Fire Chief's "million" really meant something like "closer to one million than to half a million or one and a half million". So it would not have been an undue liberty to say that the fire damage had cost half a million pounds. That estimate would at least have been consistent with the level of accuracy that the Fire Chief had claimed.

So that was when I got my peeve. I just couldn't see how people could justify the spurious claims to accuracy that they made with their naive conversions. Not of course that it stopped when we went to decimal currency. The penny couldn't drop any more, and all over the nation, people had to settle for heaven sending zero point eight three repeaters of a cent down. That silliness eventually fell away, just in time for us to go metric.

How often have you driven along a country road, and seen a sign that indicated how far it was to the next petrol station? In the good old days, these signs usually said something like "One mile to the next Oilygas station". Stuck to a convenient tree, about a mile from the next Oilygas station, what do you think happened when we went metric?

You guessed it, the sign now tells us that the petrol station is located just one point six zero nine kilometres away!


I for one don't blame electronic calculators for all this. If those aids to laziness were behind these spurious claims to accuracy, we would have had one point six zero nine three four seven two at least.
That brings the accuracy level down to the nearest tenth of a millimetre, and with the sign needing to be changed every time the tree sways in the breeze! No, I don't blame the calculators.

Because if I did, that would not explain a recent iron man race, near my home, which was publicised as forty two point one nine five metres long. Funny, I thought: how do they stop the marker buoys from drifting around by at least a metre when the wind changes? I looked further, to see where the figure came from.

I didn't have to look far. The first leg of the race was a sprint of 195 metres. Everything else fell into place from that, because the next smallest unit was three point five kilometres. By now, I hope that you, the listener, will be adding, sotto voce, "to the nearest half kilometre". Or, if you are really sophisticated, you might even go so far as to say" plus or minus two hundred and fifty metres".

You see the fallacy, I hope. There were half a dozen legs to the race, all but one being guesstimated to the nearest half kilometres. Forty two point one nine five be blowed!

If the competitors had been blown out to sea by a strong wind, they would probably have been rescued 160 kilometres away from the coast. Why? The answer ought to be obvious by now: just ask yourself what is 100 miles in kilometres?

Maybe the real cause is a mistrust of rounded figures. They say that when Mount Everest was first measured, the surveyors came up with a height of exactly twenty nine thousand feet. "Nobody will believe that!" they said, and so they reported it as twenty nine thousand and two feet. That's the story, anyhow.

I suppose that the ability to see the wrongness of the false claims to accuracy comes partly from training. A friend who studied physics at University mentioned recently how frustrating it was to do ordinary practical work. She had calculated Joule's Equivalent, which was a nice traditional thing to do, and she had enjoyed messing around with the equipment.

What she didn't like was that she had to make allowance for all of the possible errors that might have crept in during the experiment. The end result of her measurements was something like seven plus or minus seventeen. Not very satisfying to somebody whose aim in life was to unravel the mysteries of space-time.

Nor could it have been very satisfying to Joule, when he first carried out his exercise. Joule's original apparatus is to be seen in the Science Museum in London, and it's much more complex than anything that young physics students use. Obviously, Joule spent a lot of time improving his level of accuracy, and he would have understood a thing or two about significant figures.

Still, when she is older, I expect my young student to have a peeve about the next petrol station being one point six zero nine kilometres away. Just like Joule would have done.

Sometimes, the temptation to pretend about accuracy leads to delightful silliness. My favourite example comes from a study that I did some years ago of readability formulae.

Readability is one of those things that people like to talk about, but not act on. It's far easier to say that something lacks readability than to write something readable. Writing readable material is an art, not a science, but it is an art which has been cluttered up with a whole load of pseudo-scientific gobbledygook.
The mumbo-jumbo of the readability formula comes down to this: short sentences are good, long sentences are bad. Short words are good, long words are bad.

Sounds a bit like the chant of the sheep in "Animal Farm", doesn't it, but there it is. As the readability witch-doctors see it, reading passages have no intrinsic interest. The only things that determine readability are a few simple items like word length and sentence length.

The mumbo-jumbo workers have come up with various ways of combining word length and sentences length to calculate the age (or school grade) for which a piece of text is suitable.

One of the more popular examples of a readability formula was actually developed for use with Ugandans whose English was a second language, but it has been widely used on native English speakers around the world for years. The measure also has other faults that I shan't go into here. The method is dead easy to use, so who cares that it's probably invalid?

But that's a formula that fails because it was badly developed. The accuracy claimed is spurious, but it doesn't reflect a wrong use of significant figures. So my favourite failure among the readability formulae is one that falls short because it offends that peeve of mine.

Remember that each readability formula is supposed to find an age year or a school grade for which a piece of text is most suited. So what would you do with a formula for school grade which asked you to multiply the word length by four point five five, and add that to zero point zero seven eight times the sentence length, and then to take away two point two point two zero two nine of a grade?

I would have thought that four and a half times the word length, plus point oh eight times the sentence length, minus two point two, would have been fair enough. Accurate to the nearest nine ten-thousandths of a grade? They've got to be kidding!

Why did somebody engage in this silliness? I suspect that it was somebody after a Paper-Hanger's Delight, also known as a Ph.D.

Now a thesis can be a real slog. If it requires somebody to go and study things, to think about things, and create things, it can be extremely hard work. The alternative is to measure things, which is at once both scientific and easy.

It may not be accurate, but that's not the point. If it feels scientific, do it! The researcher has only to feed enough figures into a computer, and then press the button labelled "factor analysis", and out comes a dinky little formula. just raring to go. Or maybe the button labelled "Multiple Regression", or some other related button.

These methods have a common effect of producing multi-number factors which can be thrown into an argument (or the bin, for that matter, but they usually aren't), multi-number factors which can be thrown into an argument to produce the best available prediction of some other value. The problem is that they are only as good as the data that spawned them. If the data are even a bit wonky, then the last couple of digits in the magic numbers become something of a joke.

And speaking of jokes, I recall a Goon Show sequence which relied on the spurious accuracy principle. An expert gravely announces that a skull is, as I recall it, "one hundred thousand years old". There is a pause, while the audience absorbs this statement, and then Milligan and Sellers break into the least mellifluous version of the birthday song ever to be regularly re-played over the ABC. The ridiculous becomes apparent to all, each time it is played.

Even so, guides in caves around the world will state, in all seriousness, that "these caves are two hundred and twenty five million and thirty years old". If one dares to ask how they know this, they will explain that the caves were dated thirty years ago at two hundred and twenty five million years. It happened to me: nobody guffawed, nobody laughed, there was not even a hint of a snigger.

Without the Goons to point it up, the sense of ridiculous was lacking. Politicians know this, and they rely on it. Next time you see that a new bridge is going to cost one billion and fifty three thousand dollars, be a bit suspicious. The billion is a ball-park figure, that probably won't be within a Zurich gnome's cooee of the final cost.

The fifty thousand will be for bill-boards with politicians' names on them. The three thousand may well be there to cover the cost of free beer for the workers at the end of the job. More probably, it is there to make it look as though the whole lot has been carefully costed in incredible detail: the Everest measurement principle in living colour, as it were.

So who or what do we blame? I know that there are conservative critics of education who take delight in blaming the educational system. We find the Americans among them blaring forth things like "First we had old Math, then we had New Math, now we have aftermath!".

I don't believe the conservatives, nor do I sympathise with them. The conservatives' crime is as hideous as using unjustified significant figures. They are extrapolating from a biased perception, in a moving frame of reference, from too short a base-line.

I think the answer is a whole lot simpler. I think people have just taken far too literally the old sheepish adage that there is safety in numbers.

No comments:

Post a Comment