Thursday, October 14, 2004

Yet more biased sampling

There is a story that during WWII Allied Bomber Command noticed they were losing too many bombers over germany. They set up a team to study the problem. The team requisitioned a collection of squads of soldiers who went all over southern britain looking at shot-down bombers. They recorded every bullet-hole and flak-hole in the shot-up airplanes, and eventually produced a map of the bomber showing where the planes were hit the most. They recommended armor at those spots. The armor would make the planes heavier so they would have shorter range or less payload or less speed, but it was agreed that this price was worth paying if fewer bombers were lost. The recommendations were followed.

Another team kept track of bomber losses. The missions changed, and there were more missions, so it was complicated, but this other team eventually concluded that losses were at least as high as before and maybe higher. The armor was useless or worse than useless. So the first team was assigned to repeat their work and if possible find out what had gone wrong.

While the repeat study was in progress, one of the soldiers they had counting holes in aircraft pointed out to them, "You're only making me count the holes in the planes that made it back." Sure enough, when they looked at the data they found that the spots that were shot up the most often were spots that had no vital function. The most vital spots (like the pilot's seat) were hardly shot up at all. Because those planes didn't make it back across the channel to get measured.

More biased sampling

A writer once decided to write a book about how self-made millionaires did it. He used his contacts to find a sample of self-made millionaires and fourteen of them agreed to tell him how they made their money starting from essentially nothing. He cleaned up the interviews a little and put them together into a book. He recommended their methods as ways for others to make money. The book turned into an easy $15,000 for him, once the royalties came in.

When I read the book I noticed that every one of those men had, early in the game, done something that was likely to get them jailed and they got away with it.

To actually get an idea how likely those methods were to make money, the author should have started with a bunch of people just like the self-made men and watched how many of them became millionaires versus how many of them became inmates. He was only counting the ones who made it.

Of course it's impractical to do that kind of study. That's why such books give mosty-useless advice.

Saturday, September 11, 2004

Biased sampling

Whatever you believe about the world, I hope it's connected to experience and facts and such.

People want to believe that the experience they've had is representative of the potential experience available to them. Usually it isn't. This causes trouble when you want to predict the future based on your experience with the past. Or to predict, well, anything.

Here is a story. Some MDs thought that diet affected heart attacks, and they had ideas about just which kinds of food promoted heart attacks. So they went to a nearby hospital and gave questionnaires to people who had recently had heart attacks, and also to people who were in the hospital for broken legs and such. The people with the broken legs were a control group. Nobody thought diet made people break their legs. So if the two groups of people ate obviously different food, that would say something. It turned out they did eat significantly different foods and so the researchers published a paper about their work.

But then doubters appeared. One of the doubts went like this: The people who had had heart attacks were all people who had *survived* their first heart attack. Many do not. Maybe the food that they ate didn't give them heart attacks. Maybe that food helped them survive heart attacks. So they were still around to fill out questionnaires.

Both explanations fit the data very well. Eventually researchers set up the Framingham Study, where they looked at a whole town and kept track of a variety of things on everybody in the town of Framingham Massachusetts and then waited to see which of them would develop heart disease. It was the only way to be sure. And even that doesn't prove cause. It could be that the people who eat the "bad" foods have an inherited disposition to heart disease that also predisposes them to eat those particular foods.

There is no easy way to avoid biased data. My best suggestion is to look for te possibility of bias, and try to avoid being too certain of your conclusions.

Sunday, September 05, 2004


If you haven't read _Systemantics_ by John Gall, I highly recommend it. Here is a review.

Here is a publisher.

It covers fundamentals of dealing with systems, particularly large systems. It's written in the style of C Northcote Parkinson, with grandiose language. Here are some valuable things from it:

A complex system that works is invariably found to have evolved from a simple system that works

A complex system designed from scratch never works and cannot be patched up to make it work; you have to start over, beginning with a working simple system.

This fits my experience, does it fit yours?