Tuesday, July 17, 2007

Statistics and Polls - The numbers and assumptions they do not reveal

Political campaign season is upon us - a year earlier than normal - and all sorts of polls will be published, brandished or banished for each and every possible person and topic.
These are always treated and presented as "facts" with a margin of error attached of often 3%. Given that the vast majority  of the US population (I'd say at least 70%) have never taken a college level statistics class these error margins numbers really have no meaning to them. And given the fact that without knowing how a statistical average is calculated and how standard deviation is used to arrive at 97% accuracy leaving the 3% margin of error the polls are widely used as a pure propaganda tool by all political parties.
The newscasters when using these polls also state the number of people polled - with the count of 1003 people being cited often along with the margin of error "fact."
What a "scientific" poll is trying to say is that if they poll 1003 people, get these results, poll another group of 1003 people they should get the same set of results. However, the accuracy is only within +- 3 so if 74% people liked something it really could be 71 or 77 but they are fairly certain that 97% of the time that it will be between 71 and 77.
Polling just 1003 out of 301 million people in the US (June 2007) to state as fact as to what percentage everyone believes is a stretch to most people. Statistics is just that, a stretch by mathematical means that polling 1003 people would give the same results as polling 100,300,000 adults. The formulas behind all of this is based on probability - confidence - that as you approach X number of people the accuracy increases to a point that polling more people does not significantly change (which means a standard deviation from what was observed) the percentage based on the smaller sample.
This 1003 sounds good, and thinking that 1003 people would be enough of a sample "universe" but this sample can hide lots. What pollsters do not tell often is that where these 1003 people come from. With 50 states that means just 20 people from each state would be answering the poll. Is it a telephone, paper, web based poll? Each of those in turn have their own unique sets of universes of views, abilities, and outlooks. A Web based poll is not going to get poverty people who have no web system access. 20 registered voters from Boston to represent all of Mass compared to 20 registered voters from Harvard? It still is 20 people but those 20 from Harvard have way different outlooks than 20 from Boston. Separate out teenage voters, 20,30,40, 50 year old or WW II era retirees and each have their own outlook you soon see that having just 20 from a single state to reflect all of those age years then add in gender and that 1003 people to reflect a NATIONAL view of anything is really false.
The math people will trot out formulas stating that it is valid, but unless they actually DO each of those subgroups - they are really GUESSING using effective propaganda wrapped around a trust in science to present how large groups of people believe.
One of the basic principles that a statistician learns is how the question is asked, sequence presented, also will influence the outcome. Do these polls mix up the question sequence, reword them, have word bias built into them (motive of the pollster) and other hidden items that affect answers? To account for these you would have to start asking even more people to ensure that the mere act of asking and how asked will not affect the answers received.
If you polled a 1003 people in a national poll and the answer came back that drinking age should be lowered back to 18 would you believe it when reported by a news organization? If others repeated the poll and got the same result many times? Now what if you found out that they only polled 18 to 20 year olds to get that statistically valid answer? It is perfectly valid, repeatable, and represents the view of people nationwide - in that universe of people polled.
So the next time you see any statistic being presented as "fact" think about these (and other) requisites that must occur to accomplish polling before you decide to believe these "facts" on how people think - or ANY fact stated by anyone or any government.

0 Comments:

Post a Comment

<< Home