Know the quality of your data sources


“There are liars … damned liars … and statisticians.” So goes the old cliché.

We are (or can be) awash in numbers. Some, as the cliché suggests, are numbers that don’t mean much other than to lie to us. After all, I can “correlate, tabulate, process and screen; program, printout, regress to the mean” many more things than actually should be quantified. I can find all sorts of cross-correlations and apparent causations just by picking which variables are independent and which are dependent.

Well, if you’re undergoing your own program of personal due diligence, you’ll be paying attention to trends, which means you’ll be looking at numbers from time to time. So it makes sense to know what you’re looking at.

After all, if you were a regular viewer of the Fox News Channel, which liked to use the Rasmussen polls, last Tuesday’s US election probably served up a few “say, what?!?” moments. Viewers or readers depending on other broadcasters or a single newspaper, in turn, were similarly locked into one pollster or another. (As it happens, Nate Silver’s Five Hundred Thirty Eight analysis done for the New York Times nailed just about everything bang on. But most other polls tended to track the final results more closely than did Rasmussen.)

Before you stop and say something nasty, let me also point out that someone’s got to be closer to right and someone’s got to be closer to wrong: Nanos, who’s come closest in Canadian federal politics during the last three elections, is now the outlier in Canadian polling. We’ll find out eventually whether Nanos has picked up a change earlier than everyone else — or is enjoying its turn in the unlucky box, where Rasmussen is parked this morning.

So there’s the first due diligence thing to note: have multiple sources. (In Canadian politics, the work Éric Grenier does at threehundredeight.com in blending and smoothing all the published polls makes for a better one-stop shop on that kind of data than any individual broadcaster or newspaper’s favourite polling company can give you. Unsurprisingly, he takes Nate Silver as his inspiration, since that’s what’s going on at Five Hundred Thirty Eight.)

It’s also important to actually look at the numbers as opposed to the commentary. In 2004, George W. Bush was seen to have “thumped” John Kerry; in 2012 Barack Obama was seen to have “squeaked out a win” against Mitt Romney. The numbers show that actually Obama received a stronger vote than Bush did. So always go to the data, and ignore the adjectives.

Industry statistics, on the other hand, can be a trickier entity to traverse.

Take the personal computer. Smartphones, tablets, pads, netbooks have all eroded the old desktop-or-laptop model of the industry. But no two surveys count the market in the same way.

Some, for instance, now lump netbooks in with PCs as “ultralights”. Others don’t. Some will undoubtedly lump the Microsoft Surface tablet in a year from now (regardless of which vendor makes one) because “they’re all Windows 8”, whereas the Android and Apple equivalents don’t run “PC operating systems”. Others won’t.

Here’s the trend, when you actually crunch some numbers. Desktop and laptop sales have slowed. Apple — whose Mac products don’t grow at anywhere near the rate of iPhone and iPad — has taken increasing market share from the Windows PC market in that slowing trend. So the Dells, Lenovos, H-Ps, etc. have lost some business to non-PCs, and some to Apple, where Apple only lost to non-PC.

But here’s the other part: machines (despite being increasingly flaky due to poor components out of cut-price factories) are remaining useful longer. That’s a function of software: there are fewer “need to upgrade” moments. So everyone is also losing to “the one bought five years ago”. (This piece is being written on an early 2006 MacBook with 512Mb of RAM running Tiger, a six year old version of Mac OS X. It works just fine as a writing machine. Why buy a new one?)

There isn’t anyone tracking “sales that don’t happen”, and yet that’s maybe the most important factor driving your industry and its markets.

The bottom line is this: know what data you need to track, figure out the quality of the sources (and what they’re including or excluding), and then, to be a real star, figure out what’s just plain not in the numbers that you also need to know.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s