Thursday, March 27, 2003

PC Forum, Day 2 (Monday): A minor screed on using one's noggin

Somewhere leading in to the 10:50 a.m. panel, I started to think about the value of using one's brain. My musings began with someone's statement that it is possible to develop “classified” information by aggregating and making sense of unclassified information. In other words, if you use your head you can take data sets that seem quite neutral and piece them together with other neutral data sets, only to see a fuller picture emerge. All of a sudden, it becomes information you're not supposed to have. In other words, using your brain is subversive?! Well, of course it's subversive. It always has been. This is one of our greatest sources of power as individuals.

During this particular panel, I noticed that the gentlemen on the stage were speaking from a set of assumptions about data that were (in my opinion) rather ill-informed. For instance, they seemed quite pleased with themselves because they look at "links" in the data (whatever these are, mathematically speaking--correlations? path analyses? what?) and this helps them identify consumer behavior patterns, potential miscreants, etc. This really started to get under my skin.

In a sea of data, how do you know if a "link" is important or meaningful? It is very easy to achieve statistical significance at the p<.05 level if your data set is large enough. But, how do you know what-if anything-it means, and whether or not it is important (back to Gilman Louie)? Suppose you interpret a 'link' incorrectly? Notwithstanding the fact that being wrong is expensive and inconvenient, who will be hurt by your mistake? How do establish or test meaning when it is likely that--at least in matters of race, age, or nationality--there is a good deal of bias involved? There are known ways of doing this, but they are not widely known in this setting. Not that there's anything wrong with that--at least it helps me stave off feelings of personal irrelevance.

To borrow from the title of the meeting, making "data come alive" should be about far more than managing, sorting, or even conducting significance tests on massive piles of data. Good data and lots of it are necessary but not sufficient conditions. One's ability to develop and test hypotheses, and one's skill at interpreting results, are the primary differentiators between a pile of data ressembling (along numerous dimensions) the Herculean stables, and a thing of functionality and beauty. This complex interweaving of art, science, ethical judgment, commerce, creativity, and more is fundamentally a human endeavor.

I said something to this effect on day two and was greeted with a sea of blank stares and an awkward silence. I don't know that I said it as clearly as I did just now, and I certainly wasn't as succinct. But at least I knew what I meant.............and I felt compelled to say it. Etiquette tip to panels--you need to develop a PC Forum equivalent for "thank you for sharing."

No comments: