Not often am I drawn to begin a piece with a quotation, but after reading Chris Anderson’s ‘The End of Theory’ in this month’s WIRED, for which he begins by quoting my beloved George E. P. Box, my reaction is best summarized in poetry, and I turn to an excerpt from the venerable T.S. Elliot and his Choruses from the Rock…
Where is the Life we have lost in living?
Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?
Mr. Anderson is well advised to consider these questions posed by Elliot, as his assertion that the “scientific method is becoming obsolete” by the deluge of data stored in the “clouds” of Google is not only misinformed; it presents a dangerous framework for undermining our ability to truly understand individuals as they exist, and have existed, in all aspects of life.
‘The End of Theory’ bases its argument for abandoning the scientific method, and as a result formal modeling, on the fact that Google—and its ilk—now posses so massive an archive of data that we no longer require hypothesis and testing—we simply ask the data, and it provides an answer. This view is flawed on multiple levels. First, Google’s data is itself a model, and nothing more. At best, these petabytes of data can inform our understanding of human behavior in one tiny niche (the Internet) of our existence, for which only a fraction of people in the world actually operate, and furthermore have only functioned for a micron of our collective history. Even conceding this, it would still be impossible to gain this sliver of knowledge without first constructing a hypothesis to test against this mountain of data.
Next, given that this repository of data is so massive, we must consider how such large numbers affect the outcome of analysis. Poisson’s Law of Large Numbers has been an axiom of caution for almost 200 years, but never has it been more important than now in the Petabyte Age. Given such an enormous data sample, the answers provided will so quickly converge to the median or expected outcome that any proposed knowledge gained will only reinforce assumptions. This by definition is not knowledge, but instead the theory proposed by Mr. Anderson constructs an infinite loop of knowledge degradation. By his model (dare I use the term), one simply dips into this near-infinitely deep well of data to pull out the answer that fits their assumptions, and then propose this finding as knowledge. The danger is from there increasing layers of misunderstanding and assumption can be built, until eventually our perspective of the question from which we began has progressed so far from the truth we cannot turn back.
One final point related to the danger of developing knowledge from only data. Though I do not agree wholeheartedly with Nassim Nicholas Taleb’s black swan theory, the idea is extremely well taken in this context. The most important and interesting people and events occur at the extremes of our understanding of normal, and these extremities provide the motivation for much research. For Anderson; however, all we know will come from the evidence provided by massive amounts data; but how then will we cope with the unstructured random events that shape our existence? Consider my area of research, and forgive my bias. The study of terrorism is one for which little or no data exists. In order to increase our understanding of these unstructured and seemingly random events, models and the data in these models must be constructed and tested. However, these models can never be validated by data that does not exist. In fact, it is our ultimate goal to prevent data from existing at all, because at which point data does exist (e.g., a terrorist event occurs), somewhere our understanding was flawed, and we must begin anew.
Unfortunately for Mr. Anderson, but thankfully for the rest of us, we are far from the end of theory. The scientific method has served to build our knowledge for hundreds of years, and now more than ever, we need it to guide us through the cloud of data and deliver us to knowledge.
Automatically Generated Related posts:




Drew, you just said exactly what I was thinking when I read Chris’ article, but far better than I could say it.
Great post and great blog. Keep up the good work!
[Reply]
Chris,
Thank you for the kind words. I had not run across War and Health before, and found it fascinating. You keep it up as well!
[Reply]
The good news for me is that someone (Mr. Anderson) exists at a higher level of Luddite than I do. The bad news for all of us is that the Internet has been accepted as the resource/reference gold standard for millions, perhaps billions of people. Your point is well taken, not only as a matter of argument but also as a matter of caution. As an academic, I can absolutely confirm that undergraduates almost never go beyond Google in their searches for data. And they feel victimized when their evaluator asks if they had considered looking at such-and-such source, which happens to be a hard copy book or an unlinked database. A journalist once told me that one of his most valuable resources was the bookstore, where one can browse and take notes and never spend a dime.
[Reply]
I don’t defend Anderson’s peculiar essay, but this statement of yours is very strange:
“At best, these petabytes of data can inform our understanding of human behavior in one tiny niche (the Internet)…”
That’s like saying research in a science library can at best inform our understanding of human behavior in science libraries.
[Reply]
Clyde,
The point of that statement is to convey that for the purposes of researching human behavior, studying the activity people engage in online will only inform our understanding of how people act in that setting.
That is to say, the Internet is not a controlled lab for human study. It is a very new, and very different medium, in which peoples’ behavior and interactions are unlike anything else we experience.
I submit that we can learn a lot from these studies; however, I am very cautious as to the general conclusions that we can draw–hence the stated caveats.
Also, another excellent response to the Anderson piece
[Reply]
Drew,
You are reinforcing my point.
It is ludicrous to conflate searching the content of the Web with “studying the activity people engage in online,” just as it would be to conflate the contents of a library with the behavior of its denizens.
“[S]tudying the activity people engage in online” is a new and extraneous subject, quite irrelevant to all of Anderson’s essay and to the rest of yours.
[Reply]
Quite the opposite. I am trying to avoid such conflation by limiting the conclusions that could be drawn from this mountain of data. Using your science library analogy; yes, the Internet can tell teach a lot of things (Wikipedia) like the books in a science library can, but in studying how people interact (Google’s SocialGraph API) it is only a very rich model (as was noted by kio in Andrew Gelman’s post on the Andreson article).
[Reply]
A Few Non-Connected Thoughts and Links
The three of us along with Ray Carman traveled to New York City for the weekend to enjoy a few hours of Eddie Izzard performing at Radio City Music Hall. As such, the trip is still fresh in my mind
[Reply]
This was some really excellent brainfood…found you via Tim Stevens and I’m glad I did. Everything here has been very useful, tasty and relevant. Thank you.
[Reply]
Justin, thank you, and I hope I can continue to serve up some tasty morsels.
[Reply]
Drew, you may enjoy the piece I recently wrote in which I reference “The Hubris of The End of Theory.” You can find my post is titled “Signs of the Singularity and Why Chris Anderson and Nicholas Carr Won’t make the Next Cut” here …
http://phaneron.rickmurphy.org/?p=26
An excerpt follows …
I noticed a similarity recently in posts from Chris Anderson and Nicholas Carr. Over the past few months both of these widely read authors published a thought provoking post that calls into question humanity’s stewardship of knowledge in today’s 2.0 world. And each post contains signs of the singularity. Read on brave traveler, but don’t forget to bring your towel !
[Reply]
Rick, thank you for the reference. I will put my comments on your piece over at your website.
[Reply]
Signs of the Singularity and Why Chris Anderson and Nicholas Carr Won’t Make the Next Cut
I noticed a similarity recently in posts from Chris Anderson and Nicholas Carr. Over the past few months both of these widely read authors published a thought provoking post that calls into question humanity’s stewardship of knowledge in today…
[Reply]
Drew, I think you have come across some interesting and also worrying trends in the way education, research and the acceptance of facts is taken for granted by the current and next generation of students using Google as a bible of all knowledge.
In terms of conditioning, there have been studies recently pointing to the increase of laziness of surfers abandoming the use of the address field for the google search field in internet browsers.
This is particularly worrying. Typos and laziness, added to the possibilty of good trusted resources not ranking at the top of page one for their own name, brand name, research documentation etc. encourages surfers to accepting the results they find as gospel.
The fact that Google controls the mindset of so many gullable and naive surfers is a mild form of propaganda. The current and next generation surfers need to be educated and conditioned to using Google like a traditionally library. Sometimes you need to be patient spending hours walking up and down aisles checking out different books/sources for the research and information you require. Train your brain to intepret which books are useful and most likely truthful and which ones are perhaps economical on the truth, or just useless.
Unfortunately, too many surfers just accept page one results of websites as the most trusted and credible sources of information. Google has a lot to answer for.
[Reply]