🔎 
  
CONTENT - Selected Essays on Technology, Creativity, Copyright and the Future of the Future
Cory Doctorow (2008-09-15)

21. Wikipedia: a genuine Hitchhikers' Guide to the Galaxy -- minus the editors

(Originally published in The Anthology at the End of the Universe, April 2005)

“Mostly Harmless” -- a phrase so funny that Adams actually titled a book after it. Not that there's a lot of comedy inherent in those two words: rather, they're the punchline to a joke that anyone who's ever written for publication can really get behind.

Ford Prefect, a researcher for the Hitchhiker's Guide to the Galaxy, has been stationed on Earth for years, painstakingly compiling an authoritative, insightful entry on Terran geography, science and culture, excerpts from which appear throughout the H2G2 books. His entry improved upon the old one, which noted that Earth was, simply, “Harmless.”

However, the Guide has limited space, and when Ford submits his entry to his editors, it is trimmed to fit:

"What? Harmless? Is that all it's got to say? Harmless! One word!" Ford shrugged. "Well, there are a hundred billion stars in the Galaxy, and only a limited amount of space in the book's microprocessors,“ he said, ”and no one knew much about the Earth of course." “Well for God's sake I hope you managed to rectify that a bit.” "Oh yes, well I managed to transmit a new entry off to the editor. He had to trim it a bit, but it's still an improvement." “And what does it say now?” asked Arthur. “Mostly harmless,” admitted Ford with a slightly embarrassed cough.

[fn: My lifestyle is as gypsy and fancy-free as the characters in H2G2, and as a result my copies of the Adams books are thousands of miles away in storages in other countries, and this essay was penned on public transit and cheap hotel rooms in Chile, Boston, London, Geneva, Brussels, Bergen, Geneva (again), Toronto, Edinburgh, and Helsinki. Luckily, I was able to download a dodgy, re-keyed version of the Adams books from a peer-to-peer network, which network I accessed via an open wireless network on a random street-corner in an anonymous city, a fact that I note here as testimony to the power of the Internet to do what the Guide does for Ford and Arthur: put all the information I need at my fingertips, wherever I am. However, these texts are a little on the dodgy side, as noted, so you might want to confirm these quotes before, say, uttering them before an Adams truefan.]

And there's the humor: every writer knows the pain of laboring over a piece for days, infusing it with diverse interesting factoids and insights, only to have it cut to ribbons by some distant editor (I once wrote thirty drafts of a 5,000-word article for an editor who ended up running it in three paragraphs as accompaniment for what he decided should be a photo essay with minimal verbiage.)

Since the dawn of the Internet, H2G2 geeks have taken it upon themselves to attempt to make a Guide on the Internet. Volunteers wrote and submitted essays on various subjects as would be likely to appear in a good encyclopedia, infusing them with equal measures of humor and thoughtfulness, and they were edited together by the collective effort of the contributors. These projects -- Everything2, H2G2 (which was overseen by Adams himself), and others -- are like a barn-raising in which a team of dedicated volunteers organize the labors of casual contributors, piecing together a free and open user-generated encyclopedia.

These encyclopedias have one up on Adams's Guide: they have no shortage of space on their “microprocessors” (the first volume of the Guide was clearly written before Adams became conversant with PCs!). The ability of humans to generate verbiage is far outstripped by the ability of technologists to generate low-cost, reliable storage to contain it. For example, Brewster Kahle's Internet Archive project (archive.org) has been making a copy of the Web -- the whole Web, give or take -- every couple of days since 1996. Using the Archive's Wayback Machine, you can now go and see what any page looked like on a given day.

The Archive doesn't even bother throwing away copies of pages that haven't changed since the last time they were scraped: with storage as cheap as it is -- and it is very cheap for the Archive, which runs the largest database in the history of the universe off of a collection of white-box commodity PCs stacked up on packing skids in the basement of a disused armory in San Francisco's Presidio -- there's no reason not to just keep them around. In fact, the Archive has just spawned two “mirror” Archives, one located under the rebuilt Library of Alexandria and the other in Amsterdam. [fn: Brewster Kahle says that he was nervous about keeping his only copy of the “repository of all human knowledge” on the San Andreas fault, but keeping your backups in a censorship-happy Amnesty International watchlist state and/or in a floodplain below sea level is probably not such a good idea either!]

So these systems did not see articles trimmed for lack of space; for on the Internet, the idea of “running out of space” is meaningless. But they were trimmed, by editorial cliques, and rewritten for clarity and style. Some entries were rejected as being too thin, while others were sent back to the author for extensive rewrites.

This traditional separation of editor and writer mirrors the creative process itself, in which authors are exhorted to concentrate on either composing or revising, but not both at the same time, for the application of the critical mind to the creative process strangles it. So you write, and then you edit. Even when you write for your own consumption, it seems you have to answer to an editor.

The early experimental days of the Internet saw much experimentation with alternatives to traditional editor/author divisions. Slashdot, a nerdy news-site of surpassing popularity [fn: Having a link to one's website posted to Slashdot will almost inevitably overwhelm your server with traffic, knocking all but the best-provisioned hosts offline within minutes; this is commonly referred to as “the Slashdot Effect.”], has a baroque system for “community moderation” of the responses to the articles that are posted to its front pages. Readers, chosen at random, are given five “moderator points” that they can use to raise or lower the score of posts on the Slashdot message boards. Subsequent readers can filter their views of these boards to show only highly ranked posts. Other readers are randomly presented with posts and their rankings and are asked to rate the fairness of each moderator's moderation. Moderators who moderate fairly are given more opportunities to moderate; likewise message-board posters whose messages are consistently highly rated.

It is thought that this system rewards good “citizenship” on the Slashdot boards through checks and balances that reward good messages and fair editorial practices. And in the main, the Slashdot moderation system works [fn: as do variants on it, like the system in place at Kur5hin.org (pronounced “corrosion”)]. If you dial your filter up to show you highly scored messages, you will generally get well-reasoned, or funny, or genuinely useful posts in your browser.

This community moderation scheme and ones like it have been heralded as a good alternative to traditional editorship. The importance of the Internet to “edit itself” is best understood in relation to the old shibboleth, “On the Internet, everyone is a slushreader.” [fn: “Slush” is the term for generally execrable unsolicited manuscripts that fetch up in publishers' offices -- these are typically so bad that the most junior people on staff are drafted into reading (and, usually, rejecting) them]. When the Internet's radical transformative properties were first bandied about in publishing circles, many reassured themselves that even if printing's importance was de-emphasized, that good editors would always been needed, and doubly so online, where any mouth-breather with a modem could publish his words. Someone would need to separate the wheat from the chaff and help keep us from drowning in information.

One of the best-capitalized businesses in the history of the world, Yahoo!, went public on the strength of this notion, proposing to use an army of researchers to catalog every single page on the Web even as it was created, serving as a comprehensive guide to all human knowledge. Less than a decade later, Yahoo! is all but out of that business: the ability of the human race to generate new pages far outstrips Yahoo!'s ability to read, review, rank and categorize them.

Hence Slashdot, a system of distributed slushreading. Rather than professionalizing the editorship role, Slashdot invites contributors to identify good stuff when they see it, turning editorship into a reward for good behavior.

But as well as Slashdot works, it has this signal failing: nearly every conversation that takes place on Slashdot is shot through with discussion, griping and gaming on the moderation system itself. The core task of Slashdot has become editorship, not the putative subjects of Slashdot posts. The fact that the central task of Slashdot is to rate other Slashdotters creates a tenor of meanness in the discussion. Imagine if the subtext of every discussion you had in the real world was a kind of running, pedantic nitpickery in which every point was explicitly weighed and judged and commented upon. You'd be an unpleasant, unlikable jerk, the kind of person that is sometimes referred to as a “slashdork.”

As radical as Yahoo!'s conceit was, Slashdot's was more radical. But as radical as Slashdot's is, it is still inherently conservative in that it presumes that editorship is necessary, and that it further requires human judgment and intervention.

Google's a lot more radical. Instead of editors, it has an algorithm. Not the kind of algorithm that dominated the early search engines like Altavista, in which laughably bad artificial intelligence engines attempted to automatically understand the content, context and value of every page on the Web so that a search for “Dog” would turn up the page more relevant to the query.

Google's algorithm is predicated on the idea that people are good at understanding things and computers are good at counting things. Google counts up all the links on the Web and affords more authority to those pages that have been linked to by the most other pages. The rationale is that if a page has been linked to by many web-authors, then they must have seen some merit in that page. This system works remarkably well -- so well that it's nearly inconceivable that any search-engine would order its rankings by any other means. What's more, it doesn't pervert the tenor of the discussions and pages that it catalogs by turning each one into a performance for a group of ranking peers. [fn: Or at least, it didn't. Today, dedicated web-writers, such as bloggers, are keenly aware of the way that Google will interpret their choices about linking and page-structure. One popular sport is “googlebombing,” in which web-writers collude to link to a given page using a humorous keyword so that the page becomes the top result for that word -- which is why, for a time, the top result for “more evil than Satan” was Microsoft.com. Likewise, the practice of “blogspamming,” in which unscrupulous spammers post links to their webpages in the message boards on various blogs, so that Google will be tricked into thinking that a wide variety of sites have conferred some authority onto their penis-enlargement page.]

But even Google is conservative in assuming that there is a need for editorship as distinct from composition. Is there a way we can dispense with editorship altogether and just use composition to refine our ideas? Can we merge composition and editorship into a single role, fusing our creative and critical selves?

You betcha.

“Wikis” [fn: Hawai'ian for “fast”] are websites that can be edited by anyone. They were invented by Ward Cunningham in 1995, and they have become one of the dominant tools for Internet collaboration in the present day. Indeed, there is a sort of Internet geek who throws up a Wiki in the same way that ants make anthills: reflexively, unconsciously.

Here's how a Wiki works. You put up a page:

Welcome to my Wiki. It is rad. There are OtherWikis that inspired me.

Click “publish” and bam, the page is live. The word “OtherWikis” will be underlined, having automatically been turned into a link to a blank page titled “OtherWikis” (Wiki software recognizes words with capital letters in the middle of them as links to other pages. Wiki people call this “camel-case,” because the capital letters in the middle of words make them look like humped camels.) At the bottom of it appears this legend: “Edit this page.”

Click on “Edit this page” and the text appears in an editable field. Revise the text to your heart's content and click “Publish” and your revisions are live. Anyone who visits a Wiki can edit any of its pages, adding to it, improving on it, adding camel-cased links to new subjects, or even defacing or deleting it.

It is authorship without editorship. Or authorship fused with editorship. Whichever, it works, though it requires effort. The Internet, like all human places and things, is fraught with spoilers and vandals who deface whatever they can. Wiki pages are routinely replaced with obscenities, with links to spammers' websites, with junk and crap and flames.

But Wikis have self-defense mechanisms, too. Anyone can “subscribe” to a Wiki page, and be notified when it is updated. Those who create Wiki pages generally opt to act as “gardeners” for them, ensuring that they are on hand to undo the work of the spoilers.

In this labor, they are aided by another useful Wiki feature: the “history” link. Every change to every Wiki page is logged and recorded. Anyone can page back through every revision, and anyone can revert the current version to a previous one. That means that vandalism only lasts as long as it takes for a gardener to come by and, with one or two clicks, set things to right.

This is a powerful and wildly successful model for collaboration, and there is no better example of this than the Wikipedia, a free, Wiki-based encyclopedia with more than one million entries, which has been translated into 198 languages [fn: That is, one or more Wikipedia entries have been translated into 198 languages; more than 15 languages have 10,000 or more entries translated]

Wikipedia is built entirely out of Wiki pages created by self-appointed experts. Contributors research and write up subjects, or produce articles on subjects that they are familiar with.

This is authorship, but what of editorship? For if there is one thing a Guide or an encyclopedia must have, it is authority. It must be vetted by trustworthy, neutral parties, who present something that is either The Truth or simply A Truth, but truth nevertheless.

The Wikipedia has its skeptics. Al Fasoldt, a writer for the Syracuse Post-Standard, apologized to his readers for having recommended that they consult Wikipedia. A reader of his, a librarian, wrote in and told him that his recommendation had been irresponsible, for Wikipedia articles are often defaced or worse still, rewritten with incorrect information. When another journalist from the Techdirt website wrote to Fasoldt to correct this impression, Fasoldt responded with an increasingly patronizing and hysterical series of messages in which he described Wikipedia as “outrageous,” “repugnant” and “dangerous,” insulting the Techdirt writer and storming off in a huff. [fn: see http://techdirt.com/articles/20040827/0132238_F.shtml for more]

Spurred on by this exchange, many of Wikipedia's supporters decided to empirically investigate the accuracy and resilience of the system. Alex Halavais made changes to 13 different pages, ranging from obvious to subtle. Every single change was found and corrected within hours. [fn: see http://alex.halavais.net/news/index.php?p=794 for more] Then legendary Princeton engineer Ed Felten ran side-by-side comparisons of Wikipedia entries on areas in which he had deep expertise with their counterparts in the current electronic edition of the Encyclopedia Britannica. His conclusion? “Wikipedia's advantage is in having more, longer, and more current entries. If it weren't for the Microsoft-case entry, Wikipedia would have been the winner hands down. Britannica's advantage is in having lower variance in the quality of its entries.” [fn: see http://www.freedom-to-tinker.com/archives/000675.html for more] Not a complete win for Wikipedia, but hardly “outrageous,” “repugnant” and “dangerous.” (Poor Fasoldt -- his idiotic hyperbole will surely haunt him through the whole of his career -- I mean, “repugnant?!”)

There has been one very damning and even frightening indictment of Wikipedia, which came from Ethan Zuckerman, the founder of the GeekCorps group, which sends volunteers to poor countries to help establish Internet Service Providers and do other good works through technology.

Zuckerman, a Harvard Berkman Center Fellow, is concerned with the “systemic bias” in a collaborative encyclopedia whose contributors must be conversant with technology and in possession of same in order to improve on the work there. Zuckerman reasonably observes that Internet users skew towards wealth, residence in the world's richest countries, and a technological bent. This means that the Wikipedia, too, is skewed to subjects of interest to that group -- subjects where that group already has expertise and interest.

The result is tragicomical. The entry on the Congo Civil War, the largest military conflict the world has seen since WWII, which has claimed over three million lives, has only a fraction of the verbiage devoted to the War of the Ents, a fictional war fought between sentient trees in JRR Tolkien's Lord of the Rings.

Zuckerman issued a public call to arms to rectify this, challenging Wikipedia contributors to seek out information on subjects like Africa's military conflicts, nursing and agriculture and write these subjects up in the same loving detail given over to science fiction novels and contemporary youth culture. His call has been answered well. What remains is to infiltrate the Wikipedia into the academe so that term papers, Masters and Doctoral theses on these subjects find themselves in whole or in part on the Wikipedia. [fn See http://en.wikipedia.org/wiki/User:Xed/CROSSBOW for more on this]

But if Wikipedia is authoritative, how does it get there? What alchemy turns the maunderings of “mouth-breathers with modems” into valid, useful encyclopedia entries?

It all comes down to the way that disputes are deliberated over and resolved. Take the entry on Israel. At one point, it characterized Israel as a beleaguered state set upon by terrorists who would drive its citizens into the sea. Not long after, the entry was deleted holus-bolus and replaced with one that described Israel as an illegal state practicing Apartheid on an oppressed ethnic minority.

Back and forth the editors went, each overwriting the other's with his or her own doctrine. But eventually, one of them blinked. An editor moderated the doctrine just a little, conceding a single point to the other. And the other responded in kind. In this way, turn by turn, all those with a strong opinion on the matter negotiated a kind of Truth, a collection of statements that everyone could agree represented as neutral a depiction of Israel as was likely to emerge. Whereupon, the joint authors of this marvelous document joined forces and fought back to back to resist the revisions of other doctrinaires who came later, preserving their hard-won peace. [fn: This process was just repeated in microcosm in the Wikipedia entry on the author of this paper, which was replaced by a rather disparaging and untrue entry that characterized his books as critical and commercial failures -- there ensued several editorial volleys, culminating in an uneasy peace that couches the anonymous detractor's skepticism in context and qualifiers that make it clear what the facts are and what is speculation]

What's most fascinating about these entries isn't their “final” text as currently present on Wikipedia. It is the history page for each, blow-by-blow revision lists that make it utterly transparent where the bodies were buried on the way to arriving at whatever Truth has emerged. This is a neat solution to the problem of authority -- if you want to know what the fully rounded view of opinions on any controversial subject look like, you need only consult its entry's history page for a blistering eyeful of thorough debate on the subject.

And here, finally, is the answer to the “Mostly harmless” problem. Ford's editor can trim his verbiage to two words, but they need not stay there -- Arthur, or any other user of the Guide as we know it today [fn: that is, in the era where we understand enough about technology to know the difference between a microprocessor and a hard-drive] can revert to Ford's glorious and exhaustive version.

Think of it: a Guide without space restrictions and without editors, where any Vogon can publish to his heart's content.

Lovely.

$$$$



License: This entire work (with the exception of the introduction by John Perry Barlow) is copyright 2008 by Cory Doctorow and released under the terms of a Creative Commons US Attribution-NonCommercial-ShareAlike license (http://creativecommons.org/licenses/by-nc-sa/3.0/us/). Some Rights Reserved.
The introduction is copyright 2008 by John Perry Barlow and released under the terms of a Creative Commons US Attribution-NonCommercial-ShareAlike license (http://creativecommons.org/licenses/by-nc-sa/3.0/us/). Some Rights Reserved.


≅ SiSU Spine ፨ (object numbering & object search)

(web 1993, object numbering 1997, object search 2002 ...) 2023