Folk Wisdom

3 min read

Web geeks have long fantasized about a Web taxonomy, a classification scheme that would encompass the entire Web--not just sites, but also content such as images and blog posts. The hierarchical directory set up by Yahoo Inc. is an impressive attempt at a kind of Web taxonomy, but it's a weighty construction that doesn't make finding things on the Web all that much easier or faster.

Even worse, it's a classification scheme that has been imposed from on high by Yahoo's information mavens. But we now live in the age of the long tail, the collective influence and power of the small sites and users that make up the vast majority of the Web. The big corporations like Yahoo or Time Warner Inc. might make up the "head" of the Web, but the hundreds of millions of personal sites, blogs, BitTorrent peers, and Flickr photo albums--not to mention the hundreds of millions of users roaming the Web--comprise the beast's massively long tail.

What does this long tail model have to do with a Web taxonomy? It tells us that hundreds of millions of people can probably classify what they see and interact with on the Web more efficiently, more comprehensively, and more usefully than a small group of Yahoo managers. In other words, we won't get a true Web taxonomy until the process switches from top down to ground up.

And that's just what we're starting to see happen all over the Web. At sites such as Flickr, del.icio.us, and Furl, ordinary users are creating their own taxonomic schemes. Only this isn't taxonomy. It's folksonomy, an ad hoc classification scheme that Web users invent as they surf to categorize the data they find online. It's also called (take a deep breath) folk categorization, communal categorization, ethnoclassification, distributed classification, social classification, faceted hierarchy, and mob indexing.

Folksonomists apply descriptive keywords, or tags to the objects they come across. The term explains a few other synonyms for folksonomy: folk tagging, open tagging, social tagging, and free tagging. Social software--software that enables users to share information and collaborate online--makes these tags available to other users, who can then take advantage of all this tagging to search for the information they need.

At the del.icio.us site, for example, users bookmark interesting pages and assign tags to each site, and those tags can then be searched. This is called social bookmarking, and it has caught the attention of some big players, not least of whom are the taxonomists at Yahoo, which not long ago launched My Web 2.0, a social bookmarking service.

Sites such as Flickr (for photos) and Technorati (for blogs) maintain tag clouds, a list of the tags used on the site, although with some kind of visual indication of each tag's relative popularity. (At the United Kingdom's Guardian newspaper, they call their tag cloud a folksonomic zeitgeist.) The most popular tags, for instance, are often shown with the largest font. Some sites even keep track of the tags that each user has applied in the past, the idea being that the user might be inclined to reuse those tags in the future. Del.icio.us calls each user's tag cloud a tagroll (a play on blogroll, a blogger's list of links to other blogs that he or she reads).

But how can nonprofessional taggers hope to create a taxonomy that's as sophisticated as one that professional specialists would make? The answer lies in something called the architecture of participation: services get better as the number of users increases. The canonical example is BitTorrent, where each user acts as both client (peer) and server (seed). Files are downloaded by taking small chunks from any peers who have the file and who are online. The more peers online, the faster the download. For folksonomy, the more folks applying tags, the more sophisticated the result. The writer Bruce Sterling calls the folksonomically enhanced Web "common wisdom squared."

Folksonomies are not perfect, to be sure. Nonstandard tags are problems--one Flickr user might tag a photo of a certain kind of retriever as "flat-coated," another as "flatcoated," and a third as "flatcoat"--and one- or two-word tags lack a certain amount of precision.

On the other hand, folksonomy isn't meant to be a Google-killer. It is, instead, a kind of experiment in collective intelligence, the hallmark of what some people are calling Web 2.0 (a prolific language factory that will be the topic of a future column). One person can be pretty smart, but 10 000 or 100 000 people are almost always going to be smarter. The New Yorker 's finance writer James Surowiecki calls it the wisdom of crowds, and we're starting to see some pretty big crowds in the folksonomy space: Technorati and del.icio.us each have tens of thousands of users, while Flickr boasts more than 400 000. That's a lot of folks.

PAUL MCFEDRIES is a technical and language writer with more than 40 books to his credit. He also runs Word Spy, a Web site and mailing list that tracks new words and phrases (https://www.wordspy.com).

This article is for IEEE members only. Join IEEE to access our full archive.

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, podcasts, and special reports. Learn more →

If you're already an IEEE member, please sign in to continue reading.

Membership includes:

  • Get unlimited access to IEEE Spectrum content
  • Follow your favorite topics to create a personalized feed of IEEE Spectrum content
  • Save Spectrum articles to read later
  • Network with other technology professionals
  • Establish a professional profile
  • Create a group to share and collaborate on projects
  • Discover IEEE events and activities
  • Join and participate in discussions