[Updated] Over in my aviation weblog, I find myself more and more linking to Wikipedia whenever I’m discussing a concept, person, place, or anything else that doesn’t have its own, canonical home page. If, as I suspect, lots of other bloggers are doing the same, then links to Wikipedia articles may soon be the blogsphere’s answer to subject codes.
News wire services like Reuters or Dow Jones put a lot of time and money into maintaining long lists of subject codes to attach to their news products. Unlike the simple categories used in blogs, subject codes tell you not just that an article is about (say) computer technology, but that it is about specific companies, industries, people, places, and concepts. News customers use the codes to classify stories automatically, routing them to the appropriate editorial sections, displaying them on trading screens, sorting them into categories on web sites, or using them to improve searches. The providers are constantly sending out updated lists, keeping their customers’ technical departments very busy.
Should weblogs be using some kind of subject code (beyond categories)? Some areas already have standard identifiers that we could use, such as ICAO codes for airports, UPCs for retail products, ISBNs for books, CUSIPs for financial instruments, or ISO codes for countries, languages, and currencies. However, each of those requires some surrounding context: you need not only the code, but some indication that it refers to a currency or an airport. They’re also managed by central authorities, making them less attractive to the weblog community.
Enter Wikipedia. If I’m posting about Washington the U.S. state, I can link to the Wikipedia article about the state; if I’m posting about Washington the U.S. president, I can link to the article about the president; if I’m posting about Washington the U.S. capital, I can link to the article about the city; and if I’m using the word Washington by metonymy to refer to the U.S. government, I can link to the article about the government.
Bingo — subject codes, just like the big newswires use, only a lot more useful and totally open. I can link to abstraction subjects like love or communism or to time periods like the middle ages just as easily as I can link to concrete people, places, or things; if there’s not already a Wikipedia article on my subject, I can always start a stub. If people keep linking to Wikipedia, search engines like Technorati and aggregators like Bloglines might start taking advantage of those links to do some automatic categorization, right down to offering links to other postings on the same subject (“Click here for other postings about Open Source“). Once people know the search engines are doing that, they’ll be bound to link to Wikipedia even more than they already are, creating a virtuous circle where both Wikipedia and the blogsphere become more valuable.
Of course, like anything that people actually do in the web (as opposed to drawing-board architectures that never get implemented), this approach is far from perfect. Once the search engines are paying attention to Wikipedia links, some people will deliberately include misleading links to have their weblog entries miscategorized, though rankings like Technorati’s should help make sure that the most relevant ones stay near the top of the list. Furthermore, Wikipedia URLs do change, especially for the sake of disambiguation, so the Wikipedia URLs will never be 100% accurate as subject codes. And finally, the Wikipedia project itself could shut down, leaving all of the subject codes orphaned. Still, since linking to Wikipedia is something many of us do anyway, it looks like a good, quick-and-dirty webby alternative to the news industry’s subject codes — it might even work better.