REST: the quick pitch

Now that the Java world is noticing REST, the low-pain alternative to RPC standards like WS-*, people are starting to blog about it again. Gossip with other IT folks also tells me that people’s customers are actually asking for REST explicitly (rather than having to be convinced to use it). With that in mind, I’m going to try to explain what I think matters about REST, and what you can safely ignore.

The elevator pitch

With REST, every piece of information has its own URL.

If you just do that and nothing else, you’ve got 90%+ of REST’s benefits right off the bat. You can cache, bookmark, index, and link your information into a giant, well, web. It works — you’re reading this, after all, aren’t you? Betcha got here by following a link somewhere, not by parsing a WSDL to find what ports and services were available.

Real best practices

If you want to do REST well (rather than just doing REST), you can spend 2-3 minutes after your elevator ride learning a few very simple best practices to get most of the remaining 10% of REST’s benefits:

Use HTTP POST to update information. Here’s the simple rule: GET to read, POST to change. That way, no body deletes or modifies something by accident when trying to read it.

Make sure your information contains links (URLs) for retrieving related information. That’s how search engines index the web, and it can work for other kinds of information (XML, PDF, JSON, etc.) as well. Once you have one thing, you can follow links to find just about everything else (assuming that you understand the file format).

Try to avoid request parameters (the stuff after the question mark). It’s much better to have a URL like

http://www.example.org/systems/foo/components/bar/

than

http://www.example.org/get-component.asp?system=foo&component=bar

Search engines are more likely to index it, you’re less likely to end up with duplicates in caches and hash tables (e.g. if someone lists the request parameters in a different order), URLs won’t change when you refactor your code or switch to a different web framework, and you can always switch to static, pregenerated files for efficiency if you want to. Exceptions: searches (http://www.example.org/search?q=foo) and paging through long lists (http://www.example.org/systems/?start=1000&max=200) — in both of these cases, it’s really OK to use the request parameters instead of tying yourself in a knot trying to avoid them.

Avoid scripting-language file extensions. If your URLs end with “.php”, “.asp”, “.jsp”, “.pl”, “.py”, etc., (a) you’re telling every cracker in the world what exploits to use against you, and (b) the URLs will change when your code does. Use Apache mod-rewrite or equivalent to make your resources look like static files, ending in “.html”, “.xml”, etc.

Avoid cookies and URL rewriting. Well, maybe you can’t, but the idea of REST is that the state is in the thing the server has returned to you (an HTML or XML file, for example) rather than in a session object on the server. This can be tricky with authentication, so you won’t always pull it off, but HTTP authentication (which doesn’t require cookies or session IDs tacked onto URLs) will work surprisingly often. Do what you have to do to make your app work, but don’t use sessions just because your web framework tells you to (they also tie up a lot of resources on your server).

Speculative stuff (skip this)

The strength of REST is that it’s been proven through almost two decades of use on the Web, but not everything that some of the hard-core RESTafarians (and others) try to make us do has been part of that trial. Stop reading now if you just want to go ahead and do something useful with REST. Really, stop! Some of this stuff is moderately interesting, but it won’t really help you, and will probably just mess up your project, or at least make it slower and more expensive.

[maybe some day] Use HTTP PUT to create a resource, and DELETE to get rid of one. These sound like great ideas, and they add a nice symmetry to REST, but they’re just not used enough for us to know if they’d really work on a web scale, and firewalls often block them anyway. In real-life REST applications, rightly or wrongly, people just use POST for creation, modification, and deletion. It’s not as elegant, but we know it works.

[don’t bother] Use URLs to point to resources rather than representations. Huh? OK, a resource is a sort-of Platonic ideal of something (e.g. “a picture of Cairo”), while a representation is the resource’s physical manifestation (e.g. “an 800×600 24-bit RGB picture of Cairo in JPEG format”). Yes, as you’d guess, it was people with or working on Ph.D.’s who thought of that. For a long time, the W3C pushed the idea of URLs like “http://www.example.org/pics/cairo” instead of “http://www.example.org/pics/cairo.jpg“, under the assumption that web clients and servers could use content negotiation to decide on the best format to deliver. I guess that people hated the fact that HTTP was so simple, and wanted to find ways to make it more complicated. Fortunately, there were very few nibbles, and this is not a common practice on the web. Screw Plato! Viva materialism! Go ahead and put “.xml” at the end of your URLs.

[blech] Use URNs instead of URLs. I think even the hard-core URN lovers have given up on this now — it’s precisely the kind of excessive abstraction that sent people running screaming from WS-* into REST’s arms in the first place (see also “content negotiation”, above), and it would be a shame to scare them away from REST as well. URLs are fine, as long as you make some minore efforts to ensure that they don’t change.

[n/a] REST needs security, reliable messaging, etc. The RESTafarians don’t say this, but I’m worried that the JSR (the Java REST group) will. We already have a secure version of HTTP TLS/SSL, and it works fine for hundreds of thousands or millions of web sites. Reliable messaging can be handled fine in the application layer, since everyone’s requirements are different anyway, or maybe we want a reliable-messaging spec for HTTP in general. In either case, please don’t pile this stuff on REST.

So to sum up, just give every piece of information its own URL, then have fun.

Posted in REST | Tagged , , | 19 Comments

Thinking about structure

Douglas Crockford left an excellent comment on my recent posting All markup ends up looking like XML, which he later made into its own blog posting, For the trees. I agree with his reworking of the structure: given the data that I provided, the JSON, LISP, and XML markup all could have been simpler.

If he’s right about the examples, though, he’s wrong about two things. First, my posting doesn’t represent any kind of softening to JSON among its opponents in the XML community, simply because I’ve never been one of those opponents. Second, I spend at least one order of magnitude more time working with SQL and programming languages (not processing XML) than I do with XML, so if anything, my perspective on XML would likely be tainted by them rather than the other way around. Instead, I think the examples were complicated because I built for tomorrow instead of today.

Tomorrow

So what might tomorrow look like for an application dealing with names? Consider, for example, this XML markup, moving gender out of the element/property name as Doug suggests, and eliminating the other attributes (since they don’t add much to the discussion):

<names>
  <name gender="male"><surname>Saddam</surname> Hussein</name>
  <name gender="female">Susan B. <surname>Anthony</surname></name>
  <name gender="male">Al <surname>Unser</surname> Jr.</name>
<name gender="male">Don Alonso <surname>Quixote</surname> de la Mancha</name> </names>

It’s surprisingly messy breaking each name down into a simple property list. If we tried the approach Doug used for my simpler examples, we’d end up with this (note that this is a list of names, not of people):

{"names": [
    {"gender": "male", "given-name": "Hussein", "surname": "Saddam"},
    {"gender": "female", "given-name": "Susan B.", "surname": "Anthony"},
    {"gender": "male", "given-name": "Al Jr.", "surname": "Unser"}
    {"gender": "male", "given-name": "Don Alonso Quixote de la",
      "surname": "Mancha"}
]}

This list needs a bit of patching. First, if we reconstruct the names as strings, we don’t want to end up with “Hussein Saddam” instead of “Saddam Hussein”, so we’ll have to add a property specifying whether the surname comes first or last:

{"gender": "male", "given-name": "Hussein", "surname": "Saddam",
  "surname-after-given-name": false}

Great — that’s all we need to fix that, and now we know to print “Saddam Hussein”. Now, let’s look at Susan — there’s no problem recreating the string “Susan B. Anthony” from these properties, but we probably should rename the property given-name to given-names, just to avoid confusion:

{"gender": "female", "given-names": "Susan B.", "surname": "Anthony",
  "surname-after-given-names": true}

Al Unser Jr. is a bit trickier, because there was no obvious place to put the “Jr.”. Strictly speaking, it’s neither a given name nor a surname, so for now, let’s just call it a postfix (although that assumes a physical position that might not apply to all languages):

{"gender": "male", "given-names": "Al", "surname": "Unser",
  "surname-after-given-names": true, "postfix": "Jr."}

Don Quixote, however, forces us to reconsider some of our assumptions, because “Don” is not a given name but an honorific. Assuming, however, that we don’t care whether it’s a name or an honorific, lets just call it prefix for now, to go with postfix:

{"gender": "male", "prefix": "Don", given-name: "Alonso",
  "surname": "Quixote", "surname-after-given-names": true,
  "postfix": "de la Mancha"}

Finally, just to throw a wrench into things, let’s assume that our list might contain things other than names, so that we need to add a type property:

{"type": "name", "gender": "male", "prefix": "Don",
  "given-name": "Alonso", "surname": "Quixote",
  "surname-after-given-names": true, "postfix": "de la Mancha"}

Granted, that sort-of works, but it’s really not very nice, and it’s extremely brittle: there are names with extra words in the middle (such as “de”) that are properly not part of the given name or surnames, for example. Then again, why overtag it? Perhaps we don’t need to know what’s a given name or honorific, as long as we can distinguish the surname. One possibility is simple to break it down to four properties:


{"type": "name", "gender": "male", "presurname": "Don Alonso",
  "surname": "Quixote", "postsurname": "de la Mancha"}

While I’m a big fan of Agile development in principle, however, I’ve worked on enough broken legacy systems to leave a little wiggle room for future requirements, like, say, a need to isolate the primary given name for a mail merge or index, even if we’re not going to isolate it right now. Fortunately JSON, like XML, has a natural ability to represent ordered information much more elegantly — let’s make the name into an ordered array:

{"type": "name", "gender": "male",
  "value:" ["Don Alonso", {"type": "surname", "value": "Quixote}, "de la Mancha"]}

This approach provides us with almost limitless flexibility (for example, if we start isolating honorifics, we can deal with a language where the honorific comes at the end of the name with no extra trouble), and is just as simple and easy to read as the much less flexible presurname/postsurname approach. Building for today is great, but if you have a choice between two roughly equivalent approaches where one provides an easy future upgrade path and the other doesn’t, which is the best choice? JSON is new enough that the JSON community hasn’t yet had to deal much with the life cycle of information — once enough people have built apps relying on specific JSON formats, it will be very, very hard to make any changes: v.2 of any popular data format generally results in enormous costs (in money and goodwill), and v.3 rarely happens.

Some people might prefer to shorten the above example a bit by following a simple convention: the first member of each array is a label, the second is a map with properties describing the rest of the array, and the remainder is the value, where order may be significant:

["name", {"gender": "male"},
  "Don Alonso", ["surname", {}, "Quixote"],  "de la Mancha"]

That is trickier to dump straight into a data structure or database table, but it’s a much more natural way to represent the information, and a lot easier to read on the screen. And just in case it doesn’t look look familiar, compare:

<name gender="male">Don Alonso <surname>Quixote</surname>
  de la Mancha</name>

If your information isn’t this complicated, JSON, XML, or LISP can be simple, as Doug pointed out — the XML could just as easily be


<name gender="male" presurname="Don Alonso" surname="Quixote"
  postsurname="de la Mancha"/>

The reason you don’t see that much is not because XML people never thought of it — read the xml-dev archives from ten years ago to read megabytes of discussion — but because it kept breaking in production systems as soon as the customer (or users) thought of a new requirement. When the information gets complicated, as I pointed out, there’s a bit of a tendency for all markup to end up looking like XML; when the information is simple, of course, XML can just as easily look like JSON or LISP.

Posted in General | 10 Comments

Tech botox

plastic surgery picture

Elliotte Harold is absolutely right when he suggests that people should leave Java alone. New technologies compete on features; mature technologies compete on deployment.

Let’s value our mature, middle-aged technologies for what they are, rather than destroying their dignity by pumping them full of features botox and slashing them up with plastic-surgery keyword changes to try to trick people into thinking they’re young and immature.

With some minor lapses, the W3C has done well avoiding the temptation to improve XML to death. XML is still, in every way that matters, the same as it was when the initial recommendation came out nine years ago — warts and all — and that’s why it’s so widely used. Sun should pay very close attention, since Java’s around the same age, and is deployed in many of the same places. The people who actually decide to use Java and XML to run organizations and do real work (not bloggers, but architects, project managers and even sometimes CTOs) appreciate them for precisely that stability and dependability.

Tagged | 4 Comments

MSIE MIA?

What happened to Windows Internet Explorer?

Browser stats

I just took a peek at my server stats for megginson.com (I’m pretty lazy about following them) and had a huge surprise. I’ve adjusted these to exclude “Unknown”, which I assume are mostly spiders and blog aggregators:

  • MS Internet Explorer: 42%
  • Firefox: 37%
  • Mozilla: 7%
  • NetNewsWire: 6%
  • Opera: 3%
  • Safari: 2%
  • Netscape: 1%

I cut off the list at 1%. MSIE is still in the lead, but it has suffered a huge drop from a few months ago — could it be that my ISP’s version of AWStats doesn’t recognize MSIE 7 and is lumping it with “Unknown”, or is there a chance that the movement to Firefox is becoming a stampede? The last time I remember changes this fast was when MSIE was crushing Netscape in the late 1990s.

Operating system stats

My site is not primarily a Linux or Open-Source software site, and it does not seem to attract a disproportionately high share of non-Windows users. Again, excluding “Unknown”, here is the OS distribution for visitors:

  • MS Windows: 74%
  • Linux: 15%
  • MacOS: 12%

Linux might be a touch high here, but megginson.com is no SlashDot. Something’s going on — either Firefox is doing to MSIE what MSIE did to Netscape (knocking off a stale browser sitting smugly on its assumed monopoly), or, as I mentioned, it’s just a reporting glitch.

Tagged | 5 Comments

XML 2006 pickled and preserved

The XML 2006 site is now pickled and preserved for long-term storage. Almost all of the presenters got their papers or slides in for the proceedings, if not on time, at least in time. Unfortunately, if you want to see a paper or slides from one of the few who didn’t send us anything, you’ll now have to pester them directly.

Recipe for pickling a web site

The original site was a hand-rolled LAMP implementation, but it was designed from the start to be amenable to a static copy. To pickle it, I started by doing a recursive slurp of the live site using wget (with the -m option) — that generated permanent, static HTML copies of the dynamic, database-driven pages on the site. At that point, I had an almost, but not quite perfect static copy of the site, because there were two things that wget missed:

  1. Images referred to only in CSS stylesheets (such as the banner).
  2. CSS stylesheets referred to by other CSS stylesheets.

It took only a few minutes to add all of that by hand, and the site was ready to go.

Why it worked

This will be old news to a lot of people reading, but a few simple advance steps (during site design) made later static preservation easy. Here’s what I did:

  • Every page has its own URL, period, end of discussion. No AJAX, no POST.
  • Every page (or at least, every page that we want to archive) is reachable, directly or indirectly, from the home page.
  • Script names are not shown to the public, so there are no URLs ending in “php” (hint: exposed script extensions like “php”, “asp”, or “jsp” are signs of gross incompetence in web design).
  • No web pages rely on exposed GET request parameters: for example, the URLs looked like /programme/presentations/123.html, not /programme/presentation?code=123, or even worse, /show-presentation.php?code=123.

And that’s it. Of course, if the site had included live forms, I would have had to remove those as well (and any links to them), but that wouldn’t have been much extra work.

On a final note, while the live site was hosted on an Apache server (the “A” in “LAMP”), the pickled site is hosted on a Microsoft IIS server. It made no difference at all — that’s the way Web standards are supposed to work.

Tagged , , | 5 Comments

Jon Bosak's XML 2006 keynote now online

I’m happy to announce that Jon Bosak’s closing keynote from the XML 2006 conference is now online. We don’t require keynote speakers to contribute text to the proceedings, but we received a large number of requests for Jon’s talk and he kindly obliged.

In case anyone reading this doesn’t know, Jon chaired the original W3C group that developed XML. In his closing, post-dinner keynote, Jon gives a playful account of the controversies, strange behaviour, and general atmosphere leading up to the first public XML draft released in 1996. He then goes on to contrast the pioneer attitude (my phrase) of the implementors at the time with the vendor-dependence of most XML users today. It’s well worth a read, if you weren’t able to be there to listen — just remember to picture Jon saying everything with a slight smile at the edges of his mouth.

By the way, most of the other conference presentations also have slides and/or text available now. See the programme for links to papers or slides in the proceedings. And if you’re one of the few delinquent authors who has not yet sent in your proceedings, please get them to me as soon as possible.

Tagged | Comments Off on Jon Bosak's XML 2006 keynote now online

Who's searching for "XML"?

Here are the top ten locations as of January 9 2007, according to Google trends:

  1. Pune, India
  2. Bangalore, India
  3. Hyderabad, India
  4. Chennai, India
  5. Mumbai, India
  6. Singapore, Singapore
  7. Delhi, India
  8. Tokyo, Japan
  9. Chiyoda, Japan
  10. Hong Kong, Hong Kong

Note that the top cities are all Asian. A search for “J2EE” returns almost exactly the same list. Now, compare the list for a representative new, trendy technology, Ruby on Rails:

  1. San Francisco, CA, USA
  2. Austin, TX, USA
  3. Pleasanton, CA, USA
  4. Seattle, WA, USA
  5. Salt Lake City, UT, USA
  6. Portland, OR, USA
  7. Vancouver, Canada
  8. Denver, CO, USA
  9. Oslo, Norway
  10. Auckland, New Zealand

This time, it’s 80% North American and 0% Asian, and more interestingly, all of those cities are west of the Mississippi. The easiest interpretation of this very small sample is that the Asian companies concentrate on established technologies that they can be paid for using, while the North American west coast companies are disproportionately interested in new, unproven technologies. What about a new technology that’s designed to work with an older one? Could we expect a mix of Asian and North American west coast cities? Here are the top cities searching for “XQuery”:

  1. San Jose, CA, USA
  2. Bangalore, India
  3. Singapore, Singapore
  4. Chennai, India
  5. San Francisco, CA, USA
  6. Mumbai, India
  7. Pleasanton, CA, USA
  8. San Diego, CA, USA
  9. Washington, DC, USA
  10. Hong Kong, Hong Kong

The implication of this very unscientific survey is that you can determine the relative maturity of a technology by looking at the weighting of search origins between western North America and eastern Asia.

Tagged , | 3 Comments

Sneak peek at XML 2007

With XML 2006 barely over, we’re already deep into planning XML 2007. Here’s your first peek at what we have planned.

Time and place

XML 2007 is confirmed for Monday 3 December to Wednesday 5 December 2007. We’ll be meeting in Boston again, but at a different hotel, the Boston Marriott Copley Place (located at the opposite end of the Prudential Centre from the 2006 hotel).

A lot of people asked about moving the conference to early November. I think that’s an excellent idea, but unfortunately, we have to book the hotel over a year in advance, so we cannot make that change until 2008.

Program

There will be a few significant program changes for 2007. First, there will be no tutorial day before XML 2007 begins. Attendance for the tutorial day has been declining for several years, and with the obvious lack of interest from our attendees, it no longer makes sense for IDEAlliance to offer it. However, we will try to incorporate more beginner-level and tutorial-style presentations into the main program.

The vendor pecha-kucha went very well in 2006, but for 2007, we’re considering replacing it with a standards pecha-kucha, either in the evening or during one of the days. Each standards committee will have 20 slides (at 20 seconds each) to give us a quick update on what they’ve been doing over 2007 and what to expect in 2008 — that will make it possible for attendees to learn a bit about a lot of standards in a relatively short time.

The publishing and web tracks at XML 2006 were extremely well attended (often overflowing out of the space), and the enterprise track put up a more modest but still respectable showing. However, with only a couple of exceptions, the hands-on track did not attract the same number of people, and we’ve decided to discontinue it in 2007. While we haven’t made a final decision, we may replace it with a vendor track. I personally don’t object to a vendor track as long as it’s well labeled — slipping vendor presentations into the main program is analogous to letting advertisers buy search-engine placement, while having a separate vendor track is more analogous to Google text ads, since it’s clearly distinct. In any case, it turns out that there are lots of people who do want to hear product-specific information and even sales pitches.

We will end the formal program on Wednesday 5 December with a closing keynote around noon. The afternoon will be available for user-organized activities, such as BOFs, committee meetings, or even pub crawls and karaoke — we’ll provide an online forum to help you organize these activities well in advance, and we’ll publicize them on the conference web site. In the past, these activities have been confined to evenings, when people are already tired; moving them to the afternoon should make it possible for more people to participate.

Speakers

XML 2007 will not have a late-breaking call for papers; instead, we’ll open the regular call for papers early (probably at XTech 2007 in Paris), and will keep it open to the end of August or even into September. As with XML 2006, I’m hoping for a mix of veteran and rookie speakers at the conference — I especially like it when we can bring people in from other fields.

Also, by popular request, we’re looking at providing individual evaluation forms for each speaker, so that attendees can help us identify the best and most entertaining among you. We’ll also go back to asking for proceedings before the conference, since that was overwhelmingly what people want; however, we will continue to accept papers in PDF or XHTML format so that speakers do not have to try to set up their own XML mini-publishing systems.

Comments?

I was very happy with how XML 2006 turned out, and I’m looking forward to an even better conference in 2007. Please let me know what you think about these changes — and if you have any new suggestions — by leaving a comment here.

Tagged | 3 Comments

ReiserFS

A number of years ago I was working on the scenery system for the open source FlightGear flight simulator. Due to the nature of geodata and the scenery building system, I ended up with tens of thousands of tiny files on my hard drive, many only a few bytes long, and I was constantly running out of disk space.

Then I read about an alternative filesystem for Linux called ReiserFS, part of a new generation of journaling filesystems. Unlike the others, however, ReiserFS had a special innovation: it allowed multiple very small files to share the same block, so that a 5-byte file would not automatically take up 512 bytes (or whatever your block size was). I switched over, and bingo! There was suddenly a huge amount of free space on my previously-full hard drive, and I noticed no performance problems (aside from the occasional tiny zombie file that I couldn’t delete).

I’ve been running Reiser ever since, but the filesystem has fallen on hard times. On 14 September 2006 (via Tony Coates), Jeff Mahoney announced that the SuSE Linux distribution would no longer use ReiserFS as its default. Mahoney is also one of the principal ReiserFS developers, and he wrote that ReiserFS3 does not scale, that it has a small and shrinking developer community inadequate to maintain it, and that ReiserFS4 is “an interesting research file system, but that’s about as far as it goes.” Then, on 10 October 2006 Hans Reiser, the principal maintainer, was arrested and charged with the murder of his estranged wife Nina.

SuSE was the only Linux distribution that used Reiser as its default filesystem. This c|net story links the SuSE decision with the murder charges, but it’s worth noting that Mahoney’s message predates the charges by almost a month. Whatever the cause, however, Novell (SuSE’s owner) had contributed significant resources towards the maintenance of ReiserFS. It no longer looks like ReiserFS has any future at all, and in its current state, it has performance and scalability problems that prevent its use in high-demand environments. ReiserFS was a big help to me when I needed it a few years back, but the next time I install Ubuntu, I’ll use the default ext3 filesystem instead. Hard disks — even for notebook computers — are a lot bigger and cheaper now, anyway.

Tagged | 6 Comments

In praise of architecture astronauts

Six years ago, Joel Spolsky wrote a piece on Architecture Astronauts, people who get so obsessed with the big picture that they miss the important little details that actually make things work. More recently, Dare Obasanjo pointed to Spolsky’s piece in his posting XML Has Too Many Architecture Astronauts.

I’d like to start by agreeing with Dare: XML does have too many architecture astronauts, and almost everything that’s bad, ugly, or simply scary about the huge number of standards built around XML (WS-* springs immediately to mind, but it’s not alone) comes from gross overgeneralization. That said, architecture astronauts do have their place, and we ignore them at our peril.

Case 1: Napster

Let’s start by turning Spolsky’s main example (which Dare cites) on its head. Here are two different perspectives on Napster circa 2001:

Architecture pedestrian: Napster lets people find and download songs.

Architecture astronaut: Peer-to-peer networks let people find and download songs. Napster is (was) a peer-to-peer network.

Spolsky writes about how architecture astronaut perspective helped to fuel a mini-P2P bubble at the time, with investors pouring wasted money into P2P-everything, when Napster’s success was due not to the fact that it was P2P but to the fact that it let people get songs easily. However, consider what was happening at the same time in the music industry. Rightly or wrongly, they wanted to stop people from sharing songs. The architecture pedestrian perspective (my term, not Spolsky’s) told them that Napster lets people find and download songs, so the industry spent millions of dollars in legal fees, PR, etc. shutting down Napster. The result? People downloaded even more music. After all, as the astronauts said, it was P2P networks that let people share music, not Napster in particular. Since then, the music industry has been fighting the equivalent of an insurgency, putting down one uprising after another with no end in sight.

Case #2: The Netscape IPO

My second example took place over 11 years ago, kicking of the much larger dot.com bubble (the P2P mini-bubble was just a tiny part of its tail). It was around 1995 that most non-techies noticed the web, mostly through the lens of the Netscape browser. Again, the architecture pedestrian and the architecture astronaut looked at this differently:

Architecture pedestrian: Netscape lets people see text and pictures online.

Architecture astronaut: The web allows people to put text and pictures online. Netscape is a web browser.

This time, the investors listened to the architecture pedestrian rather than the architecture astronaut: Netscape was set to open at $14/share, doubled to $28/share, and climbed to $75/share on the first day, and eventually reached a peak market cap of $8 billion. The astronauts knew all along, however, that while people (at the time) thought of the web in terms of the Netscape browser, the web wasn’t Netscape. If Internet Explorer hadn’t knocked Netscape off its perch (resulting in layoffs as early as January 1998), some other browser soon would have.

Case #3: XML

So how does this all apply to XML? I think that there are two ways that architecture astronauts can approach XML, one good and one bad. The bad one is in line with Spolsky’s original piece, where people miss what made XML popular (relative simplicity, no need to create DTDS, etc.) and believe that if a bit of standardization is good, a lot must be even better. The good one is to step back and point out that most of the advantages that appear to come from XML actually come from generic tree markup, and that holy wars between XML, JSON, YAML, etc. are really beside the point. In various situations, one syntax may have an advantage due to software support — for example, web browsers have built-in support for parsing XML or styling it using CSS, and they can convert JSON directly to JavaScript data structures using the eval() function — but when you look at the whole world of generic markup, those are small blips on a very large screen, and all of the markup languages more-or-less look the same.

Tagged | 5 Comments