Now that the Java world is noticing REST, the low-pain alternative to RPC standards like WS-*, people are starting to blog about it again. Gossip with other IT folks also tells me that people’s customers are actually asking for REST explicitly (rather than having to be convinced to use it). With that in mind, I’m going to try to explain what I think matters about REST, and what you can safely ignore.
The elevator pitch
With REST, every piece of information has its own URL.
If you just do that and nothing else, you’ve got 90%+ of REST’s benefits right off the bat. You can cache, bookmark, index, and link your information into a giant, well, web. It works — you’re reading this, after all, aren’t you? Betcha got here by following a link somewhere, not by parsing a WSDL to find what ports and services were available.
Real best practices
If you want to do REST well (rather than just doing REST), you can spend 2-3 minutes after your elevator ride learning a few very simple best practices to get most of the remaining 10% of REST’s benefits:
Use HTTP POST to update information. Here’s the simple rule: GET to read, POST to change. That way, no body deletes or modifies something by accident when trying to read it.
Make sure your information contains links (URLs) for retrieving related information. That’s how search engines index the web, and it can work for other kinds of information (XML, PDF, JSON, etc.) as well. Once you have one thing, you can follow links to find just about everything else (assuming that you understand the file format).
Try to avoid request parameters (the stuff after the question mark). It’s much better to have a URL like
Search engines are more likely to index it, you’re less likely to end up with duplicates in caches and hash tables (e.g. if someone lists the request parameters in a different order), URLs won’t change when you refactor your code or switch to a different web framework, and you can always switch to static, pregenerated files for efficiency if you want to. Exceptions: searches (
http://www.example.org/search?q=foo) and paging through long lists (
http://www.example.org/systems/?start=1000&max=200) — in both of these cases, it’s really OK to use the request parameters instead of tying yourself in a knot trying to avoid them.
Avoid scripting-language file extensions. If your URLs end with “.php”, “.asp”, “.jsp”, “.pl”, “.py”, etc., (a) you’re telling every cracker in the world what exploits to use against you, and (b) the URLs will change when your code does. Use Apache mod-rewrite or equivalent to make your resources look like static files, ending in “.html”, “.xml”, etc.
Avoid cookies and URL rewriting. Well, maybe you can’t, but the idea of REST is that the state is in the thing the server has returned to you (an HTML or XML file, for example) rather than in a session object on the server. This can be tricky with authentication, so you won’t always pull it off, but HTTP authentication (which doesn’t require cookies or session IDs tacked onto URLs) will work surprisingly often. Do what you have to do to make your app work, but don’t use sessions just because your web framework tells you to (they also tie up a lot of resources on your server).
Speculative stuff (skip this)
The strength of REST is that it’s been proven through almost two decades of use on the Web, but not everything that some of the hard-core RESTafarians (and others) try to make us do has been part of that trial. Stop reading now if you just want to go ahead and do something useful with REST. Really, stop! Some of this stuff is moderately interesting, but it won’t really help you, and will probably just mess up your project, or at least make it slower and more expensive.
[maybe some day] Use HTTP PUT to create a resource, and DELETE to get rid of one. These sound like great ideas, and they add a nice symmetry to REST, but they’re just not used enough for us to know if they’d really work on a web scale, and firewalls often block them anyway. In real-life REST applications, rightly or wrongly, people just use POST for creation, modification, and deletion. It’s not as elegant, but we know it works.
[don’t bother] Use URLs to point to resources rather than representations. Huh? OK, a resource is a sort-of Platonic ideal of something (e.g. “a picture of Cairo”), while a representation is the resource’s physical manifestation (e.g. “an 800×600 24-bit RGB picture of Cairo in JPEG format”). Yes, as you’d guess, it was people with or working on Ph.D.’s who thought of that. For a long time, the W3C pushed the idea of URLs like “
http://www.example.org/pics/cairo” instead of “
http://www.example.org/pics/cairo.jpg“, under the assumption that web clients and servers could use content negotiation to decide on the best format to deliver. I guess that people hated the fact that HTTP was so simple, and wanted to find ways to make it more complicated. Fortunately, there were very few nibbles, and this is not a common practice on the web. Screw Plato! Viva materialism! Go ahead and put “.xml” at the end of your URLs.
[blech] Use URNs instead of URLs. I think even the hard-core URN lovers have given up on this now — it’s precisely the kind of excessive abstraction that sent people running screaming from WS-* into REST’s arms in the first place (see also “content negotiation”, above), and it would be a shame to scare them away from REST as well. URLs are fine, as long as you make some minore efforts to ensure that they don’t change.
[n/a] REST needs security, reliable messaging, etc. The RESTafarians don’t say this, but I’m worried that the JSR (the Java REST group) will. We already have a secure version of HTTP TLS/SSL, and it works fine for hundreds of thousands or millions of web sites. Reliable messaging can be handled fine in the application layer, since everyone’s requirements are different anyway, or maybe we want a reliable-messaging spec for HTTP in general. In either case, please don’t pile this stuff on REST.
So to sum up, just give every piece of information its own URL, then have fun.
Very nice summary. Quick question, though. Who’s advocating the use of URNs? In what context? If you were to use URNs to name a resource, how would you dereference it? Sorry, but I haven’t seen this bit of perceived wisdom before.
I hate URNs too.
Content-negotiation may come into its own yet, especially with something like yadis for OpenID, but suffixes are ok, so long as they’re the format being returned, not the authoring tool – i.e. .xml, .json, .html is fine, .php, .pl, .asp, .asmx sucks. Much better than ?format=json as seen in Yahoo! services, anyway.
Your question “how would you dereference it” was exactly what killed URNs in the end (or at least put them into a permanent coma). They were a big deal in the late 1990s among the web technorati, who wanted to give us identifiers that were not subject to the whims of DNS (for example, you could give an ISBN-based URN to identify a book, instead of the Amazon link), but nobody ever figured out how they would actually work. Since the XML spec was written at that time, it retains the oddity that a system identifier (such as the external DTD subset) can be a URN reference as well as a URL reference, though I’m not sure what most XML parsers would do if they ever ran across something like this:
Actually, I know what URNs are (and in fact kinda like ’em for namespace names, that or URLs pointing to RDDL documents). I was wondering what use they would have in a RESTian world outside of that?
Very nice, but I disagree with your description of POST and PUT. While POST can do lots of things, it should be used to create when PUT is available, and PUT should be used for updates.
Pingback: protocol7 » Blog Archive » links for 2007-02-17
Pingback: ebyblog » Blog Archive » Bookmarks for February 17th through February 21st
I’m glad that you said “_Try_ to avoid request parameters” (empahasis on try), since I see this as an example of where the ideals of REST taken to the extreme would make life a real pain in the ass. I “try” to do this when desining REST protocols, but only up to a certain point. Query strings are a practical convenience and are so broadly supported that to force every URL to be in the form /foo/bar/foo2/bar2/abc/def would just make things a lot harder than they need to be (for both clients and servers).
If the net/web ever transform to something passing blocks of data rather than packets between end-points (kind of like distributed hash table-overlay networks of today) I guess URN, resolved to SHA identifiers for a suitable representation by some google-like service who parses RDF webs, would be better than direct SHAs.
I’mnot totally sold on getting rid of querystrings. They have some advantages. A major one is being accessible via s. Another is that they provide a bit of self description.
Which is better?
I’m not totally sold on getting rid of querystrings. Querystrings have some advantages including being accessed from s and offering a bit of implicit description.
Which is better?
Pingback: Pete Lacey's Weblog
Pingback: bugfox blog » Blog Archive » David Megginson on REST
I agree with “Avoid scripting-language file extensions” but not the first reason you give for it. Attempting to conceal your framework is illusory protection at best, it’s “security through obscurity”. Your second reason is sufficient: it’s just encapsulation. Choice of scripting language or framework is an implementation detail that should not be reflected in the URL.
Pingback: New JSR to define a high-level REST API for Java « Noelios Consulting
Pingback: sideline.ca » Elevator pitch for REST
Pingback: links for 2011-07-04 – Kevin Burke
Pingback: Distributed Weekly 117 — Scott Banwart's Blog
I’m sorry, but
is way less self explaining than
What is http://www.example.org/systems/foo/bar/components/
Only the developer knows, what he meant.
I also think the risk of unintentional overwriting should be minimized by authorisation, not by syntax.