My first REST design question is about the fact that RESTafarians seem to consider identification and location to be the same thing, and following from that, the question of how to make identification persistent in XML resources. For example, assume that
http://www.example.org/airports/ca/cyow.xml is both the unique identifier of an XML data object and the location of that object on the web. That’s the whole point of REST, really. RESTafarians don’t like interfaces where identifiers are hidden inside XML objects returned from POST requests to unrelated URLs, for example (in fact, they get angry in quite an amusing way).
GET and PUT
So, here’s a simple use case. Let’s say that I download the XML data file at
http://www.example.org/airports/ca/cyow.xml and it looks like this simple example:
<airport> <icao>CYOW</icao> <name>Macdonald-Cartier International Airport</name> <political> <municipality>Ottawa</municipality> <region>ON</region> <country>CA</country> </political> <geodetic> <latitude-deg>45.322</latitude-deg> <longitude-deg>-75.669167</longitude-deg> <elevation-msl-m>114</elevation-msl-m> </geodetic> </airport>
I then copy it onto a USB memory stick, bring it home from work, copy it onto my notebook computer, and work on it while offline during a business flight. The file no longer has any direct connection with its URL: it has gone through other transfers since the HTTP GET request I used to download it. How do I know what I’m working on or where I should PUT it when I’m done?
If this information has to be kept out of line, then some of REST’s advantages are evaporating, because now I have to start using custom-designed clients again instead of simply piggybacking on existing web technologies. As an identifier, the URL is clearly part of the resource’s state, and belongs in the XML data file; as a location, however, it is superfluous information and belongs only in the protocol (HTTP) level.
Where does the document identifier go?
Let’s assume that I get over my squeamishness and decide that the URL is a proper identifier and belongs in the XML representation. Now, how do I do that in a fairly generic way? xml:id is out of the question, since it’s designed only to hold an XML name for identifying part of a document, not a URL to identify an entire document. I could use (or abuse) xml:base, like this:
<airport xml:base="http://www.example.org/airports/ca/cyow.xml"> ... </airport>
I’m not certain, though, how XLink processors would deal with that. Would the relative URL “cyyz.xml” end up being resolved to
http://www.example.org/airports/ca/cyow.xmlcyyz.xml? There’s also the possibility that some highly-cooked APIs might predigest the xml:base attribute so that application code never sees it. Do the XML standards people believe this kind of an xml:base usage is legit?
If xml:id is unusable, and xml:base is problematic, it looks like there might be no standard way to identify RESTful XML documents, and each XML document type will need its own ad-hoc solution. Any suggestions? Does the world need one more xml:* attribute (I hope not)?
I’d be interested in hearing how REST developers have dealt with identifier persistence and round-tripping when the identifier is the URL.