Of Dilbert and Torture

[I normally stick to technical issues on this weblog. This posting is about logic, which is sort-of related to tech; apologies in advance to anyone who came here hoping for a short break from personal pontification about current events.]

Over on The Dilbert Blog, Scott Adams has just declared himself the winner of a debate. He asked the following question:

If you think there’s no moral justification for torture, would you accept the nuclear destruction of NYC (for example) to avoid torturing one known terrorist? (No fair extending my question to more ambiguous hypotheticals.)

Most people who commented objected to the question itself; as a result, today Adams declared himself the winner by a knockout and went on to insult his opponents:

… a scary number of people offered comments that were the logical equivalent of punching themselves unconscious in the first round. I don’t need to point them out because they’re somewhat obvious. The point is that most of those people are eligible to vote.

Let’s put aside the issue of torture, and simply look at the question itself. Adams has structured his question so that whether you answer ‘yes’ or ‘no’, you’re forced first to accept the premise that torture is an effective way to get information — in other words, there’s no way to answer the question directly without agreeing with him. This trick is called the Fallacy of many questions — the classic (somewhat disturbing) example is the question “when did you stop beating your wife” — and in a formal debate, it would result in a severe penalty.

To show how this fallacy distorts an argument, substitute a premise that (I hope) no one reading this posting would agree with, and try to come up with a straight ‘yes’ or ‘no’ answer:

If you think there’s no moral justification for murdering children, would you accept the nuclear destruction of NYC (for example) to avoid pushing one live baby slowly into a wood chipper? (No fair extending my question to more ambiguous hypotheticals.)

I do believe that it’s important to debate all issues openly, even touchy ones such as whether torture is an effective kind of interrogation — I believe that the answer is ‘no’ , but in my personal, offline life, I’m not afraid to hear legitimate evidence and reasonable arguments from people who disagree with me. I promise not to introduce any logical fallacies to try to trip those people up.

And I don’t plan an ad hominem attack against Adams either. He seems to be a smart guy, and I enjoy his comics. I’ll look forward to hearing his legitimate arguments on the torture issue.

Posted in General | 7 Comments

Mind your colons …

… and make friends with a technical writer.

Prescriptive grammarians — the ones who argue that the English language should follow a single standard that is both correct and eternal (at least since Fowler) and attempt to impose that standard on people around them — have generally had, at most, a very limited exposure to serious language study. To put it bluntly, folks, we laugh at you behind your backs. Alexander Pope’s famous quip about dim-witted, self-important critics applies here as well:

A little Learning is a dang’rous Thing;
Drink deep, or taste not the Pierian Spring:
There shallow Draughts intoxicate the Brain,
And drinking largely sobers us again.

Technical writers

To a software engineer, the person who often seems the most drunk on shallow drafts of prescriptive grammar is the technical writer. The engineer sends the tech writer a spec, hoping to have the spelling corrected or the prose tidied a bit, and gets back pages covered in red ink, pointing out apparently minor details like ambiguous pronoun reference, comma splices, and colon usage. Sadly, there do exist dim-witted, self-important technical writers, but in fact, most of them are not closet prescriptive grammarians; instead, they are trying to do two things:

  1. make the phrases, clauses, sentences, and paragraphs consistent and intuitive in the documentation, just as you try to make the class APIs, GUI components, and interfaces consistent and intuitive in the code; and
  2. bridge the gap between engineers, who know a lot about the application, and users, who know little to nothing about it.

Colon usage

As a gift to technical writers, in keeping with the holiday spirit, I’m going to descend a little into the underworld of prescriptive grammar and point out one item that gives tech writers no end of frustration: the use of the colon (:). Take a look at this sentence:

[no]

The three functions are: create, edit, and delete.

Tech writers, copy editors, and English teachers will not accept this use of the colon, any more than a software engineer would accept a method named retrieveAmount beside getDate and getAuthorization. On the other hand, a tech writer would have no objection to this sentence:

[yes]

There are three functions: create, edit, and delete.

Can you spot the difference? If not, here’s another example of colon usage that is unacceptable to most tech writers:

[no]

To enable editing, select:

  • authenticate users,
  • enable backups, and
  • enable page modification.

Without the colon, the example would be perfectly acceptable:

[yes]

To enable editing, select

  • authenticate users,
  • enable backups, and
  • enable page modification.

Here’s an alternative version that is acceptable with a colon:

[yes]

To enable editing, select the following options:

  • authenticate users,
  • enable backups, and
  • enable page modification.

There is a very simple rule of thumb that you can apply: use a colon only if what appears before it could be a sentence on its own. “The three functions are” and “To enable editing, select” cannot stand on their own as sentences; “There are three functions,” “To enable editing, select the following functions,” and Pope’s “A little Learning is a dang’rous Thing; Drink deep, or taste not the Pierian Spring” can.

Best practice for punctuation changes fast, and some day (likely soon), this rule of thumb will be completely obsolete. For now, though, why not make a tech writer’s day a little brighter, and mind the colons?

Posted in General | 3 Comments

Bob DuCharme

Bob DuCharme, who is well known in the XML community, now has a weblog. Welcome.

Tagged | Comments Off on Bob DuCharme

Can SOAP hide XML?

[update: fix affiliations]

I just stumbled on an interesting paper from the IEEE Web Services conference last July, Rethinking the Java SOAP Stack (PDF), written by Steve Loughran at HP Laboratories Bristol and Edmund Smith at the University of Edinburgh. Steve and Edmund believe that JAX-RPC (the main Java-based SOAP interface) takes a disasterously wrong approach for a couple of reasons:

  1. automated Object/XML mapping is badly broken for non-trivial applications, and cannot be fixed; and
  2. by generating WSDL automatically from Java code, rather than the other way around, JAX-RPC is following a contract-last antipattern.

I agree with them on both points. As I mentioned in my talk at XML 2005, I believe that most attempts to hide XML from developers are doomed because XML really does represent a new way of thinking about information, not just a new way of encoding it. Like the authors, I’ve also noticed that most real-world SOAP implementations end up using document/literal encoding anyway. Here’s a particularly cutting passage from the paper:

We believe that only two categories of web service developer exist: those who are comfortable with XML and want to work with it, and those who aren’t but end up doing so anyway. JAX-RPC provides a sugar-coated wrapping that encourages developers who are relatively unfamiliar with XML to bite. Yet, as anyone who has written a web service of any complexity knows, the XML must be faced and understood eventually. In practise, the task of creating a real web service is made more difficult, not less, by the huge volume of code JAX-RPC introduces into a project.

This is a much bigger issue than JAX-RPC or even SOAP, though. In the end, it points to one of the biggest dividing lines in the XML world, the line between people who think XML is a detail that can be hidden (many big vendors talk that talk) and those who think that it XML is an enabling technology that should be brought into the foreground. Interestingly, the rank and file seem to fall more into the second camp, at least judging by the popularity of raw REST HTTP+XML interfaces among PHP/Perl/Python web site developers.

Tagged | 2 Comments

White-texting Google

[Update: the white text no longer helps Oxcyon — they’re not even the top hit for their own company name any more.]

I’ve just stumbled across the most extreme white-text example I’ve ever seen, and it belongs not to a porn site but to an enterprise software vendor. Check it out:

http://www.oxcyon.com/

Scroll to the bottom of the visible text and then start highlighting with your mouse to bring into view screen after screen of white-text key words and phrases custom-designed to get the attention of Google and other search engines. Or, if you prefer, just bring the site up in lynx.

I guess this kind of thing still works. I tried a Google search for “stellent taxonomy support“, and Oxcyon — not Stellent (a competitor) — was the first hit.

Tagged | 7 Comments

Thanks, Lauren

Lauren Wood (by Tim Bray).

[Update: Lauren has posted her farewell message.]

On Thursday night in front a packed banquet hall in Atlanta, Lauren Wood announced her retirement as chair of the annual fall XML conference, the world’s largest XML event.

Lauren has done an outstanding job organizing and building up this conference over the past five years, but that’s only one of her many contributions to the XML community. Lauren also chaired the W3C DOM working group from its inception to the release of DOM level 2, and before that, she worked for SoftQuad, producers of one of the leading SGML and XML editors.

Lauren is now at Sun Microsystems and starting to spend some of her time on Liberty Alliance work — given Lauren’s track record so far, I’m suddenly a little more optimistic about the prospects of shared digital identity and single sign-on for the web.

Photo by Tim Bray, copyright (c) Lauren Wood, used under a Creative Commons license.

Tagged , | 1 Comment

Must-Ignore and Must-Understand

I was listening to Tim Bray‘s excellent talk On Language Creation today at the XML 2005 conference in Atlanta. Tim was talking about creating new XML-based markup languages (summary: “please don’t”), and in passing he mentioned the must-ignore/must-understand design pattern. For the first time, it occured to me that this pattern has a serious flaw.

The pattern

The pattern works this way: you want to let people extend your XML-based language with new elements, and you want to allow forward-compatibility so that systems don’t break if or when you upgrade the language, so it’s usually a good idea to let applications simply ignore what they don’t understand (as is the case with HTML). That’s called must-ignore. For example, if your application sees this XML document

<record>
 <a>xxx</a>
 <b>xxx</b>
 <w>xxx</w>
 <c>xxx</c>
</record>

but it does not understand the w element (maybe you added it to hold extra information for a different application), it will just pretend that the w element wasn’t there, and might process the document as if it read

<record>
 <a>xxx</a>
 <b>xxx</b>
 <c>xxx</c>
</record>

On the other hand, if w contained some kind of crucial information that would change the application’s processing — say, by reversing the outcome or specifying an essential prerequisite (“turn off the oxygen first“) — it would be better to have the application quit and report an error instead of chugging on ahead. That’s called must-understand. Some specifications, like SOAP, actually specify these rules inside the XML instance on an instance-by-instance basis, but most simply frame them in general terms in the specification.

The problem

I realized today, however, that there’s a huge problem with this approach: must-ignore and must-understand are properties of a processing model, not a markup language. Consider an XML language for a business report: if I designate an element as must-understand, what do I really mean?

  1. An application must understand this element to copy this information into a database?
  2. A search engine must understand this element to index it?
  3. A formatting engine must understand this element to generate a PDF?
  4. An XML editing tool must understand this element to open the document?
  5. An XSLT engine must understand this element to do a transformation?
  6. An archiver must understand this element to save the report for auditing purposes (say, Sarbanes-Oxley requirements)?

Each of these represents a different processing model for the same XML document. The must-understand and must-ignore constraints will likely be different for each one, so they’re obviously not properties of the XML-based markup language. Some XML languages, like SOAP and Atom, are specified explicitly as parts of protocols, so the must-understand/must-ignore constraints are part of the protocol specification, but even then, once you have XML, you never know what clever things people will decide to do with it.

Posted in General | 7 Comments

First mover (dis)advantage

I recently heard from an older computer user who was delighted that his hotel’s free WiFi simply worked with his notebook computer. Internet access on the road didn’t use to be so easy, either for hotels or their guests. Consider these three (hypothetical) hotels:

  1. In 1995, hotel #1 spent a lot of money to redo its digital phone system to make it compatible with computer modems.
  2. In 2000, hotels #1 and 2 spent even more money to run Cat 5 (Ethernet) cable to all of their rooms
  3. In 2005, hotels #1, 2, and 3 spent much less money to set up a few WiFi hotspots.

A quickie moral would be that hotel #3 came off better, since it ended up in the same place for a fraction of the cost, while the other two suffered from a first mover disadvantage. Reality, of course, is more complicated.: hotels #1 and 2 had five years to amortize each of their earlier investments. If those investments allowed them to steal guests from hotel #3, or to charge higher rates, then the investments may well have turned a net profit for the hotels.

The real moral is that the one that the extreme programming advocates push: build for today. As long as hotels #1 and 2 were investing in technology that their guests needed right away (rather than at some ill-defined point in the future), they probably came out OK. On the other hand, if a hotel were putting in technology just because, some day, it might be needed, it probably saw that technology superceded before it could bring in any return.

If this moral seems simple and obvious when applied to hotels, then why do architects ignore it sometimes when designing information systems for big enterprise and government? When we sell them on something like WS-* (or a REST-based data architecture), what criteria do we use to figure out whether we’re building for today, or for a tomorrow that may never come?

Posted in General | Tagged | 3 Comments

GET requests and "wings fall off" buttons

Bill de hÓra is outraged that people are blaming Google Web Accelerator (GWA) for following HTTP GET links, rather than blaming the morons^H^H^H^H^H^Hweb developers who built web sites that use innocent-looking GET requests for actions with side effects, like (say) delete or launch missile attack.

I don’t know if GWA itself is useless hype, an evil conspiracy, or a good thing (I suspect some combination of the first two), but Bill’s right that the assumption that it’s always safe to follow a GET link is one of the basic pillars of the web. Initiating a potentially dangerous action in response to a GET request is on the same level as putting a “wings fall off” button on the arm of an airliner seat — sure, we’d prefer that the passenger not hit the button, but why is the button there in the first place?

Tagged | 2 Comments

Sputtering down to XML 2005

My creaky little Piper Warrior has been grounded since a lighting strike (while tied-down on the apron) back in July, but the engine’s finally back from overhaul, and I plan to be in the air soon — just in time, in fact, to sputter my way down from Ottawa to Atlanta to speak at the XML 2005 conference. I’m planning a 7-8 hour flight down if weather permits, with stops in Watertown NY (to clear customs) and in either Pittsburgh PA or somewhere in West Virginia to refuel. I flew myself to XML 2003 in Philadelphia as well, but that was a much shorter (and non-stop) flight.

Is anyone else flying to the conference in a small plane? Perhaps we can set up an informal general aviation BOF. I’m looking forward to seeing you all there — even the non-pilots, of course.

Posted in General | Comments Off on Sputtering down to XML 2005