Some comments on Edd Dumbill’s “the state of xml” (updated)

Tags: aec

Edd Dumbill delivered a talk at a recent xml conference in which he summed up the state of xml. This seemed like a nice opportunity to put in writing some miscellaneous observations I've made lately. The subdivision is according to the sections of the article. Standards development broadened from the W3C

Standards developed by the World Wide Web Consortium (w3c) are seen as the real, definitive standards by many. Not all are as successful, but most are. Some outside specifications challenge work done by W3C, which is a good thing in principle.

From the article: The big triumph on this front has been the RELAX NG schema language. Its simplicity and ease of authoring have proved a compelling contrast to W3C XML Schema. (It's interesting that the commitment to manual authoring of schemas is so strong that RELAX NG has a non-XML syntax for convenience.) Various W3C working groups and even Microsoft are using RELAX NG internally, even if they have to convert to W3C XML Schema for interchange later.

Yes, RelaxNG is a nice, simple schema language. An xml schema is a horrible thing to read, RelaxNG does a much better job in this regard.

But, most documents in the building industry's ict research happily mention only xml schema without a look at alternatives. XML schema is the standard, right? Well, yes, but my suggestion is that there's a better alternative. Many smart xml guys are recommending it. And most of the libraries do support it, too. At the very least, the documents should mention it in the same breath. (I'll bug the econstruction workshop which I'm a member of to include it in their doc).

Update: Just when finishing this, I stumbled upon something by Sean McGrath where he's praising relaxng (especially the part I've been a markup specialist every working day of my life for about 20 years and I only understand a subset of W3C XML schema. I have a pretty thick skin for complexity but I have my limits :-) Relax NG by contrast is SIMPLE.) He also points at James Clark's piece on the difference between relaxng and xml schema.

Divergence of web services

Developed mostly outside of the W3C, the current deluge of higer-level web services standards seems like an example of something developed outside of the W3C that's going horribly wrong. The higher-level web services specifications taken outside of the W3C are a disaster in the making, with no real underlying strategy or guarantee of longevity. (from the article). represents much that is hateful: overblown hype, poor specification, and spiraling complexity. The REST way of dealing with web services is much better. Using plain http (get/put/post/delete). The difference is that every webbrowser uses http's GET index.html to fetch a webpage. Works every time. The webservices-gone-mad deluge of specifications almost tells us that it would be much better to have to send an xml file to the webserver and to specify the function (getDocument or doc or fetchDoc or fetchDocument, different for every webserver) and the parameter (pagename=index.html or id=index.html etc.) in that xml file.

Eeeeeek. That way, the web would have never ever taken off. The REST proponents (there's a lot of them) state that it's better to map databases, programs, whatever onto the web's generic url+http model. So GET should get you an ifcxml file for a drawing. GET gets you an ifcxml file with all walls in that drawing. Etc. DELETE deletes the wall with id=1234. With POST you can insert data into an item.

To me, that's the way forward. Explain how you've mapped out your URL's, explain what kind of document I get when I call that URL, explain what I have to send to put some new data in. Definitively better than explaining an API. Especially when that API is different than the other project's API. And the third project's API. And why my program doesn't yet support two out of three of those APIs.

Note: I'm in the last year of my PhD and my professor is a bit worried that I'll give away too much of my thinkwork so that there's nothing really new left by the time the dissertation's finished. So, when you're really going to do something with this information, please provide a pointer to me or reference an "upcoming dissertation by RR". I've also got half a paper (ecppm 2002) on it. I get the idea that everybody is mentioning/using soap/wsdl/ws-* and that not many in this building ict field are looking at REST. So in that sense there's something newish in pushing this from my point of view.

XML's great strengths endure

This section contains something I need to do some further thinking on. At one point it lauds some project's return to nice, human-readable XML. Then Edd makes the following remark: As an RDF fan, the realization of this truth causes me some pain. The way out is to stop thinking of RDF as an XML application, and look to easier syntaxes such as Turtle and N3.

The fact is that RDF's xml syntax can be horrible to read. Perhaps we're better off using the more plain-text like formats (turtle, n3) for it instead of sticking to XML (which doesn't completely fit).

Doing something non-XML like sure feels like a sin nowadays. More thinkwork. logo

About me

My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):