Work at home: finally time to read those papersΒΆ

Tags: phd

Because of Floris I have to spend more time at home instead of at work. But, OK, I've got loads of papers and other documents to read. So I made sure I had a heavy stack of them at home before Floris was born. Below you find some of the results I got from the latest set of papers or blogposts. Vocabulary design

James Tauber, more on xml and rdf. He believes that when you design a vocabulary you need three things:

  • an rdf schema (owl or so)
  • a schema for the xml format (relaxNG for instance)
  • a defined mapping between the two

Sounds OK to me. For some things a well-defined xml format is handy, for other things you want rdf. This is a hint I'm probably going to use.

E-procurement in the construction sector

Alexander van Lomwel sent me a paper by Enrico Cagno and colleages about e-procurement in the Italian engineering and contracting sector. The interesting things I got out of the article:

  • 10-20% of construction goods bought in Italy is plain standard stuff, most of the rest is tailor-made. (Note: this is for "commissioning companies", a term that isn't completely clear to me; sounds like the firms that deal out the construction jobs).
  • E-procurement concentrates on that 10-20%. The easy part.
  • The rest (80-90%) is unmined ground. There is serious money to be made on the transactions that are harder than "give me 2000 bricks". When you want doors of specific sizes. In a specific kind of wood.

OSI network layers and the semantic web

Sergey Melnik and Stefan Decker have a paper on A Layered Approach to Information Modeling and Interoperability on the Web which has a good analogy.

There is the OSI network layer which we unknowingly all use. The internet is build on TCP/IP. Ok, from the back of my memory: TCP is in layer 3, IP is in layer 4. (correction: my brother Maurits told me I swapped 3 and 4). IP transports TCP packets. All the internet applications talk layer 3 or 4. But they don't need to worry about the lower levels 1 and 2, those are handled by the hardware and, more importantly, the level 3 protocol: TCP!

Link to image

They draw the analogy to semantic web-like systems. The lowest layer is the syntax layer, then the object layer, semantic layer and the application layer.

  • Syntax (like xml) deals with all the pesky details like character encoding, file format, etc.
  • The object layer is what you get when you parse the xml (or rdf) file. For rdf you get a graph (a set of triples). For xml you get something like a DOM tree.
  • Semantic layer? The object layer stuff is interpreted. You draw conclusions from the rdf info and add information, for instance.
  • Finally, the application does something with it all.

Link to image

The good thing is that every layer performs a specific function. Adding info. Calculating something. Reading bytes. This way you can swap one part for another. Or at least you've got a chance to do it.

Ontology-driven information systems

Nicola Guarino has a paper about formal ontology and information systems (pdf). Good thing in that paper is the emphasis on an ontology-driven information system development. Keeping the structural data out of the application and in an ontology. That's the way to do it. Makes the application way more maintainable.

Another thing I got out of the paper is about the cooperation between a number of ontologies. According to the paper, it is reasonable to expect unified top-level ontologies for large communities of users. Something generic on top, I can agree to that. On the second level you'll see domain ontologies, task ontologies, etc. On the third level you'll see the actual application ontologies. They will mix and match objects and properties from the second level ontologies.

Ok, that's a pretty clean subdivision. Especially on level two. And explicitly combining what you need on level three... I'll see whether that holds up to practice. It smells good.

Documents on the wire

Jon Udell about web services' human touch (apparently behind some password now...). What I got out of it was to see most of what you send over the wire as documents. For me, it's easy to think in UML and modeling terms. Keeping in mind that it should, in the end, be considered a document helps me to keep in mind what the document should be about. And that it should be moderately human-consumable. At least human-understandable :-) logo

About me

My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):