Chapter 5. Initiatives which influenced ceXML


People are stupid and will believe anything because either they want to believe it or are afraid it might be true - wizard's first rule

Table of Contents
Electronic Business XML (ebXML)
Global Engineering Networking (GEN)

The purpose of this chapter is to provide a basis for the design and implementation described in Chapter 6 and Chapter 7. My design and implementation are influenced mainly by three initiatives, who will serve as the underlying basis.

This chapter introduces ebXML (electronic business XML), GEN (Global Engineering Networking) and the LexiCon (a set of programs made by STABU). These are initiatives which provide useful ideas on how to design and use a vocabulary. Each initiative is introduced. Some interesting aspects will be discussed at a deeper level, and at the end of each part the influence on ceXML is summed up. The chapter concludes with a summary of all the influences distilled from the initiatives, to serve as an input to Chapter 6.

Electronic Business XML (ebXML)

ebXML (Electronic Business XML) was started by the United Nations, who earlier developed EDI. It is an initiative in which many important and acknowledged global players co-operate. This results in work of high quality, fuelled by extensive experience. In this section, the interesting parts of ebXML are introduced, resulting in a list of ideas that will be adopted in ceXML.

Introduction on ebXML

The mission of ebXML is to provide an open XML-based infrastructure enabling the global use of electronic business information in an interoperable, secure and consistent manner by all parties []. The project was jointly initiated by the UN (United Nations) and OASIS (Organisation for the Advancement of Structured Information Standards) and aims to produce a framework for sending and receiving electronic business information within an 18 month time frame (the planned end date is mid 2001). This framework includes XML standards, protocols and software.

ebXML's context mechanism

The collective experience available in ebXML is huge. Most of the large companies that either work with or provide frameworks for electronic business are present. In current systems, two ways of dealing with information form the mainstay of solutions.

  • You make a model that allows you to describe in a precise way all information that is needed for one specific industry. This has the disadvantage that this system will not work well the moment you have to exchange information with another industry. No industry stands on it's own: a steel mill needs to order pencils and needs to deal with an accounting system for payments. Both of these fields will not be covered by the industry specific information model (which deals with steel).

  • You make a generic model that is applicable to a large number of industries in order to provide a solve-all solution. But this model will not be able to capture the level of detail needed to exchange information regarding e.g. the steel fabrication process.

ebXML acknowledges the fact that both the industry-specific initiatives and the solve-all generic approaches do not work. Initial research pointed out the need for a different approach. As a solution to above problems, it was decided to work with the concept of contexts, explained below.

A context might be a geographical one (USA, Europe, Germany), it might indicate an industry (insurance, building, automobile), etc. Using the contexts, a basic set of core components provided by ebXML can be extended to facilitate speaking about e.g. a [building project] in [Europe] using the [Dutch] classification system. This mechanism allows for a notion of inheritance. A core component can be viewed as a tag (like <address>), but it is more a concept of it's own. It can be used (when properly extended through the context mechanism) to hold street-city-state-type addresses in the US in one context internet addresses in another context, etcetera. Phrased in technical terms: the context mechanism modifies and extends an initial small, very generic model[1] in such a way that it is suited for holding information in that specific context.

For example, ebXML has given one of it's core components the name party. Party as in "this contractor is one of the parties involved in building this bridge"). The tag party always contains information on the identity of the party and how to contact the party. This is achieved through a name and an address tag. For some situations, this might be enough, but party can be extended to include tags like street and state in the USA, while in the Canadian context province will be used instead of state. As a side note, the context is not supposed to be hierarchical, which in this case means that first selecting the context USA and after that insurance should be the same as first selecting insurance and then USA, both should mean "I'm talking about insurance in the USA setting".

Work is underway to come up with an XML format specifying these context rules. A rule consists of actions to be taken to expand a given vocabulary to contain the tags needed to talk in the context the rule is valid for. Normally, the starting vocabulary will be ebXML's core component vocabulary. This vocabulary contains basic tags like address and contract. An early try-out (showed on a recent ebXML conference) demonstrated this extension of the core component vocabulary with additional terms.

Because XML Schema, the language that will be used in the near future to create a vocabulary in XML, is itself specified in an XML format, this process could well be done using XSL/T (see the section called eXtensible Stylesheet Language/Transformation part (XSL/T) in Appendix A), an XML technology to change XML files. The advantage is that it can be done completely with XML technology, making it usable in a lot of circumstances.

This mechanism is powerful, but it means that at the same time the resulting schemas might not be stable enough. Something that is generated is not something that is set in stone. But if the mechanism is well-defined and predictable, this disadvantage will not be a problem. For ceXML (the vocabulary which is the subject of my research), the context mechanism seems to be the right choice. Having to accommodate different languages, classification systems and uses, as described in Chapter 3, flexibility might be the right approach.

Because no ambiguity is allowed in any core component tag, ebXML provides strict definitions for every element they create (at the moment about 30).

Semantic level

Semantics is basically the "meaning"-part that is associated with language. Grammar is the syntax, the structure of a language. Semantics is what ties the word tree to the big brown piece of wood with green leaves (or needles).

ebXML leaves the semantics mostly to industry groups. With the semantics, I mean the tags that are needed to specify, to give words to, items defined in the industry. To rephrase that, semantics are the words - the words that are needed to communicate meaningfully about the things that are relevant to an industry.

In the ebXML system, the semantics that are needed are provided from two sides:

  • ebXML provides the few tags that are needed by everybody: the core components. They are the ones used for very generic tasks, like identifying the two trading parties, their addresses, contract, signature etc.

  • Specific industries must take care of the task of creating a set of tags needed to communicate about things specific to their industry. For the building and construction industry, eConstruct should provide these tags.

You can draw an analogy to the development of a normal (human) language. Here also you have a basic set of words, sayings and concepts. Every industry or group adds his own semantics to the pool of words. From the King James Version bible-speak in some churches to the unintelligible ramblings of computer enthusiasts and the technical phrases of a civil engineer, every group has group-specific semantics.

To make it practical for eConstruct: eConstruct should create the message while ebXML takes care of the envelope and the transport. ebXML creates the semantics needed to talk about where to send the message and what to do with it, eConstruct has to create the semantics needed to talk about the needed strength for a hollow-core slab with a span of 5 meters.

eConstruct has to watch ebXML's development carefully in order to be able to integrate when the specifications are stable. Arguably, ceXML cannot integrate at the moment with something that does not exist. It is good, however, to remain focused on the fact that the trading and communication stuff will be provided for by initiatives like ebXML. ceXML (and it's big brother bcXML) can concentrate on the specific building and construction semantics.

Business processes

ebXML also has a technology called business processes that defines what is to be done with the data and what can be done with the data. By strictly specifying business processes, also those processes are subject to automation. As an example, you can specify what has to be done to get permission to build a house. If that is known, several steps can be automated, making it easier to deal with them. There are specific issues with the building and construction industry regarding the way a conversation between two possible partners rolls along, which eConstruct should integrate into the business processes framework. For instead of relying on purpose-build one-time solutions, it is good to follow ebXML's example: devise a mechanism for the dealing with business processes.

Again, this is not directly feasible for ceXML because ebXML is not finished yet, but the idea is sound. Therefore also in ceXML, transactions will be driven by data, not by programming logic.

Influence on ceXML

As indicated earlier, using ebXML (in the form of a finished product) is not possible because it is not finished yet. Much work that is done by ebXML, however, seems extremely useful. Therefore the following parts are chosen for inclusion in the ceXML prototype.

  • The context mechanism. Using contexts to adapt the vocabulary to that context.

  • The distinction between general addressing-like items (the core components) and the building/construction specific items.

  • The use of specified business processes to ease the processing of the XML requests and answers send back and forth.



... or XML schema, or DTD, depending on the way you look at the problem.