XML and the Internet: eConstruct Internet: eConstruct"> Subsecties

Using XML and the Internet: eConstruct

This section analyses the EU research project `eConstruct' [7] that used Internet and XML as central project ingredients, therefore providing a good way to get BC input on those technologies. The section starts off with an introduction on the project, followed by the central vocabulary: the LexiCon. Then the bcXML data format is analysed, with web services next. The section closes with a discussion on the conclusions that can be drawn from eConstruct's experience.

This section represents a large portion of the work that was done in the first two years of this research. It can be considered a case study that evaluates the first part of this chapter.

Introduction

The aim of eConstruct project was to develop a vocabulary called bcXML that would `know' what is meant by user terminology. A computer would therefore understand what is meant by terms such as `wall', `floor', or `roof' in conjunction with concepts such as `cleaning' and `painting', making web-based communication and interrogation possible. In addition, concepts used in bcXML like `door' and `height' would be available and translatable into different national languages, Classification and coding systems.

Initially bcXML only provided a limited set of terms, but these were be capable of supporting real end-user cases. The first applications using bcXML focused on the `shopping phase' of eCommerce, allowing such questions as `Where can I buy a new boiler and what will it cost me?' to be answered. It was the intention that the second generation of bcXML applications would be more influential and would concentrate more on the underlying goal of the project: to help increase the competitiveness of the European BC industry by supporting business-to-business transactions.

eConstruct as a two-year project did not itself increase the competitiveness of Europe's BC, but it showed the way. The final demonstration was convincing, in this sense that a full scenario was played through: existing catalogs were converted to a Semantics-based model and stored in an on-line catalog server; the taxonomyserver allowed multi-lingual browsing and searching using the Semantics (see figure 4.4); an IFC building model could, using the Taxonomy server's Semantics, query a catalog for matching items and insert the matching item (a door in this case) back into the IFC model. [74]

Figuur 4.4: eConstruct's taxonomyserver, a web application made by the author. It shows part of the inheritance hierarchy, it links to versions in different languages and it allows you to enter values in order to search in catalogs (a possibility provided by a separate catalogserver).
Image taxonomyserver

eConstruct results have seen use in other projects, the most recent being an EU project for an electronic marketplace for equipment with the aim of allowing collaborative use of equipment in order to prevent (costly) downtime [75].

This section analyses the eConstruct development and result, to provide feedback on the Internet and XML technologies used. First, the vocabulary used is discussed including the XML format behind it; second the bcXML data format that mimics RDF; third the web services used.

Vocabulary: the LexiCon taxonomy

From the start of the eConstruct project, STABU's 12006-3 development `LexiCon', called a Taxonomy, was intended as the basis for communication. The most attractive features were: (1) the rigorous formality in specifying objects, properties, all with unique IDs, in the PDT tradition that sets it apart from the earlier Classification efforts; (2) the build-in multilinguality: attractive for a European research project with partners from six countries.

LexiCon contents were made available to the various parts of the eConstruct system by means of the bcXML Taxonomy format [76], after a conversion from LexiCon's own XML export format. A separate format was used because of a slightly different emphasis, but also because the non-LexiCon researchers wanted to research XML's possibilities more deeply. The bcXML Taxonomy format was also more explicitly focused on later use of the vocabulary by actual data. Being both XML formats, tool support was excellent, which helped in the implementation.

There are two versions of the bcXML Taxonomy format. The first version was quite elaborate and kept growing in size because of cross-breeding with the then-developing ifcXML format [77]. Another reason for the size was that it catered for both Taxonomy needs and data (catalogs, project information) needs. The second version was a much simpler model: complexity was halved. Halfway through the project, it replaced the initial elaborate one [47]. This decision helped the project along, as development of the prototype started to gain steam immediately. For an overview of the model, see figure 4.6. This simple Taxonomy format was used a couple of times by other projects after the end of the eConstruct project [75] [78] ; it was used most directly in the EU projects e-cognos [79] [80] and funsiec [81]. The simpler model only focused on the Taxonomy level (the data format is analysed in the next section). A code example of the bcXML Taxonomy format can be seen in figure 4.5.

Figuur 4.5: bcXML Taxonomy XML code example.
3#3
Figuur 4.6: UML diagram of the most recent version of the bcXML Taxonomy format. It concentrates on objects with properties and on the possibility to translate everything.
Image xtd

Adding Taxonomy data to the LexiCon did not proceed swiftly: in the end, effectively only data on doors was available for use in the prototype. The `LexiCon Explorer' tool, see figure 3.6, was a stand-alone application, so collaborative work was not possible, making this a bottleneck6.6. On a more fundamental level it turned out to be difficult to integrate the data needs of the practical prototype with the careful creation of one big single-inheritance Taxonomy spanning BC as a whole.

The multi-linguality provided much added value to the final demonstration: showing the same content in multiple applications in multiple languages (English, Greek, Dutch, German, French, Norwegian) makes for an attractive demonstration. For the eConstruct system, the LexiCon's multi-linguality was a real asset.

The national differences, though, also provided difficulties. In different countries, the regulations and norms differ: fire resistance in the Netherlands is measured in minutes according to some norm (fire resistance = 30 min), fire resistance in Great Britain is measured according to a textual class (fire resistance = `class 30 minutes'). This means that the property fire resistance can be measured in at least two ways, when it is not an entirely different property altogether. As a second example, the first floor in a British building is on ground level--in the Netherlands it is one floor up. And so on.

The idea of using a separate vocabulary proved advantageous, the multi-linguality was also a plus. The national differences, however, provided difficulties. Using XML for LexiCon's export and for bcXML's Taxonomy format worked well and allowed simple implementation.

Data format: bcXML

eConstruct's system used two formats: one for the Taxonomy, described above, and one for the actual data, like catalogs. The data format was designed to be a simple, readable format, called bcXML (the Taxonomy format is called bcXML Taxonomy format).

Figuur 4.7: Example of bcXML data: a door with height = 2.10m and width = 0.80m, followed by the relevant part of the bcXML Taxonomy where the bcXML data example is based on.
4#4

bcXML was a direct application of a result of the author's Master's thesis [5], where the idea of having a separate Taxonomy and a separate data format generated from that Taxonomy was pioneered [6]. The acceptance of a simple, readable data format had the effect of speeding up the eConstruct project's prototype development. For catalog data (the main use case for eConstruct), bcXML proved good enough [78].

A conversion generated the data format from the Taxonomy data. On the one hand the Taxonomy (using the bcXML Taxonomy format) has an object named `door' with a property named `fire-resistance' measured in unit `meter'. From this information, a data format is generated that allows <Door> elements with, as subelements, the properties like <fire-resistance> with the units added as attributes like <fire-resistance unit=m" value="....». A separate Taxonomy format and data format is elegant and allows for simple, focused formats. The data format can be seen in figure 4.7.

On a technical level, data formats from different Taxonomies were given their own namespace, so that different Taxonomies with objects with the same ID would not overlap on the data level. In practice, only one Ontology was used, though simple additional Taxonomies to assist visualisation were attempted on a prototype level. The functionality to combine data from various Taxonomies in one data file was, however, not fully developed and only used in a small visualisation prototype (see the 2D and 3D examples in figure 4.8; a point that should receive attention in further efforts.

Within the eConstruct project, a variety of bcXML visualisations were explored [50], see figure 4.8 for examples. Most prototype applications used the Taxonomy directly to query for the name (and the description) in the correct language. Another development created simple visualising (XSLT) stylesheets from the Taxonomy (figure 4.9), these stylesheets could then be used for textual or visual formatting of bcXML data--using standard XML tools.

Figuur 4.8: bcXML visualisation: multilingual text, SVG 2D and VRML 3D. The text and 2D images are from Joost Fleuren.
Image bcxmlviz
Figuur 4.9: bcXML stylesheet-based visualisation architecture. The Taxonomy holds all the translation and description information for objects and properties. The catalogs and other bcXML data files conform to the bcXML schema that has been generated from that Taxonomy format. So a visualiser XSLT can be generated from the same Taxonomy that knows how to handle bcXML data files and how to transform them into HTML.
Image visualiser

Overall, the simpleness of the data format was a plus. The format is human-readable, which is good when it is one of the first implementations using such a new technology. In hindsight, the mechanism used resembles that of a simplified RDF notation, which delivers also readable element names. What did not work well was the mixing of various Taxonomies, something that was not really researched. For this a look at RDF could provide advantageous, as RDF has build-in formal means to mix Taxonomies (Ontologies).

With the bcXML Taxonomy and the bcXML data format now described, attention in the following subsection goes to the web services that used them.

bcXML web services

In eConstruct, two web services were created: the Taxonomy server and the catalog server. The advertised way to connect to the servers was by SOAP.

The catalog server allowed you to retrieve the list of catalogs and individual catalogs. Also, you could search for matching items in the catalogs, upon sending a bcXML-formatted query. Also, new catalogs could be uploaded.

Interesting is the original idea for the catalog servers--that wasn't implemented due to time constraints--to be able to tier the servers. `Higher level' catalog servers could aggregate multiple `lower level' catalogs or catalog servers, to provide a single entry point for catalog queries. This would enable individual bigger companies to aggregate their preferred suppliers in one company-internal catalog server, so that it would be easy to order from those preferred suppliers, probably including a pre-calculated lower price because of the higher volumes. Or aggregators could provide an additional service by doing some quality filtering on the catalogs they aggregate, removing untrustworthy suppliers.

The Taxonomy server allowed textual search (in the various languages) for matching items. Individual Taxonomy items with all their information could be retrieved. The same actions could be done via HTTP using the HTML interface, with the HTML being automatically generated from the Taxonomy server's XML results (using XSLT), so it was natural to also provide the possibility to download these XML results directly. This direct HTTP method proved much simpler than SOAP and was therefore used a lot for testing. In fact, the SOAP interface was a small wrapper around the HTTP interface.

In the final project demo, both web services were used from within various programs, running in various locations. Each of them was able to use the same data, which made for a convincing demo. Having data available in centrally accessible locations was a plus, it seemed to offer a definitive advantage over the otherwise-needed distribution of data and subsequent problems of keeping it up to date. A question mark could be placed at the use of SOAP instead of HTTP+XML. SOAP did not offer anything extra, as exemplified by the Taxonomy server. SOAP did however necessitate the installation of extra software and it opened up some potential security holes.

Evaluation

Looking at the eConstruct experience, some conclusions seem warranted. Using a separate vocabulary (Taxonomy, Ontology) is advantageous. Multi-linguality is attractive, though national differences make the creation of an international vocabulary difficult. Building one big single-inheritance Taxonomy of all construction terms seems an impossible task when looking at eConstruct: the amount of terms produced after two years was disappointing. A context mechanism (for instance a country dependent definition) seems the most reasonable approach.

Using XML worked well and allowed implementation using generally available means. Simple, focused formats helped implementation greatly; eConstruct's success comes mostly from the rapid implementation that was possible (and that was done!) after the choice for simple formats was made.

For future efforts, a solution that can comfortably use multiple Taxonomies at the same time is advisable, as that was missing in eConstruct's system.

eConstruct's experience with web services was positive, the eConstruct-provided data was always up-to-date and accessible from multiple applications. In the eConstruct system, SOAP did not offer anything extra over HTTP+XML, so the latter alternative seems better suited as it is simpler.

Reinout van Rees 2006-12-13