Internet Internet"> Subsecties

Generic communication medium: the Internet

The two base characteristics of the Internet6.1, as analysed here, are its generic access method (http://...) and generic addressing mechanism (vanrees.org/research/). Their analysis is followed by an analysis of the original textual nature of the web (HTML). RDF is the last technology analysed in this section: the most recent addition that enables even richer communication.

Identifying and addressing everything: URL

Internet addresses are URLs, for example http://www.tudelft.nl and http://en.bcoweb.org/bridges/overpass. The first identifies the homepage of Delft University of Technology, the second is a definition for `overpass' in an Ontology. http://objecttree.org/delft-hospital/1st-floor/door/42 could indicate door #42 on the first floor of a hospital in Delft, inside an object tree web application. Every one of these addresses is unique: an Internet address effectively functions as GUID.

In practice, address space is unlimited. There is no technical limitation on what can be uniquely identified.6.2

When addresses are unlimited, every relevant BC object can receive its own address. This might sound like a lot of work, but for applications--who will do it most often--it is no problem: computers are very patient. With no technical restrictions, attention can be shifted entirely to the needs of BC.

There is no technical reason not to identify every single relevant object if doing it is possible. If the software employed stores the individual objects, it has a built-in mechanism to identify these objects; this can be converted to an Internet address.

There can be non-technical reasons, however. What is the needed level of detail? Not every nut and bolt is interesting, perhaps the detail level should be `all the doors on the first floor'. But, when an individual door should be replaced, that specific door should be identifiable.

To illustrate the conceptual idea, lets take an IFC file. Normally, the IFC data as a whole would be accessed through a single STEP database. Using the Internet, http://example.org/project2 could identify the entire IFC file. http://example.org/project2/floor/2 would be a data file with every object on the second floor. http://example.org/project2/objects/door could mean all the doors in the building. http://example.org/project2/objects/door/5, lastly, could be the individual door with id 5.

The increase in computer-supported communication possibilities is striking when comparing a BC industry using above method to the current BC that uses a purely text-based Specification, for example, that does not allow fine-grained object identification and access.

The nature of the Internet is--amongst others--the lack of centralised organisation: it is an endpoint-to-endpoint medium [56]. In the building industry, endpoints can be either actors or documents. This build-in focus on endpoints is a good fit with the building industry, as it consists of a lot of small companies. Every company or project can be as much an endpoint as another.

Access method: HTTP

When you visit a web page (like http://www.tudelft.nl/), your web browser internally uses the standardised HTTP access method to request that page. With this single request method, every web page on the Internet can be retrieved.

There are four standard access methods: retrieve info, delete info, update info and add new info. These four methods are typically enough for all applications. Retrieving or deleting information is typically straightforward, inserting or updating is more involved.

HTTP defines basically four operations on URLs. Of those four normally only GET is used. If you request a certain page in Firefox or Internet Explorer you technically send a GET request to the web server hosting that page. Like SQL's SELECT, UPDATE, INSERT and DELETE on database rows, HTTP offers GET, PUT, POST and DELETE on URLs [57]. GET a door catalogue item, PUT a different door in your project's data.

Using HTTP's four access methods for accessing BC data also adds the following benefits:

HTTP's four access methods are used for so-called web services [58], explained more fully in the next section. The currently much-hyped SOAP approach is seen as an alternative to HTTP. With SOAP you use every application's own access methods, typically easier for the software vendors. The drawback is that this can be done in a large number of ways6.3. The interaction complexity of this alternative rises with the square of the amount of applications, On2 [59] [60].

When using the standard Internet access method, the complexity of retrieving and deleting information is constant for the amount of applications. The interaction complexity of updating or adding information rises linearly with the number of applications, O(n).

The BC industry is heavily fragmented and the last thing it needs is a large number of custom-made interfaces instead of a completely standardised data access and modification mechanism as offered by HTTP.

With the large variety in applications used in BC, the advantages of using the standard Internet access method are clear--it is therefore recommended for use in the concept solution.

Data protection

When looking at a building project, you can distinguish four kinds of information in two dimensions. The first dimension is whether the information is specific to the project or not. The second dimension is whether the information is specific to a certain company or not. [29] Figure 4.1 shows the four intersections of the two dimensions.

Building codes are an example of information that is neither project-specific nor company-specific.

Project-specific is the project information. The not-company-specific part of this includes the client's brief, government plans regarding this project, the building Specification, etc.

Company-specific is the proprietary information. The not-project-specific part includes internal regulations, internal recipes-for-work, a database of past projects and so on.

Figuur 4.1: Information can be specific or not specific to projects and companies. This table shows some examples for all four combinations.
1#1

The proprietary information and the project information partly overlap. This is an area where problems loom. Part of the proprietary information could be very handy for the project and therefore for the project partners. But do you want to give them that valuable information for free? Do you want to share it? It might be a competitive advantage in later projects, which you give away by sharing.

Also, part of the project information will be necessary for the company: the specification drawings, the specification text. Essential for coupling with the internal information like work planning. But the project information has to be available to all partners in the project. This therefore means that the information has to be shared, so it either has to be kept in one location or different copies have to be kept synchronised.

The essential property here is that the information has to be shared between the quadrants. The information should at least be partly accessible.

Data can be protected, on a technical level, using standard Internet means. Data encryption (using the https:// protocol variant) prevents data snooping. User/password protection safeguards the data.

Information presentation: HTML

The HTML was invented by researcher Tim Berners-Lee for allowing researchers to easily share their notes and articles [61]. Thus it allowed for headings (in various levels), links (to other documents), quotations, bold, italic, code examples. Later additions brought images, ways of indicating fonts and colors and so on.

HTML's text-oriented nature meant a limited emphasis on syntax and relative greater emphasis on presentation. The syntax available is `heading', `paragraph', `quotation'. Presentation is `bold', `italic', `image'. A text viewpoint is assumed when calling `heading' syntax: from a data viewpoint, a `paragraph' with a Specification item is indistinguishable from one with a catalog item. Mixing syntax and presentation makes HTML unusable for data transfer, as opposed to its great use as a textual medium.

The hypertext part of HTML's acronym indicates the familiar links you can click on in HTML documents. Hypertext means you can point towards another text from a starting document. HTML was not the first technology to attempt hypertext; it was the first technology, though, to gain such widespread acceptance. The two main reasons mentioned most often are the simpleness of the format and the one-way linking mechanism.

The simpleness of the format allowed you to do a `view source' and to try to come up with your own document, see figure 4.2. The one-way linking mechanism was a departure from then-current hypertext systems in the sense that the other systems had bi-directional links. The bi-directional links meant internal consistency: no broken links. The drawback, negated by HTML's use of one-directional links, is that an elaborate coordination mechanism is needed. Allowing everybody to link to every document (one-directionally, without coordination) is clearly the simplest linking mechanism. The rise of the Internet has shown that simpleness's advantages won out over more elaborate, consistent systems [62].

Based on HTML's success, a more data-oriented standard was created: XML, explained in section 4.3.1. Like HTML, XML was a simplification of then-current technologies, done with the aim of securing much greater adoption.

Figuur 4.2: Short HTML code example.
2#2

Connected information: RDF

The NG Internet has a generic linking mechanism build-in, the technology is called RDF [63]. If BC would standardise on this linking mechanism, a lot of opportunities would open up.

Figuur 4.3: RDF example: two objects linked using a named relation. Both the source, target and link are identified with URLs. This means that also the link itself can be the target or source of another link. Source, target and link together form a so-called `triple'.
Image rdfexample

The linking mechanism is the simplest mechanism possible, as it connects one object with another object using a named relation. Both the objects and the relation are identified using Internet addresses (as unique identifiers). The result is that every object that is identified using an Internet address can be linked to another object. As the link is also identified, different relations are possible.

For example, an object in an IFC file (http://example.org/project2/door42) can be linked (http://example.org/belongs-to) to http://stabu.org/chapter/30, meaning that door #42 belongs in STABU's Classification, chapter 30. For a depiction, see figure 4.3.

The data format characteristics of RDF are analysed in section 4.3.2.

Reinout van Rees 2006-12-13