Europython: Pyphant, semantic web and JSON-RPC¶

Tags: europython, europython2006, plone, rdf

Klaus Zimmerman - Pyphant

It is a framework designed for the easy modelling of of reusable data processing and analysis workflows.

One of the items of a workflow ("recipe") is a "worker", which takes input from a "socket", accepts parameters and gives back output on a "plug". In the GUI system you can then connect a number of workers with their in/output sockets. They used this for instance for pushing an image (from a microscope) through a number of filters. There are workers with multiple input sockets.

If you select a worker, you get an input box where you can tweak the parameters. The gui has a list of all the available workers

Plugs are the actual computation entities. They receive their input from the sockets of their worker and they cache their results in a cache-safe way. Recipes aren't really executed, but evaluated lazily. If you have a branched recipe it automatically gets executed in threads.

Pyphant allows researchers to do quite a lot of tasks themselves. They plan to add more workers and also to allow loops, which then needs more controls, etc. Looks good.

Marian Babik - Deep integration of python with semantic web technologies

The semantic web is an extension of the current web, providing infrastructure for the integration of data on the web. The idea is to make the actual data available: not just binary files, but the actual data.

RDF is the basis for the semantic web. Source, target and a link between them. All three have a URI (the target can be a simple string, though). As everything is identified by a URI (think URL), you can merge data from different sources. If I'm identified by http://vanrees.org/reinoutvanrees two files can be merged when the one says that I have a website at vanrees.org and the other says that I'm a europython speaker. So: this europython speaker has that website.

RDFS allows you to define classes and subclasses.

OWL builds on RDFS. RDFS can't handle everything elegantly, subclassing only gets you so far. OWL also handles equality of classes, enumerations, datatypes, etc.

An important difference between an object oriented worldview and a semantic web workflow: OO has a closed world assumption, the semantic web has an open world assumption. OO assumes that it has all the info, semweb assumes that there is more information somewhere else, so it is more conservative in concluding something.

Python has a lot of RDF programs. RDFLib, CMW, Pychinko, MetaLog, Sparta, Tramp.

His work on Seth tries to map between python and the semantic web. OWL classes become python classes, for instance. Ontologies are turned into modules. Looks pretty nice.

Jan-Klaas Kollhof - JSON-RPC

Why another RPC? He didnt' like SOAP, xml-rpc missed some things, etc: he wanted to write his own version :-) Some of the goals were: easy to understand, easy to build, buildable in javascript (so: text-based, not binary).

At a high level in JSON-RPC, you have two peers who are allowed to bug eachother at any time, either with results or requests. Asynchronous. If something takes long on the server it just takes a lot of time to send back the request. If you send in an easy request in the meantime, the response might just be returned earlier than that of the time-intensive response.

Basically, you encode dictionaries in JSON . A request is the name of the method, some parameters and an ID. A response is the actual result plus the ID, so that you know which request it is the response to. There's a third one: a notification for just telling the other peer something without expecting a response.

Question: why positional arguments instead of named arguments? Answer: not all languages support named arguments, but he'll probably switch to named arguments altogether as they make it all much more readable.

There was some discussion with the audience. Jan-Klaas dislikes doing everyting with plain http-POST. That will go through all firewalls, but it is harder to send messages in both directions, the client'll have to poll the server as the server can't really send anything itself. The alternative (that he likes himself) is to use socket connections, which is a reliable data stream and allows the server to send stuff to the client. But it'll probably die on many firewalls or proxies and you can't have too many persistent connections.

Some flaws. Notifications have no error handling, but they're basically a silly idea anyway: just drop them and use normal requests. The message itself isn't JSON. There's no introspection, but that's not such a big issue as you should write applications for known interfaces, you can't really write an application on the fly at runtime :-)

Some future options: named parameters, remove notifications, some standard errors, HTTP binding in the spec (POST+GET), make response/request ID optional, etc.