Tuesday morning talks from the europython conference.
See also the complete write-up overview .
Included here: plone4artists, zope CMS projects discussion, my talk, accessing huge distributed data sets and component-based programming.
Plone4artists is a pre-customised plone site that is an out-of-the box specifically for artists. Especially artist community sites.
A lot of use is made of the zope support for WEBDAV folders. This makes integration with for instance a pc (mac/linux/windows) handy. Likewise, integration with calendar applications via iCal was done.
A funny thing is "plodcasting", a podcasting product for plone. This way a lot more use can be made of the lots of contents that gets uploaded to the side.
As it is for artist community sites, they integrated the "creative commons license assignment" product to assign the correct CC licenses to the various bits of content.
Nate recently started looking at archgenxml UML-driven development (yeah!). He showed some examples, like easily generating a new member type with some more attributes than the normal name/email pair.
They're planning on re-purposing an iPhoto-to-flickr open source project to be able to easily publish files from your mac iPhoto application into your plone4artists site.
Question from Paul Everitt: "I don't believe that you've got WEBDAV to work". Nate acknowledged that it was a lot of work and that it more or less halfway works sometimes reliably.
He's been looking at PloneMall, which is a plone product for electronic shopping. This could help both the artists and the site to earn money by selling merchandise.
Participants: Tres Seaver (CMF), Philipp von Weitershausen (zope3), Steve Alexander (zope3), Martijn Faassen(silva), Joel Burton (plone), Florent Guillaume (CPS).
The idea is to cooperate much more as currently customary among the various CMS-like projects in the zope3-time.
Florent: zope3 is suitable for many things, what we want to code cooperatively is only content management stuff, we want to focus.
Martijn: the ECM discussion has gotten into too much of an political discussion, he wants to have more technical discussion.
Joel: Plone is focussed on an out-of-the-box good CMS experience. He hopes that ECM allows them to focus much more on that instead of on the more framework-like things like archetypes. Comment by Limi: it would be great to give the low-level technical people inside the plone community a better place to put their infrastructural work.
Steve: Focus on the python code, don't focus too much on the zope database. That might not be the best choice in every situation.
Philipp: the development model of zope3 might be something to immitate. Zope3 is already very good and elegant. The focus of ECM should be to lift zope3 up to be also usable for CMS. Also: don't be too afraid of throwing away existing, working code. They did it also within zope3 and that worked out well in that case.
Tres: one of the issues of projects like this is a need to trust eachother's code, so a lot of testing is needed. Also: the existing projects solve real projects, so we've already got the use cases. And that is often the hard part.
There was a bit of discussion on backward compatibility. There was an attitude in zope2 to be really really backward compatible. Do we need to try that hard (which has drawbacks) or can we aim more at good migration support? Limi: look at the PEP or PLIP process, with strict deprecation rules. That should be good enough. Florent: at the moment the number of core zope3 developers is a bit low, so the system lay low for a while. Martijn: for ECM it's only needed the moment we actually have something which we want to preserve.
Martijn: we really need a solid, fixed release schedule for the core parts. The current situation is not good.
Steve: frequent releases are important for packaging with OSs like Ubuntu, as they only
Martijn: re-use can also happen on the python level. Python modules will work in 8 years time, but a zope module? Probably not. Reach out to all those python developers. So, also for ECM, try to look much more to python.
Steve: for something like ECM you need to have "conceptual integrety". And for that you almost need a "zope pope". A single vision. Watch out for grabbing several fragments and putting in one ECM project.
Martijn: but... you almost need the python reach-out.
Martin Aspelli: you have "conceptual integrety"-problems at various levels. Different naming conventions on the low level for instance. Also the way in which you use several components on the higher level... It is important that the examples that are the first things people look at are pretty much consistent.
Question: is ECM aimed at the enduser or is it a stepping stone for the plones and the CPSs? (I expected "plones and CPSs", but there was discussion on versions instead). Also lot of discussion on software stacks. In fact, Plone and CPS use mostly the same stack in the sense of a recent python, the latest 2.7 zope, cmf 1.4 and so.
Limi: a document on how to package zope would be useful. Debian, gentoo, etcetera: you don't want to know how they package it....
(The rest of the discussion was a bit hard to write down).
Martin Aspelli: do we need to build some community around ECM? There is the risk that it looks like a project for Really Good Programmers, thereby scaring away potential developers. Martijn: I'm taking this very seriously, this is something that we need to actively avoid.
I won't write my own track down, but the sheets plus complete text are available as a PDF . It is actually a better story on paper than how I presented it. Well, it is a bit of a scientific subject, which is always hard. As long as some people are going to take a look at archgenxml, I'm already happy. And if people start to export more data as computer-readable xml, I'm more happy. And if somehow someone got interested in my plone+semweb+construction work... Extatic :-)
But after every presentation I think by myself "why did I ever do this".
Basically: about storing lots of data.
Problem: how to manage large scientific datasets? A student generated 30gig of data and you either just leave it where it is or throw it away. So, you loose a lot of data and need to manage it somehow.
A scientist doesn't like databases and doesn't like interacting with them, not to mention on the SQL level. So Steven's trying to bring the modern database world to the researchers. He's in the BioSimGrid project that tries to make it all easier for scientists.
The biomolecular simulations are often 10GBs of data, but they can be split into two groups. Metadata, which is small, with time, date, parameters. They'll access this a lot. Second the actual molecule simulation data which is huge, but not access that often.
So they made a grid with some 45TB of capacity split over 5 sites. Some key features: you can deposit data at any site, read date from any site and there's redundant data replication (it's just copied to one other node). They've got a set of python analysis tools for accessing and mining the data in the grid.
The metadate is small and accessed often. So that's replicated to every site in an oracle database (which was a political choice). The simulation data is stored in the SRB (storage resource broker) which is more or less like a distributed file system.
A key point is to use the right tool for the job. The simulations results don't need to be stored granulary in some database format, the researchers "just want their big blob of data back". So: search the database for the metadata and retrieve the file afterwards.
SRB. Not really a distributed file system. More like a bittorrent-like thing. There wasn't a good interface to SRB. He's currently working on an object oriented python interface. So far it's got an SRB connection, standard python file object support and so.
But still... pulling 30gig over the wire isn't too handy. So he's got some ideas to associate code with files (the code is normally python) in order to calculate the results at the database. The results (mostly pretty small) and the parameters are stored, so you've got the possibility of caching and sharing of results.
As the process is pretty much automated, it wasn't that hard to deal with the research calculation grid: the possibility to farm out big calculations to the calculation grid.
Raphael is normally dealing for distributed computing.
When we're dealing with objects, we're programming in the small; with components we're programming in the large. Components do not replace objects. Components have contractually specified interfaces and an explicitly stated external context*.
Explaining components to students is hard, as the existing stuff is either too complex or too specific. So there is a need for a "component activity kit": picolo.
It is written in python and the core is, partly thanks to python, pretty small. 300 lines, so students can undetstand all that goes on.
(He gave a small demo. What I got out of it is that a component architecture is aimed at keeping everything neat and well-defined. Nice to see this presentation after some presentations earlier today where the zope3 architecture was explained - which uses a component architecture! And, yes, to keep everything more neat and well-ordered.)
As zope3 is component based, I asked whether it was a good idea to put the configuration of the connections between the components in a separate (XML) file, like zope3 does. According to Raphaël, that was the conclusion of a component-related conference a while ago. Making the components and tying them together are two separate concerns.
My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):