Wednesday morning talks from the europython conference.
See also the complete write-up overview .
Included here: research on fun and software development; sprints; choosing good names; selenium functional testing; using tests for motivation.
Results of the study "fun and software development" (FASD). Sometimes software seems a by-product of some process that provides fun :-) One of the hypotheses was that "developing open source makes more fun".
What is fun? For his research he needed a more unequivocal term. He uses the term "flow" (from Csikszentmihalyi). Flow is characterised by knowing what to do (like in sport), having the right challenge, a good match between capabilities and requirements, concentration, a perception of increased control, etc.
He did his research by making a questionaire for open source and developers of purely commercial software. (Hm. I'm wondering now whether I've filled in this questionaire: I remember something like that...)
Funny: apparently there is a linear relationship between "fun" and "engagement for open source". The more fun, the more readiness to work more on open source. It doesn't wear off quadratically with more and more fun.
After a complaint about calling it "commercial" instead of "proprietary", he explained that he asked open source programmers in general (whether they got paid to do OS development or not) and developers in a couple of Swiss companies.
Some differences: in open source you've got an optimal challenge (you do what you can) and you've got more project vision. Commercial: more formal authority, monetary incentives and deadlines.
(I didn't get everything, I didn't understand the lists of numbers that weren't too well explained. Ah well, there's probably a paper that does it.)
The results are here
She showed a picture of a school class sitting neatly in rows, looking uncreative. That's the optimum of 1000s of years of educational development?!? Definitively not how our open source projects are looking.
The PyPy projects makes a nice python-in-python compiler. They're now funded by the EU for two years. They have to do research also on agile methods.
One of the things is a "sprint". Originated at zope corporation for the zope3 development. A multiday session of intense development, 2-5 days long, no more than 10 people and using aspects of extreme programming.
Some good points of sprinting. The productivity comes mostly from teambuilding. Imagine a ball falling and falling and falling... You need to bounce back up, otherwise the developer leaves. During the "falling" stage, you orient yourself ("who are you"), start to build trust, allign the goals and roles. And then the bounce: commitment. Then moving up: implementing, WOW, restart (do we continue, do we do something else, etc.).
This means we need to structure the sprint. You must have to have a process, steer it a bit, have some social skills to get a successful sprint.
How is it done? There is actually quite a lot of info on the net (mostly zope-centric) on how to do it regarding content+logistics. You need the infrastucture: connectivity, coffee, a room. Otherwise the sprint won't work. On procedure: have an introduction, followed by a tutorial to get new contributors up to speed. Important: tracking! Track what everyone is doing, keep showing interest. Track against the goal.
Michael Hudson once asked whether it was actually possible to do open source sprinting 10 years ago: no ryanair, no wireless, no affordable laptops? He had a point.
Within PyPy they try to learn by doing - by reflecting on what they've done. Trying to design a process, to adjust the process. And disseminate the process. So document the process.
Funny: they're also working at integrating project management (=meetings) with the software development during the sprints.
Issue: functioning within EU funding.
There are challenges regarding sprinting. For instance expectations and participants: the expectations must not be too heterogenous. Vision versus implementation: during the sprint you're creative, but most ideas won't get implemented during the sprint, so they have to be done in between the sprints. This must be taken care of. Issue: leadership and process management, which depends a lot on the participants - whether they're proactive or not. Another issue: funding. Ryanair and hotels cost money. Especially for unfunded open source projects.
If you've got experiences or comments on sprinting, please help Bea with your info ( bea@changemaker.nu ).
"And God called the light "light" and it was good". Naming things is not just something God is allowed to do, programmers do too. After showing sheets with some assembly or internal computer bytecode, the need for "names" as such was pretty clear :-)
Humans deal with complexity by means of abstraction. So the computer still uses numbers, but we've put a layer on top of it. First procedural languages, like COBOL. Comment about COBOL: devised so that management could read the code that the programmers wrote...
Even further: object oriented programming. Object orientation gives you arbitrary levels of abstraction.
With the tools that we have available and the possibilities that we have, when programming, our audience is human. The computer doesn't care less what we do. And the machines are fast enough not to have to use machine-centric languages most of the time. This is the core of his talk.
Observation: there is no substitute for experience and taste. You notice it when code is written by beginners!
For communication you need a common vocabulary and so. For good names we can look at lots of existing software. Input comes from computer science, the problem domain and our experience.
The "gang of four" software pattern book has an important contribution. Not so much their explanation of how to implement the patterns, but naming them. Now we can talk about an Iterator or an Adapter and get a pretty good idea of what is meant.
When coding, we are creating instances of our tools (lists, classes)
and we use names from the problem domain. So a list could be called
products
. When choosing names, we must not mix tool-space and
problem-space. The so-called "hungarian notation" is probably one of
the worst things you could do.
Hints. Code must be understandable by itself, comments don't help. Use English language names. So know your English. Be consistent throughout the program, the specs and the docs. Do not rename things halfway, that is counter-productive.
Short names are good, but don't overdo. Saving keystrokes is not
a strategy, though. No cnt
versus count
. If it looks like
linenoise, you won't understand your own code half a year down the
road. Important hint: grepability counts, which also rules out
too-short names.
In your code, tell a story. A set of plain numbers in an argument
list is way less "telling" than keyword arguments. xyz(3, 8)
versus
xyz(product=3, quantity=8)
.
Module names should inform about the contents. Lowercase them. Martijn Faassen says that module names should always be singular, that way you don't have to guess.
Classes start with an uppercase character. Often it is UpperCasedLikeThis.
Method names signify what the method does or what it returns. It starts with a lowercase. Either mixedCase() or under_scores(). The first part should be an action like trim_spaces() not space_trimmer().
Variable names inform about the value. Booleans should start with
has
or is
.
Like testing, we choose good names primarily for our own benefit.
(I ran in five minutes late, so I missed part of the demo. Wrong room :-) There was a lot of good demonstration, so I won't write much here. By the way: the room was full, a good indication of the testing-readiness of the developers present, which in turn is a good indication of the probable quality of the code!)
There are two modes: in-browser mode with the tests run by html+javascript in one frame and the website under test in an other frame. The tests are written in html tables. The other mode is the "driven mode", with the browser being steered by an application on the same machine. That program has a python interface.
Selenium has actions for opening a page, clicking somewhere, typing
in text somewhere, and so on. Check can be for text that's present or
not present, for instance. How do you locate elements (for
clicking/text entering)? By identifiers (id="something"
), DOM,
xpath.
Selenium uses a javascript "bot" inside the browser. There is no need to change the core selenium if you want to do customisation, as it checks for a selenium-custom.js or so on startup.
Tres Seaver made Zelenium, a zope product to make this more easier for the zope world. PloneSelenium is also an alternative especially for plone. PloneSelenium has a portlet which you can use to create tests. The tests consist of one python script per test.
For plone 2.2 there is a plan (plip 100) to add selenium testing to plone. Great.
Testing by hand is boring. Automated tests might be boring. But you want happy customers. Those customers, though, aren't cheering you on 10 times a day. Automated tests, however, can cheer you on 10 times a day! See a running test as a customer cheering you on because they're getting good software.
Extreme programming has a big emphasis on unit tests and not so much on acceptance tests. They found out that unit tests made them sad and acceptance tests made them happy. Those focused on customer approval instead of on programmer intent (a unit tests tests that you did what you intended).
Acceptance test typically take a long time. Ouch. The long cycles resulted in more "coding by guessing" instead of "coding by testing". Hm. That needed to be faster. They tried different things to get the time down, but in the end they just piled on a lot of machines. Hardware costs less than people. They already had the infrastructure to distribute calculations over different machines. Tests slow? Throw in more machines.
Some tests (at least in their business) took a lot of time and effectively were run outside the edit-compile-feedback short cycle. They had to compensate for that.
Really important: write a test first to demonstrate the wrong or missing functionality, start satisfying the test afterwards. This really strengthens all the other extreme programming points like collective ownership, simple design, continuous integration and refactoring.
Regarding retrofitting tests onto an existing application: with unittests code coverage is an issue and can be a massive undertaking. Introducing and modifying tracing output to become more deterministic is much less work and can be gradual. Testing for customer approval is thus probably much easier to pull off.
The key issue for agile teams is social skills and the ability to adapt to a highly communicative environment. Some tips for the road to agility. Number one is to establish automated regression testing. Learn to estimate in 1 to 3 day pieces. Once the pieces are small enough, you can have integration meetings (every 2 or 3 weeks) and daily standups. That's also nice for the manager that has a much easier task of tracking everything! Continuous integration, having something running all the time. Let your tests drive your development (they still haven't implemented that 100%). On pair programming: see it as continuous code review! Consolidate test coverage to enable couragous refactoring and collective ownership.
Important (according to him): Acquire a customer on-site to use user stories. User stories are normally short, just a few lines. When you actually start to program, almost always you've got questions you want to ask to clarify them.
Comment from the audience: pair programming is like rally driving. One person is steering and pushing the gas pedal, the other one is reading the map and showing the way. In pair programming, one is coding, the other is thinking strategically, keeping the goal in mind and tries to maintain the overview. (Hm. As a human you have only about 7 things you can keep in your head at any one time, so pair programming might give you some extra ones, say 10 in total. And one of the difference between ordinary people and really smart and productive people is the number of things they can keep in their head. 7 is the average. 8 is a lot. So if you're pairing and get to 10 or so... Dunno, just philosofying.)
Johan: I spend about 30-40% of my time pair programming. The rest is administrative. debugging and so on. But. Every piece of code that gets into our system gets in there by pair programming.
My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):