Reinout van Rees’ weblog

Software releases 6: skeletons for easy best practices

2010-07-30

Tags: python, softwarereleasesseries, buildout, django

This is the last post in my software releases series (at least, the last as I originally planned the series).

In the last 5 articles, you’ve seen a couple of places where there’s some irritating boilerplate. What’s supposed to go into your setup.py? How did you start off with a buildout.cfg again? Oh, don’t forget the CHANGES.txt and a basic README.txt. The solution for that is something I normally call a skeleton. I mentioned skeletons earlier in a Dutch python user group talk on practical project automation.

A skeleton is a basic project structure with a README, a couple of directories, some python files to start off with and whatever else you want to put in there. You give it some parameters (project name, website address, whatever) and it’ll fill in the blanks with those parameters.

Django has a manage.py startapp, but there’s a generic python tool called PasteScript. PasteScript’s documentation almost exclusively talks about running wsgi applications, but it actually can create project skeletons for you. The best way to get started is to look at ZopeSkel, which has a large collection of skeletons. Download it and see what it can do.

There are two core advantages to using skeletons either for yourself or within your community or within your company:

  • Boilerplate reduction. Basic files get created and set up for you. Laziness works to your advantage. “Just start the project from a skeleton” gives you something reasonably solid and neat. The alternative is to just start off with a python file and a sub directory and “remember” to add the readme and so later.
  • Best practice. Pour your knowledge into a skeleton. Figure out the way you want to set up your projects once and reap the fruits many times over. You’re probably copy-pasting apache config files from one project to the other: why not keep the latest and greatest config file in the skeleton? For me, this is the number one advantage: you’ve got a place to collect your project setup best practice. And having that place to put it means that more best practice gets attracted!

Want to start your own skeleton? What I’d recommend is to start with a ZopeSkel download and to look at their code and how to set it all up. Then start your own. I worked for Zest software and started “zestskel” there. I worked for The Health Agency afterwards and started “thaskel” there. Now I work at Nelen en Schuurmans so I’ve started “nensskel” here :-)

Some examples of what I get when I create a new django site with nensskel (after giving it a project name):

  • Basic setup.py with project name filled in. Long description is read from the readme and changelog. Some common dependencies are pre-filled.
  • buildout.cfg which sets up django, creates apache config files, contains package-versioning best practice setup, contains some commonly used tools, etc.
  • Readme, changelog, todo file. Of course with the project name in there.
  • Basic apache config file. In here there’s also the configuration that’s needed for django’s static files. And some caching setup. And the wsgi configuration. And the setup needed for django-compressor.
  • A directory with the actual django app (empty models.py, views.py and so). A bit like how django’s manage.py startapp does it, but now with our own defaults.
  • Our own defaults? Yes, for instance the boilerplate needed in settings.py for our django-staticfiles css/js setup. You definitively don’t want to type that by hand.
  • Pre-created PROJECTNAME/templates/PROJECTNAME, PROJECTNAME/media/PROJECTNAME and PROJECTNAME/fixtures/ directories.
  • Ready-made test setup just the way we like it.

So: make it easy to do the right thing. Let laziness work in your favour. Start a skeleton today!

Dutch Django meeting in Amsterdam

2010-06-10

Tags: django, pun, plone, buildout

Djangocon.eu 2011 news

Djangocon.eu 2011 is going to be organized in... Amsterdam!

My djangocon.eu summary

I opened the evening with a summary of last month’s djangocon.eu conference. I won’t write down a summary of the summaries that I already made of all the talks :-) Some main points:

  • Go to such a conference! Good experience. What to do? How to get there? What happens?
  • Django’s own development. The django technical panel, Jacob’s keynote, the bad pony talk, etc.
  • NoSQL. A big hit at the conference with 3 talks and 1 panel.

Some links I gave:

www.academictransfer.com (Jan Murre)

Subtitle: “an academic Jobboard with Django. Also featuring: Plone and SOLR”. Academictransfer is a spin-off from the VSNU, the combined Dutch universities. Academic transfer has a lot of academic jobs in the Netherlands and they’re expanding abroad. It is really focused on academic jobs.

Pareto, his company, uses Plone a lot. Plone is a really good content management system, but people sometimes abuse it for tasks for which it is not suited. Since a few years, they use Django for those kind of non-Plone-suited projects.

The old side was a 10-year old coldfusion site. A classic case of vendor lock-in of the company that build them the original site. Performance problems and big problems at every change request.

They originally asked Pareto for a Plone website (“it is a lot of content”). Plone is a good CMS, but this is mostly a database-driven site, so they advised Django.

They also use SOLR. SOLR is a high-end open source search engine (from the apache world). Till now, this has proven to be a good choice.

They use buildout (YEAH! See my buildout articles). It is their workhorse for setting up Django websites. With a buildout config, you can have a fresh, running django website within a minute. (Note: read Jacob Kaplan-Moss’s article about Django and buildout).

People are looking for a job, so the search functionality is the most important part of the site. SOLR works great. It is good at searching with facets. A facet can be an organisation or a keyword or an academic field for instance.

There’s login and personalization. Storing your queries; getting mails with new vacancies. Universities also can have their own page in the site. Most of the website is made with Django, but Plone is also used. An apache config divides up the requests between the both of them.

The UI is reused between Django and Plone. They did this by having Django ask plone for a base template. What plone returns is a plone browser view, but it is a Django template with Django’s tags! The base template is downloaded with a cron job every night as the Plone site doesn’t change that much. Another option would have been deliverance. Nice system.

A localization problem: Plone and Django use different methods for setting the language. Django stores it in the session and Plone uses a separate i18n cookie. The solution was to do the language switching on the Django side and get Django to set the Plone cookie, too. (An alternative could have been some django i18n middleware).

User management is Django-only. In the Plone pages, you want to show the username, too, though. The solution was to add an ajax call to the Plone interface that asks Django for the username :-)

The whole website is cached with varnish (which is basically the standard in the Plone world). For the username stuff, they had to use aggressive cache invalidation. An alternative they’re going to pursue is SSI (server side includes). A main reason is that such ajax injection goes against the webrichtlijnen (Dutch accessibility guidelines).

They want to show some Django content, like recent job entries, in the Plone homepage. The solution: load those entries via RSS/Atom feeds in Plone portlets. You have to make those feeds anyway.

Website statistics are of course important. The counters are stored in a Redis store. Redis is a very simple noSQL database that’s basically only a key/value store. Lightning fast. It uses http for connections. He had some doubts about that, but it turns out to be fast, too. Google analytics wasn’t an option as academic transfer had some custom requirements. Especially very custom “funnels”. Funnels are specific sequential steps that a customer can take through the site.

Some extra features:

  • Pluggable publication model so that you can submit to/from several university’s job sites.
  • Chinglish plugin: automatic English-Chinese translation. The quality isn’t perfect, but it suffices to get translated job vacancies into China’s main “baidu” search engine. The people that find it then can switch to English for the “real” job vacancy.
  • Several import/exports.

Question: when would you use Django, when Plone? Answer: for this site, they’d have gone with all-Django. Plone is only used for only 15 pages. But if your site needs real CMS functionality, use Plone. So if you need workflows, per-page and per-folder security, stuff like that. Don’t build anything like that yourself: it is already there in Plone.

Lightning talks

Updating your django site, the easy way (Dave de Fijter)

When you update your site, there are a lot of manual steps you need to do. That’s not DRY (don’t repeat yourself). Some options:

  • Shell scripts (but you still need ssh).
  • Fabrik (pretty good, but not that friendly)
  • He proposes django-siteupdate. You can update your sites from the django admin. It keeps logs of your updates. Custom update scripts are possible, as is postprocessing. You see the output of your scripts in the django admin interface.

(I mentioned a djangocon talk about capistrano.)

Dynamic models in Django (Joeri Bekker)

Dynamic models are not defined in your python code. For instance, they’re created on the fly at runtime. You could create a flexible admin this way. Or you can dynamically prototype models without having python knowledge.

Or you can, before a syncdb, create dynamic copies of your old data models so you can still access them. Or, what he uses it for, generate additional models for localization.

How? Something like:

Person = type('Person', models.Model, attrs={..field...})

In the end, it is more involved of course. Most of django’s machinery still works just fine with those classes. You can run syncdb just fine from within your code, creating the tables in your database.

So: do you have a usecase for this? He said there was a nice article about this subject on the Django website.

My App Engine has more horsepower than your Pony (Alper Cugun)

Alper’s been programming Django for 5 years. Recently he’s working with google’s app engine.

Django is good for content-heavy websites and for complex websites that are composed of many apps.

App engine is made for scalability and availability.

The good things about app engine:

  • The database (nosql!) is great. SQL sucks. NoSQL doesn’t give you migrations, so you don’t have migration issues :-)
  • There’s a python script that deploys to google in no time. No configuration, only programming. It just works.
  • App engine comes with batteries included. Memcache and all sorts of other tools. Handy.
  • Google makes sure it is scalable and always available and safe.

The bad things:

  • No serious queries. No normalization. Partially bad, but it enables some of the good things.
  • There’s a 30 second deadline for everything. Something that takes to long doesn’t ever complete.
  • They are a bit focused on enterprisey stuff.

Some tips: use indexes, memcache and create json endpoints.

It is ideal for getting stuff up and running quickly. Instant gratification. He wouldn’t really tie his whole company to google at the moment.

Just a millisecond, achieving performance without ramping up the hardware (Rick van Hattem & Thierry Schellenbach)

They got way too much traffic to one of their sites (http://www.fashiolista.com/) as it got waaaay to popular.

A default approach is to cache pages in Django: the @cache_page decorator.

NGINX and memcached are also a great combination.

Their problem: most of the content is user-specific and thus uncacheable.

The solution: combination of SSI (server side includes) and javascript. Memcached stores an anonymous page and the javascript knows how to turn it into a personalized page. SSI mixes in a bit of data with personalization instructions into the page. Javascript then applies those instructions. This SSI mixed-in data is just a very small Django request.

He showed result statistics that were pretty sweet.

A bigger tutorial and code will be up soon at http://mellowmorning.com

Class based views (Roald de Vries)

His goal was to find an object oriented way to create reusable views for his project. Django had almost the same goal for its “generic views”.

He used the __init__() method and django is starting to use the __call__() method. The __call__() calls the actual methods that you can override in subclasses. The url pattern is a problem. Thread-safety is an issue, they’re still thinking about an optimal solution.

An advantage of his __init__ method is that the url pattern can stay the same. There are some disadvantages with subclassing and returning self.

Search the django developer mailing list for “class based generic views in 1.3” to get the full discussion.

Eric Holscher: Getting the most out of your test suite

2010-05-26

Tags: django, djangocon

How to run tests? It differs per application. “setup.py test”, “nosetests”, “run_tests.py”, “manage.py test myapp”... So we need a standard way to run python tests. The answer is “setup.py test”. Problem: it is not ready yet. It is part of setuptools/distribute (not in distutils), it will be part of distutils2. When that lands, things will be a bit earlier. Django has no support for “setup.py test” as it needs a settings file and so.

Document how to run your tests, preferably in a shell script or so. Make it clear how to run the tests.

Run your tests after every commit, for instance in a continuous integration tool (Pony build (a simple one), hudson, buildbot, devmason). A continuous integration tool needs some script to set up the environment and run the tests. Use virtualenv and install your package through your setup.py so that also your setup.py is tested.

Hudson is the best continuous integration tool. Great plugin community. It is becoming the de-facto tool. Worth wile if you have the machine resources. See Honza’s earlier talk which covers hudson, too. There’s a doesn’t-run-perfectly-yet one at http://hudson.djangoproject.com.

Handy plugins: cobertura, violations, irc. Get coverage report by running coverage your_test_command. The code coverage needs an xml report to work from. django-test-extensions helps set it up for you. django-nose also makes it easy.

Django Kong is a functional testing tool (a Django application, so it stores historical data in the database) that can check if your live website is still running by checking certain pages. It is amazing how many problems get caught this way. It is useful for finding the crazy mistakes. And it records the time every request takes, so you can get historical data on the speed of your site.

Tip: get a dashboard. (They use devmason). Display test results. Display CPU server usage statistics. You get an overview in one screen. (Personal note: wow, that all looked very neat. Also the UI and the interface workflow. Small “sparklines” here and there to give a quick historical overview. Got to make sure I find some screenshots to show internally in my company as we’re also working on a dashboard for some completely different use case.)

Devmason can serve as a reporting platform. It is open source.

Summary:

  • Make sure one single command will run your tests.
  • Get a continuous integration setup.
  • What you don’t measure, you don’t improve. So measure to get yourself to improve what needs improving.
  • Get some basic monitoring going for your sites.
Karl Marx Allee (just behind the conference location)

Benoît Chesneau: Relax your project with CouchDB

2010-05-26

Tags: django, djangocon

Benoît is one of the CouchDB developers and the maintainer of CouchDBkit and a couple of other packages.

Why go with a NoSQL database? Often our data isn’t 300GB and the data fits one machine. Scalability isn’t always needed. And regular SQL databases are often fast enough, right? Some reasons for NoSQL:

  • Different kinds of data. Often in Django you have lots of models.
  • Easy denormalisation. You no longer have to make your data match the relational normalisation theory.
  • Easy data migration.
  • Your data is safe. With enough budget and machines, you can keep your sql data safe. Apparently it is easier and cheaper with a NoSQL database.
  • Flexibility and simplicity.

Some CouchDB characteristics: map/reduce; REST interface; document oriented database; append-only; replication.

Document oriented, what’s that? It means schema-less json with attachments. So you can put any json you want into it, you don’t have to define a schema beforehand. There’s a web frontend called Futon to introspect your data (basically phpmyadmin for CouchDB).

There’s more. For instance sharding. And geocouch, full r-tree.

CouchDBkit is a CouchDB python framework. It is a simple client and it also comes with a django extension. The django extension allows you to define models (if you want) so that you also can generate forms for it.

For the next version, he wants admin integration, some new mappings (objects with annotations), eventlet support, multidb support.

Question: the current couchdb version is 0.11 or so. Should we wait for 1.0? Answer: just use the 0.11, it is stable.

Maciej Pasternacki: Deploying Django applications with Capistrano and supervisord

2010-05-26

Tags: django, djangocon, python

Should we only use python tools together with Django or can we use the best of both worlds? For instance, can we use ruby scripts and perl tools? Background: http://setjam.com, an online TV guide, is what he uses it all for.

Release management. They have weekly iterations: a hard pace. They go from devel to staging (where it sits for a week) to production. And everything is in git branches. Every new feature gets its own branch during development. Production and staging also have their own branch. Releasing new code is as easy as “git pull”.

Capistrano is a ruby tool for automating the deployment. Capistrano does separate, independent releases each in a separate directory. It is mature and very complete. And... you have transactions and rollbacks for your deployment!

Of course, there’s data you need to share between the various releases (database, uploaded images, etc). For that, capistrano uses a shared/ directory and a bunch of symlinks.

Maciej made capistrano-offroad, a package of capistrano utils. One thing that it does is to reset the rails-like defaults. And there are django and supervisord recipes. He wants and needs to improve it, though.

Capistrano allows you to specify dependencies: so you can say you need PIL to be installed. There are certain dependencies that he doesn’t want to handle on the OS level, as several sites might need a different version. In the end, he went with regular Makefiles to compile upstream packages that partially come directly from version control systems and that also sometimes have custom patches applied to them. The makefile uses timestamp files for dependencies and finished tarballs are stored on S3. A drawback of Makefiles is that there’s quite some magic going on behind the scenes, so you need to understand what’s happening

Process management: supervisord. Supervisord (build in python) is the best tool for handling processes. Extremely reliable. It is easy to use locally. It manages foreground processes, which is more robust than handling background processes. And... it is easy to configure with a .ini file.

Scaling in the cloud. Keep the central server count (for instance DNS server) down to a minimum. Symmetrical processes: configure the other servers to do mostly the same, that way it is easier to add extra servers. And queue jobs. django-sqs (simple queuing service) makes it easy to queue jobs.

Logfiles are written to a central mysql database and phplogcon is used to browse and filter the logfiles. Also a mail is send with a summary of the log messages, for instance a list of the top 20 tracebacks.

Future work:

  • Monitoring and status. Nagios or so?
  • Rake instead of Make.
  • Polish and gemify capistrano-offroad.

More info and slides will be at http://www.setjam.com/blog

Wednesday lightning talks

2010-05-26

Tags: django, djangocon

Jan Lehnhardt: mustache

Mustache is a template language that works in almost every programming language. The programming work is done in the programming language: there’s a clear separation of concerns.

http://mustache.github.com

Some examples:

{{variable}}
{{{unescaped_variable}}}
{{!comment}}

{{#section}}
  {{section_variable}}
{{/section}}

Jan Lehnhardt: CouchDB

You can use CouchDB directly from your browser without a web server! CouchDB uses the REST protocol, so you could put your html in the database and call it directly from the browser.

CouchDB has replication... So you could do a distributed flickr or facebook this way... FAST response times!

Russel Keith-Magee: django is a server side framework

Django is a server side framework, so we don’t do ajax ourselves. But we do have forms and thus widgets. And we don’t really provide good examples on how to use it.

Lots of questions on django-users are “how do I plug in this or that calender widget?”. There are all sorts of snippets lying around ready for using. But it isn’t really pluggable.

Html5 is great and it is going to change the landscape. But we’re not going to support it directly.

What he wants to get going: “Ray’s widget exchange”. It is designed to be a set of applications that explicitly makes client-side decisions. It will include ajax.

There’s already an empty repository. http://bitbucket.org/freakboy3742/django-rays Do what you can. Help out! Code skills, graphical skills, javascript skills.

Ville Säävuori: patches welcome

We’re all contributors to Django, just by being here.

You can contribute in loads of ways. Triage tickets. Answer questions on the mailing list. Write tutorials or documents. It is not just about the code. Arrange local meetups.

Btw: there’s trying to arrange pycon Finland later this year: http://fi.pycon.org

Remco Wendt: demoserver management command

A new management command: demoserver. They wanted a simple command for showing a quick demo.

It runs runserver, but beforehand it loads all fixtures called demo_*. And it starts a special middleware which automatically logs you in.

Soon to be released.

Btw: next Dutch django meetup is on 9 June. http://wiki.python.org/moin/DjangoMeetingNL

Horst Gutmann: Localized documentation

A German documentation translation project started before 1.0 was out. People ask for it. Other languages do it, too.

Problem: translating the whole documentation takes too much time. contributing.txt has 1200 lines of text, that is about a week of work. Combined with all the other documentation it can take a year or so.

  • Do we need everything? “No” Tutorials, howtos and contributing.txt is probably enough. We don’t need the references and detailed release notes.
  • Who maintains it after translating?

Are there alternatives for the documentation? Yes, smaller focused articles. Django advent worked really well.

Lukasz Dobrzanski: Continuous performance testing

See his slides.

You want to see how well your code performs over time. Did a certain commit improve or worsen the performance? We need a simple tool for that. Integrate it with your continuous integration tools and show graphs.

The tool: performatic. Example: http://graphs.mozilla.org

Jacob Kaplan-Moss, Eric Holscher, Idan Gazit: Hidden hires

Hiring in the open source community is different than in most circumstances. A commit count is more important than everything else on a resume for many people. Reputation.

http://hiddenhires.com/ tries to connect the best django companies to the best django developers. Some informal reputation-based system. Filter out the best people and companies. It ought to work much better than regular recruiters and regular stacks of resumes.

Konrad Delong: Mockity mock mock, a little love for the mock library

Most people test, some people use a mocking framework.

There are two approaches to mocking. The first: prepare an object that expects a certain behaviour. The other is to have a generic object that remembers everything and then afterwards you ask what happened and check that.

http://python-mock.sourceforge.net/ is of the second kind.

(He gave a quick demo).

It also contains mocks for stdin! (Note to self: test it!)

Reinout van Rees: zest.releaser

I myself showed zest.releaser, a handy tool for getting rid of all the manual work in releasing a package:

  • Removing the ‘dev’ marker from the version string in your setup.py.
  • Recording the release date in your changelog.
  • Committing those changes.
  • Tagging the release (svn, hg, git, bzr).
  • Incrementing the version number and adding a ‘dev’ marker (so from 1.0 to 1.1dev).
  • Adding a new header in the changelog.

All that’s done automatically (well, it asks for your permission after showing the changes) by zest.releaser. Many people in the zope/grok/plone community use it, so it ought to work just fine in the Django community too :-)

Oh, and it has 95%+ code coverage.

Jonas Obrist: a quick look at django-cms

People said they missed the high-level overview after the earlier talk about django-cms.

So Jonas gave a demo of http://django-cms.org :-)

Benny Daon: django gov

Benny started a google group for discussing open government projects and apps.

http://groups.google.com/group/django-gov

Jörg Kress: Red square

2010-05-26

Tags: django, djangocon

An enterprise success story. He works for sky tec. Red square was made for BMW. So he won’t show real production data as Audi and friends might be watching :-)

Red square is for all employees of BMW to share ideas. All. Irregardless of position in the company. The idea came from senior management, so having senior management support for the site was not a problem. The project manager made the choice for Django and he was high enough in the hierarchy to make that choice.

Red square is an anonymous social network for idea creation and refinement. The goal is focused open innovation. The problem with open innovation is often lots of noise and not a lot of signal. It is for employees of BMW, so it is focused. In a big company like BMW, cracking silos, so cracking the walls between the various departments, is a problem you have to solve. And you need barrier-free communication, so anonymity is important as nobody will contradict the boss if he posts a wrong idea as he’s the boss.

Brainstorming: you need an uninterrupted flow of ideas. So they don’t allow negative ratings on ideas, only positive. Only comments on ideas can be downrated.

It is quite an unexpected application. “How did you get that approved?”:

  • It was a pilot.
  • It was allowed to fail.
  • It stayed focused.
  • Backed by senior management.
  • It used as much existing infrastructure as possible.

Why Django?

  • Good for iterative development?
  • Provides sustainable imperfection :-) You can keep parts of the system in an im-perfect way and it doesn’t fall over.
  • Good enough.
  • Coherent.

Success factors:

  • Happy users! We listen to them.
  • No major fails.
  • Trusted by users.
  • Trusted by the departments!
  • They could run on a shoestring-budget.
  • Focus. Reluctant to add new features.

They can measure their success: other companies also want it. BMW wants more of it. Users demand other systems to work just like red square.

Thanks a lot to all Django developers!

Idan Gazit: Design for developers

2010-05-26

Tags: django, djangocon, plone

Why does he care about telling us about design? He can’t teach us design in half an hour. What he can do is tell us about the design that we can do something about. Design is just one of the things that you should know a little bit about. Just like you must know a bit about databases, optimization, caching, deploying, testing and so on, you must know a bit about design. As programmer. Don Norman quote: attractive things work better.

The goal is sucking less. You’re not going to be a great designer. A good way of sucking less is doing less. Be minimal. You simply have less opportunities to suck. It is also a bit of a trend nowadays to do more with white space and to put less on a page. “Minimalist design”.

He showed some examples. Just a few colors. A solid background. A couple of 1px dashed lines to add structure.

Create a visual hierarchy. Try to steer the visitor. Some things are more important, some are less important. Add contrast for that. Add discoverability.

Put your content front and center. That’s the most important part. Squint your eyes and look at your page: can you still make out the main content block? Also: give your content room to breath. So add white space between things. It helps set them off.

Contrast (color, or bold, or upper/lowercase) helps showing your hierarchy. Emphasize (big, color) and de-emphasize (smaller and a bit grayer, for instance).

Play with it. Steal ideas. Keep a folder with screenshots of (parts of) pages that you like. You’ll get better with time.

Design with a grid. You don’t need a unique structure every time. You can use one of the available css frameworks, for instance http://960.gs/ or http://www.blueprint.org/. Such a grid seems to fit our brain and it almost automatically looks OK. Just go with a grid and start with pencil and paper.

Typography is not only font selection. It also means laying out your letters on a page. There’s loads of best-practices from our 574 years of typesetting, but most of that got forgotten when we moved to the web. There’s more attention lately. Some things NOT to do:

  • Lines are normally set “solid”. Set a line height of 1.3em to 2.0em so that the lines are set a little bit apart. This is the biggest issue on most websites.
  • Don’t make the columns too wide (and thus the lines too long). Rule of thumb: make your line about two alphabets (a-z a-z) long.
  • 12 pt fonts. That’s ok for books, but a computer screen is further away, so 16pt is better.
  • Too many fonts. Rule of thumb: just 2 or 3 typefaces, maximum. A good guide is a sans-serif (“helvetica”) for titles and serif for body text (for instance Georgia).

Two highly recommended links: http://typographyforlawyers.com and http://webtypography.net .

Most browsers (including IE since version 4.0) allow web fonts. You can use more than those 8 standard fonts now. You can DIY (for instance with the FontSquirrel font-face generator) or you can go with a hosted solution like http://typekit.com/ (note that google also jumped into this lately with typekit so those fonts are now hosted using google’s CDN).

Color. Choosing a good color scheme. Color theory is hard, it is a lot like music theory. Watch out for localization as colors have different meanings: in the west, white means purity, in the east it often means death...

If you don’t know what to do, go with a monochromatic scheme. White/black plus red, medium red, light red.

You can go with complementary (colors on the opposite sides of the color wheel) and many others.

Two suggested sites for color schemes: http://colourlovers.com and http://kuler.adobe.com .

Steal and learn. Look at http://patterntap.com/ or sites like that to get design ideas. And read a free book http://designingfortheweb.co.uk/book .

Rules are meant to be broken. Just don’t break too many at the same time.

Question: stealing? What are the legal issues? Answer: I didn’t mean steal steal, I meant lifting ideas. Don’t copy someone’s layout wholescale. Change things.

Question: what about color blindness? Answer: hire an expect, this is hard. Colors are hard. There’s biology, culture, ...

German S-bahn station at night

NoSQL panel

2010-05-26

Tags: django, djangocon, python

Question: do you want to support special features of NoSQL DBs in the ORM even if relational databases don’t support those features?

  • Yes.
  • Yes.
  • You choose a NoSQL database for its specific features, so those features need to be supported.
  • Warning: there won’t be a common set of features that lets you move a DB transparently from one NoSQL db to another. The API will work the same way, but you will probably have to do some specific stuff.

Question: How many if any of the various NoSQL databases should Django support?

  • All of them. Even filesystem or RCS...
  • Try to find communality. Querying probably will have to be specific per DB.
  • Will there be an SQL for NoSQL databases? Relational databases existed before SQL was there...
  • Lots of talking on “what are common ways of querying?”. Javascript could end up as the default query language.
  • For someone knowledgeable in SQL, mapreduce is a brainwreck.

Question: should the Django ORM try to emulate JOINs on a NoSQL DB?

  • JOINs aren’t use all that often. So it makes no real sense to emulate things like that.
  • Just tell everyone that it is limited support. Manage expectations, then they’re not going to complain that loudly. Just return a UseADifferentDatabaseBackend traceback.
  • On some of these NoSQL databases, you could emulate JOINs. So people will try it.
  • What are the write operations like? SQL loses some speed in reading, but is efficient in writing. NoSQL seems to be the other way around with, depending on the database, weird hacks and caveats regarding writing.

Question: what sort of API should Django have for a pure key/value store like Redis?

  • Map to dictionaries?
  • Nothing.
  • Allow a bit of javascript. Allow that to be plugged in by the one that uses a “ORM”-like mapper. That provides lots of (necessary) customization.
  • There are a lot of new ways of querying in NoSQL that someone new is not immediately familiar with. So ask around: for most problems there’s an existing simple solution.

Question: how are other high-level web frameworks approaching this issue?

  • Nobody has done it yet. Rails isn’t doing it. Sqlalchemy said “which part of SQLalchemy didn’t you understand?” :-) (Update: see Mike Bayer’s comment below)
  • We call it a “persistence layer for models” instead of an “ORM”. We have a shitty ORM but a good persistence layer for models. So by accident we’re pretty well positioned to at least attempt to support NoSQL as NoSQL is “just” another way of persisting models, technically.

Question: what about focusing on integrating with the forms, generic views and so on first and only worry about the ORM (and corner cases) later?

  • That’s basically how we normally work.
  • The first support for NoSQL in a release will be very basic. The next release will probably mostly have various fixes and cleanups all over the place for the benefit of NoSQL. So no big commit messages “NoSQL support”, but behind-the-scenes work.
  • We need NoSQL support NOW. Model forms support for NoSQL can wait. First the basic stuff. Something like i18n pages are way simpler with schema-less NoSQL databases with their sub-structure support.

One of Django’s strong points is the amount of reusable applications. What about getting them to work transparently with NoSQL backends?

  • Try it out from application to application and see what comes out.
  • Piston helped to get not an object, but a dictionary out of django. That was the answer in that case, as a dictionary can be stored just fine in NoSQL.
  • Switching won’t happen. It is a choice that you make in the beginning. Switching from mysql to postgresql can already be a pain. And I think something big like saatchmo won’t ever be able to support both sql and NoSQL.
  • Class-based views (instead of view methods) would help a lot. It makes it much easier to selectively override/fix parts of a view (in order to support NoSQL in this case).

Question: realistically, in which version can we expect some sort of NoSQL support?

  • Not in 1.3. We already announced that it would be a quick short release with mostly bugfixes and cleanups.
  • We’ll need much testing with NoSQL support. At least 3-4 months of testing after a feature freeze. So if we’d want it in 1.3, 1.3 would only come out 1.5-2 years from now and that’s waaaaaay too late. So probably 1.4.
  • There is code around that helps you to already get it working. So there’ll be code in 1.3 to support that better. But the real full NoSQL support will come later.

Question: what about keeping NoSQL backends out of the core and focusing instead on making adding backends easier?

  • Rails already did this.
  • We’re thinking about cutting the core up a bit and decoupling the release cycles of the parts.
  • It is quite easy to make a backend! The backend API is very good and friendly. Those who made that API did a great job. In that sense, it is a solved problem.

Question: how to keep model forms and so working with the new databases?

  • We’ll lock Alex up in a room and wait for him to solve it.
  • We’re waiting for Simon to post more complete thoughts to the mailinglist.
  • Mongokit already has some model forms support.
  • Various people are already working with NoSQL in django. So it is plainly possible.

Michael P. Jung: Efficient Django hosting for the masses

2010-05-26

Tags: django, djangocon, python

Hosting a huge number of php sites is no problem for a hoster. It is a bit more involved when django sites are involved. He’s co-founder of http://pyrox.eu that tries doing django hosting The Right Way (fully replicated python hosting).

Mass hosting means many websites from different users sharing CPU/disk/memory resources. Suitable for small and medium sites. So “mass” hosting, not “massive” hosting.

Python implications: WSGI applications are long-running processes so they consume memory 24/7, for instance. For developers, going with a mass hosting solution should mean reduced costs and less maintenance (no OS level upgrades). Being developer friendly is important. SSH access, svn/git/hg/bzr tools must be available.

A good efficient stack is apache with mod_wsgi and daemon processes. That last bit is essential. Excellent performance due to persistent apps. But restarting apache (for a config change) means re-starting all wsgi processes. It is apache after all :-)

The best stack would be a webserver separate from the wsgi container. So he searched for the perfect wsgi container. Low memory usage and simple configuration please.

He also wants threads. The normal response is “just use more processes”. But the application might require quite some memory. And firing up a couple of extra processes costs memory.

He looked at many wsgi containers, but in the end mod_wsgi was pretty good after all, provided you switched off as many apache modules as possible. Note: apache is only used as a wsgi container, there’s a separate nginx web server in front!

Eventual setup. Web server: nginx. Wsgi container: apache+mod_wsgi. And a separate apache (with NoFollowSymlinks) for the static files as nginx follows symlinks by default, which is a security risk.

Replication: database, filesytem. Of course you have to use RAID. Backups are important, but they are an entirely different story. You don’t want to wait hours before a restore succeeded: that’s why you have replication. HAProxy and nginx deal with failover well enough.