Migrating django(-cms) from sqlite to postgres

Tags: django

We’ve got a django-cms site that started out on sqlite. Now we’re adding GIS shapefiles so we wanted to move to postgres/postgis. Ok, switch over the DATABASES definition in settings.py, make a fixture and load it, right? Ahem, no.

There were a couple of problems, partially with our code, partially django-cms, partially django. Some solutions below might not be needed in every case, but I’m just going to list everything in the hope that it helps someone.

South migration dependencies

Lots of our code uses south for database migrations. But when creating our database from scratch, “migrate” would complain that a certain table of application A didn’t exist yet when migrating application B. Uh oh, a dependency.

Turns out that’s easily solved. In the failing migration step, just add a dependency on the missing application’s relevant step:

...
class Migration(SchemaMigration):

    depends_on = (
        ("application_a", "0001_initial"),
        )

    def forwards(self, orm):
        ...

After that, all migrations run fine.

Small sqldiff limitation

We’ve got django-extensions in our projects. One of the handy extra management commands it provides is “sqldiff” (so manage.py sqldiff or, as we use buildout, bin/django sqldiff). (The best extension is “graph_models” which creates a picture of your model structure).

We got some complaints when importing a fixture. In the end it turned out to be that someone somehow copied an old and wrong fixture instead of a fresh one. But we suspected that some older migration had mis-fired. “sqldiff” will tell you the difference between the current database schema inside the actual database and what it should be according to django.

We got a bunch of missing fields in some custom user model inside django-cms! What! Faulty south migrations? What?

In the end, it turned out to be a false alarm. Django-cms’s user model inherits from django’s own user model and adds one field. In the database this ends up as a small table with a pointer to a corresponding django user item’s ID and the one single extra field. But sqldiff complains about all the regular django user model fields being missing :-)

Note: I don’t fault django-extensions very much here. They note it is an experimental feature and we just happened to hit a corner case. Most of the time it is a real helper!

IntegrityError: duplicate key value violates unique constraint

See for instance a stackoverflow question and especially Mark van Lent’s recent explanation of what can happen.

There are two basic ways, I think, such an IntegrityError can crop up when loading a fixture:

  • You’re loading them out of order or with gaps in the primary key IDs. This can mess up the automatic ID generation.

  • You’ve really already got an item with that ID in the database and the complaint is right.

IntegrityError solution 1: natural keys

See django ticket #7052. Solved. Solved, that is, when the model that gives you a problem supports so-called natural keys. Basically, your fixture doesn’t point at a user with primary key “54”, but at a user with username “pietje”, whatever the ID.

This takes away a too-hard dependency on primary key ID numbers. To get it going in your fixture, add the --natural flag:

$> bin/django dumpdata --indent=2 --natural > all.json

IntegrityError solution 2: make sure there are no duplicates

One of our problems was that django’s content_type table was already pre-filled with several items after the regular “syncdb”. Quite sensibly. But it conflicted when loading our fixture. (It might be that the natural key solution mentioned above would have solved this, but we tried it in a different order).

The solution: remove those content types from the table before loading the fixture. The fixture is a full fixture of everything in the database, so that’s OK.

In your manage.py dbshell or bin/django dbshell, run:

truncate django_content_type cascade;

IntegrityError solution 3: split the fixture

Django-cms’s cms.placeholder items continued to give grief no matter what. Perhaps they ought to start to support the natural key solution? Not sure if that would help, but I guess so.

The solution? Grab the whole exported fixture and split that all.json into two json files, one with only “cms.placeholder” items and one without them. (Make sure the last dict-like item in the json should be without a closing comma!

all_only_placeholders.json:

[
  {
    "pk": 1,
    "model": "cms.placeholder",
    "fields": {
      "slot": "leftbar",
      "default_width": null
    }
  },
  {
    "pk": 2,
    "model": "cms.placeholder",
    "fields": {
      "slot": "content",
      "default_width": null
    }
  },
  ...
  }
]

all_without_placeholders.json:

[
  # Right, everything apart from cms.placeholder :-)
  ...
  ]

After that, load the data:

$> bin/django loaddata all_only_placeholders
$> bin/django loaddata all_without_placeholders

Closing comment

In the end, we got it running. I hope someone gets a good tip out of this entry when faced with a similar problem.

One django advantage I noticed: everything is visible. You can always inspect the source code. “What exactly is in that one model that’s giving me grief?” “Which migration steps are available?” Etc.

The advantage of open source!

Unrelated model railway photo. 'pays perdu' layout.
 
vanrees.org logo

Reinout van Rees

My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):