Django unicodedecodeerror in a models’ __unicode__

Tags: django

This might sound stupid, but I had the infamous unicodedecodeerror yesterday. The stupid thing was that it was in my Django model’s __unicode__() method. Just a simple one:

# -*- coding: utf-8 -*-
from django.db import models


class DataSet(models.Model):
    name = models.CharField('name', max_length=80, blank=True)

    def __unicode__(self):
        return u'<DataSet %s>' % self.name

Django promises to always return unicode from the database, so the u'<DataSet %s>' % self.name should be perfectly fine. No unicodedecodeerrors.

Something nagged in the back of my brain, so I added a test for it:

class DataSetTest(TestCase):

    ... other tests ...

    def test_unicode(self):
        non_ascii_data_set = DataSet(name='täääst')
        non_ascii_data_set.save()
        self.assertTrue(unicode(non_ascii_data_set))
        # ^^^ Just testing that something non-error-like comes
        # out of it.

And the sorry results:

Traceback (most recent call last):
...
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3
    in position 1: ordinal not in range(128)

What? So there’s no unicode coming out of the database?

What’s really happening: I was getting the byte string 'täääst' instead of a unicode string when I called dataset.name because I just put it in myself. The core of the matter is that I created the object instead of Django creating the object when it pulls it out of the database.

So: in normal usage, Django would pull the object out of the database and put everything into the object as unicode. For this I modified the (now too elaborate) test:

class DataSetTest(TestCase):

    ... other tests ...

    def test_unicode(self):
        non_ascii_data_set = DataSet(name='täääst')
        non_ascii_data_set.save()
        id = non_ascii_data_set.id
        data_set_from_django = DataSet.objects.get(pk=id)
        self.assertTrue(unicode(data_set_from_django))

So the problem was only with the way I ran the test. And that I wanted to test the __unicode__() method in a too elaborate way.

(But I do want to test that method as I’ve seen errors in there. And those errors only occurred in the rendering of error messages, so the actual error would be hidden behind a unicodedecodeerror).

'She' concert in Zoetermeer
 
vanrees.org logo

Reinout van Rees

My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):