This might sound stupid, but I had the infamous unicodedecodeerror yesterday. The stupid thing was that it was in my Django model’s __unicode__() method. Just a simple one:
# -*- coding: utf-8 -*- from django.db import models class DataSet(models.Model): name = models.CharField('name', max_length=80, blank=True) def __unicode__(self): return u'<DataSet %s>' % self.name
Django promises to always return unicode from the database, so the u'<DataSet %s>' % self.name should be perfectly fine. No unicodedecodeerrors.
Something nagged in the back of my brain, so I added a test for it:
class DataSetTest(TestCase): ... other tests ... def test_unicode(self): non_ascii_data_set = DataSet(name='täääst') non_ascii_data_set.save() self.assertTrue(unicode(non_ascii_data_set)) # ^^^ Just testing that something non-error-like comes # out of it.
And the sorry results:
Traceback (most recent call last): ... UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
What? So there’s no unicode coming out of the database?
What’s really happening: I was getting the byte string 'täääst' instead of a unicode string when I called dataset.name because I just put it in myself. The core of the matter is that I created the object instead of Django creating the object when it pulls it out of the database.
So: in normal usage, Django would pull the object out of the database and put everything into the object as unicode. For this I modified the (now too elaborate) test:
class DataSetTest(TestCase): ... other tests ... def test_unicode(self): non_ascii_data_set = DataSet(name='täääst') non_ascii_data_set.save() id = non_ascii_data_set.id data_set_from_django = DataSet.objects.get(pk=id) self.assertTrue(unicode(data_set_from_django))
So the problem was only with the way I ran the test. And that I wanted to test the __unicode__() method in a too elaborate way.
(But I do want to test that method as I’ve seen errors in there. And those errors only occurred in the rendering of error messages, so the actual error would be hidden behind a unicodedecodeerror).
My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):