UnicodeDecodeError in plone’s catalog

Tags: plone

For the second time in a few weeks I've been bitten by a UnicodeDecodeError in a collection (or smart folder or topic):

   Module Products.PluginIndexes.common.UnIndex, line 393, in _apply_index
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 9: ordinal not in range(128)

The error occurred to me in the following case:

  • A catalog index (like some custom "organization" FieldIndex) contains both string and unicode values. So both "zest software" and u"Universit\xe9 de Paris".
  • You have a criteria where you select that Université de Paris.
  • You view the collection... boom!

In my case, I parsed an xml file and the parser returned everything perfectly as unicode. Afterward, I did some string processing on it, like organizations = orgfield.split(",") to split some string on commas. The result is, surprisingly, a mix of normal strings (the entries containing only ascii characters) and unicode (the university with the accented character).

The solution was to do an organization = organization.encode("utf-8") before giving it to plone.

 
vanrees.org logo

About me

My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):