Doctesting windows filenames

Tags: python, nelenschuurmans

I wrote a method today for checking windows filenames. Apparently Arcview (a GIS packages) is picky about names, causing occasional errors. So for our turtle add-on for Arcview I got to code a function that checks the filenames beforehand (the actual knowledge on what’s allowed and what’s not is from my colleagues, btw):

  • No spaces in the path or in the filename.

  • Just a-z, digits, underscore, dash and a dot are allowed. No other characters.

  • Oh, and also no two dashes or two underscores after each other.

  • The actual filename cannot be longer than 28 characters.

To make sure my check covered all cases I of course wrote a test. In my case a doctest. That was surprisingly difficult as windows filenames contain backslashes. And backslashes are used for escaping in python (and many other languages), for instance \t is a tab. So 'c:\turtle' turns into 'c:      urtle', sigh. The solution is either to escape the backslash itself with an extra backslash ('c:\\turtle') or to use a so-called raw string that doesn’t do any escaping (r'c:\turtle').

In this particular case I placed the doctest inside the python file. Normally I use a separate file as the doctest tends to get a bit long, but this was just a one-function module. This gave some problems:

  • I first added that r in front of the entire triple-quoted docstring so that I did not need escaping in the entire doctest. Hurray. Only, it didn’t work. The filenames and the output (error messages I’m testing) themselves were OK, but the actual check function got passed tabs when I passed 'c:\turtle'. Huh?

  • Turns out that I also had to make the string I passed into the checker function a raw string. Makes sense actually as that’s pure literal python that gets executed. Here’s an example:

    r"""
    Start of the doctest.  Note the raw triple quoted doctest itself.
    
    The function call itself also has a raw string:
    
      >>> check_filename(r'c:\1turtle\work\shape.shp')
      Traceback (most recent call last):
      ...
      InvalidFilenameError: '1turtle' starts with a digit: 1
    
    """
    

Hope this helps someone!

 
vanrees.org logo

About me

My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):