(One of my summaries of the May 2023 Dutch PyGrunn conference).
You might need files for testing. Real test files are perhaps not available when you need it. Synthetic data might just do fine for most use-cases. Real data might not even be allowed due to privacy concerns, for instance.
You could use Faker to generate fake names
and adresses and so for your test. You have control over what you
generate. faker.zip_code()
, faker.company_email()
. Faker helps when
you need to generate separate fields.
But sometime you need actual files. For that you can use faker-file. faker-file works with Faker and
factory_boy. It is added as a “faker provider”. It supports text, csv, docx,
mp3, png, pdf, epub, etc, etc. Also .eml
email files.
You can have it generate random text, but you can also pass sample text. You
can also pass a template: you can use Faker’s regular methods like
first_name
and address
in there. Handy!
If you generate a png, it will be a png with a bitmap of the text. A zipfile with some folders and docx files is also possible. And zipfiles with folders and zipfiles with pngs in them. So: basically everything.
Normally, the files are stored in some tempfile directory. You can also get the raw byte contents if you need to pass it as test data to an API method, for instance.
In case of Django: django needs the files to be inside its MEDIA
root,
otherwise you can get a “suspicious file error” exception. There’s support to
handle that.
Handy: you can also call faker-file on the command line.
My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):