Enhanced mysqldump to support data cleansing compliance requirements by nullifying specific fields

At Pascal Metrics, my team collects large sets of data including ePHI. That subjects us to HIPAA and federal law dictates how we handle said data. Before I can even consider moving a database export out of our hardened production environment, I need to cleanse the export of any personally identifying information - including but not limited to any ePHI.

There are number of ways to achieve that aim, but once our database crossed the threshold into "large" territory - cleansing the data became a real chore. I decided to roll my own mysqldump binary and add a parameter called "nullify-field" which is based upon the "ignore-table" parameter from the official release.

I've posted my modified source to my GitHub account for any who may be interested.

You'll have to compile your own copy - but that's relatively easy:

  1. Grab the latest mysql 5.0.X source 
  2. Copy my 2 modified source files over the ones that come with your source tarball
  3. Run "./configure --without-server" to build just the mysql clients
  4. Run "make" to generate your custom mysqldump under ./client/

Example Usage:

./client/mysqldump -u backupuser -p --nullify-field=my_db.my_sensitive_table.name --nullify-field=my_db.my_sensitive_table.email my_db

I love open source software.