This package is a work in progress and is not yet available on pypi. This documentation should be considered more of a design document for what scrubadub will do someday rather than a specification of what it can do today.


Remove personally identifiable information from free text. Sometimes we have additional metadata about the people we wish to anonymize. Other times we don’t. This package makes it easy to seamlessly scrub personal information from free text, without comprimising the privacy of the people we are trying to protect.

scrubadub currently supports removing:

  • names
  • email addresses

Quick start

Getting started with scrubadub is as easy as pip install scrubadub and incorporating it into your python scripts like this:

>>> import scrubadub

# John may be a cat, but he doesn't want other people to know it.
>>> text = "John is a cat"

# Replace names with {{NAME}} placeholder. This is the scrubadub default
# because it maximally omits any information about people.
>>> placeholder_text = scrubadub.clean_with_placeholders(text)
>>> placeholder_text
"{{NAME}} is a cat"

Indices and tables