Skip to content

Data Deanonymizer

The Data Deanonymizer lets you take data that has been anonymized and restore the original personal details. It’s useful when you need to work with real data again for analysis, debugging, or reporting, while still keeping a record of how the data was anonymized.

How it Works

The component receives two pieces of information:

  1. Data – the dataset that contains anonymized values (for example, names, emails, or IDs that have been replaced with placeholders).
  2. Deanonymize Mapping – a separate dataset that tells the component which placeholder corresponds to which real value.

When you run the component, it looks at the field specified by Source data input key to deanonymize (default is text) and replaces each placeholder with the real value found in the mapping. The key used to look up the mapping is set by Deanonymize Mapping Key (default is anonymizer_mapping). The result is a new dataset that contains the original personal information.

The process happens entirely inside Nappai, so no external services are called. It simply reads the two inputs, performs a lookup, and outputs the cleaned data.

Inputs

  • Deanonymize Mapping: A dataset that contains the mapping of anonymized keys to real values.
  • Data: The dataset that you want to deanonymize.
  • Source data input key to deanonymize: The name of the field in the data that holds the anonymized values (e.g., text).
  • Deanonymize Mapping Key: The key used to locate the mapping inside the Deanonymize Mapping dataset (e.g., anonymizer_mapping).

Outputs

  • Data: The deanonymized dataset. The output is produced by the deanonymize_data method and can be connected to any downstream component that needs the original personal information.

Usage Example

  1. Prepare your data

    • Data: A CSV file with a column text that contains anonymized names like User_123.
    • Deanonymize Mapping: A CSV file with two columns: anonymizer_mapping (e.g., User_123) and real_name (e.g., Alice Smith).
  2. Configure the component

    • Drag the Data Deanonymizer into your workflow.
    • Connect the Data input to the CSV component that reads your anonymized file.
    • Connect the Deanonymize Mapping input to the CSV component that reads your mapping file.
    • Leave the default values for Source data input key to deanonymize (text) and Deanonymize Mapping Key (anonymizer_mapping).
  3. Run the workflow

    • The component will replace each User_123 in the text column with Alice Smith.
    • The output Data can now be sent to a reporting component, stored in a database, or used for further analysis.
  • Data Anonymizer – The counterpart that replaces personal data with placeholders.
  • Data Validator – Checks that your data meets quality standards before processing.
  • Data Cleaner – Removes duplicates, missing values, and other inconsistencies.

Tips and Best Practices

  • Test on a small sample first to confirm that the mapping works correctly before running on large datasets.
  • Keep the mapping file secure; it contains sensitive personal information.
  • Use consistent key names in both the data and the mapping to avoid mismatches.
  • Document the mapping so that future users understand which placeholders correspond to which real values.

Security Considerations

  • The component handles personal data, so ensure that your workflow complies with your organization’s privacy policies.
  • Store the mapping file in a protected location and restrict access to authorized users only.
  • After deanonymization, consider whether the resulting data still needs to be anonymized for downstream processes.