Skip to content
This repository has been archived by the owner on Aug 15, 2023. It is now read-only.

API Reference

Sid Mohan edited this page May 17, 2023 · 3 revisions

/api/process_text

Simple PII detection where you can pass in text and have a boolean returned confirming if PII was detected along with the returned redacted text. Please note that this endpoint has some further optimization needed.

curl -X POST -H "Content-Type: application/json" -d '{"text": "My name is John Doe, and my email is john.doe@example.com"}' http://127.0.0.1:5000/api/detect_pii

/api/process_csv

The /api/process_csv endpoint allows users to pass in a csv and specify whether they wish to:

  • identify and redact information using one of three methods:
    • Apply simple insertion of “[REDACT]”
    • Anonymized values
    • Hash
  • substitute in synthetic data

The output can be downloaded in the same format as the original file.

Sample Curl Request - PII Redaction

curl -X POST -H "Content-Type: multipart/form-data" -F "file=@/Path/to/your.csv" -F "redaction_method=hash" http://127.0.0.1:6000/api/process_csv --output redacted_output.csv

Sample Curl Request - Synthetic Data Insertion

curl -X POST -H "Content-Type: multipart/form-data" -F "file=@/Path/to/your.csv" -F "synthetic_data_generation=true" http://127.0.0.1:6000/api/process_csv --output redacted_output.csv

/api/process_excel

Similar to /api/process_csv, you can use the /api/process_excel endpoint to upload an Excel file (.xls, .xlsx) and specify a redaction method to remove sensitive data from the file. The redacted file can be downloaded in the same format as the original file. The endpoint also supports synthetic data generation to replace sensitive data with realistic but fake data.

Sample Curl Request - PII Redaction

curl -X POST -H "Content-Type: multipart/form-data" -F "file=@/Path/to/your.xls" -F "redaction_method=hash" "http://127.0.0.1:5000/api/process_excel" --output redacted_output.xlsx

Sample Curl Request - Synthetic Data Insertion

curl -X POST -H "Content-Type: multipart/form-data" -F "file=@/Path/to/your.xlsx" "http://127.0.0.1:6000/api/process_excel" -F "synthetic_data_generation=true" --output redacted_output.xlsx

/api/process_json

The /api/process_json endpoint allows users to upload a JSON file and specify a redaction method to remove sensitive data from the file. The redacted file can be downloaded in the same format as the original file. The endpoint also supports synthetic data generation.

Sample Curl Request - PII Redaction

curl -X POST -H "Content-Type: multipart/form-data" -F "file=@/Path/to/your.json" -F "redaction_method=hash" http://localhost:5000/api/process_json -o redacted_output.json

Sample Curl Request - Synthetic Data Insertion

Coming soon!

Request Body

The request body for all three endpoints contains the following parameters:

  • file: The file to be redacted. The file must be of the appropriate type for the endpoint being used (CSV, Excel, or JSON).
  • redaction_method: The method to be used for redacting sensitive data from the file. The following methods are available:
    • fixed_string: Replace sensitive data with a fixed string.
    • random_value: Replace sensitive data with a random value.
    • hash: Replace sensitive data with a hash of the data.
  • synthetic_data_generation: Whether to generate synthetic data to replace sensitive data in the file. This parameter is optional and defaults to false.

Response Body

The response body for all three endpoints contains the redacted file in the same format as the original file.

Status Codes

The API returns the following status codes:

  • 200: The request was successful.
  • 400: The request was malformed or invalid.