API Requests Quickstart

Overview

This is a quickstart guide for making classification requests to the Fidescls API.

Prerequisites

It is assumed that you've got the Fidescls API server up and running. If this is not the case, view the installation options to get up and running.

Postman Collection

If you have access to Postman and would like to use it to test/experiment with the API, the Fidescls Postman Collection can be used as a great place to start.

Classification

Classification requests have the following commonalities:

The requests are made to the /classify end point
The requests are POST requests with a JSON payload

For more information about the classification interface contracts, see the classifiers development documentation as well as the API docs.

Context Classification

See the classifiers guide to get more information about the Fidecls context classification paradigm.

Example Requests

To make a context classification request to the API, a POST request to the localhost:8765/text/classify endpoint with a JSON payload of either of the following formats (single string vs list of strings).

Single string data

{
    "context": {
        "data": "email_address",
        "method": "similarity",
        "method_params": {
            "possible_targets": [
                "user.device.ip_address",
                "user.financial.account_number",
                "user.contact.email",
                "user.contact.phone_number",
                "user.contact.address.street",
                "user.contact.address.city",
                "user.contact.address.state",
                "user.contact.address.country",
                "user.contact.address.postal_code"
            ],
            "top_n": 1,
            "remove_stop_words": false
        }
    }
}

The response to the above request payload will be similar to the following:

{
  "context": [
    {
      "input": "email address",
      "labels": [
        {
          "label": "user provided identifiable contact email",
          "score": 0.791374585498101,
          "position_start": null,
          "position_end": null
        }
      ]
    }
  ]
}

Array of strings data

{
    "context": {
        "data": [
            "email_address",
            "phone_num",
            "credit_card"
            ],
        "method": "similarity",
        "method_params": {
            "possible_targets": [
                "user.device.ip_address",
                "user.financial.account_number",
                "user.contact.email",
                "user.contact.phone_number",
                "user.contact.address.street",
                "user.contact.address.city",
                "user.contact.address.state",
                "user.contact.address.country",
                "user.contact.address.postal_code"
            ],
            "top_n": 2,
            "remove_stop_words": false
        }
    }
}

The response to the above request payload will be similar to the following:

{
  "context": [{
      "input": "email address",
      "labels": [{
          "label": "user provided identifiable contact email",
          "score": 0.791374585498101,
          "position_start": null,
          "position_end": null
        },
        {
          "label": "account contact postal code",
          "score": 0.7402522077965934,
          "position_start": null,
          "position_end": null
        }
      ]
    },
    {
      "input": "phone num",
      "labels": [{
          "label": "user provided identifiable contact phone number",
          "score": 0.5770164988785474,
          "position_start": null,
          "position_end": null
        },
        {
          "label": "account contact postal code",
          "score": 0.44817613132976103,
          "position_start": null,
          "position_end": null
        }
      ]
    },
    {
      "input": "credit card",
      "labels": [{
          "label": "user provided identifiable financial account number",
          "score": 0.5742921242220389,
          "position_start": null,
          "position_end": null
        },
        {
          "label": "account contact postal code",
          "score": 0.5587338672966902,
          "position_start": null,
          "position_end": null
        }
      ]
    }
  ]
}

Content Classification

See the classifiers guide to get more information about the Fidecls content classification paradigm.

Decision Method

Content classification supports two main 'decision methods': pass-through or direct-mapping.

The pass-through method instructs the system to return the 'raw' classification results. This means that the labels assigned will be a PII entity type (i.e. CREDIT_CARD, NAME, etc.).

The direct-mapping method instructs the system to return the 'mapped' taxonomy data-categories (i.e. user.contact.email, user.contact.address.street, etc.). This mapping is specified via the fidescls/cls/entity_map.py file.

note: when there is now direct mapping between a PII entity type and a taxonomy data-category, the PII entity type is returned.

Example Requests

To make a content classification request to the API, a POST request to the localhost:8765/text/classify endpoint with a JSON payload of either of the following formats (array of strings vs json of values).

Array of string data

{
    "content": {
        "data": ["sample@aol.com"],
        "method_params": {
            "decision_method": "direct-mapping"
        }
    }
}

An example response to this request body:

{
    "content": [
        {
            "input": "sample@aol.com",
            "labels": [
                {
                    "label": [
                        "user.contact.email"
                    ],
                    "score": 1.0,
                    "position_start": 0,
                    "position_end": 14
                },
                {
                    "label": [
                        "DOMAIN_NAME"
                    ],
                    "score": 1.0,
                    "position_start": 7,
                    "position_end": 14
                }
            ]
        }
    ]
}

Json values

Sometimes it can be helpful to process content data where multiple samples are taken from a particular group (or column in a database). This can be done by passing the data in as a json of the following format:

{
    "content": {
        "data": {
            "emails": ["sample@aol.com", "sample@aol.com"],
            "phones": ["(555) 555-5555", "(555) 555-5555"],
            "ccs": ["4242-4242-4242-4242", "4242-4242-4242-4242"]
        },
        "method_params": {
            "decision_method": "pass-through"
        }
    }
}

An example response to this request body:

{
    "content": {
        "emails": [
            {
                "input": "sample@aol.com",
                "labels": [
                    {
                        "label": "EMAIL_ADDRESS",
                        "score": 1.0,
                        "position_start": 0,
                        "position_end": 14
                    },
                    {
                        "label": "DOMAIN_NAME",
                        "score": 1.0,
                        "position_start": 7,
                        "position_end": 14
                    }
                ]
            },
            {
                "input": "sample@aol.com",
                "labels": [
                    {
                        "label": "EMAIL_ADDRESS",
                        "score": 1.0,
                        "position_start": 0,
                        "position_end": 14
                    },
                    {
                        "label": "DOMAIN_NAME",
                        "score": 1.0,
                        "position_start": 7,
                        "position_end": 14
                    }
                ]
            }
        ],
        "phones": [
            {
                "input": "(555) 555-5555",
                "labels": [
                    {
                        "label": "PHONE_NUMBER",
                        "score": 0.4,
                        "position_start": 0,
                        "position_end": 14
                    }
                ]
            },
            {
                "input": "(555) 555-5555",
                "labels": [
                    {
                        "label": "PHONE_NUMBER",
                        "score": 0.4,
                        "position_start": 0,
                        "position_end": 14
                    }
                ]
            }
        ],
        "ccs": [
            {
                "input": "4242-4242-4242-4242",
                "labels": [
                    {
                        "label": "CREDIT_CARD",
                        "score": 1.0,
                        "position_start": 0,
                        "position_end": 19
                    }
                ]
            },
            {
                "input": "4242-4242-4242-4242",
                "labels": [
                    {
                        "label": "CREDIT_CARD",
                        "score": 1.0,
                        "position_start": 0,
                        "position_end": 19
                    }
                ]
            }
        ]
    }
}

Context and Content Classification

Both context and content classification can be performed within the same request. This is accomplished by combining the json payload where the parent keys are the classification methods:

{
    "content": {
        "data": [
            "sample@aol.com",
            "(555) 555-5555",
            "4242-4242-4242-4242"
        ],
        "method_params": {
            "decision_method": "pass-through"
        }
    },
    "context": {
        "data": "email_address",
        "method": "similarity",
        "method_params": {
            "possible_targets": [
                "user.device.ip_address",
                "user.financial.account_number",
                "user.contact.email",
                "user.contact.phone_number",
                "user.contact.address.street",
                "user.contact.address.city",
                "user.contact.address.state",
                "user.contact.address.country",
                "user.contact.address.postal_code"
            ],
            "top_n": 1,
            "remove_stop_words": false
        }
    }
}