Skip to content

Connecting to a Database

Fidescls can connect to a variety of databases to detect and label potential sources of PII. The scan and db commands allow you to easily evaluate your connected databases from the command line.

Supported databases

Fidescls is built with the necessary driver support for the following databases:

  • PostgreSQL
  • MySQL
  • Microsoft SQL Server
  • BigQuery

Classify a database

The db classify command accepts a connection string to your database in the following format:

1
fidescls db classify [OPTIONS] CONNECTION_STRING

For more information on the available options, like types of classification, see the CLI docs.

Example connection with arguments
1
fidescls db classify --context --filename "output.txt" "redis://:testpassword@redis:6379/1"

The above command runs the context classification engine against the information contained in the Redis database, and returns possible labels for the provided columns to output.txt.

Connect to BigQuery

Connecting to BigQuery is achieved through passing a credentials file via the CLI. Fidescls will use these credentials to access your BigQuery warehouse and scan it for possible sensitive data.

Create a service account and key

To connect to BigQuery, a service account must provide fidescls with access. Follow the Google Cloud guide for creating a service account, and then create a service account key for fidescls to use.

Once your service account key is created, download the associated keyfile. By default, fidescls will look for a credential file named bigquery.json in the .fides/ directory.

Scan your dataset

The scan bigquery command accepts a connection string to your database in the following format:

1
fidescls scan bigquery [OPTIONS] DATASET_NAME [CONNECTION_KEYFILE_PATH]

The BigQuery dataset to be inspected is a required positional argument, and has no default value. Additional options, like types of classification, are available in the CLI docs.

Example command line arguments
1
fidescls scan bigquery --content "output.txt" "dataset_name"

The above command runs the content classification engine against the provided dataset, and returns possible labels for the contained rows to output.txt.

To specify an alternate path to your credentials file, a positional argument may be provided:

Example path to credentials file
1
fidescls scan bigquery "dataset_name" "/path/to/credential/keyfile.json"
Back to top