Google BigQuery Integration
There are two ways of integrating Google BigQuery with Entropy Data.
1. Ingestion-based Integration (Built-in)
The integration is managed within Entropy Data. Configure the connection and sync schedule to start syncing with Google BigQuery.
No additional deployments are needed.
2. Connector-based Integration
If you have a complex network topology or direct integration with Entropy Data is not possible due to governance restrictions, the Google BigQuery connector can be deployed in a DMZ of your network to provide additional security.
| Feature | Ingestion-based Integration | Connector-based Integration |
|---|---|---|
| Direct integration into Entropy Data | ✅ | |
| Syncing of assets | ✅ | ✅ |
| Syncing of permissions | ✅ | |
| Complete control of deployment | ✅ | |
| Support for different network topologies | ✅ |
Read more about how to use the integrations below.
1. Ingestion-based Integration (Built-in)
You can directly integrate Google BigQuery with Entropy Data.
Prerequisites
You need an Entropy Data Enterprise License or the Cloud Edition. To enable the integration, set APPLICATION_INGESTIONS_ENABLED to true in your environment. See Configuration for more information.
To start, navigate to Settings > Integrations > Add Integration.
This opens a wizard that guides you through configuring the integration.
Select the Integration Type

Configure the Credentials
The BigQuery integration uses a JSON Token File. Read here how to get one with the access rights, you will need to sync the assets with Entropy Data: Authenticate for using REST.
One of the following scopes is needed for reading the relevant data:
- https://www.googleapis.com/auth/bigquery
- https://www.googleapis.com/auth/cloud-platform
- https://www.googleapis.com/auth/bigquery.readonly
- https://www.googleapis.com/auth/cloud-platform.read-only
Note: Credentials are stored encrypted in the Entropy Data database. To enable encryption in your environment, set a 64 hex character
APPLICATION_ENCRYPTION_KEYSin your environment (see Configuration).

You can drop your downloaded key-file on the upload area or select it via "Upload a file". The credentials will be shown in the dialog. You can now test, if the connection works as expected by using the "Test Connection" button.
Configure Filters
Configure filters to limit which assets are synchronized. Both include and exclude filters are supported. For Google BigQuery, filters can be applied to Projects, DataSets, and Tables.
Filters support '*' as a wildcard character to match any number of characters.

Configure Schedule
Set a schedule to automatically synchronize assets. You can choose from predefined schedules or define a custom schedule using the cron expression format.
Note: All schedules use UTC timezone, so make sure to take this into account when configuring your schedule. Please do not synchronize the assets more than once or twice per day. We reserve the right to disable the integration if this happens. You will be able to trigger a synchronization manually if you need an immediate update.

Complete the Integration Configuration

Choose a unique name for the integration, review your configuration, and click Create Integration.
Next Steps
The integration is now configured and will run according to the schedule. To check the integration status, navigate to Settings > Integrations. Here you'll find the current status and the last 10 integration runs.

You can adjust the integration configuration and credentials at any time. The configuration is saved in YAML format with syntax validation support in the editor.

Deselecting the Enabled checkbox disables the automatic schedule. Manual integration runs are still possible.
2. Connector-based integration: The BigQuery Connector
BigQuery Connector is an open-source component that integrates Entropy Data with BigQuery. It is based on the SDK and available as a Docker image. The source code can also be forked to implement custom integrations.
Features
- Asset Synchronization: Sync tables and datasets of BigQuery projects to Entropy Data as Assets.
- Access Management: Listen for AccessActivated and AccessDeactivated events in Entropy Data and grants access on BigQuery datasets to the data consumer.
Links
- Source Code: Entropy Data Connector for BigQuery on GitHub