Google BigQuery
Prerequisite
Please note that Google BigQuery is supported only in GoodData.CN. Before you can use BigQuery with GoodData.CN, you must install the JDBC driver into GoodData.CN. Due to licensing restrictions, GoodData is not allowed to distribute the JDBC driver for BigQuery.
You can download the driver from the Google BigQuery website. Alternatively, you can download the same driver from the Simba website for BigQuery (registration required).
Important
Make sure that the version of the JDBC driver for BigQuery is 1.2.22 or newer.
To install the driver, follow the procedure for injecting custom JDBC drivers that corresponds to your edition of GoodData.CN:
- Inject custom JDBC drivers into GoodData.CN
- Inject custom JDBC drivers into GoodData.CN Community Edition
Data Source Details
Use the following information when creating a data source for your BigQuery database:
- The JDBC URL must be in the following format:
jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=<Your Project ID>;OAuthType=0;Timeout=150
- BigQuery JDBC driver cancels queries automatically after 10 seconds if Timeout parameter is omitted
- Schema
- Dataset name
- Token
- Only service account authentication is supported
- Encode service account JSON by Base64BashPowerShell 7Mac
base64 -w0 service_account.json
$File = "c:\secrets\service_account.json" $Content1 = get-content $File $Bytes = [System.Text.Encoding]::UTF8.GetBytes($Content1) $Encoded = [System.Convert]::ToBase64String($Bytes) Write-Host "ENCODED: " $Encoded
base64 service_account.json
- Username & password
- Leave it empty.
Token
(service account) is used for authentication
- Leave it empty.
Unsupported Features
GoodData.CN does not support the following features:
- The PERCENTILE function.
- The REGR_R2 function.
- Referential integrity:
- BigQuery does not support referential integrity (primary and foreign keys).
- Primary and foreign keys cannot be utilized when generating a logical data model (LDM) from a physical data model (PDM).
Note
If you want to generate primary keys and references into the LDM automatically, you can utilize database naming conventions as an alternative solution.
User Access Rights
To connect your BigQuery data warehouse and GoodData, we recommend that you use Google Cloud Platform service account. For sufficient level of access, ensure your service account has the following user roles and permissions:
Steps:
Grant your service account the following user roles:
bigquery.dataViewer
bigquery.jobUser
or if you intend to enable pre-aggregation caching:
bigquery.dataEditor
bigquery.jobUser
For more information see Google Cloud documentation Service Accounts 🡕 and Access Control 🡕.
Ensure your service account has the following permissions:
bigquery.jobs.create
bigquery.tables.get
bigquery.tables.getData
bigquery.tables.list
Note
Custom BigQuery roles are supported. If you create a custom role in BigQuery, then you must also assign the
bigquery.datasets.get
permission to your users and the custom role or else you will be unable to import any data.
Performance Tips
If your database holds a large amount of data, consider the following practices:
- Denormalize the relational data model of your database.
- This helps avoid large JOIN operations. Because BigQuery is a columnar database, queries read only the required columns and each column is compressed separately.
- Utilize clustered tables.
- Data can be pruned when using clustered columns.
- Utilize partitioned tables.
- To map the BigQuery provided pseudo-columns
_PARTITIONDATE
and_PARTITIONTIME
onto the LDM:- Reference your date dataset in the dataset mapped to the partitioned table.
- Map the foreign key representing the reference to your date dataset to one of the pseudo-columns mentioned above.
- To map the BigQuery provided pseudo-columns
- Utilize materialized views.
- Changes to underlying tables are propagated to related materialized views. They are always consistent.
- Materialize the results of JOINs and aggregations. These are executed very often as a result of using dashboards or visualizations.
- Map materialized views and their columns to datasets and the LDM. You can utilize them in metrics, visualizations, and dashboards.
Query Timeout
Query timeout is not supported for Google BigQuery yet.
Permitted parameters
- ProjectId
- InitiateOAuth