Google BigQuery
Prerequisite
Before you can use BigQuery with GoodData.CN, you must install the JDBC driver into GoodData.CN. Due to licensing restrictions, GoodData is not allowed to distribute the JDBC driver for BigQuery.
You can download the driver from the Google BigQuery website. Alternatively, you can download the same driver from the Simba website for BigQuery (registration required).
To install the driver, follow the procedure for injecting custom JDBC drivers that corresponds to your edition of GoodData.CN:
- Inject custom JDBC drivers into Community Edition
- Inject custom JDBC drivers into Production K8S Edition
Data Source Details
Use the following information when creating a data source for your BigQuery database:
- The JDBC URL must be in the following format:
jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=<Your Project ID>;OAuthType=0
- Schema
- Dataset name
- Token
- Only service account authentication is supported
- Encode service account JSON by Base64BashPowerShell 7Mac
base64 -w0 service_account.json
$File = "c:\secrets\service_account.json" $Content1 = get-content $File $Bytes = [System.Text.Encoding]::UTF8.GetBytes($Content1) $Encoded = [System.Convert]::ToBase64String($Bytes) Write-Host "ENCODED: " $Encoded
base64 service_account.json
- Username & password
- Leave it empty.
Token
(service account) is used for authentication
- Leave it empty.
Unsupported Features
GoodData.CN does not support the following features:
- The PERCENTILE function.
- The REGR_R2 function.
- Period-over-period combination with years and weeks is not available.
- Example: A metric filtered by this week compared against the same metric filtered by a week in the previous year.
- Referential integrity:
- BigQuery does not support referential integrity (primary and foreign keys).
- Primary and foreign keys cannot be utilized when generating a logical data model (LDM) from a physical data model (PDM).
Note
If you want to generate primary keys and references into the LDM automatically, you can utilize database naming conventions as an alternative solution.
Performance Tips
If your database holds a large amount of data, consider the following practices:
- Denormalize the relational data model of your database.
- This helps avoid large JOIN operations. Because BigQuery is a columnar database, queries read only the required columns and each column is compressed separately.
- Utilize clustered tables.
- Data can be pruned when using clustered columns.
- Utilize partitioned tables.
- To map the BigQuery provided pseudo-columns
_PARTITIONDATE
and_PARTITIONTIME
onto the LDM:- Reference your date dataset in the dataset mapped to the partitioned table.
- Map the foreign key representing the reference to your date dataset to one of the pseudo-columns mentioned above.
- To map the BigQuery provided pseudo-columns
- Utilize materialized views.
- Changes to underlying tables are propagated to related materialized views. They are always consistent.
- Materialize the results of JOINs and aggregations. These are executed very often as a result of using dashboards or insights.
- Map materialized views and their columns to datasets and the LDM. You can utilize them in metrics, insights, and dashboards.
Query Timeout
Query timeout is not supported for Google BigQuery yet.
Permitted parameters
- ProjectId
- InitiateOAuth