Google BigQuery

Prerequisite

Before you can use BigQuery with GoodData.CN, you must install the JDBC driver into GoodData.CN. Due to licensing restrictions, GoodData is not allowed to distribute the JDBC driver for BigQuery.

You can download the driver from the Google BigQuery website. Alternatively, you can download the same driver from the Simba website for BigQuery (registration required).

To install the driver, follow the procedure for injecting custom JDBC drivers that corresponds to your edition of GoodData.CN:

Data Source Details

Use the following information when creating a data source for your BigQuery database:

The JDBC URL must be in the following format:
- jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=<Your Project ID>;OAuthType=0
Schema
- Dataset name

Token

Only service account authentication is supported

Encode service account JSON by Base64

Bash

PowerShell 7

Mac

base64 -w0 service_account.json

$File = "c:\secrets\service_account.json"
$Content1 = get-content $File
$Bytes = [System.Text.Encoding]::UTF8.GetBytes($Content1)
$Encoded = [System.Convert]::ToBase64String($Bytes)
Write-Host "ENCODED: " $Encoded

base64 service_account.json

Username & password
- Leave it empty. Token(service account) is used for authentication

Unsupported Features

GoodData.CN does not support the following features:

The PERCENTILE function.
The REGR_R2 function.
Period-over-period combination with years and weeks is not available.
- Example: A metric filtered by this week compared against the same metric filtered by a week in the previous year.
Referential integrity:
- BigQuery does not support referential integrity (primary and foreign keys).
- Primary and foreign keys cannot be utilized when generating a logical data model (LDM) from a physical data model (PDM).
  Note
  If you want to generate primary keys and references into the LDM automatically, you can utilize database naming conventions as an alternative solution.

Performance Tips

If your database holds a large amount of data, consider the following practices:

Denormalize the relational data model of your database.
- This helps avoid large JOIN operations. Because BigQuery is a columnar database, queries read only the required columns and each column is compressed separately.
Utilize clustered tables.
- Data can be pruned when using clustered columns.
Utilize partitioned tables.
- To map the BigQuery provided pseudo-columns _PARTITIONDATE and _PARTITIONTIME onto the LDM:
  - Reference your date dataset in the dataset mapped to the partitioned table.
  - Map the foreign key representing the reference to your date dataset to one of the pseudo-columns mentioned above.
Utilize materialized views.
- Changes to underlying tables are propagated to related materialized views. They are always consistent.
- Materialize the results of JOINs and aggregations. These are executed very often as a result of using dashboards or insights.
  - Map materialized views and their columns to datasets and the LDM. You can utilize them in metrics, insights, and dashboards.

Query Timeout

Query timeout is not supported for Google BigQuery yet.

Permitted parameters

ProjectId
InitiateOAuth

Vertica