Amazon Redshift

Data Source Details

Use the following information when creating a data source for your Amazon Redshift database:

The JDBC URL must be in the following format:
jdbc:redshift://<host>:<port>/<databaseName>
Basic authentication is supported. Specify user and password.
If you use native authentication inside your cloud platform (for example, Google Cloud Platform, Amazon Web Services, or Microsoft Azure), you do not have to provide the username and password.

GoodData uses the driver with version 2.1.0.9.

Unsupported Features

GoodData does not support the following features:

Statistical functions:
- regr_slope
- regr_intercept
- covar_samp
- corr
- regr_r2
Statistical running functions:
- stdev
- stdevp
- var
- varp
Window functions (running aggregations like RUNSUM, RUNMAX or RUNVARP) with unbounded beginning of frames.

User Access Rights

We recommend that you create a dedicated user and user role for integration with the GoodData platform.

Steps:

Create a user role and grant the following access rights to it:

GRANT USAGE ON SCHEMA {schema_name} TO ROLE {role_name};
GRANT SELECT ON ALL TABLES IN SCHEMA {schema_name} TO ROLE {role_name};

If you intend to enable pre-aggregation caching, additional usage rights have to be granted for your pre-aggregation schema cache_schema_name:

GRANT USAGE, CREATE ON SCHEMA {cache_schema_name} TO ROLE {role_name};

Create a user and grant it with the user role:

GRANT ROLE {role_name} TO USER {user_name};

Make the user role default for the user:

ALTER USER {user_name} SET DEFAULT_ROLE={role_name};

If you use AWS Identity and Access Management (IAM) for Redshift authentication, do the following:
1. Create an IAM role or user with permissions to call GetClusterCredentials (see https://docs.aws.amazon.com/redshift/latest/mgmt/generating-iam-credentials-role-permissions.html).
2. (Optional) Create a database user and database groups (see https://docs.aws.amazon.com/redshift/latest/mgmt/generating-iam-credentials-user-and-groups.html).

Performance Tips

If your database holds a large amount of data, consider the following practices:

Denormalize the relational data model of your database.
This helps avoid large JOIN operations. Because Amazon Redshift is a columnar database, queries read only the required columns and each column is compressed separately.
Choose the best sort key.
- Use the columns that are most frequently used for JOIN and aggregation operations. Those columns are typically mapped to attributes that are most frequently used for aggregations in visualizations.
- If you have to build analytics for multiple mutually exclusive use cases, prepare a separate table for each use case.
Choose the best distribution style. At least, use a column with high cardinality so that loaded data is evenly distributed in your cluster.
Spin up databases/clusters based on user needs.
- Users with similar needs populate data into caches that are likely reused.
- Isolate data transformation operations running in your database from the analytics generated by GoodData.
Because Amazon Redshift does not support partitioning, use a related DATE or TIMESTAMP column as one of the sort keys to improve performance of visualizations using only the recent data.

Query Timeout

Query timeout is configurable per application instance. It is a parameter of the sql-executor service, default value is 160 seconds.

Query timeout is closely related to the ACK timeout. Proper configuration of the system requires that ACK timeout is longer than query timeout. Default ACK timeout value is 170 seconds.

Note

When a query fails on query timeout, the REST API call returns error code 500. Please note that this is subject to change in a future release.

Permitted parameters

adaptiveFetch
adaptiveFetchMaximum
adaptiveFetchMinimum
allowEncodingChanges
ApplicationName
assumeMinServerVersion
autosave
binaryTransferDisable
binaryTransferEnable
cleanupSavepoints
connectTimeout
currentSchema
defaultRowFetchSize
disableColumnSanitiser
escapeSyntaxCallMode
gssEncMode
hostRecheckSeconds
loadBalanceHosts
localSocketAddress
loggerFile
loggerLevel
loginTimeout
logServerErrorDetail
logUnclosedConnections
maxResultBuffer
options
preferQueryMode
preparedStatementCacheQueries
preparedStatementCacheSizeMiB
prepareThreshold
readOnly
receiveBufferSize
reWriteBatchedInserts
sendBufferSize
socketFactory
socketFactoryArg
socketTimeout
ssl
sslcert
sslfactory
sslfactoryarg
sslhostnameverifier
sslmode
sslpassword
sslpasswordcallback
sslrootcert
targetServerType
tcpKeepAlive

PostgreSQL Server

Snowflake

Learn more: