Create Data Sources
A data source is a logical object in GoodData that represents the database where your source data is stored. To integrate your database into GoodData, you connect it to a workspace. As a result, a data source is registered as an entity in GoodData.
Data sources can be created using the web UI, the API or the Python SDK. To create a data source, refer to one of the following articles:
- Create a Snowflake Data Source
- Create an Amazon Redshift Data Source
- Create an Azure SQL Data Source
- Create a Google BigQuery Data Source
- Create a Greenplum Data Source
- Create a Databricks Data Source
- Create a Microsoft SQL Server Data Source
- Create a PostgreSQL Server Data Source
- Create a Synapse SQL Data Source
- Create a Vertica Data Source
- Create a Dremio Data Source
- Create an Apache Drill Data Source
Anatomy of a Data Source
A data source has the following properties:
Type
The type of database you are connecting to. For an exact list of avaliable types, submit a GET
request to /api/v1/options/availableDrivers
.
URL
The URL uses the standard JDBC URL format. For information about how to set up the JDBC URL for your database, review the Create Data Source article specific your database.
The URL must not be empty.
The URL must start with
jdbc:
and be constructed in the following format:jdbc:<database-type>://<host>:<port>/<path>?<param-1>=<value-1>&<param-2>=<value-2>
You can use
;
instead of?
and&
.- The query cannot contain parameters that have been forbidden for security reasons. For more information, review the Create Data Source article specific your database..
- Depending on the database, the path and query can be optional. For more information, review the Create Data Source article specific your database..
- The combination of (database type, host, database) cannot be the same as the ones used for hosting internal data storages:
- Analytics metadata (visualizations, metrics, …)
- The internal OIDC identity provider (authentication)
- Caching metadata (evidence of cache tables and their expiry)
Username and password
The username and password are optional. These can be empty if implicit authentication (for example, AWS Identity and Access Management) is used.
Tokens
Tokens are an alternative to usernames and passwords.
- Example: BigQuery service account encoded by Base64
Review the Create Data Source article specific your database to find out if tokens are an option or a requirement for your set up.
Schema
A schema is required. A data source can be connected with only one schema. If you want to access multiple schemas in your database, create a separate data source for each schema.
Note
For data source managers, all schemas are used when you set schema
to an empty value ("").
Additionally, it is possible to specify only part of the schema to use all schemas starting with the specified value.
For example, specifying the schema mypostgres
will use the schemas mypostgres.demo
and mypostgres.tpch
.
Using AWS PrivateLink
GoodData Cloud can be used in conjuction with AWS PrivateLink to establish secure connection to your data. In order to set up PrivateLink, you will need to follow these steps:
Reach out to GoodData support, letting us know you want to set up PrivateLink.
We will provide you with your GoodData account ID. You will need this ID for when you are setting up your VPC endpoint.
In your AWS, create a load balancer and a VPC endpoint service.
See Use AWS PrivateLink with GoodData Cloud on our community portal to see a worked example of how to set up PrivateLink for an Amazon Redshift data source.
Pass the name of your VPC endpoint service to us, we will finalize setting up the PrivateLink connection on our end.
Create your data source and use the DNS we will provide you with as the host name.