Generate a Logical Data Model Automatically Using the API
Once you have generated the physical data model (PDM) either using the API or in the LDM Modeler, generate the logical data model (LDM) based on the PDM automatically using the API.
If you cannot or do not want to use the API, you can work in the LDM Modeler.
Generating the LDM automatically is suitable for quick onboarding and exploring the content of your database or if your database is prepared for building analytics and does not contain the complex analytical scenarios. If this is not the case, create the LDM manually in the LDM Modeler.
To generate the LDM automatically using the API, do the following:
- Learn how the PDM is transformed to the LDM.
- Generate the LDM from the stored PDM.
- Load the generated LDM definition to the workspace.
Learn how the PDM Is Transformed to the LDM
When a declarative definition of the LDM is generated, certain rules are applied that affect how LDM entities are created based on the PDM.
Understanding these rules will help you adjust your database and make the process of generating the LDM smoother and more accurate.
Default Configuration
By default, if you have not provided any customized parameters in the API request for generating the LDM, entities for the LDM are derived from the PDM based on data types assigned to table columns and referential integrity in the database.
LDM Entity | Expected Database Entity |
---|---|
Dataset | Table, view |
Attribute | CHAR -like, INT columns |
Fact | NUMERIC -like columns |
Date dataset | DATE , TIMESTAMP (TZ ) columns |
Grain | PRIMARY KEY |
Reference | FOREIGN KEY |
The default configuration has the following limitations:
- It always reads the whole PDM; you cannot select just a part of the PDM to read.
- It cannot detect advanced LDM entities such as attribute labels.
- It cannot generate facts from columns with the
INT
data type. - It cannot generate grains and references if referential integrity is not maintained in the database.
- It cannot generate proper LDM entities (such as facts, labels, and so on) from columns with incorrectly defined data types.
To overcome these limitations, use the naming conventions in your database to make sure that PDM entities are identified and interpreted correctly regardless of the column data types and referential integrity.
Database Naming Conventions
Use the following naming conventions to name objects in your database:
Entity | Name Format | Example |
---|---|---|
Table/view (standard) | tablePrefix__baseName | gd_table__customer |
View | viewPrefix__baseName | gd_view__customer |
Label | labelPrefix__attributeBaseName__labelBaseName | ls__country_id__country_name |
Fact | factPrefix__baseName | f__sales_amount |
Grain | grainPrefix__baseName | gr__id_customer_pk |
Reference | referencePrefix__tableBaseName__referenceBaseName | r__customer__surname |
Workspace Data Filter | wdfPrefix__baseName | wdf__region |
The prefixes (tablePrefix
, labelPrefix
, and so on) are not case-sensitive.
Workspace Data Filter
is used to filter data in workspaces. To learn more, go to Workspace Data Filters.
Attributes and labels are not generated for columns that have been filtered using the workspace data filter.
Separators
In the table above, a double underscore (__
) is used as a separator between prefixes and entity names, or between separate sections of reference names and label names. In the API request for generating the LDM, you can define any string as a separator.
We recommend that the string you want to use as a separator be as unique as possible. Consider the situation when you use an underscore (_
) as a separator and you have a column named referencePrefix_table_1_fk_column
. This column name cannot be split into three sections as it would be required (prefix
, referencedTable
, referencedColumn
).
Naming Conventions vs. Referential Integrity
Do not apply the naming conventions to the columns that are used as or are included in a primary key or a foreign key in your database. Such columns will be interpreted as grains and references based on referential integrity.
Entity Names in the Generated LDM
In the generated LDM, entity names and descriptions are generated based on the corresponding column names. To make an entity name more human-readable, the following happens when the entity name is injected into the LDM:
- The prefix is removed from the name.
- The first character is upper-cased.
- Any non-alphanumeric character is replaced by a space.
Column Name in PDM | Entity ID in LDM | Entity Name in LDM |
---|---|---|
gd_table__customer | customer | Customer |
customer.f__amount | customer.amount | Amount |
customer.ls__customer_key__customer_name | customer.customer_name | Customer name |
customer.region_name (attribute) | attr.customer.region_name | Region name |
Duplicate label IDs may be generated for different columns in the same table in the database (and eventually for different labels in the corresponding dataset in the LDM).
For example, both lineitem.ls__customer_id__url
and lineitem.ls__supplier_id__url
will be transformed into lineitem.url
.
While an LDM with duplicate label IDs can be generated, publishing this LDM will fail.
To avoid this situation, make sure that the base label names distinguish from each other, for example:
- Use
lineitem.ls__customer_id__customer_url
instead oflineitem.ls__customer_id__url
so that the label ID is generated aslineitem.customer_url
. - Use
lineitem.ls__supplier_id__supplier_url
instead oflineitem.ls__supplier_id__url
so that the label ID is generated aslineitem.supplier_url
.
Generate the LDM from the Stored PDM
To generate the LDM from the stored PDM, submit a POST
request to /api/actions/dataSources/<data-source-id>/generateLogicalModel
. <data-source-id>
is the ID of the data source that corresponds to the database from which the PDM was generated.
As a result, a JSON file with a declarative definition of the LDM is generated.
Example: Generating the LDM based on the PDM stored under the demo-ds
data source that represents the pre-installed PostgreSQL database with the sample data prepared in the GoodData.CN Community Edition image
The naming convention will be used for views, grains (primary keys), secondary labels, and references. The LDM definition will be saved to the ldm.json
file.
curl $ENDPOINT/api/actions/dataSources/demo-ds/generateLogicalModel \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Authorization: Bearer YWRtaW46Ym9vdHN0cmFwOmFkbWluMTIz" \
-X POST \
-d '{"separator": "__", "viewPrefix": "mtt", "grainPrefix": "gr", "secondaryLabelPrefix": "ls", "referencePrefix": "r"}' \
| jq . > ldm.json
Invoke-RestMethod -Method Post -Uri "$ENDPOINT/api/actions/dataSources/demo-ds/generateLogicalModel" `
-ContentType 'application/json' `
-H @{
'Accept' = 'application/json'
'Authorization' = 'Bearer YWRtaW46Ym9vdHN0cmFwOmFkbWluMTIz'
} `
-Body '
{
"separator": "__",
"viewPrefix": "mtt",
"grainPrefix": "gr",
"secondaryLabelPrefix": "ls",
"referencePrefix": "r"
}' | ConvertTo-Json > ldm.json
Load the Generated LDM Definition to the Workspace
To load the LDM definition to the workspace, submit a PUT
request to /api/actions/workspaces/<workspace-id>/logicalModel
. <workspace-id>
is the ID of the workspace where you want to load the LDM to.
Example: Loading the LDM definition to the demo
workspace
curl $ENDPOINT/api/layout/workspaces/demo/logicalModel \
-H "Authorization: Bearer YWRtaW46Ym9vdHN0cmFwOmFkbWluMTIz" \
-H "Content-Type: application/json" \
-X PUT -d @ldm.json
Invoke-RestMethod -Method Put -Uri "$ENDPOINT/api/layout/workspaces/demo/logicalModel" `
-ContentType 'application/json' `
-H @{ 'Authorization' = 'Bearer YWRtaW46Ym9vdHN0cmFwOmFkbWluMTIz' } `
-InFile ldm.json
Once you have the LDM generated, you can start building dashboards and insights.