Deployment Architecture

This article provides an overview of GoodData.CN’s microservice architecture and the flow of a report computation request.

Microservices

GoodData.CN is primarily composed of microservices that communicate synchronously through gRPC and asynchronously via Apache Pulsar. External communication with GoodData.CN is possible through the public REST API.

Most of these microservices are built using Kotlin (JVM) with the Spring Boot framework, while others are based on Python.

GoodData Architecture

Authorization Service

This service handles authentication-related use cases (based on OAuth 2.0) through the OAuth2 Library. The service also provides user context, containing basic information about the organization, authenticated user, and more. This context is useful for all API clients.

GoodData.CN stores session data (Access and ID tokens) as encrypted HTTP cookies to enhance security.

Metadata Service

This service manages all GoodData.CN metadata, such as report and metric definitions, and exposes a public REST API for managing it. Metadata is stored in a PostgreSQL database.

The Calcique Service and SQL Executor cache some metadata. Therefore, the Metadata Service must notify these services about specific metadata changes asynchronously through Apache Pulsar.

For real-time metadata requests, dedicated gRPC calls are used.

AFM Service

The AFM (Attributes, Filters, and Metrics, formerly called measures) Service serves as the gateway for report computation and related use cases. It allows REST clients to define report executions more easily than by creating complex metrics. The service handles the following use cases:

  • Report execution: Computation of Reports
  • Valid objects: Returns compatible Logical Data Model (LDM) entities (attributes, facts, metrics) based on the current context (AFM).
  • Label elements: Returns label values (distinct values in the mapped table column).

The service utilizes the MAQL parser, collects relevant metadata from the Metadata Service, and transforms AFM requests into MAQL ASTs. These ASTs can be used by the Calcique Service to generate corresponding SQL statements.

Scan Model Service

This service forwards requests to the SQL Executor Service, which communicates exclusively with the data source). Once the SQL Executor Service retrieves all data source metadata, it returns the result to the Scan Model Service. This service then uses the metadata to generate an LDM based on naming convention parameters. The Scan Model Service is also responsible for testing connectivity to data sources.

Calcique Service

The Calcique Service is the core component of GoodData.CN. Built on the Apache Calcite project, it provides features for manipulating SQL trees and generating valid SQL statements for all supported data sources.

Inputs for this service include MAQL AST, LDM. The Calcique Service retrieves these inputs from the Metadata Service and caches them (cache invalidation is done through Apache Pulsar). The inputs are then used to generate optimized query trees, which are sent via Apache Pulsar to the SQL Executor Service.

Result Cache Service

The service receives results from the SQL Executor Service (streamed through gRPC) and stores them as raw results under rawId. Afterward, a process called cross-tabulation begins, which is responsible for pivoting, sorting, and pagination (dividing results into multiple pages). Once the cross-tabulation is completed, the results are stored under resultId in FlexQuery.

Once the service has fully utilized it’s filestorage (1GB by default) all new caches trigger last recently used caches eviction to keep filestorage under it’s limit.

Redis is used to store metadata about the cached data in FlexQuery.

SQL Executor Service

This service is responsible for executing queries, retrieving results, storing results in the cache, managing database connection pooling, scanning database metadata (schemas, tables, columns, constraints), and testing data source configuration correctness.

Export Services

Services responsible for exporting visualizations as PDF, XLSV and CSV files.

Report Computation Request

Report computation is a common operation in GoodData.CN. The following text and flow diagram illustrate how a report computation request is processed.

The entire flow consists of:

  1. A POST request to /api/v1/actions/workspaces/{workspaceId}/execution/afm/execute with an AFM (Attributes, Filters, and Measures) in the request body. The response includes a resultId, which is predictably generated. This means that repeated requests for the same report will utilize the result cache and avoid additional computations on the data source.
  2. The AFM body is translated into MAQL.
  3. Using the LDM (logical data model), object identifiers (facts, metrics, attributes, etc.) are resolved in MAQL. The Calcique Service can then generate corresponding SQL statements.
  4. SQL statements are executed against the data source, and raw results are stored in the cache under rawId.
  5. Crosstabulation is applied, and the final results are stored in the cache under resultId.
  6. A GET request to /api/v1/actions/workspaces/{workspaceId}/execution/afm/execute/result/{resultId} retrieves the result.
Computation of Request

UI Applications

This service includes all GoodData UI applications: