1. Connect Data

To perform analytics with GoodData, you must first establish a connection to a database. This process involves creating a data source object in GoodData, which represents the external database containing your source data. We query your database in real time to compute analytics, and only cache the results.

This tutorial uses our sample Snowflake database. However, you are welcome to connect to any of the supported databases or upload a CSV file. For the best learning experience, we recommend completing this tutorial with the sample Snowflake database before using your own data.

UI
API

Steps:

  1. Go to Data sources and click Connect data.

    The Data sources page with the Connect data button to add a new data source.
  2. Select the type of database you are using, we recommend connecting to our sample Snowflake database.

    Connect data dialog displaying supported database options, with Snowflake highlighted.
  3. Name your data source, fill in the connection credentials:

    • Account Name: gooddata
    • Username: gooddata_demo
    • Password: 4m62f7hSXAayAisZ
    • Database name: GOODDATA_DEMO_DATABASE
    • Warehouse: GOODDATA_DEMO_WAREHOUSE

    and click Connect.

    Data source creation dialog with fields for connection credentials.
  4. Fill in the schema name ECOMMERCE_DEMO_SCHEMA and click Save.

    Second part of the Data source creation dialog for setting up schemas and caching.

    The data source is created and appears in the list of data sources.

    The Data source page listing the newly created data source.

You can connect to our sample Snowflake database using the following API call:

curl $HOST_URL/api/v1/entities/dataSources \
  -H "Content-Type: application/vnd.gooddata.api+json" \
  -H "Accept: application/vnd.gooddata.api+json" \
  -H "Authorization: Bearer <API_TOKEN>" \
  -X POST \
  -d '{
      "data": {
          "attributes": {
              "name": "Demo Snowflake DB",
              "url": "jdbc:snowflake://gooddata.snowflakecomputing.com?warehouse=GOODDATA_DEMO_WAREHOUSE&db=GOODDATA_DEMO_DATABASE",
              "schema": "ECOMMERCE_DEMO_SCHEMA",
              "type": "SNOWFLAKE",
              "username": "gooddata_demo",
              "password": "4m62f7hSXAayAisZ"
          },
          "id": "demo-ds",
          "type": "dataSource"
      }
  }' | jq .

To confirm that the database has been connected, the server returns the following response:

{
  "data": {
    "id": "demo-ds",
    "type": "dataSource",
    "attributes": {
      "name": "Demo Snowflake DB",
      "type": "SNOWFLAKE",
      "url": "jdbc:snowflake://gooddata.snowflakecomputing.com?warehouse=GOODDATA_DEMO_WAREHOUSE&db=GOODDATA_DEMO_DATABASE",
      "schema": "ECOMMERCE_DEMO_SCHEMA",
      "username": "gooddata_demo"
    }
  },
  "links": {
    "self": "$HOST_URL/api/v1/entities/dataSources/demo-ds"
  }
}

About the Sample Database

The data represent a simple scenario of an e-commerce system that tracks customers, how they order certain products, and how much money was spent on various marketing campaigns in different campaign channels.

The data is stored and organized according to the following physical schema of tables and their relationships:

Sample data structure showing tables and their relationships in the e-commerce schema.

The database schema describes the structure of the physical tables in the sample database, forming the foundation for building your LDM. The relationships between the sample datasets are structured as follows:

  1. Customer Table

    • Contains details such as customer city, country, email, ID, and state.
    • Can be connected to the Order lines and Returns tables through the Customer ID primary key.
    • Includes Customer created date which can be mapped to a corresponding date dimension dataset in the LDM.
  2. Product Table

    • Contains product-related information such as product brand, category, image, rating, and ID.
    • Can be connected to the Monthly Inventory and Order Lines tables via Product ID as the primary key to track which products are included in orders.
  3. Orders Table

    • Stores order information, including order ID and order status.
    • Can be connected to the Order Lines and Customer tables via the Order ID primary key.
  4. Monthly Inventory Table

    • Tracks inventory levels over time, including inventory ID, quantity at beginning (BOM) and end (EOM) of the month.
    • Can be connected to the Product table via the Product ID foreign key to track inventory per product.
  5. Order Lines Table

    • Central table that stores details about each item within an order, such as customer age, order line ID, unit cost, discount, unit price, and unit quantity.
    • Can be connected to the Customer, Product, and Order tables using the Order ID, Product ID, and Order ID foreign keys.
    • Connected to the Date and Order Date date dimension datasets.
  6. Returns Table

    • Tracks returned items, including return ID, unit cost, paid amount, and unit quantity.
    • Can be connected to the Customer and Orders via the Customer ID and Order ID foreign keys to associate returns with specific orders.
    • Connected to the Date and Return Date date dimension datasets.
  7. Date Tables
    Several Date dimension tables are used to track various time-related aspects, including Date, Order Date, Inventory Month, Return Date, and Customer Created Date. These tables function as date entities and do not physically exist in the database schema. Instead, data fields in the database tables are mapped to date dimension datasets within GoodData. Since the date dimension is handled virtually, there is no need to create physical tables in the database.