Advanced Modelling Use Cases

Every database is different, and sometimes one or more of the following edge case scenarios may need to be considered when modelling the data.

Multiple Datasets Mapped to the Same Table

This is typically needed when one table represents multiple logical entities, and each entity should be represented by a separate dataset in the LDM.

While the LDM Modeler supports mapping of multiple datasets to the same table, publishing an LDM with such mapping fails.

To avoid this issue, create multiple views on top of the table and map each dataset to a separate view.

For example, you have two tables, users and tickets.

  • The users table contains ticket creators and assignees.
  • The tickets table contains the assignee_id and creator_id columns.

To avoid mapping multiple datasets to the users table, do the following:

  1. In the database, create two views on top of the users table: v_users_assignees and v_users_creators.
  2. In the LDM, create three datasets: assignees, creators, and tickets.
  3. Map the tickets dataset to the tickets table.
  4. Map the assignees dataset to the v_users_assignees view.
  5. Map the creators dataset to the v_users_creators view.
  6. Create a relationship from the assignees dataset to the the tickets dataset using the assignee_id column as a primary key in the assignees dataset.
  7. Create a relationship from the creators dataset to the tickets dataset using the creator_id column as a primary key in the creators dataset.

Using Multiple Data Sources in an LDM

  • You can use datasets from different data sources in an LDM. They can be connected using common attribute or date dimension datasets.
  • The list of data sources prioritizes those already used in the current model. You can change the data source in dataset mapping.

Limitations

Understanding Data Federation, Data Blending, and Data Joins

  • Data Federation:
    Data federation creates a virtual layer that lets users access and query data from multiple sources without moving or copying it.

  • Data Blending:
    Data blending combines specific subsets of data from multiple sources for analysis in a single visualization.

  • Data Joins:
    The key difference between data blending and traditional joins is the timing of the operation:

    • In traditional joins, the join is usually performed before aggregation within the database.
    • In data blending, the operation is typically applied after aggregation, using data that has already been processed from different sources.

Blending Data from Multiple Sources in a Single Visualization

You can create visualizations that use multiple metrics from different data sources. These metrics are blended into one common result set using shared conformed dimension attributes. This approach allows you to connect data from multiple sources in real-time, giving you a complete view of your business without the need for complex data movement.

A visualization showing data from multiple data sources

The common attributes can be:

  • An attribute of a date dimension dataset referenced in each data source used for data blending.
  • An attribute defined as a primary key in a dimension dataset, present as a foreign key in each data source used for data blending.

These common dimension attributes can be used for breaking down metrics, grouping, slicing, or filtering in a visualization that uses blended data.

LDM with common dimension datasets

Attributes are defined by their primary label (e.g., Federal Information Processing Standard state code). Optionally, you can define a secondary label (e.g., a user-friendly display name like “California”) in the dimension dataset. In this case, blended visualizations will display this label, even if the secondary label is not present in all data sources.

However, using secondary labels for common attributes requires retrieving all dimension attribute values during queries, which can slow down processing. For this reason, we recommend avoiding secondary labels for attributes with high cardinality.

Limitations

  • Ranking Filters and Metric Value Filters in Analytics Designer cannot be used with visualizations that combine metrics from multiple sources.

  • Metrics from different sources can only be sliced or filtered using attributes that are:

    • References to common dimension datasets.
    • Present in each data source involved in the visualization.
  • The Show Missing Values feature is not supported.

  • Only fact-based metrics can be used.

  • A fact table can be directly linked to a dataset, but there can only be one data source change in a single path of relationships.

    For example:
    Country (Snowflake) -> County (BigQuery) -> City (Snowflake) -> Facts (Snowflake)
    In this example, the path changes data sources twice (from Snowflake to BigQuery and back to Snowflake), which is not supported.