Power BI - Automatic Data Lineage

Michał Trybulec - Dataedo Team Michał Trybulec 3rd June, 2024

What to Expect

Dataedo imports automatically:

  1. Repors, visualizations and fields
  2. Data sets, tables and columns
  3. Column-level lineage from datasets - reports
  4. Column-level lineage from source database tables to data sets

SQL Server → Power BI

Lineage is created from Tables and Views in SQL Server to Dataset Tables in Power BI and from Dataset Tables to Power BI Reports. It is created at the column level. Supported Data Connectivity modes: Import and Direct Query

Image title

Microsoft Fabric Data Lakehouse → Power BI

Lineage is created from Tables in Microsoft Fabric Data Lakehouse to Dataset Tables in Power BI and from Dataset Tables to Power BI Reports. It is created at the column level. Supported Data Connectivity modes: OneLake, Import and Direct Query.

Limitation: If the Dataset uses MS Fabric Data Warhouse and MS Fabric Data Lakehouse at the same time, the lineage of only one of these two will be adjusted. Manually correcting Linked Source will solve the problem.

Image title

Microsoft Fabric Data Warehouse → Power BI

Lineage is created from Tables in Microsoft Fabric Data Warehouse to Dataset Tables in Power BI and from Dataset Tables to Power BI Reports. It is created at the column level. Supported Data Connectivity modes: OneLake, Import and Direct Query.

Image title

PostgreSQL → Power BI

Lineage is created from Tables and Views in PostgreSQL to Dataset Tables in Power BI and from Dataset Tables to Power BI Reports. It is created at the column level. Supported Data Connectivity modes: Import and Direct Query.

Image title

MySQL → Power BI

Lineage is created from Tables and Views in MySQL to Dataset Tables in Power BI and from Dataset Tables to Power BI Reports. It is created at the column level. Supported Data Connectivity modes: Import and Direct Query.

Image title

SQL Server Analysis Services Tabular - import mode → Power BI

Lineage is created from Tables in SQL Server Analysis Services Tabular to Dataset Tables in Power BI and from Dataset Tables to Power BI Reports. It is created at the column level. Supported Data Connectivity modes: Import and Live Connection

Image title

SQL Server Analysis Services Tabular - live connection mode → Power BI

Image title

SQL Server Analysis Services Multidimensional → Power BI

Lineage is created from Cubes and Dimensions in SQL Server Analysis Services Tabular to Dataset Tables in Power BI (only for cubes measures) and from Dataset Tables to Power BI Reports. It is created at the column level. Supported Data Connectivity modes: Import and Live Connection

Limitation: SSAS Multidimensional does column lineage only for cube measures

Image title

Snowflake → Power BI

Lineage is created from Tables in Snowflake to Dataset Tables in Power BI and from Dataset Tables to Power BI Reports. It is created at the column level. Supported Data Connectivity modes: Import and Direct Query.

Image title

Databricks Unity Catalog → Power BI

Lineage is created from Tables in Databricks to Dataset Tables in Power BI and from Dataset Tables to Power BI Reports. It is created at the column level. Supported Data Connectivity modes: Import and Direct Query.

Image title

File → Power BI

Lineage is created from File (csv, xlsx) to Dataset Tables in Power BI and from Dataset Tables to Power BI Reports. It is created at the column level.

Image title

SQL Server → Power BI Paginated Report

Lineage is created from Tables and Views in SQL Server to Paginated Reports Datasets in Power BI and from Paginated Reports Datasets to Paginated Reports. It is created at the column level.

Image title

Power BI Dataset → Power BI Paginated Report Dataset → Power BI Paginated Report

Lineage is created from Datasets in Power BI to Paginated Reports Datasets in Power BI and from Paginated Reports Datasets to Paginated Reports. It is created at the object level.

Image title

Lineage in measures

The lineage view shows the lineage from columns to measures in Dataset Tables and from measures to visualizations in the Reports.

Image title

Lineage in dataflows

Lineage is created from data sources to Dataflow Tables in Power BI, and then from Dataflow Tables to Datasets, followed by the flow from Datasets to Reports.

Limitation: The building of this lineage only works when using Service Principal authentication.

Image title

Lineage when dataset is created by a SQL Query

Dataedo creates a lineage when a Dataset is created by a SQL query. We use an advanced SQL Parser for this. More about our Parser here. Parser finds column source even if query uses Common Table Expressions.

Image title

Lineage when dataset is created from Multiple Sources

Dataedo traces lineage effectively, even when data is pulled from various sources. It identifies the origin of each table within a dataset, ensuring thorough Linked Sources alignment. Each table is individually mapped to its respective source, providing detailed tracking for datasets with multiple origins.

Example: Image title

This is what the same flow looks like in Dataedo: Image title

Data Connectivity mode

Data Connectivity mode is a way of storing tables in datasets. It is determined when creating a dataset in Power BI.

Image title

Dataedo creates lineage in the above cases for this data connectivity modes: - Import - DirectQuery - Live Connection

Summary

Dataedo provides insight into data lineage from the source to the final destination. In the example shown below, a column originates from a file, passes through SSIS (SQL Server Integration Services), is loaded into SQL Server, and from there is used in a Power BI report.

This flow illustrates the end-to-end data movement, allowing users to trace the data's journey from its original source to its ultimate usage in reporting, ensuring transparency and facilitating debugging or optimization efforts.

Image title

Known Limitations

  • For Power Bi on prem (Power Bi Report Server) we do not create a lineage .

Troubleshooting

Lineage is created based on a matched Linked Source, check the Dataedo desktop to make sure it is matched properly. Image title