Amazon Redshift - Automatic Data Lineage

Wojtek Bialek - Dataedo Team Wojtek Bialek 17th July, 2024

What to expect

Views

Dataedo analyzes SQL of Redshift views with built in PostgreSQL Parser and builds column-level lineage from tables/views queried by view to the view.

Column-level lineage

Learn more about PostgreSQL parser

External Tables

Dataedo build object-level lineage between Redshift External Tables and objects from other databases based on LinkedSources. Dataedo will automatically assign linked source to appropriate objects, however user also can pick it manually.

External Tables

Stored Procedures

Dataedo will create column-level data lineage for stored procedures based on the script. Script will be devided into steps represented as seperate processes. Data lineage will be created only for supported steps, the unsupported steps will be named after the first word from the process script and end with three dots. This is seen on the data lineage configuration tab in Desktop.

Stored Procedures column-level lineage

Known Limitations

  1. Check the limitations for views lineage from PostgreSQL parser
  2. Check the limitations for stored procedures lineage from SQL parser

Troubleshooting

I don't see data lineage for views

  1. Make sure you have selected right SQL dialect - in this case PostgreSQL (SQL Dialect field at Data Source level).
  2. Rerun import of the source - maybe schema was imported in older version or configuration was incorrect.

I don't see data lineage for external tables

  1. Make sure the source object has Linked Source with correctly assigned database
  2. Rerun import of the source - maybe schema was imported in older version or configuration was incorrect.

I don't see data lineage for stored procedures

  1. Make sure Dataedo supports SQL syntax of procedure. Check Know Limitation above
  2. Rerun import of the source - maybe schema was imported in older version or configuration was incorrect.

Cross database lineage is not built

  1. Make sure the source object has Linked Source with correctly assigned database
  2. Rerun import of the source - maybe schema was imported in older version or configuration was incorrect.