What to Expect
Entire column-level data lineage within the Snowflake instance
Users with Snowflake Enterprise Edition (or higher) can automatically generate column-level data lineage based on all executed scripts that moved data between Snowflake objects up to 1 year back. Read more about this feature in the Snowflake Enterprise Lineage connector documentation.
You can view column-level data lineage along with the scripts that created this lineage within Snowflake instance.
Views/Materialized views
Data lineage for views/materialized views is created using our SQL Analyzer (Parser). Our analyzer creates column-level data lineage based on the view script. The data flow is created between Snowflake objects (tables, views, etc.) and view.
Dynamic tables
Data lineage for dynamic tables is created using our SQL Analyzer (Parser) and using Snowflake metadata (INFORMATION_SCHEMA.DYNAMIC_TABLE_GRAPH_HISTORY()
).
- Our analyzer creates column-level data lineage based on the dynamic table script.
- Snowflake metadata provides only object-level data lineage.
The data flow is created between Snowflake objects (tables, views, etc.) and dynamic table.
Because dynamic tables behave more like views than tables, their main type is View
(with subtype Dynamic table
), not Table
. That's why they are displayed on screenshots as views.
Stages
Data lineage for stages is created using Snowflake metadata (INFORMATION_SCHEMA.STAGES
). The data flow is created on object-level between external storage (like S3 or Azure Blob Storage) objects (CSV, JSON, Parquet files, folders, etc.) and Snowflake stage.
External tables
Data lineage for external tables is created using Snowflake metadata (INFORMATION_SCHEMA.EXTERNAL_TABLES
). The data flow is created on object-level between Snowflake stage and external table.
Snowpipes (Snowflake Data Pipelines)
Data lineage for Snowpipes is created using simple regex pattern matching on the Snowpipe script. The data flow is created on object-level between Snowflake stage and Snowpipe and then between Snowpipe and table: stage → snowpipe → table.
Direct loading
Dataedo supports data lineage created by directly loading data into tables from stages using the COPY INTO
command. The data flow is created based on Snowflake metadata (ACCOUNT_USAGE.COPY_HISTORY
- requires requires elevated permissions). The data flow is created on object-level between Snowflake stage and table with SQL Script
as the processor.
Streams
Dataedo supports column-level data lineage for streams, where the data flow is created between table/view and stream. The lineage is established based on column names. Note that Dataedo does not support lineage for streams created on directory tables (stages).
Known limitations
- Check the limitations for views lineage from SQL parser.
- With more complex snowpipes copy statements, the data lineage may not be created correctly.
- Snowpipes and stages regex patterns are ignored when lineage is created.
- Lineage from snowpipes with transformations is not supported.
- Snowflake Enterprise Lineage connector only creates data lineage for data moved between Snowflake objects. It does not support lineage for data moved between Snowflake and external storage.
Troubleshooting
Snowflake Enterprise Lineage connector
- Make sure that your Snowflake instance is running on Enterprise Edition or higher.
- Make sure that the role used to connect to Snowflake has the necessary permissions to access the
ACCOUNT_USAGE
schema. - Check if objects used in the scripts are still available in Snowflake and are imported to Dataedo.
- Check if the scripts are supported by the connector - connector documentation.
- Check the date of script execution - the connector can only retrieve scripts executed up to 1 year back but also have 3 hours of latency.
Views/Materialized views
- Make sure the view script was imported correctly and is valid (e.g. referenced objects exist).
- Check if the view script contains any unsupported SQL syntax - parser documentation.
- Make sure that dialect is set correctly and "Automatic lineage" and "Parse SQL" checkboxes are checked in the Snowflake import window.
- Try to reparse the view script. You can do it by clicking the "Parse Script to Lineage" button or by rerunning the import process.
Dynamic tables
- Make sure the dynamic table script was imported correctly and is valid (e.g. referenced objects exist).
- Check if the dynamic table script contains any unsupported SQL syntax - parser documentation.
- Make sure that dialect is set correctly and "Automatic lineage" and "Parse SQL" checkboxes are checked in the Snowflake import window.
- Verify in Snowflake if
SELECT * FROM TABLE(INFORMATION_SCHEMA.DYNAMIC_TABLE_GRAPH_HISTORY())
returns any results using the same user, role, and database as provided in the Dataedo. - Try to reparse the dynamic table script. You can do it by clicking the "Parse Script to Lineage" button or by rerunning the import process.
Stages
- Make sure that referenced storage objects are imported to Dataedo and contain valid paths.
- Make sure the "Automatic lineage" checkbox is checked in the Snowflake import window.
- Verify in Snowflake if
SELECT * FROM INFORMATION_SCHEMA.STAGES
returns any results using the same user, role, and database as provided in the Dataedo. - Try rerunning the import process (especially if referenced storage objects were added after the last import).
External tables
- Make sure the "Automatic lineage" checkbox is checked in the Snowflake import window.
- Verify in Snowflake if
SELECT * FROM INFORMATION_SCHEMA.EXTERNAL_TABLES
returns any results using the same user, role, and database as provided in the Dataedo. - Try rerunning the import process.
Snowpipes (Snowflake Data Pipelines)
- Make sure the Snowpipe script was imported correctly and is valid (e.g. referenced objects exist).
- Make sure the "Automatic lineage" checkbox is checked in the Snowflake import window.
- Verify if the Snowpipe script contains transformations (should not contain any
SELECT
). - Try rerunning the import process.
Direct loading
- Make sure the "Automatic lineage" and "Copy history" checkboxes are checked in the Snowflake import window.
- Verify in Snowflake if
SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.COPY_HISTORY
returns any results using the same user, role, and database as provided in the Dataedo.SNOWFLAKE.ACCOUNT_USAGE
requires elevated permissions. - Try rerunning the import process.
Streams
- Make sure the "Automatic lineage" checkbox is checked in the Snowflake import window.
- Verify if the object used in the stream is imported to Dataedo.
- Verify in Snowflake if
SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.COLUMNS
returns any results using the same user, role, and database as provided in the Dataedo.SNOWFLAKE.ACCOUNT_USAGE
requires elevated permissions. - Check if the stream is created on a table or view. If it's created on a directory table (stage), the lineage will not be created.
- Try rerunning the import process.