Databricks Unity Catalog support

Databricks is a data processing cloud-based platform. It simplifies collaboration of data analysts, data engineers, and data scientists. Databricks is available in Microsoft Azure, Amazon Web Services, and Google Cloud Platform.

Dataedo will connect to single catalog Unity Catalog via API, and document objects and data lineage within the connected catalog.

Instructions on how to connect to Databricks using Dataedo can be found at: Connecting to Databricks Unity Catalog

Connector features

Data Source	Support	Schema	Lineage	Profiling	Classification	Export comments	FK tester	DDL import
Databricks Unity Catalog	Native	✅	Column Level	❌	✅	✅	NA	NA

Data Catalog

Dataedo will document following objects and their respective properties from Databricks:

Object Name	Metadata	Lineage
Delta Live Tables	✅	✅
Pipelines	Limited	✅
Tables	✅	✅
Views	✅	✅
Columns	✅	✅
External locations	✅	✅
External Tables	✅	✅
Primary keys	✅
Foreign keys	✅

Objects Properties Configuration & Support

Documentation is created for one selected catalog from Databricks Unity Catalog.

Known Limitations

Documentation Functionality

Data Profiling is not available for Databricks, however we're working on this feature for feature releases.
Connection to multiple catalogs at once or regional metastore is not yet supported [it is on the roadmap]
For pipelines, Dataedo will discover only name, not script

Lineage Functionality

Column level lineage for external tables will be created only if data source (for example JSON file) schema is automatically discovered by Databricks and column names are not changed

By Use Case

By Industry

Learn About Dataedo

Learn About Data

After Hours