Types of connectors
Dataedo is shipped with a number of native connectors that can connect to specific technologies: databases, applications, BI tools, ETL tools, etc. and extract metadata.
- Database connectors - SQL Server, Snowflake, PostgreSQL and more
- BI connectors - Power BI, Tableau, SSRS and more
- Application connectors - Microsoft Dynamics 365, Salesforce
- Metadata repositories connectors - AWS Glue Data Catalog, Apache Hive Metastore, Microsoft Dataverse
- File connectors - Apache Parquet, Delta Lake, CSV, JSON, and more
- ETL connectors - Azure Data Factory, SSIS, dbt
- Storage access - Local disk, FTP/SFTP, Amazon S3, Azure Blob Storage/Data Lake Storage
ODBC (Open Database Connectivity) is a standard way of accessing databases. Most databases, applications and other sources provide ODBC drivers that you install and configure on your workstation. Once configuration is set you can then use Dataedo ODBC connector, select the connector and metadata will be imported from your source. Please note that metadata available through most ODBC drives is limited.
Custom SQL Connectors (from 23.2)
In version 23.2, we are unveiling a mechanism that allows for the creation of custom connectors for data sources that support SQL and extract metadata with data dictionary tables (supported by most relational databases). These custom connectors are delineated through a series of predefined SQL queries that extract metadata.
Furthermore, these connectors have the capability to support data profiling, reference data, and primary key/foreign key tester functionalities.
As of now, the custom connectors are crafted exclusively by the Dataedo team. However, we are working towards empowering users to create their own custom connectors in forthcoming updates.
Metadata import with interface tables
Dataedo provides a powerful feature that allows users to import various metadata into catalog using interface tables - a set of predefined tables where users can upload metadata extracted on their own and run import that loads it safely into repository.
SQL DDL imports
Sometimes you cannot allow a third party software to connect to your database, but you can dump database structure with a set of SQL DDL scripts (
create statements). Dataedo can import schema from DDL scripts a selected SQL dialects.
Metadata extraction techniques
Dataedo uses following techniques to extract metadata from your source:
Scanning data dictionary tables
Most databases and data platforms provide System Catalog / Data Dictionary tables. Dataedo scans those tables to identify tables, columns and other structures.
Some data sources provide APIs that provide metadata.
Sources like document stores (MongoDB for instance) require sampling of documents to identify its structure.
Parsing report/ETL code
To extract metadata from BI or ETL tools Dataedo needs to parse the code of reports and packages.
To build lineage for SQL code Dataedo parses SQL to identify objects used, their columns and data movement.
To extract information about column data profile (min-max, average, number of nulls, etc.) Dataedo executes a number of predefined data profiling SQL queries.