dbt Core™ is an open source command line tool that enables data teams to transform data in their warehouses by simply writing select statements - it enables that analysts to work more like software engineers. Dataedo supports documenting dbt projects which you can access locally. In future releases support for connecting to dbt Cloud™ will be added.
Supported elements and metadata
All objects from dbt will be displayed as views in Dataedo, the type will determine the actual materialization. Dataedo creates an automatic data lineage for you between dbt objects (always) and sources (only if you connect data warehouse used in this dbt project). We also supports not_null
test - if you test a column in dbt to see if it is not_null
, Nullable
marker in Dataedo will be unchecked and unique
test - the corresponding constraint will be added.
{{ source(source_name, table_name) }}
or {{ ref(package_name, model_name) }}
. So only when you see a lineage between objects in the documentation generated by dbt, it will be imported into Dataedo.
Prerequisites
You need to have access to the folder on your computer where the dbt project is located. The folder must contain the dbt_project.yml
file in which the correct target-path
name is specified (target
by default). The folder to which the target-path
leads must contain a catalog.json
and manifest.json
file (as a last resort, manifest.json
alone will suffice).
Creating a catalog.json
file
The catalog.json file is created after executing the dbt docs generate command. Open the terminal, navigate to the folder where the dbt project is located and execute the dbt docs generate
command. If everything is properly configured the catalog.json
file should generate. If you have problems, refer to the dbt documentation.
Connect Dataedo to dbt project
To connect to dbt project create new documentation by clicking Add documentation and choosing Database connection.
On the Add documentation window choose dbt (beta) and click Next >:
On the next screen, click on the three dots next to the path field, then find the dbt project folder (it must contain the dbt_project.yml file), click on it and confirm with Ok.
For best results, add a data warehouse already documented in Dataedo in which dbt was running. To do this, click on the dropdown next to the Database (optional) field and select the database in which dbt was running.
Click on the Connect button at the bottom right of the window.
In the next window you will be asked what objects you want to import into Dataedo. If you will want to omit some objects in bulk refer to the advanced import filter. After selecting the objects to be imported, click Next >.
Add a document title and click Import.
Outcome
dbt project has been imported to a new documentation. Automatic data lineage was created.
.