dbt tool that helps data teams to transform data in their warehouses by simply writing select statements - it enables analysts to work more like software engineers. dbt cloud is a hosted service that helps data analysts and engineers productionize dbt deployments. It comes equipped with turnkey support for scheduling jobs, CI/CD, serving documentation, monitoring, alerting, and an integrated development environment (IDE).
Supported elements and metadata
All objects from dbt will be imported as views in Dataedo, the type will determine the actual materialization. Dataedo creates an automatic data lineage between dbt objects (always) and sources (only if you connect the data warehouse used in this dbt project). We also support the not_null
test - if you test a column in dbt to see if it is not_null
, the Nullable
marker in Dataedo will be unchecked and the unique
test - the corresponding constraint will be added.
{{ source(source_name, table_name) }}
or {{ ref(package_name, model_name) }}
. So only when you see a lineage between objects in the documentation generated by dbt, it will be imported into Dataedo.
Prerequisites
- dbt Cloud Service Token with at least read-only permissions.
- Run Id which contains the
manifest.json
file and for the best resultscatalog.json
file as well.
Preparations
Creating a service token
In the upper right edge, click on the gear wheel, then select Account Settings.
On the left panel, select Service Token, choose + New Token
Name the token, add member permissions, select the projects you want to import, and click save.
Copy the token and save it in a safe place. Once you close it, you will no longer be able to see its value in dbt Cloud.
Create proper run and read run ID
Select Deploy, then Jobs.
In the upper right corner, click on Create Job.
Name the job, specify the environment. In the command field, write dbt docs generate
. You can uncheck Run on schedule, and click save.
In the upper right corner, click on Run now.
When the run is over, click on it.
In the field marked with an arrow you will be able to read the Run Id - here it's 12345678 (without the hashtag). Save it because it will be necessary for the import.
Connect Dataedo to dbt project
To connect to the dbt project create new documentation by clicking Add and choosing Database connection.
Choose dbt (beta) from the list and click Next >:
Select dbt Cloud (beta) and click Next >:
Paste Service Token, click on three dots next to Account, and choose account. Paste Run Id and for best results, add a data warehouse already documented in Dataedo in which dbt was running by selecting it in the Database field. Click Connect.
In the next window, you will be asked what objects you want to be imported into Dataedo. If you will want to omit some objects in bulk, refer to the advanced import filter. After selecting the objects, click Next >.
Add a document title and click Import.
Outcome
dbt project has been imported to new documentation. Automatic data lineage was created.
.