Fivetran

Hubert Książek - Dataedo Team Hubert Książek 31st July, 2024

Introduction

Fivetran is a cloud-based data integration platform designed to simplify the process of consolidating and analyzing data from a multitude of sources. It streamlines the traditionally complex ELT (Extract, Load, Transform) process, making data movement effortless and automated. Fivetran supports a wide array of data sources, including on-premises databases, cloud storage solutions, SaaS applications, and more.

The platform's architecture is built around a cloud-native, fully managed service model. Key components of Fivetran's data integration process include:

  1. Connection: Fivetran provides pre-built connectors for various data sources, allowing users to establish connections easily without the need for custom integrations.
  2. Data Extraction: Utilizing intelligent change data capture (CDC) mechanisms, Fivetran efficiently extracts only the incremental changes from source systems, minimizing load and ensuring swift data transfer.
  3. Data Loading: Data is loaded into the destination system, such as a data warehouse or data lake. Fivetran handles schema changes, data type conversions, and error management to ensure data consistency and reliability.
  4. Data Transformation: Fivetran offers built-in capabilities for data transformation, enabling users to clean, filter, and reshape data. It supports SQL-based transformations and provides a user-friendly interface for defining custom processes.

Designed with user simplicity, Fivetran's workflow typically involves selecting connectors, supplying credentials, specifying a schedule, and initiating execution with minimal effort.

Connecting to Fivetran

You can find instructions on how to connect to Fivetran in this article.

What's imported

Imported metadata

One Dataedo documentation corresponds to one Fivetran group (destination).

Destinations and sources are imported to Dataedo as linked sources. To build a data lineage, these linked sourcesmust be assigned to the corresponding Source Database in Dataedo.

Linked source

Data movement created by Fivetran is represented as an ETL program in Dataedo. Each ETL program illustrates the flow of data from a source table to a destination table. These objects have the same name and schema as the corresponding objects in the destination database.

ETL program

Supported Dataedo features

Feature Supported
Data profiling NA*
Data classification
Data lineage (manual)
Data lineage (automatic)
Reference data (import lookups) NA*
Importing from DDL NA*
Generating DDL NA*
FK relationship tester NA*

*NA - not applicable

Automatic Data lineage

You can find information about automatic data lineage in this article.

Column-level lineage from Database to Destination with Fivetran in the middle

Known limitations

Dataedo only works with Fivetran sources that are applications (like Salesforce) or databases (like SQL Server) and are also supported by Dataedo. It doesn’t support other types of sources like events (such as Apache Kafka), files (like Amazon S3), functions (such as AWS Lambda), or sources without a corresponding connector in Dataedo (like Google Analytics).

Linked sources are not assigned automatically, this must be done manually.

Dataedo only imports information from the source connectors that are currently connected.

Transformations should be imported separately using dbt connector.