Amazon Redshift support

Applies to: Dataedo 23.x (current) versions, Article available also for: 10.x

Connctor Features

Data Catalog

Dataedo documents metadata of following objects:

  • tables
  • external tables
  • views
  • materialized views
  • functions
  • copy Commands

Data Lineage

For Redshift Dataedo supports both manual and automatic lineage. Automatic lineage is available for views and external tables.

Data Profiling

Users will be able to run data profiling for a table, view or materialized view, then save selected data in the repository. This data will be available from Desktop and Web.

Relationships (PK/FK) Tester

With Dataedo you can check if selected columns are viable candidates to create a relationship between tables.

Data Classification

Users will be able to run classification on Redshift database in the repository in serach of columns containing potentially sensitive data. All built in functions are supported.

Connection requirements

Cluster VPC Option

In order to import Redshift metadata. Cluster have to have turned on Public accessibility of cluster.

Redshift Action Menu

Public accessibility menu

How to connect

  1. Host - provide a address of redshift endpoint,
  2. Port - change the default port of Amazon Redshift instance if required,
  3. User - provide username of user (either root or IAM) that has access to Redshift database,
  4. Password - provide password for given username,
  5. SSL mode:
    • Disable - don't use SSL,
    • Require - conect with SSL. If server doesn't support SSL connection won't be established,
  6. Database - type in database name.

Redshift Connector Control

Connector specification

Imported objects

Object Imported as
Table Table
External Table Table
View View
Materialized View View
Function Function
Copy Command Sql Script

Imported metadata

Imported Editable
Tables
  Columns
   Data types
   Nullability
   Default value
   Column comments
  Table comments
  Foreign keys
  Primary keys
  Unique indexes
Views, Materialized Views
  Script
  Columns
   Data types
   Nullability
   Default value
    Column comments
  View comments
User-defined Functions
  Script
  Parameters
  Returned Value
  Parameter comments
  Function comments
Copy Commands
  Script

Supported features

Feature Imported
Import comments
Write comments back
Data profiling
Reference data (import lookups)
Importing from DDL
Generating DDL
FK relationship tester

Data profiling

Datedo supports following data profiling in Redshift:

Profile Support
Table row count
Table sample data
Column distribution (unique, non-unique, null, empty values)
Min, max values
Average
Variance
Standard deviation
Min-max span
Number of distinct values
Top 10/100/1000 values
10 random values

Read more about profiling in a Data Profliling documentation.

Data Lineage

Source Method Version
Views - object level From dependencies 10.4
Views - object level From SQL parsing 10.4
Views - column level From SQL parsing 10.4
External Tables - object level From dependencies 23.2
External Tables - object level From Linked Sources 24.1

Known issues and limitations

  • Copy Commands - Due to retention time in Redshift Copy Logs, after one week (default, can be changed in cluster options) Copy Commands would be impossible to import
Found issue with this article? Comment below
Comments are only visible when the visitor has consented to statistics cookies. To see and add comments please accept statistics cookies.
0
There are no comments. Click here to write the first comment.