Amazon Redshift support

Applies to: Dataedo 23.x (current) versions, Article available also for: 10.x

Catalog and documentation

Data Dictionary

Dataedo imports tables, external tables, views, materialized views, columns and Copy Commands.

Descriptions, aliases and custom fields

When technical metadata is imported users will be able to edit descriptions of each object and element, provide meaningful aliases (titles) and document everyting with additional custom fields.

Import and export comments

When importing metadata from Redshift, Dataedo reads table, view, column and user defined functions and parameters comments. Dataedo does not write comments back to Redshift at this moment.

Business Glossary

Users will be able to link a Business Glossary term to any Redshift object.

Table relationships and keys

Dataedo imports table relationships (foreign keys), primary and unique keys with their columns.

ER Diagrams

Using imported and manually created foreign keys Dataedo allows you to create your own ER diagrams (ERDs) manually.

Data Profiling

Users will be able to run data profiling for a table, view or materialized view, then save selected data in the repository. This data will be available from Desktop and Web.

Lookups / Reference data

Users will be able to build Lookups for columns in Redshift tables and views and feed them with distinct values from a column.

Data Classification

Users will be able to run classification on Redshift database in the repository in serach of columns containing potentially sensitive data. All built in functions are supported.

Importing changes and schema change tracking

Changes to descriptions in Dataedo Desktop and Web Catalog are tracked and saved in the repository.

Description changes

Changes to descriptions in Dataedo Desktop and Web Catalog are tracked and saved in the repository. Users are able to manually edit object description.

Share in Web Catalog or export to HTML, PDF or Excel

Documentations can be exported by sharing it in web catalog or generating HTML, PDF or Excel.

Subject areas

Users can mannually create multiple ERDs in subject areas, as a diagram of whole database or only part of it.

Connection requirements

Cluster VPC Option

In order to import Redshift metadata. Cluster have to have turned on Public accessibility of cluster.

Redshift Action Menu

Public accessibility menu

Connecting to Redshift

To connect to Redshift create new documentation by clicking Add documentation and choosing Database connection.

Add documentation

On the connection screen choose Amazon Redshift as DBMS. Provide database connection details:

  • Host - provide a host name or address where a database is on. E.g. redshift-cluster.r5gf5.eu-west-1.redshift.amazonaws.com
  • Port - type in Redshift instance port name
  • User and password - provide your username and password (for database from target cluster)
  • Database - database name

Redshift Connector Control

Saving password

You can save password for later connections by checking Save password option. Password are saved in the repository database.

Importing schema

When connection was successful Dataedo will read objects and show a list of objects found. You can choose which objects to import. You can also use advanced filter to narrow down list of objects.

Objects to import

Confirm list of objects to import by clicking Next. Next screen with allow you to change default name of the documentation under with your schema will be visible in Dataedo repository.

Change title

Click Import to start the import.

Importing documentation

When done close import window with Finish button.

Import succeeded

Your database schema has been imported to new documentation in the repository.

Importing changes

To sync any changes in the schema in Redshift and reimport any technical metadata simply choose Import changes option. You will be asked to connect to Redshift again and changes will be synced from the source.

Scheduling imports

You can also schedule metadata updates using command line files. To do it, after creating documentation use Save update command option. Downloaded file can be run in command line, what will reimport changes to your documentation.

Specification

Imported metadata

Imported Editable
Tables
  Columns
   Data types
   Nullability
   Default value
   Column comments
  Table comments
  Foreign keys
  Primary keys
  Unique indexes
Views, Materialized Views
  Script
  Columns
   Data types
   Nullability
   Default value
    Column comments
  View comments
User-defined Functions
  Script
  Parameters
  Returned Value
  Parameter comments
  Function comments
Copy Commands
  Script

Supported features

Feature Imported
Import comments
Write comments back
Data profiling
Reference data (import lookups)
Importing from DDL
Generating DDL
FK relationship tester

Comments

Dataedo reads comments from following Redshift objects:

Object Read Write back
Table comments
  Column comments
View comments
  Columns
User-defined comments
  Parameters

Data profiling

Datedo supports following data profiling in Redshift:

Profile Support
Table row count
Table sample data
Column distribution (unique, non-unique, null, empty values)
Min, max values
Average
Variance
Standard deviation
Min-max span
Number of distinct values
Top 10/100/1000 values
10 random values

Read more about profiling in a Data Profliling documentation.

Data Lineage

Source Method Version
Views - object level From dependencies 10.4
Views - object level From SQL parsing 10.4
Views - column level From SQL parsing 10.4
External Tables - object level From dependencies 23.2

Under development

Known issues and limitations

  • Copy Commands - Due to retention time in Redshift Copy Logs, after one week (default, can be changed in cluster options) Copy Commands would be impossible to import
Found issue with this article? Comment below
Comments are only visible when the visitor has consented to statistics cookies. To see and add comments please accept statistics cookies.
0
There are no comments. Click here to write the first comment.