Catalog and documentation
Data Dictionary
Dataedo imports tables, external tables, views, materialized views, columns and Copy Commands.
Descriptions, aliases and custom fields
When technical metadata is imported users will be able to edit descriptions of each object and element, provide meaningful aliases (titles) and document everyting with additional custom fields.
Import and export comments
When importing metadata from Redshift, Dataedo reads table, view, column and user defined functions and parameters comments. Dataedo does not write comments back to Redshift at this moment.
Business Glossary
Users will be able to link a Business Glossary term to any Redshift object.
Table relationships and keys
Dataedo imports table relationships (foreign keys), primary and unique keys with their columns.
ER Diagrams
Using imported and manually created foreign keys Dataedo allows you to create your own ER diagrams (ERDs) manually.
Data Profiling
Users will be able to run data profiling for a table, view or materialized view, then save selected data in the repository. This data will be available from Desktop and Web.
Lookups / Reference data
Users will be able to build Lookups for columns in Redshift tables and views and feed them with distinct values from a column.
Data Classification
Users will be able to run classification on Redshift database in the repository in serach of columns containing potentially sensitive data. All built in functions are supported.
Importing changes and schema change tracking
Changes to descriptions in Dataedo Desktop and Web Catalog are tracked and saved in the repository.
Description changes
Changes to descriptions in Dataedo Desktop and Web Catalog are tracked and saved in the repository. Users are able to manually edit object description.
Share in Web Catalog or export to HTML, PDF or Excel
Documentations can be exported by sharing it in web catalog or generating HTML, PDF or Excel.
Subject areas
Users can mannually create multiple ERDs in subject areas, as a diagram of whole database or only part of it.
Connection requirements
Cluster VPC Option
In order to import Redshift metadata. Cluster have to have turned on Public accessibility of cluster.
Connecting to Redshift
To connect to Redshift create new documentation by clicking Add documentation and choosing Database connection.
On the connection screen choose Amazon Redshift as DBMS. Provide database connection details:
- Host - provide a host name or address where a database is on. E.g. redshift-cluster.r5gf5.eu-west-1.redshift.amazonaws.com
- Port - type in Redshift instance port name
- User and password - provide your username and password (for database from target cluster)
- Database - database name
Saving password
You can save password for later connections by checking Save password option. Password are saved in the repository database.
Importing schema
When connection was successful Dataedo will read objects and show a list of objects found. You can choose which objects to import. You can also use advanced filter to narrow down list of objects.
Confirm list of objects to import by clicking Next. Next screen with allow you to change default name of the documentation under with your schema will be visible in Dataedo repository.
Click Import to start the import.
When done close import window with Finish button.
Your database schema has been imported to new documentation in the repository.
Importing changes
To sync any changes in the schema in Redshift and reimport any technical metadata simply choose Import changes option. You will be asked to connect to Redshift again and changes will be synced from the source.
Scheduling imports
You can also schedule metadata updates using command line files. To do it, after creating documentation use Save update command option. Downloaded file can be run in command line, what will reimport changes to your documentation.
Specification
Imported metadata
Imported | Editable | |
---|---|---|
Tables | ✅ | ✅ |
Columns | ✅ | ✅ |
Data types | ✅ | |
Nullability | ✅ | |
Default value | ✅ | ✅ |
Column comments | ✅ | ✅ |
Table comments | ✅ | ✅ |
Foreign keys | ✅ | ✅ |
Primary keys | ✅ | ✅ |
Unique indexes | ✅ | ✅ |
Views, Materialized Views | ✅ | ✅ |
Script | ✅ | ✅ |
Columns | ✅ | ✅ |
Data types | ✅ | |
Nullability | ✅ | |
Default value | ✅ | ✅ |
Column comments | ✅ | ✅ |
View comments | ✅ | ✅ |
User-defined Functions | ✅ | ✅ |
Script | ✅ | ✅ |
Parameters | ✅ | |
Returned Value | ✅ | |
Parameter comments | ✅ | ✅ |
Function comments | ✅ | ✅ |
Copy Commands | ✅ | ✅ |
Script | ✅ | ✅ |
Supported features
Feature | Imported |
---|---|
Import comments | ✅ |
Write comments back | |
Data profiling | ✅ |
Reference data (import lookups) | ✅ |
Importing from DDL | |
Generating DDL | ✅ |
FK relationship tester | ✅ |
Comments
Dataedo reads comments from following Redshift objects:
Object | Read | Write back |
---|---|---|
Table comments | ✅ | ✅ |
Column comments | ✅ | ✅ |
View comments | ✅ | ✅ |
Columns | ✅ | ✅ |
User-defined comments | ✅ | ✅ |
Parameters | ✅ | ✅ |
Data profiling
Datedo supports following data profiling in Redshift:
Profile | Support |
---|---|
Table row count | ✅ |
Table sample data | ✅ |
Column distribution (unique, non-unique, null, empty values) | ✅ |
Min, max values | ✅ |
Average | ✅ |
Variance | ✅ |
Standard deviation | ✅ |
Min-max span | ✅ |
Number of distinct values | ✅ |
Top 10/100/1000 values | ✅ |
10 random values | ✅ |
Read more about profiling in a Data Profliling documentation.
Data Lineage
Source | Method | Version |
---|---|---|
Views - object level | From dependencies | 10.4 |
Views - object level | From SQL parsing | 10.4 |
Views - column level | From SQL parsing | 10.4 |
External Tables - object level | From dependencies | 23.2 |
Under development
Known issues and limitations
- Copy Commands - Due to retention time in Redshift Copy Logs, after one week (default, can be changed in cluster options) Copy Commands would be impossible to import