Catalog and documentation
Dataedo imports tables, external tables, views, materialized views, columns and Copy Commands.
Descriptions, aliases and custom fields
When technical metadata is imported users will be able to edit descriptions of each object and element, provide meaningful aliases (titles) and document everyting with additional custom fields.
Import and export comments
When importing metadata from Redshift, Dataedo reads table, view, column and user defined functions and parameters comments. Dataedo does not write comments back to Redshift at this moment.
Users will be able to link a Business Glossary term to any Redshift object.
Table relationships and keys
Dataedo imports table relationships (foreign keys), primary and unique keys with their columns.
Using imported and manually created foreign keys Dataedo allows you to create your own ER diagrams (ERDs) manually.
Users will be able to run data profiling for a table, view or materialized view, then save selected data in the repository. This data will be available from Desktop and Web.
Lookups / Reference data
Users will be able to build Lookups for columns in Redshift tables and views and feed them with distinct values from a column.
Importing changes and schema change tracking
Changes to descriptions in Dataedo Desktop and Web Catalog are tracked and saved in the repository.
Changes to descriptions in Dataedo Desktop and Web Catalog are tracked and saved in the repository. Users are able to manually edit object description.
Share in Web Catalog or export to HTML, PDF or Excel
Documentations can be exported by sharing it in web catalog or generating HTML, PDF or Excel.
Users can mannually create multiple ERDs in subject areas, as a diagram of whole database or only part of it.
Cluster VPC Option
In order to import Redshift metadata. Cluster have to have turned on Public accessibility of cluster.
Connecting to Redshift
To connect to Redshift create new documentation by clicking Add documentation and choosing Database connection.
On the connection screen choose Amazon Redshift as DBMS. Provide database connection details:
- Host - provide a host name or address where a database is on. E.g. redshift-cluster.r5gf5.eu-west-1.redshift.amazonaws.com
- Port - type in Redshift instance port name
- User and password - provide your username and password (for database from target cluster)
- Database - database name
You can save password for later connections by checking Save password option. Password are saved in the repository database.
Confirm list of objects to import by clicking Next. Next screen with allow you to change default name of the documentation under with your schema will be visible in Dataedo repository.
Click Import to start the import.
When done close import window with Finish button.
Your database schema has been imported to new documentation in the repository.
To sync any changes in the schema in Redshift and reimport any technical metadata simply choose Import changes option. You will be asked to connect to Redshift again and changes will be synced from the source.
You can also schedule metadata updates using command line files. To do it, after creating documentation use Save update command option. Downloaded file can be run in command line, what will reimport changes to your documentation.
|Views, Materialized Views||✅||✅|
|Write comments back|
|Reference data (import lookups)||✅|
|Importing from DDL|
|FK relationship tester||✅|
Dataedo reads comments from following Redshift objects:
Datedo supports following data profiling in Redshift:
|Table row count||✅|
|Table sample data||✅|
|Column distribution (unique, non-unique, null, empty values)||✅|
|Min, max values||✅|
|Number of distinct values||✅|
|Top 10/100/1000 values||✅|
|10 random values||✅|
Read more about profiling in a Data Profliling documentation.
|Views - object level||From dependencies||10.4|
|Views - object level||From SQL parsing||10.4|
|Views - column level||From SQL parsing||10.4|
|External Tables - object level||From dependencies||23.2|
Known issues and limitations
- Copy Commands - Due to retention time in Redshift Copy Logs, after one week (default, can be changed in cluster options) Copy Commands would be impossible to import