Release notes 23.2 - Domains, AI and more

Piotr Kononow - Dataedo Team Piotr Kononow 16th November, 2023
Applies to: Dataedo 23.x (current) versions, Article available also for: 24.x

I am proud to present you a new long-awaited Dataedo 23.2! This probably our biggest release so far. We have a lot of new exciting features and updates for you.

Domains and hierarchy of Subject Areas

The most ground breaking new feature of Dataedo, something many of our users were waiting for a while, are the new domains. With new domains we replaced the old subject areas and we met the following requirements:

  1. Added ability to group assets into domains (now, domains are separate from the sources, as subject areas were),
  2. Added the hierarchy of Subject Areas (domains can have a hierarchy of subject areas up to 3 levels),
  3. Now it is possible to define access to groups of data assets (each domain/subject area has its own permissions and objects linked inherit those permissions),
  4. Data teams can build a wiki like portal for business users with the hierarchy of business domains.

We built not one, but two types (hierarchies) of domains: Business and Data Domains.

Business Domains

Business Domains, and the hierarchy of Subject Areas, allow you to break down your organization business processes and act like a wiki-like portal to the data and concepts in your organization. This can be an easy start with data for business users, or even be the place where they learn about various areas and processes of your business.

Image title

Data Domains

Data Domains, with the hierarchy of data areas, are meant to represent the Data Team's/IT view and organization of the data and applications. You can use it to represent Data Mesh domains or different departments, teams, applications, modules or even data products.

Image title

Permissions

From 23.2 objects inherit permissions from domains and areas (business and data). Users that have access to specific domain/area

When you assign an object to one of the that user has access to he will gain access to all the linked objects. Therefore, domains can be a way of controlling access on the object level.

Auto documentation with AI

One of the most exciting features of 23.2 is AI Autodocumentation. Now, you can plug in Open AI's GPT subscription into Dataedo and ask it to generate descriptions for you.

Image title

Keyword Explorer

Keyword Explorer is our own unique approach to browsing, searching and discovering metadata found in your databases and other sources. Dataedo analyzes different keywords that occur in table and column names. You can then explore keywords, finding their occurrences in data set names, finding glossary terms that exist with that name, and finding similar keywords.

Image title

Image title

Import improvements

Performance

In 23.2 we redesigned our import engine and now imports should be much faster.

There are still some suboptimal elements in current import mechanism that we will be correcting in subsequent versions. We are planning to remove first step (counting objects) and continue to improve loading time.

User Experience (UX)

We built better import status tracking and logging.

Image title

Interface tables import improvements

Interface tables are set of tables in Dataedo repository database that you can write to raw metadata that you would like to import to repository safely. To load data from interface tables you need to run Interface tables connector.

Changes:

  1. Import of reports
  2. Import of linked sources
  3. Additional fields for importing data lineage (including custom fields)
  4. Ability to build a lineage between different databases

New connectors

In 23.2 we added the following connectors:

  1. Custom (SQL) connectors
  2. MS Fabric
  3. Azure Synapse Pipelines
  4. Power BI Report Server
  5. SSAS Multidimensional (Cubes)
  6. Local disk, FTP/SFTP connector
  7. MySQL Core connector

Custom (SQL) connectors

To ship connectors faster and without the need to release new version Desktop, we build a "custom (SQL) connector" functionality. Custom connectors are saved and distributed in proprietary .dataedocon files (XML format). Connectors are built and delivered by Dataedo team or can be build by users (they currently require signing by Dataedo team).

We built and shipped with 23.2 custom connectors to:

  1. ClickHouse
  2. IBM Informix
  3. IBM iSeries
  4. SAP Advantage Server (Sybase ADS)
  5. Starburst/Trino

Image title

Learn more about custom SQL connectors.

MS Fabric

We created connectors to MS Fabric - unified analytics platform. It allows to connect to Power BI and Data Warehouse inside the Fabric Workspace.

Power BI

Metadata imported with this connector is same as for the conventional Power BI connector.

Image title

Learn more about Microsoft Fabric - Power BI.

Data Warehouse

MS Fabrics Data Warehouse will import same metadata as the Azure Synapse connector

Image title

Learn more about Microsoft Fabric - Data Warehouse.

Learn more about Microsoft Fabric connector.

Azure Synapse Pipelines

We have created a separate connector for Pipelines in Azure Synapse Analytics (Azure Synapse Pipelines) that imports pipelines and builds lineage automatically.

What is imported:

  1. Pipelines
  2. Activities
  3. Source and sink datasets
  4. Linked services
  5. Lineage between pipelines, activities, datasets, and external data sources

Image title

Learn more about Azure Synapse Pipelines connector.

Power BI Report Server

We built a new connector to Power BI - Power BI Report Server. It is a on-prem version of Power BI service.

We import the following objects:

  1. Power BI Reports
  2. Paginated Reports
  3. Paginated Report Datasets

We haven't figure out how to import Power BI Datasets and build lineage for Power BI reports due to limitations of what's provided by the platform.

Learn more about Power BI Report Server connector.

SSAS Multidimensional (Cubes)

Dataedo 23.2 comes with brand new connector to SQL Server Analysis Services (SSAS) Multidimensional model (cubes, measures and dimensions). Previously you could connect to SSAS Tabular, that is completely different technology.

We import the following objects:

  1. Cubes
  2. Measures
  3. Dimensions
  4. Data Source View
  5. Data Sources (as Linked Sources)
  6. Column level lineage!

Image title

Learn more about SSAS Multidimensional connector.

Local disk, FTP/SFTP connector

In previous versions you could import big data files from Amazon S3, Azure Blob Storage and Azure Data Lake Storage cloud storages. Now we extended this list with local disk and FPT and SFTP connections.

Image title

Learn more about FTP/SFTP connector.

MySQL Core connector

There are many databases derived from MySQL. Many of them are not compatible with most recent version of MySQL. We created a MySQL Core connector that scans just the most basic metadata therefore giving you the most chance of successful connection to all those other types of databases using this connector.

Learn more about MySQL Core connector.

Connector improvements

In 23.2 we made improvements to the following connectors:

  1. SQL Server Polybase
  2. Azure Synapse
  3. SSIS
  4. SSRS
  5. Tableau
  6. Power BI
  7. Azure Data Factory
  8. Snowflake
  9. Amazon Redshift
  10. Oracle
  11. CSV/Delimited text files
  12. Dbt

SQL Server Polybase

In 23.2 we added import of object-level lineage for external tables in SQL Server Polybase. Supported sources of external tables are the following: SQL Server, Oracle, MongoDB, Excel, S3, Azure Blob Storage, Azure Data Lake Gen2.

Azure Synapse

Lineage based on COPY INTO history - Synapse Dedicated instances

Dataedo will create lineage from source file to destination table of COPY INTO command. It requires documenting source file before importing Synapse Dedicated metadata.

Image title

Lineage for external tables - Synapse Dedicated and Synapse Serverless instances

External tables will have full lineage from source file to external table in Dataedo 23.2.

Image title

Learn more about Azure Synapse Analytics connector.

SSIS

From 23.2 Dataedo imports the following object types:

  1. Packages (now moved to Packages folder)
  2. Sources
  3. Destinations
  4. Connections (as Linked Sources)

Image title

More detailed lineage

For some types of SSIS processes such as Balanced Data Distributor, Conditional Split, Lookup, Multicast Transform and many more (more details in SSIS documentation) Dataedo can extract column data lineage

Image title

Learn more about SSIS connector.

SSRS

Learn more about SSRS connector.

Tableau (Prep Builder data flows)

In 23.2 we now read Tableau Prep Builder data flows into repository.

Upgrades include:

  1. Importing of flows with lineage
  2. Filtering out system tables during import
  3. Building of column level lineage from Custom SQL (represented as datasets) with SQL parsing
  4. Importing of fields hierarchy in Data sources

Learn more about Tableau connector.

Power BI

We upgraded our Power BI connector:

  1. Added support for column-level data lineage for live connection
  2. Support for Datasets that have been created as a Live connection
  3. We added Linked Sources for better control over automatic lineage building (see Data Lineage improvements section)
  4. Importing Dashboards as "Dashboard" not a "Report"

Learn more about Power BI connector.

Azure Data Factory

  1. Importing of Linked Services (as Linked Sources)
  2. Importing of Data sources and destinations into Datasets
  3. Better way of fetching dataset columns

Learn more about Azure Data Factory connector.

Snowflake

23.2 Snowflake connector includes following upgrades:

  1. Loading lineage from data loading history of COPY INTO \<table>
  2. Loading of lineage for external tables
  3. Distinguished external functions as a separate type

Learn more about Snowflake connector.

Amazon Redshift

We upgraded automated lineage creation for Amazon Redshift connector. Upgrade includes:

  1. Import of external tables with lineage
  2. import of COPY commands (not COPY Jobs)
  3. Import of Data Ingestion (as Linked Sources)

Learn more about Amazon Redshift connector.

Oracle

In 23.2 we moved Oracle packages and their stored procedures and functions into Packages folder.

Learn more about Oracle connector.

CSV/Delimited text files

Dataedo 23.1 and older supported import of CSV files (comma separated files). CSVs store values separated by comma (","). Some files use this format with other separators (";", "|", "tab"). Now during the import you can provide a separator character.

Image title

Dbt

Added option to add manifest.json and catalog.json files manually.

Learn more about Dbt connector.

Data Lineage improvements

Lineage from SQL parsing

  1. Snowflake dialect support
  2. CTE support for PostgreSQL, MySQL & Oracle

Linked Sources

In 23.2 we introduced a new object type - Linked Source. Linked Source represents a connection in one data source to another data source. In BI and ETL tools those are simply connections. This allows you identify manually links across various sources in Dataedo repository in case Dataedo is not able to detect it automatically (for instance, in one source address is identified by url and in another one by IP). This gives you more control over automated lineage creation and increases scope of what is created automatically.

Image title

Selectable SQL dialect

Now, you can direct parser and lineage builder on what parser/SQL dialect should be used. This gives you more control over parsing and lineage automation and allows you to use one of the existing parsers on sources we don't provide dedicated SQL parsers for.

You can assign SQL dialect to:

  1. Data source
  2. Linked source - link to data source from another data source
  3. Data Lineage process

Image title

Column level lineage fields

From 23.2 you can add additional metadata to column mapping:

  1. Description - you can provide a description for each column mapping
  2. Transformation (as text)
  3. Custom fields

Image title

Process level fields

We added a number of fields to process level of the lineage

  1. Description - you can describe each process
  2. Script - script is imported automatically or added manually
  3. Linked Source

Image title

Marking as deleted

Automated data lineage process now marks processes and object and column mapping as deleted, rather than removing it. This is because now processes and column mapping have extra manual metadata that would be lost otherwise.

Data Profiling improvements

Refreshed UI

In 23.2 we refreshed the UI of data profiling window:

  1. Ribbon options have been grouped
  2. Sparklines and progress icons have been updated
  3. Context menu have been updated

Image title

Blocking classified fields

Now, Dataedo will not profile and save data for columns that have classification.

Image title

You can define what classification labels disable what profiling steps.

Image title

Filtering tables

Now Dataedo allows you to add where clause to tables (and views) to limit rows that will be profiled.

Image title

Value distribution chart

Dataedo now builds value distribution chart for numbers columns, and length distribution for text columns.

Image title

Sample data as a tab

We added Sample data tab directly to the table form so it is easier to preview table sample rows. This information is stored only in Desktop memory and is not saved in the repository or shared with other users.

Image title

Re-run profiling

Now you can re-run profiling for the same set of tables, columns and settings of profiling scope with one button.

Image title

Running profiling from command line

You can also re-run profiling using command line which means you can schedule profiling operation.

Image title

Other improvements

  1. You can now also profile saved SQL Queries
  2. Profiling skips empty tables

Dataedo Portal

This section lists changes in Dataedo Portal, formerly known as Dataedo Web Catalog).

Preview pane

Dataedo 23.2 Portal has a convenient preview pane, that presents basic information about selected object, in the example below a column in the table, without the need for opening of a new page. Click arrow next to the name to go to the object page.

Image title

History of changes in Portal

Since 10.2 and 10.3 we are saving changes to descriptions, titles and custom fields, both from Desktop and Portal. There is also an option to view this history for each field in Desktop. Now, we added this capability to Portal.

Image title

Customization

From now on, you can customize the following elements of Dataedo Portal :

  1. Logo
  2. Login page background
  3. Login page additional information box
  4. Portal name
  5. Home page additional information box

To customize portal, log in as an admin, go to System Settings > Customization.

Image title

Search redesigned

New search is faster and groups different types of objects in separate tabs.

Image title

Open in Desktop

In 23.1 we added in Dataedo Desktop an "Open in Web" button, that allows you to quickly open the same page in Dataedo Portal (formerly Dataedo Web). Now, we added a similar button in Portal that allows you to open the same object in Dataedo Desktop - "Open in Desktop".

Image title

Business Glossary types configuration

Now, you can add/edit glossary entry types directly in the Portal.

Image title

Improved code diff

In schema change tracking report we improved:

  1. Before and after code are aligned to the same lines
  2. Marking removed and new fragments in red/green

Image title

Dataedo Desktop

This section lists changes in Dataedo Desktop.

New navigation and double click

In 23.2 Desktop got a complete overhaul of the navigation. The repository explorer now has two layers. Items got grouped into categories what makes it easier to navigate.

Now, you have to double click on the object to open it.

Image title

Breadcrumbs, history and Back button

A long awaited new feature - option to go back to the last visited object. We have made a full package - Windows-like breadcrumbs (that in next version will allow copy and paste), back button and history of viewed objects.

Image title

You can navigate repository in the breadcrumb:

Image title

Browser-like stacked history:

Image title

Full history (for current session):

Image title

Secure repository connection

To make it more secure to host Dataedo Portal and repository in the cloud Desktop now allows you to connect to the repository over SSH tunnel.

Image title

Image title

Before you upgrade

Before you upgrade consider the following:

  1. Upgrade requirements - this release requires upgrade of the following:
    • Repository
    • Dataedo Portal (formerly Web Catalog)
    • Dataedo Desktop
  2. Check your plan - not all the features added in 23.2 are available in all the plans. Check the overview of plans and check your plan in Dataedo Account.
Found issue with this article? Comment below
Comments are only visible when the visitor has consented to statistics cookies. To see and add comments please accept statistics cookies.
0
There are no comments. Click here to write the first comment.