Skip to main content

Azure Purview

Azure Purview is a unified data governance solution that enables organizations to catalog, manage, and discover data across their cloud and on-premises environments. It provides tools for organizing and tracking data assets like databases, tables, and more, helping users understand and control their data landscape effectively. Starting with Dataedo 24.4, we now support an Azure Purview connector, allowing users to import single databases stored within Purview, streamlining data cataloging and enhancing data governance.

Documenting Azure Purview

Dataedo imports following Purview elements:

PurviewDataedo
AssetObject
Technical MetadataFields + Custom Fields
ClassificationFields + Custom Fields
Managed AttributesFields + Custom Fields

Data Lineage

  • Purview connector builds lineage within single import (it won't create lineage to other technologies/Purview imports).
  • Purview connector builds Object-level lineage

Connecting to Purview

Add new connection

To connect to Purview instance create new documentation by clicking Add and choosing New connection.

On the connection screen choose Purview (it can be found under Catalogs Folder).

Purview connection details

Connection details:

  • Sign in - Authenticate with your Azure Account
  • Subscription - Azure subscription assigned to your Purview instance.
  • Resource group - Resource group where your Purview instance sits in
  • Account name - Account which has access to Purview

Setting Service and Objects Types

In order to correctly import Purview, Dataedo needs to know three things:

  • Service - the technology you would like to import e.g. Hive, HBase
  • Object Types - which object from picked Service is the main one e.g. Hive Database (Due to graph structure of Purview, Dataedo needs information which object type should be treated as Database/Storage)
  • Database/Storage - object instance of picked object type the user would like to import e.g. Database AdventureWorks

After setting this up, Purview connection is ready to begin.

Mapping Purview Service to Dataedo

Purview import requires mapping Purview attributes onto Dataedo fields. It allows users to pick how their want their data to be loaded into Dataedo.

  • Import - Checkbox informing whether to import target type
  • External Object Type - Name of Purview object type
  • Dataedo Object Type - Object type to which Purview type will be mapped to
  • Parent Object Type - Purview tends to split certain objects into two separate entities, whereas Dataedo treats them as one—for example, a view and its corresponding script. During the mapping process, we provide the option to select from a dropdown whether a specific object type in Purview should be considered a "Child" (a sub-object or part of another object). Enabling this setting will merge the two objects. For instance, when mapping a view and its script, users can designate the view as the parent object of the script. As a result, the application will treat the script and its attributes as attributes of the view. In this case, the view's attributes take precedence—meaning that the imported object’s name, schema, and other metadata will be sourced from the view. However, if the view lacks certain attributes (or they are empty), but the script contains them, those attributes will be loaded into the mapped fields.
  • Map attributes - a button which opens mapping attributes window
  • Attributes Mapping Percentage - a bar which shows what percentage of all available attributes for certain type is mapped in Dataedo fields

Mapping attributes

Mapping attributes is a separate step which can be accessed by clicking Map attributes. In this form you can browse through Purview attributes and set Dataedo attribute/field it should be loaded to. You can also create new Custom Fields and use them for mapping.

Mapping attributes

Automated mapping

Some of selected services are automatically mapped by Dataedo. The import process no longer requires manually mapping fields from Purview to Dataedo; however, this option remains available if needed. Below is the list of services that have received predefined mapping models.

  • AmazonRdsDatabaseMySql
  • AmazonRdsDatabasePostgreSql
  • AmazonRdsDatabaseSql
  • AmazonRedshift
  • AzureBlobStorage
  • AzureDatabaseForMariaDb
  • AzureDatabaseForMySql
  • AzureDatabaseForPostgreSql
  • AzureSqlDatabase
  • AzureTableStorage
  • Cassandra
  • Dataverse
  • Db2
  • Fabric
  • GoogleBigquery
  • Hbase
  • Hive
  • MySql
  • Oracle
  • PostgreSql
  • SqlServer
  • Snowflake
  • TableauServer
  • Teradata

Specifications

Imported objects

ImportedEditable
Assets
Technical properties
Classification
Managed Attributes

Supported features

FeatureIs supported
Writing changes back
Data Profiling
CMD Import
PK/FK relationship tester
Linked Sources

Data Lineage

SourceMethodVersion
Internal lineage (object-level)SDK24.4 (2024)

Limitations

  • Once a property has been mapped to the Dataedo Title field and imported, it will not be overridden by subsequent imports. Once the Title field is set, Dataedo does not modify it during future imports.
  • All classifications are loaded into a single custom field. This means that if a table in Purview has classifications A and B assigned to it, both values will be stored in the same custom field in Dataedo.
  • Automated mapping models do not map managed attributes and classifications; these must be mapped manually.

Plans for future releases

  • Importing multiple services at once