Skip to main content

Apache Atlas

Apache Atlas is an open-source data catalog. It allows users to organize and manage information about different types of data, like databases and tables, making it easier to understand and control data assets.

From Dataedo 24.2 we support Apache Atlas connector that imports a single database stored within Atlas.

Documenting Apache Atlas

Dataedo imports the following Atlas elements:

Apache AtlasDataedo
EntityObject
Technical MetadataFields + Custom Fields
ClassificationFields + Custom Fields
Business MetadataFields + Custom Fields

Data Lineage

  • Apache Atlas builds lineage within a single import (it won't create lineage to other technologies/Atlas imports).
  • Apache Atlas builds Object-level lineage.

Connecting to Atlas

Add new connection

To connect to an Apache Atlas instance, create new documentation by clicking Add and choosing New connection.

On the connection screen, choose Apache Atlas (it can be found under the Catalogs folder).

Provide database connection details:

  • Host - provide a host name or address where the database is. E.g. server17, server17.ourdomain.com or 192.168.0.37
  • Port - specify the port number where Apache Atlas is running.
  • User and Password - provide the user and password used to authenticate to Apache Atlas.
  • SSL mode - choose the SSL mode based on your security needs.

Saving password

You can save the password for later connections by checking the Save password option. Passwords are saved in the repository database.

Setting Service Type and Objects

In order to correctly import Apache Atlas, Dataedo needs to know three things:

  • Service Type - the technology you would like to import e.g. Hive, HBase.
  • Main Object Type - which object from the picked Service is the main one e.g. Hive Database (Due to the flat structure of Apache Atlas, Dataedo needs information on which object should be treated as the Database/Source of every object in the Service).
  • Specific Object - object of the picked object type the user would like to import e.g. Database AdventureWorks.

After setting this up, the Atlas import is ready to begin.

Mapping Atlas Objects to Dataedo

This is a crucial step in importing Apache Atlas into Dataedo. It allows users to do the following:

  • Pick Dataedo object type for every Atlas entity type imported.
  • Select whether to import certain Atlas entities.
  • Map attributes.
Object mapping

Apart from editing mapping, you can have a look at how many percent of Atlas attributes were mapped to Dataedo. After mapping attributes for one object type, you can Right Click to Copy and then Paste your mapping to different object types.

Mapping attributes

Mapping attributes is a separate control which can be accessed by clicking Map attributes. In the form, you can browse through all Atlas Attributes and set the Dataedo attribute/field it should be loaded to. You can also create new Custom Fields and use them for mapping.

Mapping Attributes

After setting up object and attributes mapping, you can import your Atlas. Once you map objects for a certain service, it can be reused in Import Changes as well as in Copy Connection.

Specifications

Imported objects

ImportedEditable
Entities
Technical properties
User defined properties
Label
Classification
Business Metadata

Supported features

FeatureIs supported
Writing changes back
Data Profiling
CMD Import
PK/FK relationship tester
Linked Sources

Data lineage

SourceMethodVersion
Internal lineage (object-level)Rest API24.2 (2024)

Plans for future releases

  • Import terms and glossaries
  • Import object labels
  • Import user defined properties
  • Import multiple services at once
  • Import propagated classifications
  • Lineage improvements, lineage across multiple Apache Atlas documentations