Apache Atlas is an open-source data catalog. It allows users to organize and manage information about different types of data, like databases and tables, making it easier to understand and control data assets. From Dataedo 24.2 we support Apache Atlas connector that import single database stored within Atlas.
Documenting Apache Atlas
Dataedo imports following Atlas elements:
Apache Atlas | Dataedo | |
---|---|---|
Entity | → | Object |
Technical Metadata | → | Fields + Custom Fields |
Classification | → | Fields + Custom Fields |
Business Metadata | → | Fields + Custom Fields |
Data Lineage
- Apache Atlas builds lineage within single import (it won't create lineage to other technologies\\Atlas imports).
- Apache Atlas builds Object-level lineage
Connecting to Atlas
Add new connection
To connect to Apache Atlas instance create new documentation by clicking Add and choosing New connection.
On the connection screen choose Apache Atlas (it can be found under Catalogs Folder).
Provide database connection details:
- Host - provide a host name or address where a database is on. E.g. server17, server17.ourdomain.com or 192.168.0.37
- Port - specify the port number where Apache Atlas is running
- User and password- provide user and password used to authenticate to Apache Atlas
- SSL mode - choose the SSL mode based on your security needs
Saving password
You can save password for later connections by checking Save password option. Password are saved in the repository database.
Setting Service Type and Objects
In order to correctly import Apache Atlas Dataedo needs to know three things:
- Service Type - the technology you would like to import e.g. Hive, HBase
- Main Object Type - which object from picked Service is the main one e.g. Hive Database (Due to flat structure of Apache Atlas, Dataedo needs information which object should be treated as Database/Source of every objects in Service)
- Specific Object - object of picked object type the user would like to import e.g. Database AdventureWorks
After setting this up, Atlas import is ready to begin
Mapping Atlas Objects to Dataedo
This is crucial step in importing Apache Atlas in Dataedo. It allows users to do the following:
- Pick Dataedo object type for every Atlas entity type imported
- Select whether to import certain Atlas entities
- Map attributes
Apart from editing mapping you can have a look on how many percent of Atlas attributes were mapped to Dataedo. After mapping attributes for one object type you can Right Click to Copy and than Paste your mapping to different object types.
Mapping attributes
Mapping attributes is a separate control which can be access by clicking Map attributes. In form you can browse through all Atlas Attributes and set Dataedo attribute/filed it should be loaded to. You can also create new Custom Fields and use them for mapping.
After setting up object and attributes mapping you can import your Atlas. Once you map objects for certain service it can be reused in Import Changes as well as in Copy Connection.
Specifications
Imported objects
Imported | Editable | |
---|---|---|
Entities | ✅ | ✅ |
Technical properties | ✅ | |
User defined properties | ||
Label | ||
Classification | ✅ | |
Business Metadata | ✅ |
Supported features
Feature | Is supported |
---|---|
Writing changes back | |
Data Profiling | |
CMD Import | ✅ |
PK/FK relationship tester | |
Linked Sources |
Data lineage
Source | Method | Version |
---|---|---|
Internal lineage (object-level) | Rest API | 24.2 (2024) |
Plans for future releases
- import terms and glossaries
- import object labels
- import user defined properties
- import multiple services at once
- import propagated classifications
- lineage improvements, lineage cross multiple Apache Atlas documentations