Cloudera - Automatic Data Lineage

Wojtek Bialek - Dataedo Team Wojtek Bialek 4th December, 2024

What to expect

Object-level lineage

Dataedo builds object-level lineage for Cloudera Data Catalog datasource. It analysis relations between objects from Apache Atlas API (connected through Atlas endpoint).

Object-level lineage

Known limitations

  1. Dataedo won't build column-level lineage (Atlas API doesn't return such information)
  2. If API lacks data about relation lineage won't be builded
  3. API returns information about lineage and other relations inside same JSON property, Dataedo checks only two properties while building lineage (lineage in and lineage out) if linage will be defined in other tag it won't be builded

Troubleshooting

I don't see object-level lineage

  1. Make sure mapping is set correctly (lineage won't be build between objects and columns)
  2. Rerun the import of the source