Apache Spark SQL is currently not officially supported. Connection is possible with generic ODBC driver. Metadata returned depends on driver version and provider.
We have tested and successfully connected to and imported metadata from Apache Spark SQL with ODBC drivers listed below. It is highly likely it will work with other drivers as well.
Tested ODBC Driver and environments:
We have tested and successfully connected to and imported metadata in following distributions:
Cloudera
Hortonworks
Hive ODBC version: 2.6.1 (64 bit)
Supported schema elements and metadata
Dataedo reads following metadata from Apache Spark SQL.
- Tables
- Columns
- Data type with length
- Nullable
- Columns
- Views (displayed as a table)
- Columns
- Data type with length
- Nullable
- Columns
Data Profiling
Dataedo does not support profiling in Apache Spark SQL.
Data Lineage
Dataedo does not support data lineage in Apache Spark SQL.