Apache Parquet support

Applies to: Dataedo 23.x versions, Article available also for: 24.x (current), 10.x

Dataedo 9.3 added support for Parquet files. Dataedo scans Parquet file and builds a structure that includes:

Supported Metadata

Metadata

  • Primitive data type fields,
  • Nested structs,
  • Arrays,
  • Maps

Each field contains:

  • Name,
  • Data type,
  • Nullability.

Data profiling

Datedo does not support data profiling in Parquet files.

How to import Parquet File

To add Parquet file:

  • right click on any database or Structures folder, choose Add Object, then Add/Import Structure, or
  • on main ribbon select Add Object then Structure/File, or
  • select Structures folder and on main ribbon select Add Structure/File.

Then select Import from file and PARQUET format. To read the file, point to a PARQUET file on the disk and click Next. This will scan the content and open Structure designer with a parsed structure. You can use this window to edit names, data types and field types and save with Save button.

Parquet Structure Designer

Guide: Adding files to the catalog

Found issue with this article? Comment below
Comments are only visible when the visitor has consented to statistics cookies. To see and add comments please accept statistics cookies.
0
There are no comments. Click here to write the first comment.