Apache ORC support

Applies to: Dataedo 23.x versions, Article available also for: 24.x (current), 10.x
You are looking at documentation for an older release.
Switch to the documentation for Dataedo 24.x (current).

Dataedo 9.3 added support for ORC files. Dataedo scans ORC file and builds a structure that includes:

Supported Metadata

Metadata

  • Primitive types (boolean, byte, int, long, float, string ...)
  • Compound types:
    • Struct
    • List
    • Map
    • Union

Each field contains:

  • Name,
  • Data type,
  • Nullabiltiy.

Data profiling

Datedo does not support data profiling in ORC files.

How to import ORC File

To import ORC file to Dataedo:

  • right click on any database or Structures folder, choose Add Object, then Add/Import Structure, or
  • on main ribbon select Add Object then Structure/File, or
  • select Structures folder and on main ribbon select Add Structure/File.

Then select Import from file and ORC format. To read the file, point to a ORC file on the disk and click Next. This will scan the content and open Structure designer with a parsed structure. You can use this window to edit names, data types and field types and save with Save button.

ORC Structure designer

Guide: Adding files to the catalog