Data Dictionary vs Data Catalog

Data Catalogs are becoming popular, but there seems to be a confusion what they are. Especially, what is their relationship with Data Dictionaries. In this article I'd like to provide basics about those two terms and show the differences and relationships

Data Catalog is a relatively new term - it was coined just a few years ago. Before that industry was using different term - metadata repository (maybe I'll do a comparison in another post).

What is a Data Dictionary?

Data Dictionary is a specification and description of data structures in a database, data model or data source. It consists of list of entities/tables/data sets and their fields/columns/data elements. Data dictionaries can contain various scope of information, depending on the use case. Some of them are data type, description, relationships, aliases, constraints, sources, etc.

Example of Data Dictionary (built with Dataedo):

What is a Data Catalog?

Data Catalog is an inventory of data assets in an organization. It is also a category of software that allows organization to build data catalogs.

Data Catalog (database) includes:

  1. Inventory of data assets in the database,
  2. Information about the quality of the data.

Data Catalog (software) includes:

  1. Metadata repository with the data inventory assets,
  2. Metadata scanners that connect to sources and,
  3. Web interface for data analysts,
  4. Search functionality.

Screenshot of Data Catalog software (Dataedo Web Catalog):

The differences

Data Dictionary Data Catalog
Definition Definition of data sets and elements Inventory of enterprise-wide data assets
What it is Metadata (information) Software or software with actual database
Scope Data source or data model Data in organization
Metadata Data sets, fields, relationships, definitions, etc. Data assets, business glossary, classifications, data lineage
Purpose Describe data in a database Catalog enterprise data for analytics

The relationship

Data Catalogs usually include a Data Dictionary of the data assets. Therefore, Data Dictionary can be thought of as a building block of a Data Catalog.

Piotr Kononow

CEO and founder of Dataedo. For many years business analyst, software architect and project manager in various industries - asset management, heavy industry, telco, utilities/gas and tourism. His specialties are data warehousing/BI and business applications.

