Documenting data mapping and sources in Data Warehouse

Data warehouses are databases populated from other databases (in contrast to applications where data is mostly created with UI) and therefore it is important to know where does the data come from for each column.

Besides documenting data model and creating data dictionary for your data warehouse, Dataedo enables you to document data mapping / sources for individual columns. This guide will show you how.

Prepare

Before you begin you must already have created repository and imported your data warehouse schema. If you haven't already, this tutorial will get you started:

Generate your first database documentation

1. Create custom fields

When you have imported your DWH schema (tables and columns) you now must create custom fields to store information about data sources.

Dataedo has a feature called custom fields that allows you to add additional columns to your data dictionary.

To define custom fields click Custom fields button on the ribbon.

This will open custom fields definition window. You can define any fields you need to document sources, in our case we are adding:

  • Source System - to indicate database or application data is loaded from
  • Source Table - to indicade table data is loaded from
  • Source Column - to indicate column name data is loaded from
  • Transformation - additional information on data transformation

Those fields are predefined in Dataedo so you can add them by clicking their names from the list on the bottom of the window.

Please note that each field has a definition for which objects it will be available for. In our case we are interested only in columns, as we are documenting column sources.

You can add your custom fields by clicking Add button and providing a unique name and visibility definition (make sure to select Columns).

Save when done.

2. Document mapping

When you have added custom fields to your repository you can now go table by table and type definiton of sources for each column.