You Don't Need an ER Diagram to Understand Your Database

Table of Contents:


    Now that I have your attention, let me clarify - ERDs are great for visualizing databases but are not suitable as the documentation in itself. What you really need to for database documentation is a Data Dictionary. Why I feel that way about ERDs? Let me show you by example.

    Reverse engineering of a sample database

    ER Diagram reverse engineered with Oracle SQL Developer Data Modeler

    So what’s wrong with this picture? Well, it doesn’t help me understand the data model, where to find data and how to use it. Here are a few reasons:

    Too many details

    There are too many tables and columns. This diagram even shows data types but instead of explaining the model, it buries the information under irrelevant details.

    Unknown purpose of tables and columns

    Diagram only shows me tables, their columns and relationships between tables. It doesn’t say anything about the purpose of those elements. Even if I find RA_CUST_TRX_LINE_GL_DIST_ALL table and its ATTRIBUTE1 column it really doesn’t tell me much.

    Done right: Oracle EBS docs

    I want to show you one example of database documentation done right - Oracle E-Business Suite (EBS) applications technical documentation. EBS is an ERP application for enterprises with huge database (24 thousand tables). Its documentation is broken down to application (a module), each having its own document. Each document has High Level Design and Detailed Design.

    High Level Design has a list of small diagrams each showing entities and relationships between them of one aspect/topic of the module. Diagrams only include primary key columns.

    Low level design section includes a list of tables with a description of their purpose, definition of foreign keys and a list of columns with the description of their purpose.

    I cannot show you all the details because it is copyrighted material but I want to show you how they structured their documentation:

    Structure of Oracle EBS database documentation

    I used it when I built a data warehouse many years ago. Data warehouse included many tables from multiple modules (accounts payable, accounts receivable, general ledger, projects, purchasing, inventory and a few more) and this documentation saved me weeks or months of figuring out where the data was, the purpose of columns, how to join tables, or which views to use.

    When are ERDs good enough?

    ERDs are good enough for documentation of data models for small, simple, well designed databases with obvious domain. Such as sample MySQL database - Sakila. I guess that by looking at the diagram below you can figure out what Sakila is for, purpose of tables and columns and how to query the database.

    ER diagram of sample MySQL database - Sakila

    Small databases - not too many tables

    ERDs may work if you have a couple dozen tables at most. If you put more it's going to be confusing rather than informative.

    An attempt to visualize a typical (non-sample) database

    Small tables - not too many columns

    ERDs communicate well if they have a minimum set of columns. Some tables have over a hundred of columns and this information hides the big picture.

    Sample tables in typical (non-sample) database

    Well designed with meaningful names

    Sakila is a small database designed probably in one sitting as a demo for MySQL database engine. Tables and columns were designed as a simple example that everyone would grasp. In real life, databases have confusing, misleading names and unused tables and columns and just by looking at a name doesn't tell you anything.

    What data you think this sample table in SAP holds?

    Sample table in SAP

    Read: Reasons why tables and columns have confusing names

    Obvious model and domain

    You understood Sakila data model easily because it describes something you know well and understand - movie rental business - films, actors, rentals, etc. Those are relatively simple processes we have contact within our lives. However, there are many domains that we don’t know, understand or are very complex with many caveats. Diagram will not explain to you what the data represents and what is its logic.

    Oracle EBS vs Sakila

    Let’s now compare Sakila and Oracle EBS databases.

    Sakila Oracle EBS
    Tables 16 24,063
    Columns 90 616,699
    Average columns per table 5 25
    Sample tables Customer, Payment, Rental HZ_CUST_ACCOUNTS, RA_CUST_TRX_LINE_GL_DIST_ALL
    Domain Simple - movie rental Very complex and wide business domain - all processes of an enterprise
    Fit to domain Designed for the purpose Designed as a generic application to fit multiple businesses and models. Many objects have different meaning, purpose or naming in particular enterprise.

    Here is a visualization of the size of data models.

    If you put them together then data model of Sakila simply doesn’t exist :)

    So would a Sakila-like ERD work for EBS? I wish I could see the attempt to do it...

    The right tool

    You can document your databases like Oracle did with their huge EBS application what I believe helped thousands of consultants and developers around the world. Create a set of simple diagrams and attach complete data dictionary.

    You can do it very simply with Dataedo and share with entire team easily with HTML pages like this:

    Database documentation (ERD + Data Dictionary) created with Dataedo [Live sample]

    See sample documentation

    Give it a try (for free)

    Piotr Kononow
    Piotr Kononow

    Product Manager and founder of Dataedo. For many years business analyst, software architect and project manager in various industries - asset management, heavy industry, telco, utilities/gas and tourism. His specialties are data warehousing/BI and business applications.

    0
    There are no comments. Click here to write the first comment.