Why It's Hard to Find Data And Why You Need a Map: Data Dictionary

Table of Contents:


    Today's enterprises have a vast, complex and ever growing landscape of data sets and the demand for data access and analytics grows every year. I want to show what obstacles data analysts, BI/DWH developers and all other data specialist have navigating this sea of data.

    Let's take an imaginary manufacturing company that produces boxes for an example. Let's call it Box Inc.

    1. There Are Many Databases

    Any enterprise has a number of different loosely connected applications and databases - legacy databases, custom applications, packaged applications, data warehouses and many more.

    Box Inc has following data architecture:

    1. ERP packaged application with manufacturing, procurement, inventory, financial modules
    2. Data Warehouse with Planning, Budgeting, Consolidated Reporting and Analytical CRM modules
    3. Packaged CRM application
    4. Packaged Human Resources Management (HRM) application
    5. Packaged Project Management application
    6. Custom Time Tracking application
    7. Custom Technology Management application
    8. Custom Facility Maintenance application
    9. Sales Support application
    10. Packaged Customer Service application integrated with CRM
    11. Packaged Invoice Workflow application integrated with main financial system
    12. Intranet Portal
    13. Access Control database (physical gates and door)
    14. Subcontractor Portal
    15. Suppliers Integration Platform that enables automatic quotes on materials
    16. Standard eCommerce platform where customers can place orders
    17. Customer Portal where customers can access documents and request support
    18. Master Data Management solutions for customers and suppliers to keep data consistent across systems
    19. LDAP with all users accounts
    20. Document Store

    And a dozen of other smaller applications and databases.

    As you can see, just to navigating through the databases itself is not a trivial task. But let's continue our search of the data.

    Sample Enterprise Application Architecture

    See sample environment

    2. Databases Are Large And Complex

    Many of enterprise applications have very large and complex databases. Especially packaged applications, ERP in particular, are a great example of this. It might be hard to believe but popular ERP applications have tens or hundred of thousands of tables and views. Let's have a look at a few examples:

    Number of tables and views in popular applications: - TETA (HRM): 9,000 - Oracle e-Business Suite (ERP): 55,000 - SAP (ERP): 130,000!

    To visualize you how much is that, this is how 42k tables of particular installation of Oracle e-Business Suite looks like:

    Oracle eBS tables

    List of sample Oracle eBS (ERP) 42k tables

    And those tables are large and complex themselves. Here is a list of columns of order lines table of that Oracle database:

    Oracle OE_ORDER_LINES_ALL table columns

    Columns of sample Oracle table

    I hope it gives you an idea of how difficult it is to find a data or understand what it is you are looking at. It's as if you were looking for it on Manhattan (there are approx. 134,000 buildings in Manhattan).

    Manhattan from space Photo by NASA by Expedition 10 Commander Leroy Chiao

    Your data is in one of those apartments. And you even have an address - good luck!

    Manhattan address

    You Need a Map!

    I hope I visualized you that having data is not enough to be able to use it. If you want to make any use of your data you need a map. This map around your databases is called Data Dictionary. If you haven't already, you should start building it today.

    Aliens metadata discovery Frame from movie "Aliens" (1986), J. Cameron

    Start building Data Dictionary today

    Piotr Kononow
    Piotr Kononow

    Product Manager and founder of Dataedo. For many years business analyst, software architect and project manager in various industries - asset management, heavy industry, telco, utilities/gas and tourism. His specialties are data warehousing/BI and business applications.

    0
    There are no comments. Click here to write the first comment.