Today's enterprises have a vast, complex and ever growing landscape of data sets and the demand for data access and analytics grows every year. I want to show what obstacles data analysts, BI/DWH developers and all other data specialist have navigating this sea of data.
Let's take an imaginary manufacturing company that produces boxes for an example. Let's call it Box Inc.
1. There Are Many Databases
Any enterprise has a number of different loosely connected applications and databases - legacy databases, custom applications, packaged applications, data warehouses and many more.
Box Inc has following data architecture:
- ERP packaged application with manufacturing, procurement, inventory, financial modules
- Data Warehouse with Planning, Budgeting, Consolidated Reporting and Analytical CRM modules
- Packaged CRM application
- Packaged Human Resources Management (HRM) application
- Packaged Project Management application
- Custom Time Tracking application
- Custom Technology Management application
- Custom Facility Maintenance application
- Sales Support application
- Packaged Customer Service application integrated with CRM
- Packaged Invoice Workflow application integrated with main financial system
- Intranet Portal
- Access Control database (physical gates and door)
- Subcontractor Portal
- Suppliers Integration Platform that enables automatic quotes on materials
- Standard eCommerce platform where customers can place orders
- Customer Portal where customers can access documents and request support
- Master Data Management solutions for customers and suppliers to keep data consistent across systems
- LDAP with all users accounts
- Document Store
And a dozen of other smaller applications and databases.
As you can see, just to navigating through the databases itself is not a trivial task. But let's continue our search of the data.
2. Databases Are Large And Complex
Many of enterprise applications have very large and complex databases. Especially packaged applications, ERP in particular, are a great example of this. It might be hard to believe but popular ERP applications have tens or hundred of thousands of tables and views. Let's have a look at a few examples:
Number of tables and views in popular applications: - TETA (HRM): 9,000 - Oracle e-Business Suite (ERP): 55,000 - SAP (ERP): 130,000!
To visualize you how much is that, this is how 42k tables of particular installation of Oracle e-Business Suite looks like:
List of sample Oracle eBS (ERP) 42k tables
And those tables are large and complex themselves. Here is a list of columns of order lines table of that Oracle database:
Columns of sample Oracle table
I hope it gives you an idea of how difficult it is to find a data or understand what it is you are looking at. It's as if you were looking for it on Manhattan (there are approx. 134,000 buildings in Manhattan).
Photo by NASA by Expedition 10 Commander Leroy Chiao
Your data is in one of those apartments. And you even have an address - good luck!
You Need a Map!
I hope I visualized you that having data is not enough to be able to use it. If you want to make any use of your data you need a map. This map around your databases is called Data Dictionary. If you haven't already, you should start building it today.
Frame from movie "Aliens" (1986), J. Cameron