Lack of Comments Makes Database Documentation Useless

Recently my team was assigned to map a data source to a data warehouse we are developing and taking care of. We received a well defined data extract from one of the major banks. Along with sample data we received a documentation with a list of all tables and columns in the data set. This documentation included many data details like data type, precision, format, etc. but to my surprise it did not include the most important one - a description of a meaning of each column. To be honest, without it this documentation and data was useless to us.

A few examples

1. Salesforce

Salesforce is the most popular CRM in the world, with the largest market share. It is a cloud application wiht a relational database in the back end. Customers don't have direct access to this database with database tools but they can query it from the application (and dedicated tools) using SOQL - a query language similar to SQL. Underdstanding database model would be useful.

To be fair, for each database field they document how field is presented in the UI (Field Label column) which is very useful, and there are descriptions of fields in API documentation, but more explicit descriptions for each field would make this documentation and understanding and querying Salesforce database much easier.

Source: Salesforce Field Reference Guide

2. Wordpress

Wordpress is open source blog blog engine in the world. In fact, it is the most popular CMS (content management system) in the Internet today - it is estimated that 27,6% of all the websites are hosted on Wordpress! Wordpress has hundreds if not thousands of developers - as of today, there are 53k thousand tables. You would expect that such popular open source software would be well documented. Well, I found no column descriptions:

Source: Wordpress Database Description

A few good examples

There are also notable examples, where architect and technical writers did a good job. Here are some of them.

1. Oracle e-Business Suite

Oracle e-BS is a large ERP packaged application from Oracle. It has a very large and complex database schema with over 42k tables. And yet, Oracle managed to priovide meaningful description for each table and column! Well done, Oracle.

Source: Oracle Purchasing Technical Reference Manual

2. phpBB

Another good example comes from an open source project - phpBB. It is one of the most popular and oldest open source forum engines. It was released in 2000 and a good chance is that you have visited one or more forums powered by this platform. Their online documentation includes description of each table and column.

Well, upon further inspection I noticed that at least half of the columns have a "tbd" as a description. Not that good of an example after all. Still, probably better than most commercial projects.

Source: phpBB Tables

What about your databases?

And do you have a comments for tables and columns in your databases? Do you know where to look for it? Do you know who's responsible for this? Are you setting a good or bad example?

Read: How to Check if Your Database Has Comments

Take-out

Plain specification of data schema is meaningless - if you are documenting a database, this information is accessible with database tools anyway. What is most important is the meaning of data - what is actually kept in each column and what it all means. Please make life of everyone that will need to query database after you easier and provide meaningful description of each table and column...

Need a tool? Try Dataedo

Comments (0)