All articles · Metadata Management · Database Design & Metadata · Application Metadata · Metadata Tools · Products and News

6 Typical Metadata Fields Stored by Applications

Many packaged and custom applications store a few standard metadata elements in the database. This article lists them and explains their purpose and how they are implemented.

Those metadata fields help to track and audit data. In a perfect world all the databases would have those fields by default. In this imaginary world database platforms would support them out of the box for each entity/table.

1. Date of Record Creation

One of the most common metadata fields you can find in databases is a data and time when record was created (inserted into table).

Why is This Useful?

Knowing when specific record was created might be very useful for auditing or even analytics. It tells you how old particular row is and what was the order of creating data rows.

How to Implement

In most database systems it is enough to create a date or datetime column and set up a default value with current time:

  • SQL Server: created_date default getdate()
  • Oracle: created_date date default SYSDATE
  • MySQL: created_date timestamp default CURRENT_TIMESTAMP
  • PostgreSQL: created_date timestamp default now()

2. Who Created Record

Why is This Useful?

Knowing who created a record is very useful in auditing. You know who to talk to. Or blame.

How to Implement

There are two approaches to this:

  1. Use database login - All databases can automatically extract information about logged in user name and save this information in new row using default value (or a trigger). This approach doesn't always work, it doesn't when data is entered with application that uses one login to connect to database for all users.
  2. Use application - When application uses one login to connect to the database developers may pass user login explicitly along with the rest of fields.

To use database login create a varchar column and set up a default value with function that returns logged in user (or use a trigger):

  • SQL Server: created_by varchar(100) default suser_name()
  • Oracle: created_by varchar(100) default USER
  • MySQL:
    • created_by varchar(100);
    • create trigger table_insert_trig before insert on table for each row begin set new.created_by := current_user(); end;

3-4. Date and Who Updated Record Last

Similarly to created by and time fields databases often have time and login of user who last updated row. Again, very useful for auditing and trying to figure out the origin of the data.

Why is This Useful?

Having information who and when updated row gives you information if it was ever updated after its creation. You know how fresh the data is and who to talk with if you want to get more information about its status.

How to Implement

To use database login create a varchar column and set up a default value with function that returns logged in user:

  • SQL Server: created by varchar(100) default suser_name()
  • Oracle: created by varchar(100) default
  • MySQL: created by varchar(100) DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP

5. Source of a Record

Less obvious and less common metadata field is a source of a record. This can be an identifier of an application/module, interface or program that created each particular row.

Why is This Useful?

It is a wise strategy to have this field in your tables, for key tables at least. Data can be collected over a long period of time from various sources with various programs.

Examples of different sources:

  • Form ID - when data is entered by users with UI
  • Database/Application ID - when data is gathered from various sources, for instance customer record can come from Online store, CRM, Help desk, Accounts receivables, and so on. The same goes for application consolidation. Often companies have multiple parallel applications storing the same information. This happens for various reasons - geographical distribution or mergers. When databases get consolidated into one it can be very useful to know from which database it came from exactly.
  • Program/routine ID and version - for rows that are generated with some kind of processing
  • Manual import - when data was manually imported to the database from spreadsheets. You could use name of the spreadsheet.
  • Legacy database ID - for data that has been migrated from legacy database

Hot to Implement

It is very easy to create such field (or fields) but making sure it holds the right information takes a lot of conscientiousness at each stage of data management - application design, development, data integration and migrations. This needs to be considered with each insert statement written or executed at any stage. It a good practice to have a reference documentation with a list of possible values.

6. Row Version Number

This field is quite rare. This is used mostly for the implementation of optimistic locking mechanism where application handles multiple users by checking by row version number whether row was updated by someone else before saving it.

Why is This Useful?

Besides handling concurrent updates this also brings extra information and can be used for auditing and analytics. It tells you how many times specific row was updated by users.

Hot to Implement

This can be implemented with a trigger or in application layer. Optimistic locking needs to be implemented in the latter.

Any Other Standard Metadata Fields?

I hope you found my compilation useful. Do you know any other standard metadata field is stored by applications? Please share in the comment.

Document Your Table Columns

Recommendations