Table of Contents:
Business glossary is business metadata - a glossary of business terms related to data and is at the core of data governance and streamlines reporting, analytics, data warehousing, data protection, and all other data initiatives.
Business glossary is tool agnostic - you can build one in any word processor, spreadsheet or any wiki tool. However, it is mostly implemented with a dedicated metadata/data governance tool. Those tools do it in a very similar fashion. In this article, I’d like to guide you through elements of typical business glossary implementation.
A (business) glossary is a collection of business terms, policies, rules and categories/folders that arrange them.
A key entry in the business glossary is a business term. It is a business concept or entity identified by a unique name and defined by a meaningful description specific to the organization, in a language understood by everyone, IT and business people.
Glossary term in Apache Atlas.
Data governance rules for data assets.
Policies that define how, how, where and by whom data will be managed.
A folder that is used to hold terms, policies, and rules.
Glossary term relationships
An important element of the business glossary are relationships between the terms, policies, and rules.
Examples of such relationships:
- Is a synonym to
- Is calculated from
- Replaced by
- Related to
- Is modifier of
Term relationships in IBM Information Governance Catalog
Sample diagram of terms and relationships expressed in VOWL notation (for ontologies).
Additionally to the relationships described above, glossary entities can be arranged into a hierarchy.
Each term has a standard set of attributes.
Name is how the term should be referred to by the organization. It should be a meaningful, unequivocal and unique identifier of the concept/entity.
Glossary term in Semanta
Every term needs a definition. It is an explanation and specification of a concept/entity understandable by both business and technical users. It can be split into a short definition, long definition, and calculation. Name and definition constitute minimum core elements of any business glossary.
Glossary term in Semanta
These are optional fields - a list of other names and abbreviations this term is also known under.
For instance, capital markets term Assets Under Management can also be known as Funds Under Management and abbreviated as AUM or FUM.
Links to data assets
While a Business Glossary itself is not tied to any particular IT solution, an option of linking business terms with data assets can bring immense value. Thus, a very important element of a glossary in data catalogs, metadata management and data governance tools are links to data assets (data dictionary). Benefit of using tool such as Dataedo is that it allows you to automatically catalog data elements (tables, columns) from your databases and then link them to business terms.
This allows you to easily go from logical definition to physical representation in various databases (CRM, HRM, ERP, data warehouse, etc.) and quickly find the data.
It also works the other way around - it helps data engineers and analysts working with databases understand what particular table or column represents and have a shortcut to global organizations definition.
Many tools and glossary implementations have other attributes and metadata assigned to business terms.
A term owner is a person or organization that has the highest authority on the name and definition.
A steward is a person who is responsible for the process of gathering information, agreeing on the definition between all stakeholders and publishing the term in the dictionary.
Building a glossary takes considerable time and is a team effort. Each term can go through different stages of the process of definition and approval. A status field can clearly indicate the stage of the definition of the term, for instance: Suggestion, Draft, Pending Approval, Approved, Deprecated.
For the purposes of data protection and governance, organizations must assign a level of classification to data assets (e.g. Public, Private, Restricted). Providing a classification for a term is a top level requirement for the data protection and can help classify specific physical data assets and find sensitive data in databases.
I hope that after reading this article, you now have a good idea now what a business glossary is, what are its essential elements and that it is something an organization of any scale can afford. You can start collecting your key terms in a spreadsheet or a word processor, and when you're ready, move to a tool such as Dataedo and find key data in your databases. Try it for free now.