Data Dictionary is a description of data structures (read about two basic types) and you can distinguish logical and physical Data Dictionaries.
Data Dictionary and Data Model
The difference between logical and physical Data Dictionaries is the same as between logical and physical data model:
- Logical data model is created at the requirements gathering, system analysis and top level design. It is a communication and specification tool for business analysts and business.
- Physical data model is created when you translate top level design into physical tables in the database. This model is slightly different due to the fact that you have to worry about many details. This model is created by database/software architects, software developers or database administrators.
The same goes for Data Dictionary – logical one describes elements of logical data model and physical documents actual tables and columns in your database schema.
Different audience and users
Logical Data Dictionary is used by business analysts or business users. Physical Data Dictionary is used by database administrators, software developers and database architects.
Different authors
Not only audience is different, it is also created by different people. Logical Data Dictionary is created by business analysts. Physical Data Dictionary is created and maintaied by database architects and database administrators.
Relation to applications and databases
Logical Data Dictionary are system agnostic while Physical Data Dictionary represent data in a specific database.
Different purpose
Logical Data Dictionary is used while business analysis and top level design as communication tool. Physical Data Dictionary serves as a documentation of databases and systems for technical users.
Different development stage
Logical Data Dictionary is usualy created at the beginning of the system design - top level design, data modeling and gathering of requirements. It is then reviewed at launching and validation. Physical Data Dictionary is created while database desing and implementation and should be maintained thougout the whole system lifecycle, as database schema evolves.
Different scope
Physical Data Dictionary covers one database or one schema, while Logical Data Dictionary covers one business domain or even entire organization. Therefore, there should be one Logical Data Dictionary covering one domain, but there may be more than one Physical Data Dictionaries since the same data may exist in more applications. A good example is customers data - it can be held in multiple databases - eCommerce, CRM, ERP or data warehouse.
Different level of details
Logical Data Dictionary focuses on key data elements (entities and fields) while Physical Data Dictionary describes all tables and columns.
A few examples
Here are some differences in those two models and dictionaries:
- Logical model uses user friendly names while physical has some limitations and often uses different notation (eg. rcv_trx_no instead Invoice number)
- Logical model doesn’t need to determine data type while physical has to (including data length)
- Physical model needs to specify exact data format (eg. date format stored as text)
- Physical model often introduces surrogate primary keys (eg. customer_id) where logical uses business keys (eg. Customer number)
- The same goes for foreign keys – logical model just defines relationship while physical one needs to introduce linking columns
- Logical model may specify many-to-many relations while physical needs to use associative tables
- On the other hand, physical model may use some denormalization to improve performance while logical as a conceptual model is mostly normlaized
- Logical model can define redundant calculated fields (like Is employment active which can be derived from start, end dates) while physical often doesn’t (developers embed logic in code)
- Both models may use different value sets (e.g. logical uses ‘Male’ and ‘Female’ while physical stores ‘M’ and ‘F’)
- Values can be also implemented as a separate dictionary table (sometimes a common table for many dictionaries), while logical model doesn’t have to specify such detail
- Physical data can be stored in more tables or even systems (eg. customer billing data is held in ERP system and marketing information in CRM, while logical model has just one Customer entity)
- Physical model often has system or metadata columns (eg. created_by, modified_date, deleted_flag) that logical one doesn’t require
There are more differences between the two, but I hope you get the idea.