Data in today's information systems are organized into databases. A database is a logical subset of data tables and scripts hosted on a server. This division is often artificial and there are no strict logical boundaries between data assets.
There are various reasons for creating such data silos: technical (it is easier to backup and manage smaller databases), organizational (different teams and departments can work on separate databases), there are legacy applications and databases, companies buy third party applications and so on.
But data in most environments, from business and logical perspective constitutes one continuous landscape, tables are related to one another and there are no strict boundaries.
But in fact, because of technical limitations of today's DBMS platforms, data is put into distinct siloed islands and architects cannot define cross-database table relationships/foreign keys.
Some engines, like SQL Server or MySQL, don't even allow to create foreign key constraints across databases on the same server.
Why is that a problem?
Not being able to define foreign key constraints (FKs) across databases brings a few problems:
Data integrity
FKs ensure data referential integrity handled by the database server. Without them it has to be ensured in higher levels and that is not as reliable.
Understanding data model
Without FKs developers have a hard time understanding and discovering data models and this means problems and costs for reporting, data warehousing or big data projects.
Let's document it
There is one thing (at least) you can do about this - document those relationships. Let's use a great tool for this - Dataedo. It enables you to describe schema of various databases in one repository and define non-existent foreign keys.
Let's do it.
1. Install
First, we need to download and install Dataedo tool: https://dataedo.com/download-dataedo
You can work with the local file as your repository or set up a server repository, which is recommended for teamwork. You will need SQL Server instance for this. Create repository and connect to it.
2. Import database schemas
Now you need to connect to your databases (the ones you want to document) and import schema. Just click Add documentation button, provide connection details and Dataedo will read tables, columns, foreign keys etc.
In my example I have imported following databases:
You can see that Dataedo reads foreign keys and shows you existing table relationships:
3. Define relationships
OK, let's now get to documenting those cross-database relationships. Once all required databases and tables are imported we can create new relationship from table in one database to the table in another database.
In our case, we have a time tracking database that has RES table that holds resources. Those are any entities that organization can track time against, including employees. But employees are defined in HR database. This database is not only hosted on different server, but also operates on different database engine.
There are no foreign keys, but there is a logical link between them - RES.EMP_NO -> HR.EMP.EMP_NO. So let's define it. Let's right click EMP_NO column and choose Add relation.
A new window will open with where we need to define our new relationship. Let's choose HR database as a Primary Key database (PK Database field):
Then EMP table in PK Table:
And finally EMP_NO column in PK Column:
If your key is composite (consists of more than one column add more rows in Columns table)
Our relationship is now defined and we can save it in the repository. As you can see that it shows just as a regular foreign key, except with a prefix with the name of the database (HR).
Now anyone working with this table will be aware of this data logic assumption. It could be used for reporting, building data warehouse, ensuring data consistency or development of this database.
Please note that no change was applied to any of the documented databases.
4. Create diagrams
OK, so by now we have extended our Data Dictionary with our relationship, but that's not all - we can also create a cross-database ER diagram. To do it, go to Modules & ERDs item under any database and create new module. On ERD tab you can add tables to the pane with a drag & drop from the list on the right. I have already added 3 tables from Time tracking database to my diagram. Once I added RES table, EMP table from HR database is also available.
Let's add it to the pane as well. You may notice that relationship was added automatically.
I used colors to distinguish database. You may notice that name of the other database appears over table name.
5. Export & share
OK, so we have created pretty convenient documentation by far. Now it is time to share it with our team. We can share it in HTML, PDF and Excel files. Let's export it as HTML. To do it, click Export documentation, choose format and path. I selected all databases in my export to provide an overview of our company databases.
I hope you found it useful. If you are ready to understand your databases more and share this knowledge: