Amazon DocumentDB is a fully managed database service by Amazon Web Services (AWS) designed to support MongoDB-compatible workloads. By using Dataedo, you can create comprehensive and well-organized documentation for your Amazon DocumentDB database, making it easier to understand, manage, and maintain.
Catalog and Documentation
Information about the collections in the DocumentDB database and metadata about their fields (or attributes) including their names, data types, and descriptions is imported. After import user can discover the schema of DocumentDB documents using ER diagrams as in the case of MongoDB.
Users will be also among others able to:
- run Data Classification on the DocumentDB database in the repository to search for columns containing potentially sensitive data.
- change all descriptions in Dataedo Desktop and Dataedo Portal
- link a Business Glossary term to any documented EBS object
- build Lookups for columns and views and feed them with distinct values from a column
Required permissions
Because Dataedo only scans existing documents only read permissions are required.
Connecting to DocumentDB
Prerequisites
If connecting with TLS enabled, appropriate certificates from Amazon PKCS7 cerificate bundle from https://truststore.pki.rds.amazonaws.com/global/global-bundle.p7b must be previously installed.
Connection
To connect to DocumentDB you need to first click the Add button on the left upper corner and choose New connection.
From the sources choose Amazon DocumentDB.
Now data necessary to establish a connection between Dataedo and the DocumentDB database must be provided. Two connection types are available: Values(visible on screen below) and Connection string.
When connecting to the database using Values option specified must be:
- Host - hostname of server where DocumentDB instance is located, for example, documentdb-3-6-sch.cluster-abcdefghijkl.us-west-1.docdb.amazonaws.com
- Username - name of the user
- Password - user password
- Timeout (s) - number of seconds after which the connection attempt will be interrupted
Optionally user can specify following options:
- Auth database - the authentication database to use, if unspecified, the client will attempt to authenticate the specified user to the admin database
- Replica set - name of the replica set
- Use TLS - enables TLS
- SRV - enabled SRV
In the case of Connection string connection type user needs to specify the:
- Connection string - URI that defines connection between applications
To learn more about possible connection options that can be set in connection string visit: Connection Options
Independently of choosing the connection type specified must be:
- Database - name of database to be documented
After clicking Connect Dataedo will start retrieving database objects, and when it finishes, it will display a window that allows to choose object types to import. Checking the Advanced fillters checkbox enables to define more complex filters.
Confirm which objects are to be imported by clicking Next. You will see a window that allows changing the default name of the documentation under which it will be visible in the Dataedo repository. This name can be changed later.
Click Import to start the import process. Wait until the import process is completed.
Close the import window using Finish button. Your DocumentDB database has been imported to new documentation in the repository.
Connector specification
Supported versions: 3.6, 4.0, 5.0
Imported metadata
Imported | Editable | |
---|---|---|
Collections | ✅ | ✅ |
Fields | ✅ | ✅ |
Data types | ✅ | |
Fields descriptions | ✅ | ✅ |
Required (as not nullable) | ✅ | |
Collections descriptions | ✅ | ✅ |
Primary keys | ✅ | ✅ |
Supported features
Feature | Supported |
---|---|
Data profiling | |
Data classification | ✅ |
Data lineage | |
Reference data (import lookups) | ✅ |
Importing from DDL | |
Generating DDL | |
FK relationship tester |
Known issues and limitations
- When connecting outside AWS Virtual Private Cloud (VPC) using SSH using EC2 instance as bastion host (How can I use an SSH tunnel to connect to my Amazon DocumentDB cluster from outside an Amazon VPC?) there might be problems with certificate names mismatch. In this case adding tlsInsecure=true option in connection string may be required which changes requirements so that the server only needs to present an X.509 certificate. To learn more about this option visit: Enable TLS on a Connection.