SQL Server Integration Services (SSIS) is a Microsoft's ETL platform for building data transformations and data ingrations solutions. Dataedo supports documenting the SSIS packages contained in ISPAC deployment project files, DTSX package files, and connecting to a SSIS server.
Supported elements and metadata
Packages will be imported as Packages in Dataedo. Tasks within a package will be imported as data processes in Data Lineage, so that the data lineage can be easily extended with data flows.
Sources and Destinations types of objects from SSIS which are in Data Flow are imported into dataedo as Sources and Destinations.
If the source of the data is an SQL query then Dataedo parses that query, and based on that parsing a lineage is created.
Connections that are in Connection manager in SSIS Package will be imported into Dataedo as Linked Sources.
Linked source contains information about the connection such as server, host, etc. If Dataedo detects that the database to which the connection relates was previously documented, it will assign the matching database to the Linked source. More about Linked Sources in this article
Dataedo imports project connections and package connections
Automatic data lineage
Before documenting SSIS project/package make sure that all of the data sources and destinations are imported as data sources in Dataedo. Furhtermore, the data source and destination should be created on package level. If all of the prerequisites are met, Dataedo will automatically create Data Lineage for following tasks (Control Flow):
- Data Flow Task (source and destination)
- Bulk Insert Task (destination)
- Execute SQL Task
- FTP Task
The data lineage for Sources and Destinations is also predicted when data sources are documented in Dataedo, and is created in SSIS by these objects from the Toolbox:
- ADO NET Source
- Flat File Source
- OLE DB Source
ODBC Source
ADO NET Destination
- Flat File Destination
- OLE DB Destination
- SQL Server Destination
Automatic column data lineage
This type of lineage is provided for specific objects. At the Process level in Dataedo, a column lineage is created when a process contains these objects from the SSIS Toolbox:
- Balanced Data Distributor
- Conditional Split
- Lookup
- Multicast
- Catche Transform
- Export Column
- Procentage Sampling
Column lineage, but not for columns newly created in the process:
- Data Conversion
- Derived Column Script Component
- Character Map
- Copy Column
- Fuzzy Grouping
There is no column lineage, but there is lineage to the table:
- Aggregate
- Sort
- Union
- Pivot
- Row Sampling
- Unpivot
At the Sources and Destinations object level in Dataedo, the column lineage is also available.
Connect Dataedo to SQL Server Integration Services SSIS
To connect to SSIS create new documentation by clicking Add and choosing New connection.
On the Add documentation window choose SQL Server Integration Services (SSIS):
On the next screen select Import Type:
- ISPAC project deployment file - if you want to document all packages in a project file exported from SSIS SQL Server instance.
- DTSX package file - if you want to document just one package, either extracted from ISPAC file, or created in Visual Studio.
- SSIS Server - this option allows you to connect to a SQL Server instance holding an SSIS project.
ISPAC project deployment file and DTSX package file import
For first two options select the path of the appropriate file by clicking Browse button.
Click Connect. Dataedo will now read all the packages and will list them as Package type. Select packages to import and click Next. Set the Documentation Title and click Import.
SSIS Server import
For SSIS Server option you will need to fill in connection details:
SSIS Server connection details
- Server name - SQL Server instance where SSIS is stored, e.g. localhost.
- Port - port under which SQL Server is available. In most cases you can leave default value of 1433
- Authentication - select authentication type for connection. These are same as authentication types for SQL Server. For selected authentication type you will need to fill in appropriate information:
- Windows Authentication - Dataedo will use your Windows account, hence no further details are needed
- SQL Server Authentication - user and password
- Azure Active Directory - Password - Azure AD user and password
- Azure Active Directory - Integrated - Azure AD login associated with Windows account
- Azure Active Directory - Universal with MFA - two-step authentication where you are redirected to Microsoft MFA login webpage.
- Database - select SSISDB (Catalog Database) from list or type its' name
- Folder - select Folder in which project is stored
- Project - select Project which is to be documented
How to export an ISPAC project file
Connect to a SQL Server instance containing the SSIS project you want to document with SQL Server Management Studio. Then expand Integration Services Catalog, look for the project you want to document (on screenshot it's Daily ETL). Then right-click on the project and choose Export.
The windows explorer window will appear. Choose a location for the file and save.
Outcome
Specification
Imported metadata
Dataedo reads following metadata from SSIS.
Imported | Editable | |
---|---|---|
Packages | ✅ | ✅ |
Control Flows as Processes | ✅ | ✅ |
Column data lineage | ✅ | ✅ |
Process description | ✅ | ✅ |
Process script | ✅ | ✅ |
Variables | ✅ | |
Script | ✅ | |
Description | ✅ | ✅ |
Sources and Destinations | ✅ | ✅ |
Data Lineage | ✅ | ✅ |
Column data lineage | ✅ | ✅ |
Process description | ✅ | ✅ |
Process script | ✅ | ✅ |
Script | ✅ | |
Description | ✅ | ✅ |
Shared metadata | ||
Dependencies | ✅ | ✅ |
Created time | ✅ | |
Last updated time | ✅ |