Subscribe to Windows IT Pro
May 01, 1999 12:00 AM

Unleash the Power of DTS

SQL Server Pro
InstantDoc ID #5248
Rating: (6)
Downloads
5248.zip

How to use DTS import, export, and data- manipulation capabilities

Data Transformation Services (DTS) is a SQL Server 7.0 utility service that provides import, export, and data-manipulation capabilities between OLE DB, ODBC, and ASCII data stores. Like many SQL Server tools, the DTS Package Designer and wizards are nothing more than wrappers that graphically expose an object model. Although these tools offer an extensive array of options that you can use to develop elaborate transformations, they don't expose the complete functionality and power of the DTS object model. To unleash this power, you must interact with DTS via its native COM interfaces. This article examines the DTS object model and illustrates it with a Visual Basic 6.0 (VB6) application that uses the DTS COM interfaces to perform its data transformations.

The Object Model
The DTS object model offers a complete set of components that let you move data among multiple data stores. Figure 1, page 44, provides a summary of the major collections that compose the Package object model: Connections, GlobalVariables, Steps, and Tasks. All these collections play equally important roles in supporting a Package.

The Connections collection contains all the OLE DB data provider information in a Package. Each Connection object describes the data provider. With connection pooling, you can then reuse these connections across multiple Steps or Tasks, so that you can do more work with fewer resources. Connection pooling is a provider-supplied function that enables an application to reuse an existing connection rather than physically establishing a new connection for each use.

The GlobalVariables collection provides an internal storage area where you can store data and share it across different Steps within a Package. The GlobalVariable object provides a variant data type for storage. Each Package object contains a collection of GlobalVariable objects that you can dynamically add or remove during execution.

The Steps collection contains all the Task workflow information in a Package. You can associate each Step object with a collection of Task objects; however, if you want to execute a Task, it must have at least one Step. Unlike the DTS Package Designer, which implicitly creates any needed Steps, applications using the DTS COM interfaces must explicitly define the Steps.

Whether a Step object executes depends on its PrecedenceConstraint objects; a Step can execute in parallel with other Steps, serially to them, or conditionally based on their execution. A PrecedenceConstraint defines the conditions that prior Steps must meet before the current Step can execute. Thus, a Step with more than one PrecedenceConstraint cannot execute until it satisfies all its PrecedenceConstraints. The three types of PrecedenceConstraints are

  • Completion: This Step executes regardless of the previous Step's success or failure.
  • Success: This Step executes only after the previous Step's successful completion.
  • Failure: This Step executes only when the prior Step fails.

The Tasks collection, as Figure 2 shows, contains all the defined Tasks in a Package. Each Task object contains information describing a unit of work that the Package object must perform as part of the Transformation process. The CustomTask object is necessary to implement any Task. Using the CustomTask, you can extend DTS by creating Tasks to supplement the built-in Tasks that ship with SQL Server. CustomTasks can include their own customized user interface and dialogs, which you can add to the DTS Designer. The DTS interfaces also let you use CustomTasks directly in COM applications.

Table 1, page 45, lists the eight built-in Tasks that ship with DTS. This article focuses on the three Tasks most directly related to performing Transformations: the DataPumpTask, Data-DrivenQueryTask, and ExecuteSQLTask.

Transformations
The DataPumpTask object defines the information necessary to create and execute an instance of the DTS data pump, which transforms data as it is moved from the source to the destination data store. The data pump, which Figure 3 shows, is a multithreaded, high-speed, in-process COM server that moves and transforms rowsets between a source and a destination data store. ActiveX scripts can perform the Transformations as the data pump moves the data. If your Transformation is too complicated for the ActiveX scripting language, you can develop CustomTransformations with any COM-compliant language. Note that DTS applies Transformations row by row as the data passes through the pump. Therefore, when designing CustomTransformations, you need to evaluate and minimize any performance hits.

The DataDrivenQueryTask object, which Figure 4 shows, defines the information necessary to move data between the source and destination data stores via data-driven queries. The DataDrivenQueryTask differs from the DataPumpTask in how the Task applies the data to the destination data store.

DTS provides two methods of interaction with the destination data store. The first and fastest method uses insert-based data movement: The Task reads the data from the source, transforms it, and then uses INSERT statements or OLE DB's IRowsetFastLoad interface to insert the data or bulk copy it to the destination data store. The second method is data-driven-query-based: Every source row passing through the data pump applies a query against the destination data store. The query can be any valid SQL code including UPDATE, DELETE, and stored procedure statements.

When designing your application, keep in mind that these two methods are mutually exclusive; therefore, apply the following rule when choosing between the two. If moving data into the destination data store is an insert-based scenario—always a complete refresh or add—implement the Transformation using DataPumpTask. If moving data into the destination data store is a query-based scenario—requiring UPDATE, DELETE, or stored procedures—implement the Transformation with DataDrivenQueryTask. Unless you implement DTS in complete refresh scenarios, most of your development work will use DataDrivenQueryTask.

Both DataPumpTask and DataDrivenQueryTask support the Lookup object, which lets you perform queries that use one or more named parameterized queries. The Lookup object lets a Transformation retrieve data from locations other than the immediate source or destination row being transformed. For example, you might use a Lookup if your source and destination data stores are SQL Server databases, but one of the values needed at the destination resides in an alternative location (e.g., Oracle, DB2, Excel). For each source row, the Lookup performs a query against the third-party location to retrieve the needed value.

The Interfaces
Microsoft provides two .DLL files that custom applications use to directly access DTS Packages and the data pump. First, the Microsoft DTSPackage Object Library (DTSPkg), implemented through DTSPKG.DLL, exposes an OLE Automation interface that you can use with any language that supports OLE Automation to create and modify DTS Packages. The second interface, the Microsoft DTSDataPump Scripting Object Library (DTSPump), which DTSPUMP.DLL implements, provides additional OLE Automation interfaces that developers can use to write directly to the data pump. The ActiveXScriptTask object uses the DTSPump interface. In addition, C and C++ developers can reference the header file DTSPUMP.H to directly interface with the data pump.

Although both interfaces—DTSPkg and DTSPump—let you create Packages and CustomTasks, you can implement Custom Transformations via C or C++ only by using DTSPUMP.H. Because DTS processes Transformations one row at a time, Microsoft felt the performance overhead of an OLE Automation interface would be too overwhelming. Therefore, Microsoft restricted Custom Transformations development to the native OLE DB interface that DTSPUMP.H includes.

Related Content:

ARTICLE TOOLS

Comments
  • Anonymous User
    7 years ago
    Apr 16, 2005

    should have elaborated a lil more

  • Brian Pregler
    9 years ago
    Jul 30, 2003

    I want to use a comma delimited list as my source and insert the data into an SQL server 2000 table. Here is my problem: The list is not always going to be named the same thing and I also need to run a SP during the insert to check and see if the data from the list already exists in the table. There will be many rows of data from the lists that will be inserted and many will not because they are already present in the table. Can you help me?

  • mickster
    11 years ago
    Apr 19, 2001

    straight outta msdn or what!

You must log on before posting a comment.

Are you a new visitor? Register Here

advertisement

advertisement

Windows is a trademark of the Microsoft group of companies. Windows IT Pro is used by Penton Media Inc. under license from owner.