The process for designing SQL Server Integration Services (SSIS) packages is typically iterative. You start by getting the components working individually or in small sets, then concentrate on ensuring that the components will work in the correct sequence. During later iterations, you add in more components or adjust properties to perform error handling. Then, in a final pass, you might add in abstractions, taking advantage of variables and expressions to enable runtime changes for your package. But your work is not yet complete at this stage.
Before you put your package into production, you need to take some more time to review your package with an eye toward preventing, or at least mitigating, performance problems. Sooner or later, the business environment is likely to change, requiring you to perform the same work within a shorter time span or to process higher data volumes than originally planned. I’ll describe areas within a package that can cause performance problems and offer suggestions for modifying the package design to get better performance.
Bear in mind that there are various factors that can affect the performance of SSIS packages. Several of these factors are completely external to SSIS, such as the source systems that you’re using for data extractions, disk speed and configuration, NICs, network bandwidth, and the amount of memory available on the server executing the packages. I won’t address how to resolve bottlenecks caused by external factors, but rather concentrate on specific strategies you can take within the design of your packages. I assume that you already have a general understanding of SSIS architecture and package development practices.
Understanding Control Flow Performance
Every SSIS package has at least one task in the control flow. If you add multiple tasks to the control flow, you can direct the sequence of tasks by adding precedence constraints to connect a series of tasks from beginning to end. You can even group these tasks together in containers. Apart from the Data Flow Task, which I’ll describe in more detail later, the performance of each individual task depends on the external systems with which the task must communicate to get its job done. Therefore, the only way within the package to speed up control flow processing is to have tasks or containers (collectively called executables) run in parallel. Figure 1 illustrates parallelism in Container A and sequential workflow in Container B. In this example, the containers themselves also execute in parallel.

Figure 1: Running executables in parallel to speed up control flow processing
The package property that determines how many executables can run in parallel is MaxConcurrentExecutables, as shown in Figure 2. The default is -1, which means that the control flow engine will use the number of logical processors on the server plus 2. For example, if you’re executing the package with the default setting on a quad-core server, you can have up to six executables running in parallel.

Figure 2: Using the MaxConcurrentExecutables property to configure how many executables can run in parallel
In some cases, you won’t see much change if you try to increase the MaxConcurrentExecutables value. However, one scenario with potential for improvement is when you have a package with tasks that must wait for a response from external systems and your server is dedicated to executing packages. In that case, you might be able to boost parallelism by replacing the MaxConcurrentExecutables value with a higher number. Start by increasing the value to the number of processors plus 3, then test the package in Business Intelligence Development Studio (BIDS) to see whether the number of tasks executing in parallel increased. If so, continue to increment the property value by 1, test, and repeat until no further parallelization is possible, or you have no more tasks to run in parallel.
Understanding Data Flow Performance
The Data Flow Task is used to retrieve data from one or more sources, optionally change the structure or content of the data in transit, and send the data to one or more destinations. In business intelligence (BI) applications, this task is used to perform extraction, transformation, and loading (ETL) operations.
SSIS uses a pipeline engine with an in-memory buffer architecture to efficiently manage Data Flow Task operations. Performance of the pipeline engine largely depends on the number of records moving through the pipeline and the number of buffers required to accomplish the transformations and to move the data to the destinations. Therefore, to optimize performance, you need to understand how data flow components and the Data Flow Task properties affect pipeline throughput and buffer requirements.