Subscribe to Windows IT Pro
December 07, 2009 12:00 AM

SSIS Logging Best Practices

What information should be logged, where should it be logged, and how should it be logged?
SQL Server Pro
InstantDoc ID #103213
Rating: (16)

One thing software developers often forget after we write brilliant code is that someone will need to actually operate the applications we create. No matter how spectacular the code, we can’t anticipate everything that developers and users might change in an application.

Issues concerning system changes, data changes, and errors are particularly acute in any extraction, transformation, and loading (ETL) work. ETL code, by definition, must bridge two or more systems. Those systems change over time. Even without new development, upgrades, and patches, changes can cause problems. In addition, the user community continually adds data to source systems. Data that’s allowed on the source system could violate a business rule set on the target system, causing your elegantly designed ETL code to throw up like a frat boy on his 21st birthday.

There are two laws of ETL development that are as inviolate as the laws of thermodynamics. They are:

  • Source systems change constantly. No one involved in the changes will ever tell the ETL developers that the change is coming until after the change has been made and your extractions fail. This is usually accompanied by someone with a "C" in their title (as in CIO) getting upset over data marts not having data and indiscriminately lopping off various appendages.
  • Users continually mess up data input. They do it in new and varied ways that boggle the mind and make no logical sense. You’ll need to deal with all of these issues while management screams at you that they need their reports. You’ll do this until someone takes pity on you and you, blessedly, retire a shattered and broken human being.

Therefore, we need to design systems that can recover from these errors. It also must provide sufficient information to the appropriate people so that they can address these issues on an ongoing, operational basis. This raises the questions of what should be logged and, sometimes, where it should be logged, versus how you should log it. Broadly speaking, there are three main areas for logging:

  • Things that go bump in the night
  • WTF (as in "Where’s this From?" …. Get your minds out of the gutter)
  • Event driven logging, or Audit logs

Let’s take a look at these three logging areas in a bit more depth.

Things That Go Bump in the Night
Things can go bump at any time, but the typical ETL job runs late in the evening or during the early morning hours. That’s when data is extracted from a transactional system and loaded into a data mart or data warehouse for analysis, or when different systems are synchronized during a maintenance window. I’m referring specifically to unrecoverable errors. As good developers, we need to provide the operations staff with enough information for them to find and fix the problem. As self-centered megalomaniacs (we leave out megalomaniacs who aren’t self-centered), we need to provide this data so that we aren’t getting phone calls in the early morning hours or on the weekend asking us to come in and fix a broken system.

These problems should be (but often aren’t) rare. For example, it’s difficult to imagine how one can programmatically recover from someone kicking the power cord out of the source system’s server. We simply need to build our code with a nod to the fact that it runs on a machine and machines occasionally break.

When these events occur, the error needs to be raised in such a way that it is visible to the folks who can begin to address the problem. So the first thing required here is an interface for the operations staff to view the status of each job as it runs. Let me say the following in big bold letters, so I'm very clear—NEVER USE EMAIL TO REPORT ERRORS!

Why? First, email lists for notification purposes are rarely, if ever, maintained or kept up-to-date. Second, everyone, without exception, puts rules on their email accounts to route such email messages to some folder. No matter what this folder is named, the name means "Ignore Until Management Calls." The result is the error is ignored until you have annoyed users. Third, bad things come in threes. If the first thing bad is that the application died, the second thing bad is that the person who gets the email about it is unavailable (e.g., out sick, on vacation, in some remote location with no cell phone coverage). The third bad thing will be the 2AM call you get to come in to fix the problem.

Related Content:

ARTICLE TOOLS

Comments
  • Bhudev
    2 years ago
    Mar 05, 2010

    This is what, I was looking for. I'm working one of major DW requirements on SSIS, where counting of packages can cross 100 mark. Source can be Web Site, Structural DB tables, Flat files & other formats, which will come from multiple applications. My manager asked to give common framework for Exceptional Handling & which will be better out of Exceptional Handling using SSIS Features vs Using Web Services or using C# APIs.

    As you have discribed why & why not in details, it was really helpfull to me.

You must log on before posting a comment.

Are you a new visitor? Register Here

advertisement

advertisement

Windows is a trademark of the Microsoft group of companies. Windows IT Pro is used by Penton Media Inc. under license from owner.