Webinar: Getting Started with Change Tracking in SQL Server

Change TrackingStart your summer off right by brushing up on a highly effective change detection technique! We will be hosting a webinar, Getting Started with Change Tracking in SQL Server, on Friday, June 8th at 11:00am CDT.

In this webinar, I’ll walk you through the essentials of change tracking in SQL Server: what it is, why it’s important, and how it fits into your data movement strategy. I’ll walk through demos to give you realistic examples of how to use change tracking.

Registration is free and is open now. I hope to see you there!

Data Lineage Tracking in ETL Processes

Data Lineage Tracking in ETL ProcessesAnyone who has worked in data integration for a significant length of time has almost certainly been confronted with the following question:

“Where did this data come from?”

Confronting this question brings to the front burner the need for establishing data lineage tracking in ETL processes.

Data Lineage Tracking in ETL Processes

Extract, transform, and load (ETL) operations frequently involve the movement and consolidation of data from multiple sources through numerous transformation steps before being routed to its final resting place. As the ETL touchpoints increase in number and complexity, so does the difficulty in tracking the data back to its origins. The concept of data lineage in ETL is intended to make this process easier. Data lineage tracking involves building ETL elements in such a way that each row of data in the destination tables is unambiguously traceable back to it source, including any transformation process through which it passed during its travels.

Building ETL processes that include data lineage tracking takes extra work and careful planning. Sadly, the majority of ETL processes I’ve found in the wild do not have provisions for capturing data lineage. Because data lineage tracking does not add to the core functions of extraction, transformation, and loading, this design is often skipped during architecture or build. Much like the related ETL best practices of logging and auditing, data lineage is a typically unseen yet still valuable component of a well-designed ETL architecture.

Why Use Data Lineage Tracking?

There are several benefits that are realized with proper ETL data lineage tracking:

  • Trustworthiness. When the origin of each row of data and the path it took to arrive is systematically tracked, users and administrators of the data will have more reason to trust the data.
  • Easier troubleshooting. When the ETL data path is self-describing, it makes testing and troubleshooting far easier.
  • Expose leaky processes. When each row of data is trackable from source to destination, it helps to reveal any holes in the process where data might be lost.
  • Visibility. I can’t count the number of times I have been asked by a client to help them find and document business rules in their ETL processes. A side benefit of data lineage tracking is that the places where the business rules hide become more evident.

Not all load mechanisms require data lineage tracking. Very simple processes, those that load volatile tables, and some single-use, “throwaway” code does not have the same need for tracking data lineage. However, these are the exception rather than the rule.

Conclusion

Data lineage tracking in ETL processes is a best practice for most loads. Although it takes time to properly design and implement this pattern, the value gained is almost always worth the effort.

DevConnections 2016

I am excited to share that I will be presenting at the DevConnections conference in Las Vegas in October of this year. This year I will present one workshop and two regular presentations:

Building Better SSIS Packages  – Full-day workshop (Monday, October 10)

Making the Most of the SSIS Catalog (Tuesday, October 11)

Change Detection in SQL Server (Thursday, October 13)

I’ll be spending most of the week in the city, so if you’re attending let me know! I’d be happy to meet up and chat.

Introducing the Pinch Hit Service

I am happy to announce the launch of a new service designed to help with very short term consultation needs. Although most consulting engagements are weeks or months in duration, we’ve discovered that some client needs are simple and do not require a traditional consulting approach. In response to this need, Tyleris has created the Pinch Hit service as a simple, no-commitment, 2-hour remote consultation.

The Pinch Hit was created to assist clients who are handling their own data warehousing, ETL, and reporting infrastructure. They may be looking for a second set of eyes to look at a problem, assistance with troubleshooting a specific problem, or a focused training session. Much like the use of a pinch hitter in baseball, Tyleris brings a specialized skillset to help deal with a clutch situation.

Not every business or technical need is suitable for this service, but in cases where the problem domain is narrow, the Pinch Hit can deliver outstanding value in a short time. If you find yourself in need of a Pinch Hit engagement from Tyleris, just let us know how we can help.

Request a Pinch Hit

 

 

 

SSIS Classroom Training – Boston and Denver

For those looking for classroom training in SSIS, I’ve got an exciting announcement: I have a brand new course entitled “Building Better SSIS Packages” which I’ll be delivering in Denver and Boston this fall. Here’s a brief into to this course:

There’s nothing magical about building rock-solid SSIS packages, but it does take some discipline, experience, and a library of best practices. That is exactly the aim of this course: to demonstrate a set of proven practices that help frame the development of enterprise-ready SSIS packages.

In this full-day presentation, we will walk through each of these five facets of well-built packages, discussing and then demonstrating ways of applying these practices to design better SSIS packages.

I’ll be teaching this course in Denver, Colorado on Friday, September 18th, and again in Boston, Massachusetts on Friday, October 16th. Registration is open for both courses.