Professional Microsoft SQL Server 2012 Integration Services
Buy Rights Online Buy Rights

Rights Contact Login For More Details

  • Wiley

More About This Title Professional Microsoft SQL Server 2012 Integration Services

English

An in-depth look at the radical changes to the newest release of SISS

Microsoft SQL Server 2012 Integration Services (SISS) builds on the revolutionary database product suite first introduced in 2005. With this crucial resource, you will explore how this newest release serves as a powerful tool for performing extraction, transformation, and load operations (ETL). A team of SQL Server experts deciphers this complex topic and provides detailed coverage of the new features of the 2012 product release. In addition to technical updates and additions, the authors present you with a new set of SISS best practices, based on years of real-world experience that have transpired since the previous edition was published.

Details the newest features of the 2012 SISS product release, which is the most significant release since 2005 Addresses the keys to a successful ETL solution, such as using the right enterprise ETL tool and employing the right ETL architecture in order to meet the system requirements Includes additional case studies and tutorial examples to illustrate advanced concepts and techniques

Professional Microsoft SQL Server 2012 Integration Services is a valuable resource that meets the demands and high expectations of experienced SSIS professionals.

English

Brian Knight is the founder of Pragmatic Works and co-founder of SQLServerCentral.com and JumpstartTV.com.

Erik Veerman is a consultant, author, trainer, and a mentor with SolidQ. His award-winning industry recognition includes Microsoft's BI Solution of the Year and SQL Server Magazine's Innovator Cup.

Jessica M. Moss is a well-known architect, speaker, author, and Microsoft MVP of SQL Server business intelligence who has created data warehousing solutions for a variety of industries.

Mike Davis, MCTS, MCITP, is the Managing Project Lead at Pragmatic Works. He has co-authored several business intelligence books and regularly speaks at SQL events.

Chris Rock is a software developer and program manager at Pragmatic Works, where he develops custom SSIS components and software.

English

INTRODUCTION xxix

CHAPTER 1: WELCOME TO SQL SERVER INTEGRATION SERVICES 1

SQL Server SSIS Historical Overview 2

What’s New in SSIS 2

Tools of the Trade 2

Import and Export Wizard 3

The SQL Server Data Tools Experience 4

SSIS Architecture 5

Packages 5

Control Flow 6

Data Flow 9

Variables 13

Parameters 14

Error Handling and Logging 14

Editions of SQL Server 14

Summary 15

CHAPTER 2: THE SSIS TOOLS 17

Import and Export Wizard 17

SQL Server Data Tools 23

Creating Your First Package 25

The Solution Explorer Window 26

The Toolbox 27

The Properties Windows 28

The SSIS Package Designer 28

Control Flow 29

Connection Managers 32

Variables 32

Data Flow 34

Parameters 34

Event Handlers 34

Package Explorer 35

Executing a Package 36

Management Studio 36

Summary 37

CHAPTER 3: SSIS TASKS 39

SSIS Task Objects 40

Using the Task Editor 40

The Task Editor Expressions Tab 40

SSDTCommon Properties 41

Looping and Sequence Tasks 42

Script Task (.NET) 43

Analysis Services Tasks 45

Analysis Services Execute DDL Task 45

Analysis Services Processing Task 46

Data Mining Query Task 46

Data Flow Task 47

Data Preparation Tasks 48

Data Profi ler 48

File System Task 50

Archiving a File 52

FTP Task 53

Getting a File Using FTP 54

Web Service Task 55

Retrieving Data Using the Web Service Task and XML Source Component 57

XML Task 60

Validating an XML File 62

RDBMS Server Tasks 64

Bulk Insert Task 64

Using the Bulk Insert Task 67

Execute SQL Task 68

Workfl ow Tasks 80

Execute Package Task 80

Execute Process Task 81

Message Queue Task 82

Send Mail Task 83

WMI Data Reader Task 84

WMI Event Watcher Task 86

Polling a Directory for the Delivery of a File 86

SMO Administration Tasks 87

Transfer Database Task 88

Transfer Error Messages Task 89

Transfer Logins Task 89

Transfer Master Stored Procedures Task 90

Transfer Jobs Task 91

Transfer SQL Server Objects Task 91

Summary 92

CHAPTER 4: CONTAINERS 93

Task Host Containers 93

Sequence Containers 94

Groups 95

For Loop Container 95

Foreach Loop Container 97

Foreach File Enumerator Example 98

Foreach ADO Enumerator Example 100

Summary 103

CHAPTER 5: THE DATA FLOW 105

Understanding the Data Flow 105

Data Viewers 106

Sources 106

OLE DB Source 107

Excel Source 109

Flat File Source 110

Raw File Source 114

XML Source 115

ADO.NET Source 115

Destinations 115

Excel Destination 116

Flat File Destination 116

OLE DB Destination 116

Raw File Destination 117

Recordset Destination 117

SQL Server and Mobile Destinations 118

Data Mining Model Training 118

DataReader Destination 118

Dimension and Partition Processing 118

Common Transformations 119

Synchronous versus Asynchronous Transformations 119

Aggregate 119

Conditional Split 121

Data Conversion 122

Derived Column 122

Lookup 123

Row Count 124

Script Component 125

Slowly Changing Dimension 126

Sort 126

Union All 127

Other Transformations 128

Audit 128

Character Map 129

Copy Column 130

Data Mining Query 130

DQS Cleansing 131

Export Column 131

Fuzzy Lookup 133

Fuzzy Grouping 138

Import Column 142

Merge 144

Merge Join 144

Multicast 145

OLE DB Command 145

Percentage and Row Sampling 147

Pivot Transform 147

Unpivot 150

Term Extraction 152

Term Lookup 156

Data Flow Example 157

Summary 160

CHAPTER 6: USING VARIABLES, PARAMETERS, AND EXPRESSIONS 161

Dynamic Package Objects 162

Variable Overview 162

Parameter Overview 162

Expression Overview 163

Understanding Data Types 164

SSIS Data Types 164

Date and Time Type Support 166

How Wrong Data Types and Sizes Can Affect Performance 167

Unicode and Non-Unicode Conversion Issues 167

Casting in SSIS Expressions 169

Using Variables and Parameters 170

Defi ning Variables 170

Defining Parameters 171

Variable and Parameter Data Types 172

Working with Expressions 173

C#-Like? Close, but Not Completely 174

The Expression Builder 175

Syntax Basics 176

Using Expressions in SSIS Packages 190

Summary 200

CHAPTER 7: JOINING DATA 201

The Lookup Transformation 202

Using the Merge Join Transformation 203

Contrasting SSIS and the Relational Join 203

Lookup Features 206

Building the Basic Package 207

Using a Relational Join in the Source 209

Using the Merge Join Transformation 211

Using the Lookup Transformation 216

Full-Cache Mode 216

No-Cache Mode 219

Partial-Cache Mode 220

Multiple Outputs 223

Expressionable Properties 226

Cascaded Lookup Operations 227

Cache Connection Manager and Cache Transform 229

Summary 231

CHAPTER 8: CREATING AN END-TO-END PACKAGE 233

Basic Transformation Tutorial 233

Creating Connections 234

Creating the Control Flow 237

Creating the Data Flow 237

Completing the Package 239

Saving the Package 239

Executing the Package 240

Typical Mainframe ETL with Data Scrubbing 241

Creating the Data Flow 242

Handling Dirty Data 242

Finalizing 246

Handling More Bad Data 247

Looping and the Dynamic Tasks 250

Looping 250

Making the Package Dynamic 250

Summary 252

CHAPTER 9: SCRIPTING IN SSIS 253

Introducing SSIS Scripting 253

Getting Started in SSIS Scripting 255

Selecting the Scripting Language 255

Using the VSTA Scripting IDE 256

Example: Hello World 257

Adding Code and Classes 259

Using Managed Assemblies 260

Example: Using Custom .NET Assemblies 261

Using the Script Task 264

Confi guring the Script Task Editor 265

The Script Task Dts Object 266

Accessing Variables in the Script Task 267

Connecting to Data Sources in a Script Task 271

Raising an Event in a Script Task 281

Writing a Log Entry in a Script Task 287

Using the Script Component 288

Diff erences from a Script Task 288

Confi guring the Script Component Editor 289

Accessing Variables in a Script Component 291

Connecting to Data Sources in a Script Component 292

Raising Events 292

Logging 293

Example: Data Validation 294

Synchronous versus Asynchronous 302

Essential Coding, Debugging, and Troubleshooting Techniques 305

Structured Exception Handling 305

Script Debugging and Troubleshooting 308

Summary 311

CHAPTER 10: LOADING A DATA WAREHOUSE 313

Data Profiling 315

Initial Execution of the Data Profiling Task 315

Reviewing the Results of the Data Profiling Task 317

Turning Data Profi le Results into Actionable ETL Steps 321

Data Extraction and Cleansing 322

Dimension Table Loading 322

Loading a Simple Dimension Table 323

Loading a Complex Dimension Table 327

Considerations and Alternatives to the SCD Transformation 335

Fact Table Loading 337

SSAS Processing 345

Using a Master ETL Package 350

Summary 351

CHAPTER 11: ADVANCED DATA CLEANSING IN SSIS 353

Advanced Derived Column Use 354

Text Parsing Example 355

Advanced Fuzzy Lookup and Fuzzy Grouping 357

Fuzzy Lookup 357

Fuzzy Grouping 363

DQS Cleansing 366

Data Quality Services 366

DQS Cleansing Transformation 370

Summary 373

CHAPTER 12: USING THE RELATIONAL ENGINE 375

Data Extraction 376

SELECT * Is Bad 376

WHERE Is Your Friend 377

Transform during Extract 378

Many ANDs Make Light Work 381

SORT in the Database 382

Modularize 384

SQL Server Does Text Files Too 385

Using Set-Based Logic 389

SQL Server Change Data Capture 391

Benefi ts of SQL Server CDC 392

Preparing CDC 393

Capture Instance Tables 394

The CDC API 396

Using the New SSIS CDC Tools 398

Querying CDC in SSIS 401

Data Loading 405

Database Snapshots 406

The MERGE Operator 408

Summary 411

CHAPTER 13: ACCESSING HETEROGENEOUS DATA 413

Excel and Access 415

64-Bit Support 415

Working with Excel Files 417

Working with Access 421

Importing from Oracle 427

Oracle Client Setup 428

Importing Oracle Data 429

Using XML and Web Services 431

Confi guring the Web Service Task 431

Working with XML Data as a Source 439

Flat Files 442

Loading Flat Files 443

Extracting Data from Flat Files 444

ODBC 447

Other Heterogeneous Sources 450

Summary 451

CHAPTER 14: RELIABILITY AND SCALABILITY 453

Restarting Packages 453

Simple Control Flow 455

Containers within Containers and Checkpoints 457

Variations on a Theme 459

Inside the Checkpoint File 461

Package Transactions 463

Single Package, Single Transaction 464

Single Package, Multiple Transactions 466

Two Packages, One Transaction 468

Single Package Using a Native Transaction in SQL Server 469

Error Outputs 471

Scaling Out 473

Architectural Features 474

Scaling Out Memory Pressures 474

Scaling Out by Staging Data 475

Scaling Out with Parallel Loading 479

Summary 485

CHAPTER 15: UNDERSTANDING AND TUNING THE DATA FLOW ENGINE 487

The SSIS Engine 488

Understanding the SSIS Data Flow and Control Flow 488

Handling Workfl ows with the Control Flow 491

Data Processing in the Data Flow 491

Memory Buff er Architecture 492

Types of Transformations 493

Advanced Data Flow Execution Concepts 501

SSIS Data Flow Design and Tuning 508

Data Flow Design Practices 508

Optimizing Package Processing 513

Troubleshooting Data Flow Performance Bottlenecks 516

Pipeline Performance Monitoring 518

Summary 520

CHAPTER 16: SSIS SOFTWARE DEVELOPMENT LIFE CYCLE 521

Introduction to Software Development Life Cycles 523

SDLCs: A Brief History 524

Types of Software Development Life Cycles 524

Versioning and Source Code Control 525

Subversion (SVN) 526

Team Foundation Server, Team System, and SSIS 533

Summary 547

CHAPTER 17: ERROR AND EVENT HANDLING 549

Using Precedence Constraints 549

Precedence Constraint Basics 550

Advanced Precedence Constraints and Expressions 551

Event Handling 557

Events 558

Using Event Handlers 560

Event Handler Inheritance 567

Breakpoints 569

Error Rows 572

Logging 576

Logging Providers 577

Log Events 577

Catalog Logging 582

Summary 584

CHAPTER 18: PROGRAMMING AND EXTENDING SSIS 585

The Sample Components 586

Component 1: Source Adapter 586

Component 2: Transform 587

Component 3: Destination Adapter 588

The Pipeline Component Methods 588

Design-Time Functionality 589

Runtime 593

Connection Time 594

Building the Components 596

Preparation 596

Building the Source Component 602

Building the Transformation

Component 614

Building the Destination Adapter 625

Using the Components 633

Installing the Components 633

Debugging Components 634

Design Time 635

Building the Complete Package 636

Runtime Debugging 637

Upgrading to SQL Server 2012 641

Summary 641

CHAPTER 19: ADDING A USER INTERFACE TO YOUR COMPONENT 643

Three Key Steps for Designing the UI: An Overview 644

Building the User Interface 644

Adding the Project 645

Implementing IDtsComponentUI 647

Setting the UITypeName 651

Building the Form 653

Extending the User Interface 658

Runtime Connections 658

Component Properties 661

Handling Errors and Warnings 663

Column Properties 665

Other UI Considerations 667

Summary 667

CHAPTER 20: EXTERNAL MANAGEMENT AND WMI TASK IMPLEMENTATION 669

External Management of SSIS with Managed Code 670

Setting Up a Test SSIS Package for Demonstration Purposes 670

The Managed Object Model Code Library 671

Catalog Management 672

Folder Management 673

Environments 674

The DTS Runtime Managed Code Library 676

SSIS Deployment Projects 676

Parameter Objects 677

Server Deployment 679

Executing SSIS Packages Deployed to the SSIS Catalog 680

EnvironmentReferences 681

Package Operations 682

Application Object Maintenance Operations 683

Package Operations 683

Package Monitoring 686

Project, Folder, and Package Listing 688

A Package Management Example 689

Package Log Providers 699

Specifying Events to Log 701

Programming to Log Providers 702

SQL Server 2012 Operation Logs 703

Package Confi gurations 705

Creating a Confi guration 706

Programming the Confi guration Object 707

Confi guration Object 708

Windows Management Instrumentation Tasks 709

WMI Reader Task Explained 709

WMI Data Reader Example 710

WMI Event Watcher Task 715

WMI Event Watcher Task Example 716

Summary 718

CHAPTER 21: USING SSIS WITH EXTERNAL APPLICATIONS 719

InfoPath Documents 720

ASP.NET Applications 727

Winform .NET Applications 731

Executing a Package with T-SQL 736

Summary 741

CHAPTER 22: ADMINISTERING SSIS 743

Using the SSIS Catalog 743

Setting the SSIS Catalog Properties 744

SSISDB 747

Deployment Models 748

Project Deployment Model 749

Package Deployment Model 751

Using T-SQL with SSIS 757

Executing Packages 757

Using Parameters 758

Querying Tables for Parameter Values 759

Using Environments 760

Using Data Taps 765

Creating a Central SSIS Server 766

Clustering SSIS 768

Package Confi guration 770

Command-Line Utilities 774

DTExec 774

DTExecUI 775

DTUtil 780

Security 782

Securing the SSIS Catalog 782

Legacy Security 785

Scheduling Packages 787

SQL Server Agent 787

Proxy Accounts 789

64-Bit Issues 790

Monitoring Package Executions 791

Built-in Reporting 791

Custom Reporting 795

Performance Counters 796

Summary 796

CHAPTER 23: CASE STUDY: A PROGRAMMATIC EXAMPLE 797

What You Will Take Away 798

Background 798

Business Problem 799

Solution Summary 799

Solution Architecture 801

Naming Conventions and Tips 804

Additional SSIS Tips Before You Start a Large Project 805

Data Architecture 805

File Storage Location Setup 806

Bank ACH Payments 806

Lockbox Files 807

PayPal or Direct Credits to the Corporate Account 808

Case Study Database Model 808

Database Setup 810

Case Study Load Packages 819

Bank File Load Package 819

ACH Load Package 845

E-mail Load Package 861

Testing 866

Case Study Invoice Matching Process 867

Matching Process Control Flow 867

Matching Process High-Confidence Data Flow 870

Matching Process Medium-Confidence Data Flow 875

Interpreting the Results 879

Creating a Parent Driver Package 880

Driver Package Setup 880

Summary 881

INDEX 883

loading