Friday 31 December 2010

Downloading all packages from your SQL Server

Case
A few month ago I had a job to add some adjustments to a SSIS project, but they lost the Visual Studio project file. So I created a empty project and started adding all the packages (over a 150) from SQL Server to the project one by one. After five packages I realised this could take ages. A while ago I created a package to upload all Visual Studio project files to SQL Server and thought I can do that vica versa!

Solution
I created a package with a single Script Task and used the SSIS API to download all packages to the Visual Studio SSIS project folder. After that I edited the Visual Studio SSIS project file (.dtproj, an XML file) to add the packages to the project itself.
// C# code
// Script to download packages from a SQL Server and add them
// to a Visual Studio SSIS Project file. The script assumes
// that the package name and package object name are equal.
using System;
using System.Data;
using System.Xml; // Added
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace ST_fe8263e907c94504a4c41974a46fc623.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

        public void Main()
        {
            /////////////////////////////////////////
            // The Project file and folder
            String ProjectFilePath = @"d:\Integration Services Project\Integration Services Project.dtproj";
            String ProjectFolderPath = ProjectFilePath.Substring(0,ProjectFilePath.LastIndexOf(@"\")+1);

            // The SQL Server
            String SSISServer = "Name_Of_Your_Server";

            /////////////////////////////////////////
            // Variable that contains the Visual Studio SSIS Project file, an XML document     
            XmlDocument doc = new XmlDocument();
            doc.Load(ProjectFilePath);

            // Get DTSPackages to add a package
            XmlNode DTSPackages = doc.SelectSingleNode("/Project/DTSPackages");

            /////////////////////////////////////////
            // Set application context
            Microsoft.SqlServer.Dts.Runtime.Application app = new Microsoft.SqlServer.Dts.Runtime.Application();

            // Create a package variable to temporary store the server packages
            Package serverPackage;

            /////////////////////////////////////////
            // Loop through packages and folders in the root: / (or change the path for your folder)
            foreach (PackageInfo serverPackageInfo in app.GetPackageInfos("/", SSISServer, null, null))
            {
                // Get only the packages. You could make a recursive function to loop through all folders
                if (serverPackageInfo.Flags.ToString().ToLower().Equals("package"))
                {
                    /////////////////////////////////////////
                    // Fill the package variable with a server package
                    serverPackage = app.LoadFromSqlServer(serverPackageInfo.Folder + serverPackageInfo.Name, SSISServer, null, null, null);

                    // Save the package to xml (.dtsx) in the project folder.
                    app.SaveToXml(ProjectFolderPath + serverPackageInfo.Name + ".dtsx", serverPackage, null);

                    /////////////////////////////////////////
                    // Dirty/lazy check to see if the package already exists in the project file (can't add them twice)
                    bool fireAgain = true;
                    if (DTSPackages.InnerXml.Contains(serverPackage.Name + ".dtsx"))
                    {
                        Dts.Events.FireInformation(0, "Download package", "Existing local package had been overwritten by Server package", string.Empty, 0, ref fireAgain);
                    }
                    else
                    {
                        /////////////////////////////////////////
                        // Add package to project xml file
                        // <dtspackage formatversion="3"></dtspackage>
                        XmlNode newDtsPackageNode = doc.CreateNode(XmlNodeType.Element, "DtsPackage", "");
                        XmlAttribute newDtsPackageAttribute = doc.CreateAttribute("FormatVersion");
                        newDtsPackageAttribute.Value = "3";
                        newDtsPackageNode.Attributes.Append(newDtsPackageAttribute);

                        // <name>Datastaging.dtsx</name>
                        XmlNode newDtsPackageNameNode = doc.CreateNode(XmlNodeType.Element, "Name", "");
                        newDtsPackageNameNode.InnerText = serverPackage.Name + ".dtsx";

                        // <fullpath>Datastaging.dtsx</fullpath>
                        XmlNode newDtsPackageFullPathNode = doc.CreateNode(XmlNodeType.Element, "FullPath", "");
                        newDtsPackageFullPathNode.InnerText = serverPackage.Name + ".dtsx";

                        // <references />
                        XmlNode newDtsPackageReferencesNode = doc.CreateNode(XmlNodeType.Element, "References", "");

                        // Add child nodes to <dtspackage>
                        newDtsPackageNode.AppendChild(newDtsPackageNameNode);
                        newDtsPackageNode.AppendChild(newDtsPackageFullPathNode);
                        newDtsPackageNode.AppendChild(newDtsPackageReferencesNode);

                        // Add <dtspackage> to <dtspackages>
                        DTSPackages.AppendChild(newDtsPackageNode);
                        Dts.Events.FireInformation(0, "Download package", "Package from server has been added to the project.", string.Empty, 0, ref fireAgain);
                    }
                }
            }
            /////////////////////////////////////////
            // Save project
            doc.Save(ProjectFilePath);
           
            Dts.TaskResult = (int)ScriptResults.Success;
        }
    }
}
I hard coded the project file and server name, but you can change that by using variables and configure those.

Thursday 30 December 2010

How to use variables in a Script Task

Case
Every now and then you need a variable in a script task. For instance to avoid a hardcoded path or connectionstring in your code. You can use Package Configuration to fill a variable and use that variable in your script task.

Solution
There are two ways to read and write variables in a Script task. I will show you both.

1) Create variables
Let's create to string variables called ReadVariable and WriteVariable.
Right mouse click in Control Flow












2) Script task
Add a script task to your package and give it a suitable name.
The Script Task
















3a) Simple version
Add ReadVariable to the ReadOnlyVariables and add WriteVariable to the ReadWriteVariables:
Let the script know what variables you want to use.



















Now the code is only one row:
// C# code
public void Main()
{
    // Fill WriteVariable with value from ReadVariable
    Dts.Variables["User::WriteVariable"].Value = Dts.Variables["User::ReadVariable"].Value;

    Dts.TaskResult = (int)ScriptResults.Success;
}

' VB.Net code
Public Sub Main()
    ' Fill WriteVariable with value from ReadVariable 
    Dts.Variables("User::WriteVariable").Value = Dts.Variables("User::ReadVariable").Value

    Dts.TaskResult = ScriptResults.Success
End Sub

3b) Advanced Version
In the advanced version you don't add the variables to the list. We will do that in the code:
// C# code
public void Main()
{
    // Lock variables
    Dts.VariableDispenser.LockForRead("User::ReadVariable");
    Dts.VariableDispenser.LockForWrite("User::WriteVariable");

    // Create a variables 'container' to store variables
    Variables vars = null;

    // Add variables from the VariableDispenser to the variables 'container'
    Dts.VariableDispenser.GetVariables(ref vars);

    // Now you can use the variables
    vars["User::WriteVariable"].Value = vars["User::ReadVariable"].Value;

    // Release the locks
    vars.Unlock();

    Dts.TaskResult = (int)ScriptResults.Success;
}

' VB.Net code
Public Sub Main()
    ' Lock variables 
    Dts.VariableDispenser.LockForRead("User::ReadVariable")
    Dts.VariableDispenser.LockForWrite("User::WriteVariable")

    ' Create a variables 'container' to store variables 
    Dim vars As Variables = Nothing

    ' Add variables from the VariableDispenser to the variables 'container' 
    Dts.VariableDispenser.GetVariables(vars)

    ' Now you can use the variables 
    vars("User::WriteVariable").Value = vars("User::ReadVariable").Value

    ' Release the locks 
    vars.Unlock()

    Dts.TaskResult = ScriptResults.Success
End Sub
You can even add a Try Catch Finally construnction to avoid unexpected errors (variable doesn't exists or is already locked). And with the finally you can release the locks even if your script fails, so you can still use the variable in for example an event handler.
// C# code
public void Main()
{
    // Create a variables 'container' to store variables 
    Variables vars = null;
    try
    {
        // Lock variables 
        Dts.VariableDispenser.LockForRead("User::ReadVariable");
        Dts.VariableDispenser.LockForWrite("User::WriteVariable");

        // Add variables from the VariableDispenser to the variables 'container' 
        Dts.VariableDispenser.GetVariables(ref vars);

        // Now you can use the variables 
        vars["User::WriteVariable"].Value = vars["User::ReadVariable"].Value;
    }
    catch (Exception ex)
    {
        // Throw an exception or add some logging
        throw ex;
    }
    finally
    {
        // Release the locks  (even if your script task fails)
        vars.Unlock(); 
    }

    Dts.TaskResult = (int)ScriptResults.Success;
}

' VB.Net code
Public Sub Main()
    ' Create a variables 'container' to store variables 
    Dim vars As Variables = Nothing

    Try
        ' Lock variables 
        Dts.VariableDispenser.LockForRead("User::ReadVariable")
        Dts.VariableDispenser.LockForWrite("User::WriteVariable")

        ' Add variables from the VariableDispenser to the variables 'container' 
        Dts.VariableDispenser.GetVariables(vars)

        ' Now you can use the variables 
        vars("User::WriteVariable").Value = vars("User::ReadVariable").Value
    Catch ex As Exception
        ' Throw an exception or add some logging
        Throw ex
    Finally
        ' Release the locks (even if your script task fails)
        vars.Unlock()
    End Try

    Dts.TaskResult = ScriptResults.Success
End Sub

Which version is best? The result of both version is the same. The advanced version gives you a little more control over what is happening and at what stage the variables are locked and released, but the simple method is used more often in blogs and forums. I recommend using the simple method.

Note: Variables in a Script Component will be discussed at a later time.
Update: I wrote an SSIS WIKI about this where I also show how to use parameters in a Script Task

Wednesday 29 December 2010

Passing variables from a Parent Package To a Child

Case
Occasionally you need to pass some value from the parent package to the child package. Of cource you could temporary store it in a database or file, but that could be a little too much for a single value.

Solution
The easiest solution is to store the value in a variable in the parent package and use a script task in the child package to copy it.

1) Create parent variable
Create a variable in your parent package. Right click in the Control Flow if the variables are not visible. I used a string variable named FilePath in this test case to store some filepath.
Create new variable, think about the scope.











2) Create Execute Package Task
Drag an Execute Package Task in your Control Flow and configure it to start your child package.
Execute Package Task



















3) Create child variable
Goto to your child package and create a variable, but be sure not to use the same name as in the parent package. Otherwise it won't work! I used LocalFilePath in this case.


4) Script task
Drag a Script Task in the Control Flow of your Child package. Choose your script language (I will show both C# and VB.net for this short script) and select your Child package variable as a ReadWriteVariable and type (you can't select it) your Parent package variable as ReadOnlyVariable. The Scope (User::) is optional.
Language and Variables




















5) The Script
The simple version of this script is only 1 row of code. I will show the more advanced version at a later time.
' VB.Net code
Public Sub Main()
    ' Fill local variable with value from parent variable
    Dts.Variables("User::LocalFilePath").Value = Dts.Variables("User::FilePath").Value

    Dts.TaskResult = ScriptResults.Success
End Sub
// C# code
public void Main()
{
    // Fill local variable with value from parent variable
    Dts.Variables["User::LocalFilePath"].Value = Dts.Variables["User::FilePath"].Value;

    Dts.TaskResult = (int)ScriptResults.Success;
}
That's it. The child variable is now filled with the value of the parent variable.
The end result













The SSIS solution can be downloaded here.

Note: the child package cannot run by itself, because it will fail finding the parent variable. A simple solution is to change the Constraint behind the Script task from Success to Completion (see blue line in last picture).

Update: All the roads lead to Rome. So here is another road/solution without .Net Coding

Tuesday 28 December 2010

Start and stop batchfile services via a batchfile

Case
Constantly running all Microsoft SQL Services on your development desktop/laptop could be a little too much. Especially when you're not developing, but starting and stopping via the control panel is tiresome...

Solution
You can stop/start a service via the command (or even better, put it in a batchfile):
Syntax
      NET START [service]
      NET STOP [service]
      NET PAUSE [service]
      NET CONTINUE [service] 
   
Key
   service : The service name as shown in Control Panel, Services

Apparently it does not matter whether you use the Service name or the Display name. 
Service name or Display name















Start batchfile SQL 2005:
@ECHO OFF
NET START "SQL Server (MSSQLSERVER)"
NET START "SQL Server Agent (MSSQLSERVER)"
NET START "SQL Server Integration Services"
NET START "SQL Server Analysis Services (MSSQLSERVER)"
NET START "SQL Server Reporting Services (MSSQLSERVER)"

Stop batchfile SQL 2005:
@ECHO OFF
NET STOP "SQL Server (MSSQLSERVER)"
NET STOP "SQL Server Agent (MSSQLSERVER)"
NET STOP "SQL Server Integration Services"
NET STOP "SQL Server Analysis Services (MSSQLSERVER)"
NET STOP "SQL Server Reporting Services (MSSQLSERVER)"



Start batchfile SQL 2008:
@ECHO OFF
NET START "MSSQLSERVER"
NET START "SQL Server Agent (MsSqlServer)"
NET START "SQL Server Integration Services 10.0"
NET START "MSSQLServerOLAPService"
NET START "ReportServer"

Stop batchfile SQL 2008:
@ECHO OFF
NET STOP "MSSQLSERVER"
NET STOP "SQL Server Agent (MsSqlServer)"
NET STOP "SQL Server Integration Services 10.0"
NET STOP "MSSQLServerOLAPService"
NET STOP "ReportServer"

Note: It works the same for SQL 2012. Only the SSIS name has been changed to SQL Server Integration Services 11.0 and the rest of the names depend on the instance name you choose.

If you try to start a service that is already running you will get an error message like:
"The requested service has already been started. More help is available by typing NET HELPMSG 2182"

If you're running Windows Vista or Windows 7, your will have to start the batch file as an administrator or disable the User Account Control (UAC).
User Account Control (UAC)

Monday 27 December 2010

Microsoft SQL Server code-named 'Denali' - Community Technology Preview 1

CTP 1 of SQL 2012 (code-named 'Denali') has been released. After some installation problems I got it working on a VPC. Here is a first quick summary:

General
Undo & Redo
Finally! Need I say more:
They really work





Cut & Paste
Yes! If you paste with your mouse, the item will be pasted at your mouse position instead of somewhere below.

New icons
There are new icons and rounded corners for tasks and transformations
Rounded corners













Zoom
Very handy is the zoom function in the lower right corner which fades when not active.











Parameters
Parameters allow you to assign values to properties within packages at the time of package execution. More information.

Left Expression
The Right was always there, but now we even got a Left method in the string expression. In previous versions you need a Substring to accomplish this.
Left!















Project Build
If you build a project all project output is zipped in an .ispac file. I guess (didn't found it yet) there is some related deployment wizard to process this .ispac file.


Controlflow
SSIS Toolbox
They regrouped the items with some new categories. And new custom tasks and components should automatically appear in the toolbox.

Notice the description box





















Foreach Loop
Finally the Foreach File enumerator configuration is instantly visable (personal irritation).
SSIS 2008



















Execute Package
ReferenceType has been added to the Execute Package Task. Which you can use to select a package from your current project. If you choose External Reference, you will get the SQL Server / File system choice.
Package Reference Type


















And there also is a new parameter binding tab  to support the new parameter option (see parameters).

Dataflow
Flexible Authoring
It's now possible to edit a component even when its input path is disconnected.

Source & Destination Assistant
New in the dataflow are the Source Assistant and Destination Assistant. Feels a little bit like SSIS for Dummies... and there isn't a flatfile target.
They are extra.












After the OK, you will get the regular screens.











Grouping in dataflow
Very useful for large dataflows is the new collapsible grouping option.
Dataflow grouping













Data Correction
Very cool new data flow object, but not yet working. I'm very curious about this new item...

Pivot
I expected a better editor. Unfortunately no improvements yet...

Simplified Data Viewer
The Data Viewer can now be configured directly from the Data Flow Path editor. The data viewer only supports a grid view, the histogram and scatter plot views have been removed (but who has ever used them?).

Flexible XML Destination

Case
A client wants to export data to XML files, but SSIS has no XML destination. A script component could help him but he doesn't want to create a new script for each file/source just because the file/source lay-out is different.

Starting point
Col1Col2Col3
testTestTEST
blaBlaBLA
xxxXxxXXX

Desired situation
<?xml version="1.0" ?>
<ROOT>
 <ROW>
  <col1>test</col1>
  <col2>Test</col2>
  <col3>TEST</col3>
 </ROW>
 <ROW>
  <col1>bla</col1>
  <col2>Bla</col2>
  <col3>BLA</col3>
 </ROW>
 <ROW>
  <col1>xxx</col1>
  <col2>Xxx</col2>
  <col3>XXX</col3>
 </ROW>
</ROOT>

Solution
There is a request for an XML destination on the Microsoft site which you can support. There is an open source component: XML Destination and there are some third party components, but you can do it yourself with a Script component.
Let's create a flexible XML destination that is the same for each file/source. This could be resolved by reflection (again).

1) Source
Create a random source (flatfile / OLE DB).

2) Script component
Drag a script component to your data flow and select Destination. Connect it to the Source component.
Script component destination




















3) Input columns
Edit the Script components and select all input columns (or just the ones you need)
Select all columns as input



















4) Destination
Create a new connection named xmldocument in the Connection Managers tab. The file type should be FILE and create a new file.
FILE


















Create file











5) The Script
Now the script (C# and VB.net for this example). SSIS creates 3 methods which we will adjust and we need a fourth method to remove forbidden xml characters such as < and >.
// C# Code
// Flexible script that creates a XML document
// using the SSIS columns as nodes.
using System;
using System.Data;
using System.Reflection; // Added
using System.Xml;        // Added
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    // Variable that contains the XML document
    XmlTextWriter textWriter;

    // Start of XML document
    public override void PreExecute()
    {
        base.PreExecute();

        // Create a new XML document and use the filepath in the connection as XML-file
        textWriter = new XmlTextWriter(this.Connections.xmldocument.ConnectionString.ToString(), null);

        // Start writing the XML document: 
        textWriter.WriteStartDocument();

        // Create root element 
        textWriter.WriteStartElement("ROOT");
    }

    // Close of XML document
    public override void PostExecute()
    {
        base.PostExecute();

        // Close root element: 
        textWriter.WriteEndElement();

        // Stop writing the XML document
        textWriter.WriteEndDocument();

        // Close document
        textWriter.Close();
    }

    // Method that will be started for each record in you dataflow  
    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        // Row type to get the value of a column
        Type rowType = Row.GetType();
        String columnValue = "";

        // Create row element: 
        textWriter.WriteStartElement("ROW");

        // Loop through all columns and create a column element: valuevalue
        foreach (IDTSInputColumn100 column in this.ComponentMetaData.InputCollection[0].InputColumnCollection)
        {
            // Use the SSIS column name as element name: 
            textWriter.WriteStartElement(column.Name);
            
            // Get column value, will fail if null
            try
            {
                columnValue = rowType.GetProperty(column.Name).GetValue(Row, null).ToString();
            }
            catch
            {
                // Default value for null values: "null", "" or null
                columnValue = "null";
            }
            finally
            {
                textWriter.WriteString(removeForbiddenXmlChars(columnValue));
               }
            // Close column element: 
            textWriter.WriteEndElement();
        }
        // Close row element: 
        textWriter.WriteEndElement();


        // Output the number of processed rows. 103 = RowsWritten
        this.ComponentMetaData.IncrementPipelinePerfCounter(103, 1); 
    }

    // Remove forbidden chars that could damage your XML document
    private string removeForbiddenXmlChars(string columnValue)
    {
        return columnValue.Replace("&", "&amp;").Replace("<", "&lt;").Replace(">", "&gt;");
    }
}

And the VB.net code
' VB.Net code
' Flexible script that creates a XML document 
' using the SSIS columns as nodes. 
Imports System
Imports System.Data
Imports System.Reflection       ' Added
Imports System.Xml              ' Added
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

<microsoft.sqlserver.dts.pipeline.ssisscriptcomponententrypointattribute> _
<clscompliant(false)> _
Public Class ScriptMain
    Inherits UserComponent

    Dim textWriter As XmlTextWriter

    Public Overrides Sub PreExecute()
        MyBase.PreExecute()
        ' Create a new XML document and use the filepath in the connection as XML-file 
        textWriter = New XmlTextWriter(Me.Connections.xmldocument.ConnectionString.ToString(), System.Text.Encoding.Default)

        'Start writing the XML document:
        textWriter.WriteStartDocument()

        'Create root element 
        textWriter.WriteStartElement("ROOT")
    End Sub

    Public Overrides Sub PostExecute()
        MyBase.PostExecute()

        'Close root element: 
        textWriter.WriteEndElement()

        'Stop writing the XML document
        textWriter.WriteEndDocument()

        'Close document
        textWriter.Close()
    End Sub

    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        ' Row type to get the value of a column 
        Dim rowType As Type = Row.GetType()
        Dim columnValue As String = ""

        ' Create row element:  
        textWriter.WriteStartElement("ROW")

        ' Loop through all columns and create a column element: valuevalue 
        For Each column As IDTSInputColumn100 In Me.ComponentMetaData.InputCollection(0).InputColumnCollection


            ' Use the SSIS column name as element name:  
            textWriter.WriteStartElement(column.Name)

            ' Get column value, will fail if null 
            Try
                columnValue = rowType.GetProperty(column.Name).GetValue(Row, Nothing).ToString()
            Catch
                ' Default value for null values: "null", "" or null 
                columnValue = "null"
            Finally
                textWriter.WriteString(removeForbiddenXmlChars(columnValue))
            End Try

            ' Close column element:  
            textWriter.WriteEndElement()
        Next
        ' Close row element:  
        textWriter.WriteEndElement()
        ' Output the number of processed rows. 103 = RowsWritten 
        Me.ComponentMetaData.IncrementPipelinePerfCounter(103, 1)

    End Sub


    'Remove forbidden chars that could damage your XML document
    Private Function removeForbiddenXmlChars(ByVal columnValue As String) As String
        removeForbiddenXmlChars = columnValue.Replace("&", "&amp;").Replace("<", "&lt;").Replace(">", "&gt;")
    End Function

End Class

Note: there are probably some more forbidden XML chars like ' (&apos;) and " (&quot;) that you need to replace.


UPDATE:
I have a newer version which doesn't use the ugly try catch for null values, but you have to change the .Net Framework version to 3.5.

// C# Code
// Flexible script that creates a XML document 
// using the SSIS columns as nodes. 
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Reflection;    // Added 
using System.Xml;           // Added 
using System.Linq;          // Added

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    // Variable that contains the XML document 
    XmlTextWriter textWriter;
    
    // Start of XML document 
    public override void PreExecute()
    {
        base.PreExecute();
        // Create a new XML document and use the filepath in the connection as XML-file 
        textWriter = new XmlTextWriter(this.Connections.xmldocument.ConnectionString.ToString(), null);
        // Start writing the XML document: 
        textWriter.WriteStartDocument();
        // Create root element  
        textWriter.WriteStartElement("ROOT");
    }

    // Close of XML document 
    public override void PostExecute()
    {
        base.PostExecute();
        // Close root element:  
        textWriter.WriteEndElement();
        // Stop writing the XML document 
        textWriter.WriteEndDocument();
        // Close document 
        textWriter.Close();
    }

    // Method that will be started for each record in you dataflow 
    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        // Row type to get the value of a column 
        var properties = Row.GetType().GetProperties(BindingFlags.Instance | BindingFlags.Public);;

        String columnValue = "";
        // Create row element:  
        textWriter.WriteStartElement("ROW");
        // Loop through all columns and create a column element: valuevalue 

        foreach (var property in properties.Where(p => !p.Name.EndsWith("_IsNull", StringComparison.OrdinalIgnoreCase)))
        {
            // Use the SSIS column name as element name:  
            textWriter.WriteStartElement(property.Name);

            // Get column value, but that method will fail if empty
            // so first check if column value is null
            if (!CheckNull(Row, properties, property.Name))
            {
                // Get column value
                columnValue = property.GetValue(Row, null).ToString();
            }
            else
            {
                // Default value for null values: "null", "" or null 
                columnValue = "";
            }
            
            // write column value, but first remove forbidden chars
            textWriter.WriteString(removeForbiddenXmlChars(columnValue)); 

            // Close column element:  
            textWriter.WriteEndElement(); 
        }
        // Close row element:  
        textWriter.WriteEndElement();
        // Output the number of processed rows. 103 = RowsWritten 
        this.ComponentMetaData.IncrementPipelinePerfCounter(103, 1);
    }

    // Remove forbidden chars that could damage your XML document 
    private string removeForbiddenXmlChars(string columnValue)
    {
        return columnValue.Replace("&", "&").Replace("<", "<").Replace(">", ">");
    }

    // Method that checks wether a column value is empty.
    // It uses the standard boolean [columnName]_IsNull.
    private bool CheckNull(Input0Buffer Row, PropertyInfo[] properties, string propertyName)
    {
        var property = properties.Where(p => p.Name.Equals(propertyName + "_IsNull", StringComparison.OrdinalIgnoreCase)).FirstOrDefault();

        if (property != null)
        {
            if ((bool)property.GetValue(Row, null))
            {
                return true;
            }
        }
        return false;
    }
}

And the VB.net code
' VB.Net code
' Flexible script that creates a XML document 
' using the SSIS columns as nodes.
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Imports System.Reflection   ' Added 
Imports System.Xml          ' Added 
Imports System.Linq         ' Added
<Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute> _
<CLSCompliant(False)> _
Public Class ScriptMain
    Inherits UserComponent

    ' Variable that contains the XML document 
    Dim textWriter As XmlTextWriter

    ' Start of XML document 
    Public Overrides Sub PreExecute()
        MyBase.PreExecute()
        ' Create a new XML document and use the filepath in the connection as XML-file 
        textWriter = New XmlTextWriter(Me.Connections.xmldocument.ConnectionString.ToString(), System.Text.Encoding.Default)
        'Start writing the XML document: 
        textWriter.WriteStartDocument()
        'Create root element  
        textWriter.WriteStartElement("ROOT")
    End Sub

    ' Close of XML document
    Public Overrides Sub PostExecute()
        MyBase.PostExecute()
        'Close root element:  
        textWriter.WriteEndElement()
        'Stop writing the XML document 
        textWriter.WriteEndDocument()
        'Close document 
        textWriter.Close()
    End Sub

    ' Method that will be started for each record in you dataflow
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        ' Row type to get the value of a column 
        Dim properties = Row.[GetType]().GetProperties(BindingFlags.Instance Or BindingFlags.[Public])

        Dim columnValue As [String] = ""
        ' Create row element:  
        textWriter.WriteStartElement("ROW")
        ' Loop through all columns and create a column element: valuevalue 

        For Each [property] As Object In properties.Where(Function(p) Not p.Name.EndsWith("_IsNull", StringComparison.OrdinalIgnoreCase))
            ' Use the SSIS column name as element name:  
            textWriter.WriteStartElement([property].Name)

            ' Get column value, but that method will fail if empty
            ' so first check if column value is null
            If Not CheckNull(Row, properties, [property].Name) Then
                ' Get column value
                columnValue = [property].GetValue(Row, Nothing).ToString()
            Else
                ' Default value for null values: "null", "" or null 
                columnValue = ""
            End If

            ' write column value, but first remove forbidden chars
            textWriter.WriteString(removeForbiddenXmlChars(columnValue))

            ' Close column element:  
            textWriter.WriteEndElement()
        Next
        ' Close row element:  
        textWriter.WriteEndElement()
        ' Output the number of processed rows. 103 = RowsWritten 
        Me.ComponentMetaData.IncrementPipelinePerfCounter(103, 1)
    End Sub

    'Remove forbidden chars that could damage your XML document 
    Private Function removeForbiddenXmlChars(ByVal columnValue As String) As String
        removeForbiddenXmlChars = columnValue.Replace("&", "&").Replace("<", "<").Replace(">", ">")
    End Function

    ' Method that checks wether a column value is empty.
    ' It uses the standard boolean [columnName]_IsNull.
    Private Function CheckNull(ByVal Row As Input0Buffer, ByVal properties As PropertyInfo(), ByVal propertyName As String) As Boolean
        Dim [property] = properties.Where(Function(p) p.Name.Equals(propertyName & "_IsNull", StringComparison.OrdinalIgnoreCase)).FirstOrDefault()

        If [property] IsNot Nothing Then
            If CBool([property].GetValue(Row, Nothing)) Then
                Return True
            End If
        End If
        Return False
    End Function
End Class

Sunday 26 December 2010

Can't edit or create new SSIS 2008 solution after installing SQL Denali CTP 1

Case
Finally had some time to install Denali (on my Windows 7 machine with SQL 2008 already running). Unfortunately I found out that after the installation of SQL 2012, I couldn't edit my existing SSIS 2008 projects and I couldn't create a new one. Other project types didn't have any problems but SSIS did:
Aargh!













Solution
Max Trinidad found a solution to get SSIS 2008 working again. He had the same problem with SSIS 2008 R2.
Short version:
1) Remove SQL 2012
2) Remove BIDS 2008
3) Restart
4) Add BIDS 2008
5) Install SQL 2012 on a different PC or VPC!

For detailed information go to the blog of Max Trinidad found.

* Update: see release notes *

BIDS Is Not Supported in Side-By-Side and Upgrade Scenarios
Business Intelligence Development Studio (BIDS) is not supported in the following situations:
  • Microsoft SQL Server Code-Named “Denali” CTP1 BIDS is installed on the same server as an earlier release of MS SQL Server BIDS.
  • An earlier release of Microsoft SQL Server with BIDS is upgraded to Microsoft SQL Server Code-Named “Denali” CTP1.

Uploading all packages from your Visual Studio Project

Case
I want to upload all SSIS packages in my local Visual Studio Project folder to the SQL development server at once.

Solution
You can use the SSIS api for that!

1) Foreach Loop
At a Foreach Loop Container and give it a suitable name:
Foreach Loop Container














2) File Enumerator
Use a Foreach File Enumerator to loop through the (bin) folder of your project. (Unfortunately you can't get the folder name of a package, so you will have enter the path in the foreach loop manually.) Select only the *.dtsx files and use Fully qualified to retrieve the filename.
All packages from your VS project



















3) Variable
Go to the Variable Mapping tab and select <New Variable...>. Create a new String variable with the name localPackageFile. After creating the new variable the Index is set to the default 0.
New String variable to store the path



















4) Script Task
Drag a Script task into the Foreach loop and give it a suitable name.
Script task













5) Pass through variable
Select the new create String variable localPackageFile as a ReadOnlyVariable on the first tab of the Script task.
Select the variable



















6) The script
We will use C# in this example. Hit the Edit Script button and edit the Main method. This script will upload (and overwrite) all packages.
public void Main()
{
    // Set application context
    Microsoft.SqlServer.Dts.Runtime.Application app = new Microsoft.SqlServer.Dts.Runtime.Application();

    // Get local Package with path in variable
    Package localPackage = new Package();
    localPackage = app.LoadPackage(Dts.Variables["User::localPackageFile"].Value.ToString(), null);

    // Upload if the server version isn't newer
    app.SaveToSqlServer(localPackage, null, "Name_Of_Your_Server", null, null);

    // Log information about uploading
    bool fireAgain = true;
    Dts.Events.FireInformation(0, "UploadScript", "Package '" + localPackage.Name.ToString() + "' has been uploaded.", string.Empty, 0, ref fireAgain);

    Dts.TaskResult = (int)ScriptResults.Success;
}


7) More advanced script
Or use a little more advanced script that checks the Package version before uploading, so you don't overwrite newer packages with older versions.
public void Main()
{
    // Set application context
    Microsoft.SqlServer.Dts.Runtime.Application app = new Microsoft.SqlServer.Dts.Runtime.Application();

    // Get local Package with path in variable
    Package localPackage = new Package();
    localPackage = app.LoadPackage(Dts.Variables["User::localPackageFile"].Value.ToString(), null);

    // Create server Package to compare versions before upload
    Package serverPackage = new Package();
    try
    {
        // Try to find a package with the same name on the server
        serverPackage = app.LoadFromSqlServer("\\" + localPackage.Name, "Name_Of_Your_Server", null, null, null);
    }
    catch
    {
        // Ignore error that package doesn't exist on server
    }

    // Compare versions
    if ((localPackage.VersionMajor >= serverPackage.VersionMajor) && (localPackage.VersionMinor >= serverPackage.VersionMinor) && (localPackage.VersionBuild >= serverPackage.VersionBuild))
    {
        // Upload if the server version isn't newer
        app.SaveToSqlServer(localPackage, null, "Name_Of_Your_Server", null, null);

        // Log information about uploading
        bool fireAgain = true;
        Dts.Events.FireInformation(0, "UploadScript", "Package '" + localPackage.Name.ToString() + "' has been uploaded.", string.Empty, 0, ref fireAgain);
    }
    else
    {
        // Log warning that server version was newer
        Dts.Events.FireWarning(0, "UploadScript", "Package '" + localPackage.Name.ToString() + "' has not been uploaded.", String.Empty, 0);
    }

    Dts.TaskResult = (int)ScriptResults.Success;
}

You can add some checks/messages yourself. Like, don't upload the upload package itself.
The result on the progress tab















Let me know what your solution is or what kind of extra checks you built in.

Saturday 25 December 2010

Continuously watching files with WMI Event Watcher Task - Stopping the infinite loop

Case
Recently I explained how create an infinite loop with a WMI Event Watcher Task in it to watch for new files throughout the day. That solution had one small disadvantage: Not only the loop runs infinite, but the package does too... and some people don't want that. They want to end the package at the end of the day and restart it the next day.
Previous solution





















Solution
We will stop the package just before midnight and restart it again at midnight.

1) Add timeout to File Watcher
You can't interrupt the loop and end the package because the WMI Event Watcher Task is still watching out for new files. It will only stop if you drop a file in the folder, but you can also add a time-out to continue the control flow. Let's add a 14 minute time-out (840seconds) and let the control flow continue without errors:
  • Set ActionAtTimeout to Log the time-out
  • Set AfterTimeout to Return with success
  • Set Timeout to 840 seconds
Timeout and continue




















2) Change loop EvalExpression
After the timeout from the WMI Event Watcher task, the Foreach loop won't find any files. So the loop restarts.
To stop looping just before midnight, you have to change the EvelExpression of the loop to:
GETDATE() < DATEADD("Mi",45,DATEADD("Hh",23,(DT_DBDATE)GETDATE()))
Continue while GetDate is smaller than today at 23:45


















Note: You can finetune the timeout and the endtime of the loop to create a smaller window.

3) Schedule package
Schedule the package to start each day at midnight.


All roads lead to Rome... So let me know what your solution is.

Friday 24 December 2010

Development Best Practices

Case
As an external employee I see a lot of SSIS packages at various companies made by a whole bunch of different people. Unfortunately some of those people made Quick & Dirty as a motto in life resulting in hard to read packages. And that's a waste of time for the companies.

Solution
Companies should require both well performing and well documented packages. Here is a list of some basic development Best Practices to achieve clear and manageable packages.


1) No default names and descriptions
Rename all default component names and give them explaining descriptions. This will help other developers that edit your packages. It is also very useful when debugging.
No default names and descriptions


















2) Annotations
Use annotations. This is very useful if the Control Flow or Data Flow isn't self describing (for others).
Use annotations
















3 Group logical work
Use Sequence containers to organize package structures into logical units of work. This makes it easier to identify what the package does. It also helps to control transactions if they are being implemented. * Update: SSIS 2012 has a grouping feature *
Use Sequence Containers

















4 Flow directions
Flows should basically go top-down. This will make your packages more readable.
Design your package Top down















You can use the Auto-format option from SSIS to format your packages
Auto Layout is a good start













5) Disabled Control Flow tasks
Do not use disabled Control Flow tasks in the Quality assurance or Production environment. If you want to conditionally execute a task at runtime use expressions on your precedence constraints. Do not use an expression on the “Disable” property of the task.
Disabled Control Flow Task



















6) Spread large number of packages over serveral Visual Studio Project
You can add more than one projects to your Visual Studio Solution to spread large number of packages. Think about a proper layout. For example a datastaging project and a datawarehouse project.


7) Queries in source and look up components
Don't use too complex queries. Use a readable lay-out and add comments to explain parts of the query. For example:
-- This query does something 
SELECT    a.field1
,         a.field2
,         b.field3
,         b.field4
FROM      table1 as a
LEFT JOIN table2 as b
          on a.field5 = b.field6
WHERE     a.field2 = 'x' -- Comment about x
ORDER BY  a.field1

8) Script Coding Conventions
Use condings conventions when scripting a script task or component. C# and VB.Net both have their own conventions which are widely available on the net.

9) Use naming conventions
Give tasks and transformations a prefix. This makes it easier to read the logging.

10) Use templates
You can create templates for SSIS. Things like logging, configurations and connection managers can be added to these templates.

Let me known if you have items that should be in the list of Development Best Practices!

Thursday 23 December 2010

Sorting packages

Case
Microsoft finally added sorting in Managment Studio 2008 (SSMS) and Visual Studio 2008, but for those poor developers who are still working with 2005 (me, a couple of weeks ago) where it wasn't a standard feature, there are couple of solutions to accomplish an alphabetical sorted list of packages.
Sorting in Visual Studio 2008















Solution
Sorting in Managment Studio 2005
The sorting is done by a store procedure named sp_dts_listpackages. You can alter this procedure to accomplish a sorted list. The bottommost row has been added. Your can find the stored procedure in de MSDB database under System Stored Procedures.
USE [msdb]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER OFF
GO
ALTER PROCEDURE [dbo].[sp_dts_listpackages]
  @folderid uniqueidentifier
AS
  SELECT
      name,
      id,
      description,
      createdate,
      folderid,
      datalength(packagedata),
      vermajor,
      verminor,
      verbuild,
      vercomments,
      verid
  FROM
      sysdtspackages90
  WHERE
      [folderid] = @folderid
  ORDER BY NAME -- Added this row

That's better!
 









Sorting in Visual Studio 2005
You can do that manually by editing the .dtproj file and sort the xml nodes, but someone made a tool for it which adds some handy buttons to Visual Studio:




The tool can be found at:
http://www.sqldbatips.com/showarticle.asp?ID=78

Wednesday 22 December 2010

Performance Best Practices

Case
A client of mine had some performance issues with couple of SSIS packages and because they lack basic SSIS knowledge, they just upgraded there server with more memory. Finally, after 32GB of memory, they stopped upgrading and start reviewing there packages.

Solution
There are a lot of blogs about SSIS Best Practices (for instance: SSIS junkie). Here is the top 10 of the easy to implement but very effective ones I showed them to 'upgrade' their packages instead of the memory.

1) Unnecessary columns
Select only the columns that you need in the pipeline to reduce buffer size and reduce OnWarning events at execution time. SSIS even helps you by showing the unnecessary ones in the Progress/Execution Results Tab: [DTS.Pipeline] Warning: The output column "Address1" (16161) on output "Output0" (16155) and component "CRM clients" (16139) is not subsequently used in the Data Flow task. Removing this unused output column can increase Data Flow task performance.
Unnecessary columns from a flat file




















2) Use queries instead of tables
Following on the unnecessary columns, always use a SQL statement in an OLE DB Source component or (Fuzzy) Lookup component rather than just selecting a table. Selecting a table is akin to "SELECT *..." which is universally recognised as bad practice.
OLE DB Source, use SQL Command instead of Table














Lookup, use SQL Command instead of Table














3) Use caching in your LOOKUP
Make sure that the result of your lookup is unique, otherwise SSIS cannot cache the query and executes it for each record passing the lookup component. SSIS will warn you for this in the Progress/Execution Results Tab: [Lookup Time Dimension [605]] Warning: The component "Lookup Time Dimension" (605) encountered duplicate reference key values when caching reference data. This error occurs in Full Cache mode only. Either remove the duplicate key values, or change the cache mode to PARTIAL or NO_CACHE.

Watch out that you are not grabbing too many resources in the lookup. A couple of million records is probably not a good idea. And new is SSIS 2008 is that you can reuse your lookup cache in an other lookup.
SSIS 2008: Cache



















4) Filter in source
Where possible filter your data in the Source Adapter rather than filter the data using a Conditional Split transform component. This will make your data flow perform quicker because the unnecessary records don't go through the pipeline.
Filter in OLE DB Source, filter data in source














5) Sort in source
A sort with SQL Server is faster than the sort in SSIS, partly because SSIS does the sort in memory. So it pays to move the sort to a source component (where possible). Note you have to set IsSorted=TRUE on the source adapter output, but setting this value does not perform a sort operation; it only indicates that the data it sorted. After that change the SortKeyPosition of all output columns that are sorted.
Advanced Editor for Source, sort data in source















6) Join in source
Where possible, join data in the Source Adapter rather than using the Merge Join component. SQL Server does it faster than SSIS. But watch out that you are not making to complex queries because that will worsen the readability.

Unnecessary Join and Sorts















7) Group in source
Where possible, aggregate your data in the Source Adapter rather than using the Aggregate component. SQL Server does it faster than SSIS.

Unnecessary Sorts, Join and Aggregate














8) Beware of Non-blocking, Semi-blocking and Fully-blocking components in general
The dataflow consists of three types of transformations: Non-blocking, Semi-blocking and Fully-blocking. And as the names suggests, use Semi-blocking and Fully-blocking components rarly to optimize your packages. Jorg Klein has written a interesting article about it with a list of which component is non-, semi- or fully blocking.

A summary of how to recognize these three types:

Non-blocking
Semi-blocking
Fully-blocking
Synchronous/asynchronous
Synchronous
Asynchronous
Asynchronous
Number of rows in equal to rows out
True
Usually False
Usually False
Collect all input before the can output
False
False
True
New buffer created?
False
True
True
New thread created?
False
Usually True
True
Find more information about (a)synchronous at Microsoft.


9) High Volumes of Data and indexes
Loading high volumes of data on a table with clustered and non-clustered indexes could take a lot of time.
The most important thing to verify is if all indexes are really used. SQL Server 2005 and 2008 provide information about index usage with to views: sys.dm_db_index_operational_stats and sys.dm_db_index_usage_stats. Drop all rarely used and unused indexes first. Experience teaches that there are often a lot of unnecessary indexes. If you are absolute sure that all remaining indexes are necessary you can drop all indexes before loading the data and to recreate them afterwards. The performance profit of that depends on the number of records. The higher the number of records the more profit you gain.
Drop and recreate indexes






















10) SQL Server Destination Adapter vs OLE DB Destination Adapter
If your target database is a local SQL server database, the SQL Server Destination Adapter will perform much better than the OLE DB Destination Adapter. However the SQL Server Destination Adapter works only on a local machine and via Windows security. You have to be absolute sure that your database stays local in the future otherwise you mapping will not work when moving the database.


Note: this is not a complete list, but just a top 10 of easy to implement but very effective ones. Tell me if you have items that should be in the top 10 of Performance Best Practices!

Note: Besides the Performance Best Practice there also is a Development Best Practice.
Related Posts Plugin for WordPress, Blogger...