Tuesday, 29 August 2017

Azure Data Lake Store in SSIS

Case
Microsoft just released a new Azure Feature Pack for SSIS with ADLS Connectivity.  What's new?
Azure Data Lake Store File System Task












Solution
It contains four new items and a new connection manager:
- Azure Data Lake Store File System Task
- Foreach ADLS File Enumerator
- Azure Data Lake Store Source
- Azure Data Lake Store Destination

Azure Data Lake Store File System Task
This task only allows you to upload or download files to the Azure Data Lake Store. This is similar to the Azure Blob Upload / Download Task. In the near future new operations will be added. A delete file or delete folder would be a handy addition

1) Start
First download and install the new Azure Feature Pack. Then check the Azure Folder in the SSIS Toolbox and drag the Azure Data Lake Store File System Task to the surface. Give it a suitable name.
Azure Data Lake Store File System Task






















2) Operation and source
Edit the new task and select an Operation. For this example I will use the CopyToADLS operation. Then we first need to specify where the files are located on the local machine. This is a hardcoded path but can be overwritten with an expression. The FileNamePattern is a wildcard with ? or *. I use *.csv to upload all csv files in that folder. SearchRecursively allows you to find files in subfolders.
Specify local source






















3) Destination - Connection manager
Next we need to create a new ADLS connection manager or select an existing one.
ADLS Connection Manager






















As host you can use the URL property from the ADLS Overview page. Go to the Azure Portal and copy that URL
URL = ADLS Host











For this example I will use the easier Authentication: Azure AD User Identity. It uses you email address and password from Azure. The Azure AD Service Identity will be handled in a later post.
ADLS Connection Manager





















When hitting OK or Test Connection it will open an Azure Login page, where you need to login and confirm that SSIS can connect to that ADLS.

4) Destination - ADLS folder
Next we need to specify a folder name or path. You can either specify the name of an existing folder or a new folder name that will be created when executed. To find which existing folders you have, you can use the Data Explorer page in ADLS.
Data Explorer










Specify Folder






















The FileExpiry option lets you specify the data that will be used the expire the files in ADLS. You can leave it empty to never expire.

5) The result
Now run the task/package to see the result. Use Data Explorer in ADLS to see the actual result.
Data Explorer












Foreach ADLS File Enumerator
The Foreach ADLS File Enumerator is a new enumerator for the Foreach Loop Container. It allows you to loop through an ADLS folder and return the paths of the files. It is very similar to the Azure Blob Enumerator. You can use this enumerator with the Azure Data Lake Store Source in the Data Flow Task.

1) Select Enumerator
When you select the ADLS File Enumerator. You need to specify the Connection Manager (see above, step 3 of task). The remote folder (use the Data Explorer to find an existing folder). And then the wildcard and the Search recursive option.
Collection






















2) Variable Mappings
In the Variable Mappings pane you need to map the first item of the collection (zero based) to an SSIS string variable.
Variable Mappings






















3) The Result
To show the content of the variable during execution, I added a simple Script Task and a little C# code: MessageBox.Show(Dts.Variables["User::filepath"].Value.ToString());
MessageBox.Show



























C) Azure Data Lake Store Source
This allows you to use files from the Azure Data Lake Store as a source in SSIS. Again very similar to the Azure Blob Source.

1) Edit Source
Drag the Azure Data Lake Store Source to the surface and give it a suitable name. Then edit the source and specify the connection manager, File Path and format. You cannot specify the data type or size. In this first test every thing became (DT_WSTR,100).
Azure Data Lake Store Source























2) The Result
To test the result (with a very small file) I added a dummy Derived Column and a Data Viewer.
Azure Data Lake Store Source

















D) Azure Data Lake Store Destination
This allows you to stream your Data Flow Task data to Azure Data Lake Store. Again very similar to the Azure Blob Destination.

1) Edit Destination
Add a Azure Data Lake Store Destination after your source or transformation and give it a suitable name. You can specify the connection manager, file path and the file format option.
ADLS destination - test3.csv

















2) The Result
To test the result run the package and open the Data Explorer in ADLS to see the result
Data Explorer











Conclusion
Much needed ADLS extension for the Azure Feature Pack, but nothing spectacular compared to the Blob Storage items in this feature pack. Hopefully the Azure Data Lake Store File System Task will soon be extended with new actions and perhaps they could also introduce the Azure Blob Storage File System Task.

Wednesday, 23 August 2017

Where is the new SSIS project type?

Case
I just downloaded the new SSDT for Visual Studio 2017 (15.3.0 preview) and although I selected Integration Services during installation it isn't showing in the New Project window.
SSDT for Visual Studio 2017 (15.3.0 preview)























Solution
You probably already had Visual Studio 2017 installed and added the Analysis Services and Reporting Services project types via Extensions and Updates.


1) Remove projects
Go to Extensions and Updates and remove both the Analysis Services and Reporting Services projects.
Remove SSAS and SSRS projects



















After that close Visual Studio and let the VSIX installer, uninstall both projects.
VSIX (un)installer

















2) Uninstall SSDT and Install SSDT
Then first uninstall SSDT completely. And then install SSDT (Repair didn't work).
Uninstall and install SSDT






















3) Create new SSIS Project
Now open Visual Studio and create a new SSIS project.
New SSIS Project







Wednesday, 26 July 2017

Active Directory as Source

Case
I want to use the users from my Active Directory in my Data Warehouse for an Employee Dimension. How can I accomplish that in SSIS?














Solution
With a little .Net scripting in a Script Component you can accomplish an Active Directory source in your Data Flow Task. Before you start you need to figure out which fields are available in your Active Directory. If you don't know the fieldnames you could set a breakpoint on the foreach loop and add a watch on the result variable. Then you could browse through the properties of that variable to find all available fields.

1) Script Component Source
Create a Data Flow Task and add a Script Component and choose "Source" as the type. After that give you source a suitable name like "SRC_SCR - Active Directory Users".
Script Component - Source





















2) Output columns
Next edit the Script Component and go to the Inputs and Outputs pane. This is where we need to specify all the new output columns and datatypes. For my Active Directory example I have added 5 string (DT_STR) columns: Fullname, Firstname, Surename, Department and Manager. You have to figure out the required length. You could start with the default 50 or change it to a more save 255.
Add new output columns

























3) Edit script
Go back to the first page and choose C# as your scripting language and then hit the Edit Script buton to open the vsta environment.
Edit script

























4) Reference
To tell our script about Active Directory, we first need to add a new reference to the .NET assembly System.DirectoryServices. In the Solution Explorer right click References and
 choose Add Reference... Then scroll down, select System.DirectoryServices and click OK to confirm.
Add reference

















5) The Script - Namespaces
Under namespaces add a new using for our newly referenced assembly: using System.DirectoryServices; This allows you to shorten the code.
Add using



















6) The Script - CreateNewOutputRows
Now replace your CreateNewOutputRows method with the one below and add the GetPropertyValue method from below within your class tags under the existing methods. You could remove the PostExecute and PreExecute methods since we don't need them in this example.

The CreateNewOutputRows method adds new rows to the buffer by calling Output0Buffer.AddRow(); within a loop construction and then it fills all columns with data from the Active Directory. Before the foreach loop you find the code to connect to the Active Directory with some optional filter.
public override void CreateNewOutputRows()
{
    // Specify the connnectionstring of your domain
    // @mycompany.com => LDAP://DC=mycompany,dc=com
    // Consider using a variable or parameter instead
    // of this hardcoded value. On the other hand
    // how many times does your domain changes
    string domainConnectionString = "LDAP://DC=ilionx,dc=com";

    using (DirectorySearcher ds = new DirectorySearcher(new DirectoryEntry(domainConnectionString)))
    {
        ds.Filter = "(&" +
                    "(objectClass=user)" +  // Only users and not groups
                    "(department=*)" +      // All departments
                    "(givenname=j*)" +      // Firstname starts with j
                    ")";
        // See ds. for more options like PageSize.
        //ds.PageSize = 1000;

        // Find all persons matching your filter
        using (SearchResultCollection results = ds.FindAll())
        {
            // Loop through all rows of the search results
            foreach (SearchResult result in results)
            {
                // Add a new row to the buffer
                Output0Buffer.AddRow();
                // Fill all columns with the value from the Active Directory
                Output0Buffer.Fullname = GetPropertyValue(result, "cn");
                Output0Buffer.Surename = GetPropertyValue(result, "sn");
                Output0Buffer.Department = GetPropertyValue(result, "department");
                Output0Buffer.Manager = GetPropertyValue(result, "manager");
                Output0Buffer.Firstname = GetPropertyValue(result, "givenname");
            }
        }
    }
}

// Extra method to avoid having an if construction around each column
// It checks whether it can find a value. If not it returns an empty string.
private static string GetPropertyValue(SearchResult Results, string Property)
{
    // Null value results in count zero
    if (Results.Properties[Property].Count > 0)
    {
        return Results.Properties[Property][0].ToString();
    }
    else
    {
        return "";
    }
}


7) The result
To test the result add a dummy Derived Column and a Data Viewer to see the values. If it works you can add the rest of the components to the Data Flow Task to fill your dimension table.
The result of my script
















Summary
A very simple and short script to get data from your Active Directory. For more filter examples visit
MSDN. For very large Active Directories you have to play with the PageSize property.
Note that there are also scriptless options like with an ADO.Net source or with Third Party components. Each solution has its own pros and cons.




Sunday, 25 June 2017

PowerShell Virtual Group - PowerShell ❤ SSIS

Last week I had the honor to speak for the PowerShell Virtual Group about the combination of SSIS and PowerShell. As promised, here is my PowerPoint of that session and a list of all my PowerShell scripts for SSIS:
    And on my other blog I have a couple of PowerShell scripts available for Azure which can be executed as runbooks in Azure Automation. And if you have an hour available you can watch the entire session on youtube.

    Thursday, 1 June 2017

    Read content of Object variable

    Case
    I am filling an Object variable with an Execute SQL Task and I want to use it in a Foreach Loop Container (Foreach ADO Enumerator), but the Foreach Loop stays empty. So I want to check the value of my Object variable. However debugging the package does not show me the value of Object variables. How can I see the content of my Object variable?

    No (readable) value for Object variables





















    Solution
    A solution could be to use a Script Task after the Execute SQL Task to show the content of the Object variable. The script below shows the top (x) records in a MessageBox. The code doesn't need any changes. The only change that you could consider to make is changing the number of records to show in the MessageBox (see C# variable maxRows).
    Getting content of Object variable



















    1) Add a Script Script Task
    Add a new Script Task to the surface of your Control Flow and connect it to your Execute SQL Task. Then edit the Script Task to provide one Object variable in the property ReadOnlyVariables or ReadWriteVariables. This should of course be the same Object variable as in your Execute SQL Task.
    Provide one Object variable























    2) Edit Script
    Make sure to select Microsoft Visual C# as Script Langugage and then hit the Edit Script button to open the Vsta environment. Then first locate the Namesspaces to add an using for System.Data.OleDb.
    #region Namespaces
    using System;
    using System.Data;
    using Microsoft.SqlServer.Dts.Runtime;
    using System.Windows.Forms;
    using System.Data.OleDb;    // Added
    #endregion
    

    Then scroll down and located the Main method "public void Main()" and replace it with the code below.
    public void Main()
    {
        // Show max number of data rows in a simgle messagebox
        int maxRows = 3;
    
        /////////////////////////////////////////////////////////////////////
        // No need to change lines below
        /////////////////////////////////////////////////////////////////////
    
        // Create a table object to store the content of the object variable
        OleDbDataAdapter dataAdapter = new OleDbDataAdapter();
        DataTable myTable = new DataTable();
    
        // Create message string to show the content of the object variable
        string message = "";
        string header = "Error";
    
        // Five checks before looping through the records in the object variable
        ////////////////////////////////////
        // 1) Is a variable provided?
        ////////////////////////////////////
        if (Dts.Variables.Count.Equals(0))
        {
            message = "No read-only or read-write variables found";
        }
        ////////////////////////////////////
        // 2) Multiple variables provided
        ////////////////////////////////////
        else if(Dts.Variables.Count > 1)
        {
            message = "Please provide only 1 read-only or read-write variable";
        }
        ////////////////////////////////////
        // 3) Is it an object variable?
        ////////////////////////////////////
        else if (!Dts.Variables[0].DataType.ToString().Equals("Object"))
        {
            message = Dts.Variables[0].Name + " is not an Object variable";
        }
        ////////////////////////////////////
        // 4) Is it null or not an table?
        ////////////////////////////////////
        else
        {
            try
            {
                // Try to fill the datatable with the content of the object variable
                // It will fail when it is null or not containing a table object.
                dataAdapter.Fill(myTable, Dts.Variables[0].Value);
            }
            catch
            {
                // Failing the third check
                message = Dts.Variables[0].Name + " doesn't contain a usable value";
            }
        }
    
        ////////////////////////////////////
        // 5) Is it containing records
        ////////////////////////////////////
        if (myTable.Rows.Count > 0)
        {
            int j = 0;
            // Loop through all rows in the dataset but don't exceed the maxRows
            for (j = 0; j < myTable.Rows.Count && j < maxRows; j++)
            {
                // Get all values from a single row into an array
                object[] valuesArray = myTable.Rows[j].ItemArray;
    
                // Loop through value array and columnnames collection
                for (int i = 0; i < valuesArray.Length; i++)
                {
                    message += myTable.Rows[j].Table.Columns[i].ColumnName + " : " + valuesArray[i].ToString() + Environment.NewLine;
                }
                // Add an empty row between each data row
                message += Environment.NewLine;
            }
    
            // Create header
            header = "Showing " + j.ToString() + " rows out of " + myTable.Rows.Count.ToString();
        }
        else if (!message.Equals(""))
        {
            // Don't do anything
            // Record count is 0, but an other validition already failed
        }
        else
        {
            // Record count is 0
            message = Dts.Variables[0].Name + " doesn't contain any rows";
        }
    
        // Show message with custom header
        MessageBox.Show(message, header);
    
        Dts.TaskResult = (int)ScriptResults.Success;
    }
    
    Now close the Vsta environment and click on OK in the Script Task editor to finish it.


    3) The result
    Now run the package to see the result. I tried to make it a bit monkey proof by adding some checks in the code. If you provide a good and filled variable then it will show the data. Otherwise it will show an error telling you what's wrong.
    The result




    Sunday, 14 May 2017

    Import and export SSIS Catalog Environments with JSON

    Case
    I want to import and export Environments to/from my SSIS Catalog. Doing it manually in SSMS takes ages. How can you do that more quickly?
    I want to export this environment



















    Solution
    I will ago I created a couple of PowerShell scripts to deploy environments to your SSIS Catalog with a CSV file, database table or an array as source. The script below is a follow up that allows you to export one or more environments as json files, but also has an import method to deploy those exported environments to a Catalog: Get-CatalogEnvironment and Set-CatalogEnvironment.

    This is an example of how you execute the two methods. It first starts with importing a separate script file with various methods and then you can either execute the Get or the Set method:
    # PowerShell code
    # If you have trouble executing a PowerShell due an Execution Policy then run
    # the following script. Note that you need to run PowerShell as administrator
    # More information: https://technet.microsoft.com/nl-nl/library/ee176961.aspx
    # Set-ExecutionPolicy Unrestricted 
    
    # Include functions from a secondairy file
    . "$PSScriptRoot\Ssisfunctions.ps1"
    
    # Download example
    Get-CatalogEnvironment -SsisServer "mySqlServer\myInstance" -ExportPath "c:\backup\" -FolderName MyEnvFolder -EnvironmentName MyEnvName -Verbose
    
    # Upload example
    Set-CatalogEnvironment -SsisServer "mySqlServer\myInstance" -FolderName MyEnvFolder -EnvironmentName MyEnvName -ImportFilePath "C:\temp\employees.json" -DeleteExistingEnvironment $true -Verbose
    
    Example of execution














    The environment json files look like this:
    [
        {
            "Name":"FolderStageFiles",
            "Description":"Location of stage files",
            "Type":"String",
            "Sensitive":false,
            "Value":"d:\\sources\\"
        },
        {
            "Name":"FtpPassword",
            "Description":"Secret FTP password",
            "Type":"String",
            "Sensitive":true,
            "Value":$3cr3t
        },
        {
            "Name":"MIS_STG_Connectionstring",
            "Description":"Connectionstring to stage database",
            "Type":"String",
            "Sensitive":false,
            "Value":"Data Source=.\\sql2016;Initial Catalog=MIS_STG;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;"
        },
        {
            "Name":"NumberOfRetries",
            "Description":"Number of retries for Webservice Task",
            "Type":"Int16",
            "Sensitive":false,
            "Value":3
        }
    ]
    

    You can also get detailed help information and instructions with the standard PowerShell method. Get-Help. It allows you to see examples or the see which parameter is mandatory or optional.
    # PowerShell code
    # Getting help about the commands
    Get-Help Set-CatalogEnvironment -detailed
    Get-Help Get-CatalogEnvironment -example
    

    And this is the content of the Ssisfunctions.ps1 file containing the various methods. Take a look and let me know if you have any improvements
    # PowerShell code: Ssisfunctions.ps1 (v0.1)
    <#
    .Synopsis
        Download one or more environments from an SSIS Catalog as JSON files
    
    .DESCRIPTION
        This functions allows you to download an Environment from the SSIS Catalog. By leaving out the foldername or environmentname you can also download
        multiple files. All files are downloaded as JSON files in the format [FolderName].[EnvironmentName].json
        Example file of export:
    
        [
           {
              "Name":"FolderStageFiles",
              "Description":"Location of stage files",
              "Type":"String",
              "Sensitive":false,
              "Value":"d:\\sources\\"
           },
           {
              "Name":"FtpPassword",
              "Description":"Secret FTP password",
              "Type":"String",
              "Sensitive":true,
              "Value":null
           },
           {
              "Name":"MIS_STG_Connectionstring",
              "Description":"Connectionstring to stage database",
              "Type":"String",
              "Sensitive":false,
              "Value":"Data Source=.\\sql2016;Initial Catalog=MIS_STG;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;"
           },
           {
              "Name":"NumberOfRetries",
              "Description":"Number of retries for Webservice Task",
              "Type":"Int16",
              "Sensitive":false,
              "Value":3
           }
        ]
    
    .PARAMETER SsisServer
        Mandatory: The name of the SQL Server instance that runs the SSIS Catalog
    
    .PARAMETER FolderName
        Optional: The name of the Catalog folder that contains the Environment
    
    .PARAMETER EnvironmentName
        Optional: The name of the Environment
    
    .PARAMETER ExportPath
        Optional: The fully qualified path where the json files will be saved. Default value: c:\temp\
    
    .PARAMETER Verbose
        Optional: Get more logging information on the screen
    
    .EXAMPLE
        Get-CatalogEnvironment -SsisServer "myServer\myInstance" -ExportPath "c:\backup\" -FolderName myCatalogFolder -EnvironmentName myEnvironmentName
    
    .EXAMPLE
        Get-CatalogEnvironment -SsisServer "myServer\myInstance" -ExportPath "c:\backup\" -Verbose
    
    .NOTES
        You cannot get the value of sensitive variables.The value will be NULL in the export file. 
        Current scripts works for SSIS 2016. Change version number in code to use an other version of SSIS.
    
    .LINK 
        https://microsoft-ssis.blogspot.com/
    #>
    Function Get-CatalogEnvironment
    {
        [CmdletBinding()]
        Param
        (
            [Parameter(Mandatory=$true, Position=0)]
            [ValidateLength(1,50)] # String must be between 1 and 50 chars long
            [string]$SsisServer,
    
            [Parameter(Mandatory=$false, Position=1)]
            [ValidateLength(1,128)] # String must be between 1 and 128 chars long 
            [string]$FolderName,
    
            [Parameter(Mandatory=$false, Position=2)]
            [ValidateLength(1,128)] # String must be between 1 and 128 chars long 
            [string]$EnvironmentName,
    
            [Parameter(Mandatory=$false, Position=3)]
            [string]$ExportPath = "C:\temp\" 
        )
    
        # Don't continue after error
        $ErrorActionPreference = "Stop"
    
        #################################################
        ############## SHOW ALL PARAMETERS ##############
        #################################################
        Write-Verbose "========================================================="
        Write-Verbose "==      Used parameters - Get-CatalogEnvironment       =="
        Write-Verbose "========================================================="
        Write-Verbose "SSISServer              : $($SsisServer)"
        Write-Verbose "FolderName              : $($FolderName)"
        Write-Verbose "EnvironmentName         : $($EnvironmentName)"
        Write-Verbose "ExportPath              : $($ExportPath)"
        Write-Verbose "========================================================="
    
     
        #################################################
        ############### ADD SSIS ASSEMBLY ###############
        #################################################
        # Change assembly version number to use an other SSIS version
        # 13.0.0.0 = SSIS 2016
        # 12.0.0.0 = SSIS 2014
        # 11.0.0.0 = SSIS 2012
        $SsisNamespace = "Microsoft.SqlServer.Management.IntegrationServices"
        Add-Type -AssemblyName "$($SsisNamespace), Version=13.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91"
     
     
        #################################################
        ############ CONNECT TO SSIS SERVER #############
        #################################################
        # First create a connection to SQL Server
        $SqlConnectionstring = "Data Source=$($SsisServer);Initial Catalog=master;Integrated Security=SSPI;"
        $SqlConnection = New-Object System.Data.SqlClient.SqlConnection $SqlConnectionstring
     
        # Then use that SQL connection to create an
        # Integration Services object.
        $IntegrationServices = New-Object $SsisNamespace".IntegrationServices" $SqlConnection
    
        # Check if connection succeeded
        If (!$IntegrationServices)
        {
            Throw [System.Exception] "Failed to connect to server $($SsisServer)"
        }
        Else
        {
            Write-Verbose "Connected to server $($SsisServer)"
        }
     
     
        #################################################
        ########### CONNECT TO SSIS CATALOG #############
        #################################################
        # Create object for SSISDB Catalog
        $Catalog = $IntegrationServices.Catalogs["SSISDB"]
     
        # Check if the SSISDB Catalog exists
        If (!$Catalog)
        {
            # Catalog does not exist.
            Throw [System.Exception] "SSISDB catalog does not exist"
        }
        Else
        {
            Write-Verbose "SSISDB catalog found"
        }
    
    
        #################################################
        ############## CHECK EXPORT FOLDER ##############
        #################################################
        # Check if folder exists
        If (-Not (Test-Path $ExportPath))
        {
            # Create new folder
            New-Item -ItemType directory -Path $ExportPath | Out-Null
            Write-Host "Folder created: " $ExportPath
        }
        Else
        {
            Write-Verbose "Folder $($ExportPath) found"
        }
    
    
        #################################################
        ############# LOOP THROUGH FOLDERS ##############
        #################################################
        # Loop though all folder or filter on a folder name
        Foreach ($Folder in $Catalog.Folders | WHERE {$_.Name -eq $FolderName -or (!$FolderName)})
        {
            # Loop though all environments or filter on a environment name
            Foreach ($Environment in $Folder.Environments | WHERE {$_.Name -eq $EnvironmentName -or (!$EnvironmentName)})
            {
                Write-Host "Exporting $($ExportPath)$($Folder.Name).$($Environment.Name).json"
                $Environment.Variables | Select-Object -Property Name,Description,@{Name='Type';Expression={"$($_.Type)"}},Sensitive,Value | ConvertTo-Json -Compress | Out-File "$($ExportPath)$($Environment.Parent.Name).$($Environment.Name).json"
    
                # Show warnings if the environment contains sensitive variables
                $Environment.Variables | Select-Object -Property Name,Sensitive | Where {$_.Sensitive -eq $True} | ForEach-Object {
                    Write-Warning "Variable $($_.Name) is sensitive. Cannot retrieve its value"
                }
            }
        }
    
    }
    
    <#
    .Synopsis
        Upload a json environment file to an SSIS Catalog
    
    .DESCRIPTION
        This functions allows you to upload an Environment to the SSIS Catalog. It can update (no deletes) or replace an existing environment.
        Example file which can be imported:
    
        [
           {
              "Name":"FolderStageFiles",
              "Description":"Location of stage files",
              "Type":"String",
              "Sensitive":false,
              "Value":"d:\\sources\\"
           },
           {
              "Name":"FtpPassword",
              "Description":"Secret FTP password",
              "Type":"String",
              "Sensitive":true,
              "Value":$3cr3t
           },
           {
              "Name":"MIS_STG_Connectionstring",
              "Description":"Connectionstring to stage database",
              "Type":"String",
              "Sensitive":false,
              "Value":"Data Source=.\\sql2016;Initial Catalog=MIS_STG;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;"
           },
           {
              "Name":"NumberOfRetries",
              "Description":"Number of retries for Webservice Task",
              "Type":"Int16",
              "Sensitive":false,
              "Value":3
           }
        ]
    
    .PARAMETER SsisServer
        Mandatory: The name of the SQL Server instance that runs the SSIS Catalog
    
    .PARAMETER FolderName
        Mandatory: The name of the Catalog folder where the Evironment will be stored
    
    .PARAMETER EnvironmentName
        Mandatory: The name of the Environment
    
    .PARAMETER ImportFilePath
        Mandatory: The fully qualified path of the json file that needs to be imported
    
    .PARAMETER DeleteExistingEnvironment
        Optional: Setting to $true first deletes an existing environment. Default value: $false 
    
    .PARAMETER Verbose
        Optional: Get more logging information on the screen
    
    .EXAMPLE
        Set-CatalogEnvironment -SsisServer "MYSERVER\myInstance" -FolderName MyEnvFolder -EnvironmentName MyEnvName -ImportFilePath "C:\backup\Environments.Generic.json" -DeleteExistingEnvironment $true
    
    .EXAMPLE
        Set-CatalogEnvironment -SsisServer "MYSERVER\myInstance" -FolderName MyEnvFolder -EnvironmentName MyEnvName -ImportFilePath "C:\backup\Environments.Generic.json" -Verbose
    
    .NOTES
        You cannot insert null values. The will be skipped with a warning
        Current scripts works for SSIS 2016. Change version number in code
        to use an other version of SSIS.
    
    .LINK 
        https://microsoft-ssis.blogspot.com/
    #>
    Function Set-CatalogEnvironment
    {
        [CmdletBinding()]
        Param
        (
            [Parameter(Mandatory=$true, Position=0)]
            [ValidateLength(1,50)] # String must be between 1 and 50 chars long
            [string]$SsisServer,
    
            [Parameter(Mandatory=$true, Position=1)]
            [ValidateLength(1,128)] # String must be between 1 and 128 chars long 
            [string]$FolderName,
    
            [Parameter(Mandatory=$true, Position=2)]
            [ValidateLength(1,128)] # String must be between 1 and 128 chars long 
            [string]$EnvironmentName,
    
            [Parameter(Mandatory=$true, Position=3)]
            [ValidateScript({Test-Path -Path $_ -PathType Leaf})] # File must exist
            [ValidatePattern(‘.json$’)] # Extension must be .json
            [string]$ImportFilePath,
    
            [Parameter(Mandatory=$false, Position=4)]
            [bool]$DeleteExistingEnvironment = $false
        )
    
        # Don't continue after error
        $ErrorActionPreference = "Stop"
    
        #################################################
        ############## SHOW ALL PARAMETERS ##############
        #################################################
        Write-Verbose "========================================================="
        Write-Verbose "==      Used parameters - Set-CatalogEnvironment       =="
        Write-Verbose "========================================================="
        Write-Verbose "SSISServer                : $($SsisServer)"
        Write-Verbose "FolderName                : $($FolderName)"
        Write-Verbose "EnvironmentName           : $($EnvironmentName)"
        Write-Verbose "ImportFilePath            : $($ImportFilePath)"
        Write-Verbose "DeleteExistingEnvironment : $($DeleteExistingEnvironment)"
        Write-Verbose "========================================================="
    
    
        #################################################
        ############### ADD SSIS ASSEMBLY ###############
        #################################################
        # Change assembly version number to use an other SSIS version
        # 13.0.0.0 = SSIS 2016
        # 12.0.0.0 = SSIS 2014
        # 11.0.0.0 = SSIS 2012
        $SsisNamespace = "Microsoft.SqlServer.Management.IntegrationServices"
        Add-Type -AssemblyName "$($SsisNamespace), Version=13.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91"
    
     
        #################################################
        ############ CONNECT TO SSIS SERVER #############
        #################################################
        # First create a connection to SQL Server
        $SqlConnectionstring = "Data Source=$($SsisServer);Initial Catalog=master;Integrated Security=SSPI;"
        $SqlConnection = New-Object System.Data.SqlClient.SqlConnection $SqlConnectionstring
     
        # Then use that SQL connection to create an
        # Integration Services object.
        $IntegrationServices = New-Object $SsisNamespace".IntegrationServices" $SqlConnection
    
        # Check if connection succeeded
        If (!$IntegrationServices)
        {
            Throw [System.Exception] "Failed to connect to server $($SsisServer)"
        }
        Else
        {
            Write-Verbose "Connected to server $($SsisServer)"
        }
     
     
        #################################################
        ########### CONNECT TO SSIS CATALOG #############
        #################################################
        # Create object for SSISDB Catalog
        $Catalog = $IntegrationServices.Catalogs["SSISDB"]
     
        # Check if the SSISDB Catalog exists
        If (!$Catalog)
        {
            # Catalog does not exist. Different name used?
            Throw [System.Exception] "SSISDB catalog does not exist"
        }
        Else
        {
            Write-Verbose "SSISDB catalog found"
        }
    
    
        #################################################
        ################## CHECK FOLDER #################
        #################################################
        # Create object to the (new) folder
        $Folder = $Catalog.Folders[$FolderName]
     
        # Check if folder exists
        If (!$Folder)
        {
            # Folder doesn't exists, so create the new folder.
            Write-Host "Creating new folder $($FolderName)"
            $Folder = New-Object $SsisNamespace".CatalogFolder" ($Catalog, $FolderName, $FolderName)
            $Folder.Create()
        }
        Else
        {
            Write-Verbose "Folder $($FolderName) found"
        }
    
    
        #################################################
        ################## ENVIRONMENT ##################
        #################################################
        # Create object for the (new) environment
        $Environment = $Folder.Environments[$EnvironmentName]
    
        # Check if folder already exists
        If (-not $Environment)
        {
            Write-Host "Creating new environment $($EnvironmentName) in $($FolderName)"
    
            $Environment = New-Object $SsisNamespace".EnvironmentInfo" ($Folder, $EnvironmentName, $EnvironmentName)
            $Environment.Create()
        }
        ElseIf($DeleteExistingEnvironment -and $Environment)
        {
            Write-Verbose "Environment $($EnvironmentName) found with $($Environment.Variables.Count) existing variables"
            Write-Host "Dropping and recreating environment $($EnvironmentName) in $($FolderName)"
            $Environment.Drop()
            $Environment = New-Object $SsisNamespace".EnvironmentInfo" ($folder, $EnvironmentName, $EnvironmentName)
            $Environment.Create()
        }
        Else
        {
            Write-Verbose "Environment $($EnvironmentName) found with $($Environment.Variables.Count) existing variables"
        }
        
    
        #################################################
        ############### GET FILE CONTENT ################
        #################################################
        Write-Verbose "Reading $($ImportFilePath)"
        $EnvironmentInput = Get-Content -Raw -Path $ImportFilePath | ConvertFrom-Json
    
    
        #################################################
        ################### VARIABLES ###################
        #################################################
        # Keep track of number of updates and inserts
        $InsertCount = 0
        $UpdateCount = 0
    
        # Loop through file content
        $EnvironmentInput | Select-Object -Property Name,Description,Type,Sensitive,Value | ForEach-Object {
    
            # Get variablename from json and try to find it in the environment
            $Variable = $Environment.Variables[$_.Name]
    
            # Make sure each variable has a value
            If ($_.Value.ToString().Length -eq 0)
            {
                Write-Warning "Variable $($_.Name) skipped because it has no value"
            }
            else
            {
                # Check if the variable exists
                If (-not $Variable)
                {
                    # Insert new variable
                    Write-Verbose "Variable $($_.Name) added"
                    $Environment.Variables.Add($_.Name, $_.Type, $_.Value, $_.Sensitive, $_.Description)
    
                    $InsertCount = $InsertCount + 1
                }
                else
                {
                    # Update existing variable
                    Write-Verbose "Variable $($_.Name) updated"
                    $Variable.Type = $_.Type
                    $Variable.Value = $_.Value
                    $Variable.Description = $_.Description
                    $Variable.Sensitive = $_.Sensitive
    
                    $UpdateCount = $UpdateCount + 1
                }
            }
        }
        $Environment.Alter()
    
        Write-Host "Finished, total inserts $($InsertCount)  and total updates $($UpdateCount)"
    }
    

    Later on I will add various extra methods for example to test the existence of an environment, to delete an environment, to move an environment, to copy an environment, to rename an environment or to connect an environment to a project. Please let me know if you have any suggestions for extra functionality or improvements!

    Thursday, 4 May 2017

    SQL Nexus - PowerShell ❤ SSIS

    Nordic SQL Nexus 2017
    Last week I had the honor to speak at SQL Nexus in Copenhagen, a marvelous three day Microsoft Data Platform event with over 70 session to choose from. You can download my PowerPoint presentation here. Scripts or links to scripts are available in the notes of the PowerPoint. Contact me if you need have any question related to this presentation. More PowerShell scripts are available here.
    Presenting on an IMAX screen. No zoom-it needed

    Wednesday, 5 April 2017

    Dynamically unpivot data

    Case
    For a client I need to read hundreds of bus route matrices and they all vary in size. This makes it hard to read them dynamically with a Foreach Loop Container because the number of columns differs per file. And I don't want to create hundreds of Data Flow Tasks by hand. Even BIML won't help this time, because the routes change regularly and I don't want to generate and deploy packages every day.
    I need to dynamically unpivot data within the Data Flow Task. How do I solve this within SSIS?
    Dynamically unpivot data



















    Solution
    The trick for this case is to read everything as one big column and then dynamically split and unpivot the column in a Script Component Transformation. The unpivot output will always have three columns: Start Station, End Station and Distance. And the good news is that it has only a few lines of relatively easy code.
    The solution


























    1) Source with one big column
    Change your Flat File Connection Manager so that it will read everything as one big column. Make sure the column is big enough to fit all data. For this example I called the column 'ColumnOne'.
    Flat File with one column only














    2) Script Component Transformation Input
    Drag a Script Component on the surface and choose Transformation. Connect it to your source. Then edit the Script Component  and go to the 'Input Columns' page. On that page select the column with all the matrix data as ReadOnly.
    Input Columns
























    3) Script Component Transformation Input
    On the 'Inputs and Outputs' page we need to add the new output columns. For this example I need a StartStation (string), EndStation (string) and the Distance (int).
    An other important step is setting the SynchronousInputID property (of Output 0) to 'None'. This makes the transformation asynchronous which means the number of row in could be unequal to the number of rows out. And that means the input buffer with records isn't reused in this component, but a new output buffer will be created.
    Inputs and Outputs
























    4) The script
    Go to the script page, choose C# as scripting language and hit the Edit Script button. And now copy the contents of my Input0_ProcessInputRow method to your Input0_ProcessInputRow method. And there are also two variables called Stations and Distances. They are declared above this method. Copy those to your code and put them on the same place.
    I also remove the unused methods PreExecute, PostExecute and CreateNewOutputRows to keep the code clean and mean.
    #C# Code
    #region Namespaces
    using System;
    using System.Data;
    using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
    using Microsoft.SqlServer.Dts.Runtime.Wrapper;
    #endregion
    
    /// <summary>
    /// Split and unpivot data
    /// </summary>
    [Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
    public class ScriptMain : UserComponent
    {
        // Define two arrays for distances and stations
        // The Stations array will be filled only once
        // The Distances array will change for each row
        string[] Stations;
        string[] Distances;
    
        /// <summary>
        /// This method is called once for every row that passes through the component from Input0.
        /// </summary>
        /// <param name="Row">The row that is currently passing through the component</param>
        public override void Input0_ProcessInputRow(Input0Buffer Row)
        {
            // The first time this method executes the Stations array
            // is still empty (null). In the true clause of the if-
            // statement we will fill the Stations array.
            // Therefore the second, third, etc. time this method
            // executes we will go to the false clause of the if-
            // statement.
            if (Stations == null)
            {
                // We know that the first row contains the stations.
                // We will add those to the stations array and use
                // it to determine the end station later on.
    
                // Split the string from ColumnOne on ; (or your own
                // column separator). The Split returns an array.
                Stations = Row.ColumnOne.Split(';');
            }
            else
            {
                // Now the rows will contain distances (and the StartStation)
                // Split the distances on ; (or your own column separator)
                Distances = Row.ColumnOne.Split(';');
    
                // Now loop through distances array, but start on 1 (not on 0)
                // because 0 contains the StartStation in the distances array
                for (int counter = 1; counter < Distances.Length; counter++)
                {
                    // Add new Row and then fill the columns
                    Output0Buffer.AddRow();
                    // Get the Distance from the Distance array and convert it to int
                    Output0Buffer.Distance = Convert.ToInt32(Distances[counter]);
                    // Get the Start station from the distance array (the first item)
                    Output0Buffer.StartStation = Distances[0];
                    // Get the End station from stations array
                    Output0Buffer.EndStation = Stations[counter];
                }
            }
        }
    }
    

    4) The result
    Now close the Script Component and add more transformations or a destination and see what the Script Component does with your data. I added a dummy Derived Column and Data Viewer to see the data before and after the Script Component. For this file I had 27 rows and columns as input and 676 rows as output (26 * 26).



    Saturday, 25 February 2017

    Change Protection Level for all packages at once

    Case
    I created dozens of packages in my project but I forgot to change the default Protection Level in the project properties from "EncryptSensitiveWithUserKey" to "DontSaveSensitive". Now I have to change all packages one by one. Is there an alternative? I tried search and replace in the XML, but I can't find the Protection Level property.




















    Solution
    Of course the best option is to prevent this from happening by setting the default before you start. You can do this in the properties of the project. All new packages will then inherit the Protection Level from the project.
    Setting Protection Level on project














    First, when trying to search and replace in the XML code of the packages you will notice that you cannot find the default 'EncryptSensitiveWithUserKey' which makes it hard to replace.
    Default Protection Level is not in package
















    Secondly, the Protection Level is also stored in the Visual Studio project file (*.dtproj). When you open a package in design mode and press the save button it also updates metadata in the project file.
    Protection Level in project file as well


















    Solution A
    Good old Command Prompt to the rescue! The dtutil Utility can do the package conversion for you. If you are afraid of the Command Prompt or even never heard about it, then don't use this solution.

    1) Command Prompt
    Open a Command Prompt and use CD (Change Directory) command to navigate to your folder with packages.
    Navigate to your project folder with packages













    2) Foreach Loop Container in DOS
    Now you can call the dtutil Utility for each package in that folder with something similar as a Foreach Loop Container:
    FOR %p IN (*.dtsx) DO dtutil.exe /file %p /encrypt file;%p;0 /quiet
    The colors explain the command













    3) Execute
    When you execute the command, dtutil Utility will quickly change the Protection Level of all your packages.
    101 packages changed within 5 seconds. Try that in Visual Studio!





















    4) Project Protection Level
    If you haven't already done it, change the Protection Level in the Project Properties. See second screenshot of this blog post.

    5) dtproj file
    Now the project and all its packages have the same Protection Level, but the project doesn't now that yet. If you try to execute a package it will complain about the Protection Level inconsistencies.
    Failed to execute the package or element. Build errors were encountered.








    Error : Project consistency check failed. The following inconsistencies were detected:
     MyPackage000.dtsx has a different ProtectionLevel than the project.
     MyPackage001.dtsx has a different ProtectionLevel than the project.

    To update the dtproj file you have to open all packages and then Rebuild the project. This will update the project file. Now you can execute the packages without the consistency error.
    Open all packages and rebuild the project





















    Solution B
    Good old PowerShell to the rescue! This PowerShell script does the same as above, but also changes the project file. So no manual labour at all. Because the dtutil utility was so fast, I didn't edit the packages with .net libraries. It just executes dtutil in a hidden window.

    The script is thoroughly tested for SSIS 2012-2016 from 'EncryptSensitiveWithUserKey' to 'DontSaveSensitive'. Other situations require more testing. Make sure to keep a copy of your project before using this script and let me know which situations require some more attention.
    Change the protection level of the entire project in seconds


















    #PowerShell script
    ################################
    ########## PARAMETERS ##########
    ################################ 
    $projectFolder = "C:\SSIS\myProject\myProject"
    $dtutilPath = "C:\Program Files\Microsoft SQL Server\130\DTS\Binn\dtutil.exe"
    # The number changes per SQL Server version
    # 130=2016, 120=2014, 110=2012
    # Also check the drive where SQL Server is
    # installed
    
    
    #################################################
    ########## DO NOT EDIT BELOW THIS LINE ##########
    #################################################
    clear
    Write-Host "========================================================================================="
    Write-Host "==                                 Used parameters                                     =="
    Write-Host "========================================================================================="
    Write-Host "Project Folder          :" $projectFolder                                                 
    Write-Host "dtutil Path             :" $dtutilPath                                                    
    Write-Host "========================================================================================="
     
     
    ######################################
    ########## Check parameters ##########
    ######################################
    # Test whether the paths are filled
    # and exists.
    if ($projectFolder -eq "")
    {
        Throw [System.Exception] "Project path parameter is mandatory"
    }
    elseif (-Not (Test-Path $projectFolder))
    {
        Throw  [System.IO.FileNotFoundException] "Project path $($projectFolder) doesn't exists!"
    }
    elseif (-Not $projectFolder.EndsWith("\"))
    {
        # Make sure path ends with \ for command
        $projectFolder = $projectFolder + "\"
    }
    if ($dtutilPath -eq "")
    {
        Throw [System.Exception] "dtutil parameter is mandatory"
    }
    elseif (-Not (Test-Path $dtutilPath))
    {
        Throw  [System.IO.FileNotFoundException] "dtutil not found at $($dtutilPath)"
    }
    
     
    #############################################
    ########## dtutil for loop command ##########
    #############################################
    # In this script we are executing dtutil.exe
    # Perhaps a bit quick & dirty, but more quick
    # than dirty. It changes 100 packages within
    # seconds.
    $command = "/C FOR %p IN ($($projectFolder)*.dtsx) DO dtutil.exe /file %p /encrypt file;%p;0 /quiet"
    Write-Host "Editing packages in $($projectFolder)... " -NoNewline
    
    # Open the command prompt (hidden) and execute
    # dtutil.exe with the parameters from above.
    Start-Process "C:\Windows\System32\cmd.exe" -ArgumentList $command -WindowStyle Hidden -Wait
    Write-Host "Done."
    
    
    ##########################################
    ########## Editing project file ##########
    ##########################################
    # Find the project file. There should be
    # only one dtproj file.
    $projectFile = get-childitem $projectFolder -name -filter *.dtproj
    Write-Host "Editing project file $($projectFile)... "  -NoNewline                                                  
    
    # Edit the project file and replace the
    # protection level. First replace is for
    # all the packages and the second replace
    # is for the project itself. It uses a
    # regular expression for the replace, but
    $projectFilePath = Join-Path -Path $projectFolder -ChildPath $projectFile
    (Get-Content $projectFilePath) -replace 'ProtectionLevel">[0-9]', 'ProtectionLevel">0' -replace 'ProtectionLevel="[A-Za-z]*"', 'ProtectionLevel="DontSaveSensitive"' | Set-Content $projectFilePath
    Write-Host "Done."
    
    ##############################
    ########## Finished ##########
    ##############################
    # Finished editing packages and project file
    Write-Host "Finished editing $($projectFile) and $((get-childitem $projectFolder -name -filter *.dtsx).Count) packages" -ForegroundColor Magenta
    



    Related Posts Plugin for WordPress, Blogger...