World of Whatever

A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Wednesday, March 22, 2017

Variable scoping in TSQL isn't a thing

It's a pop quiz kind of day: run the code through your mental parser.

    DECLARE @foo varchar(30) = 'Created in try block';
    DECLARE @i int = 1 / 0;
    PRINT @foo;
    SET @foo = 'Catch found';

PRINT @foo;
It won't compile since @foo goes out of scope for both the catch and the final line
It won't compile since @foo goes out of scope for the final line
It prints "Created in try block" and then "Catch found"
I am too fixated on your form not having a submit button

Crazy enough, the last two are correct. It seems that unlike every other language I've worked with, all variables are scoped to the same local scope regardless of where in the script they are defined. Demo the first

Wanna see something even more crazy? Check this version out

    DECLARE @i int = 1 / 0;
    DECLARE @foo varchar(30) = 'Created in try block';
    PRINT @foo;
    SET @foo = 'Catch found';

PRINT @foo;

As above, the scoping of variables remains the same but the forced divide by zero error occurs before the declaration and initialization of our variable @foo. The result? @foo remains uninitialized as evidenced by the first print in the Catch block but it still exists/was parsed to instantiate the variable but not so the value assignment. Second demo

What's all this mean? SQL's weird.

Wednesday, March 8, 2017

Getting Windows share via python

Windows network shares with python


On a daily basis, we receive data extracts from a mainframe. They provide a header and data file for whatever the business users want to explore. This client has lots of old data ferreted away and they need to figure out if there's value in it. Our job is to consume the header files to drop and create tables in SQL Server and then populate with actual data. The SQL is trivial -

CREATE TABLE Foo (Col1 varchar(255), ColN varchar(255)); 

Let's make this harder than it should be

Due to ... curious permissions and corporate politics, the SQL Server service account could only read files via a network share (\\Server\Share\Input\File.csv), never you no mind the fact that path was really just D:\Share\Input. A local drive but permissions were such that we couldn't allow the service account to read from the drive. Opening a network share up and letting the account read from that - no problem.

What are the shares?

That's an easy question to answer, because I knew the answer. net share. I coded up a simple parser and all was well and good until I ran it on the server which had some really, long share names and/or the Resource was long. Like this

Share name   Resource                        Remark

C$           C:\                             Default share
IPC$                                         Remote IPC
ADMIN$       C:\WINDOWS                      Remote Admin
DEV2016      \\?\GLOBALROOT\Device\RsFx0410\\DEV2016
                                             SQL Server FILESTREAM share
The command completed successfully.
Super. The output of net share is quasi fixed width and it just wraps whatever it needs to onto the next line/column.

What are the sharesv2

Windows Management Instrumentation to the rescue! WMIC.exe /output:stdout /namespace:\\root\cimv2 path Win32_Share GET Name, Path That's way better, sort of

Name                       Path
ADMIN$                     C:\WINDOWS
C$                         C:\
DEV2016                    \\?\GLOBALROOT\Device\RsFx0410\\DEV2016
RidiculouslyLongShareName  C:\users\bfellows\Downloads
Originally, that command ended with GET * which resulted in a lot more information being returned than I needed. The devil though, is that the output width is dependent upon the source data. If I remove the network share for my RidiculouslyLongShareName and rerun the command, I get this output
Name     Path
C$       C:\
DEV2016  \\?\GLOBALROOT\Device\RsFx0410\\DEV2016
Users    C:\Users
It appears to be longest element +2 spaces for this data but who knows what the real encoding rule is. The good thing is, that while variable, the header rows gives me enough information to slice up the data as needed.

This needs to run anywhere

The next problem is that this process in Dev runs on D:\Share but in QA is is on the I:\datafiles\instance1 and oh by the way, there are two shares for the I drive \\qa\Instance1 (I:\datafiles\instance1) and \\qa\datafiles. (I:\datafiles) In the case where there are multiple shares, if there's one for the folder where the script is running, that's the one we want. Otherwise, it's probably the "nearest" path which I interpreted as having the longest path.

Code good

Here's my beautiful, hacky python. Wherever this script runs, it will then attempt to render the best share path to the same location.

import os
import subprocess

def _generate_share_dictionary(headerRow):
    """Accepts a variable width, white space delimited string that we attempt
        to divine column delimiters from. Returns a dictionary of field names
        and a tuple with start/stop slice positions"""

    # This used to be a more complex problem before I realized I didn't have
    # to do GET * in my source. GET Name, Path greatly simplifies
    # but this code is generic so I keep it as is

    header = headerRow
    fields = header.split()
    tempOrds = {}
    ords = {}
    # Populate the temporary ordinals dictionary with field name and the
    # starting, zero based, ordinal for it.
    # i.e. given
    #Name     Path
    # we would expect Name:0, Path:9
    for field in fields:
        tempOrds[field] = headerRow.index(field)

    # Knowing our starting ordinal positions, we will build a dictionary of tuples
    # that contain starting and ending positions of our fields
    for iter in range(0, len(fields) -1):
        ords[fields[iter]] = (tempOrds[fields[iter]], tempOrds[fields[iter+1]])
    # handle the last element
    ords[fields[-1]] = (tempOrds[fields[-1]], len(headerRow))

    return ords

def get_network_shares():
    """Use WMIC to get the full share list. Needed because "net share" isn't parseable"""
    _command = r"C:\Windows\System32\wbem\WMIC.exe /output:stdout /namespace:\\root\cimv2 path Win32_Share GET Name, Path"
    #_command = r"C:\Windows\System32\wbem\WMIC.exe /output:stdout /namespace:\\root\cimv2 path Win32_Share GET *"
    _results = subprocess.check_output(_command, shell=True).decode('UTF-8')

    _headerRow = _results.splitlines()[0]
    headerOrdinals = _generate_share_dictionary(_headerRow)

    _shares = parse_network_shares_name_path(headerOrdinals, _results)
    return _shares

def parse_network_shares_name_path(header, results):
    """Rip apart the results using our header dictionary"""
    _shares = {}
    #use the above to slice into our results
    #skipping first line since it is header
    for _line in results.splitlines():
        if _line:
            _shares[_line[header["Name"][0]: header["Name"][1]].rstrip()] = _line[header["Path"][0]: header["Path"][1]].rstrip()
    return _shares

def translate_local_path_to_share(currentPath):
    """Convert the supplied path to the best match in the shares list"""
    shareName = ""
    defaultShare = ""
    shares = get_network_shares()

    # find the first share match
    if currentPath in shares.values():
        shareName = [key for key, value in shares.items() if value == currentPath][0]
        #see if we can find a partial match
        # favor longest path
        best = ""
        pathLength = 0
        for share, path in shares.items():
            # path can be empty due to IPC$ share
            if path:
                # Is the share even applicable?
                if path in currentPath:
                    # Favor the non default/admin share (DriveLetter$)
                    if share.endswith('$'):
                        defaultShare = currentPath.replace(path[:-1], share)
                        if len(path) > pathLength:
                            shareName = currentPath.replace(path[:-1], share)

        # No other share was found
        if (defaultShare and not shareName):
            shareName = defaultShare
    x = os.path.join(r"\\" + os.environ['COMPUTERNAME'], shareName)
    print("Current folder {} maps to {}".format(currentPath, x))
    return os.path.join(r"\\" + os.environ['COMPUTERNAME'], shareName)

def main():
    current = os.getcwd()
    #current = "C:\WINDOWS"
    share = translate_local_path_to_share(current)
    print("{} aka {}".format(current, share))

if __name__ == "__main__":


You probably won't ever need all of the above code to be able to swap out a local path for a network share using python but by golly if you do, have fun. Also, python is still my most favorite language, 14 years running.

Thursday, February 23, 2017

Biml Database Inspection

Biml Database Inspection

Importing tables via Biml

I've mentioned how using Biml to reverse engineer a very large database was the only option and there is plenty of great material in the community about how to do this but one thing I kept stumbling over was the using the import methods to build the above Biml always seemed to fail somewhere along the way. I assumed it was just me not understanding how it works. But, today someone else got bit with the same stumbling block so I wanted to talk through the basics of how the modeling works within Biml and subsequent posts will show the source of the issue and a work around.


Biml allows you to define the tables, views, and constraints in your database. Let's look at a minimal viable table definition for dbo.AWBuildVersion from AdventureWorks2014. Ready?

<Biml xmlns="">
        <OleDbConnection Name="Adventureworks" ConnectionString="Data Source=localhost\dev2014;Initial Catalog=AdventureWorks2014;Provider=SQLNCLI11;Integrated Security=SSPI;" />
        <Database ConnectionName="Adventureworks" Name="AdventureWorks2014" />
        <Schema Name="dbo" DatabaseName="AdventureWorks2014" />
        <Table Name="AWBuildVersion" SchemaName="AdventureWorks2014.dbo">
                <Column Name="SystemInformationID" DataType="Byte" IdentityIncrement="1" />
                <Column Name="Database Version" DataType="String" Length="25" />
                <Column Name="VersionDate" DataType="DateTime" />
                <Column Name="ModifiedDate" DataType="DateTime" />
                <PrimaryKey Name="PK_AWBuildVersion_SystemInformationID" Clustered="false">
                        <Column ColumnName="SystemInformationID" />

Wow, that's a lot! Let's break it down.


Our Connections collection has a single entity in it, an OLE DB Connection named Adventureworks (remember, all of this is case sensitive so this Adventureworks is a different beast from AdventureWorks, ADVENTUREWOKRS, etc). This provides enough information to make a database connection. Of note, we have the server and catalog/database name defined in there. Depending on the type of connection used will determine the specific name used i.e. Initial Catalog & Data Source; Server & Database, etc. Look at if you are really wanting to see how rich (horrible) this becomes.


A Database (AstDatabaseNode) requires a Name and a ConnectionName. We certainly know the connection since we just defined it in the previous section and so here I'm naming the Database AdventureWorks2014. This just happens to align with the value specified in Initial Catalog but use whatever is natural. Do not name it after the environment though, please. There is nothing worse than talking about an entity named "DevSalesDb" which is referencing the production database but named after the location it was built from.


A Schema (AstSchemaNode) requires a Name and a DatabaseName (see above). Since I'm after a table in the dbo schema, I just specify it as the name.


Finally, the Table (AstTableNode) which requires a Name and a SchemaName. Where have we seen this pattern? However, look at the value of the SchemaName. We have to qualify the schema with the database because we could have two Schema entities in dbo that point to different Database entities.

Once inside the Table entity, we can define our columns, keys, indices as our heart desires.

Wrap up

An amusing side note, if you're using Mist/BimlStudio to import the Schema and Table, the wizard renders all of this correctly, there only seems to be a defect in how I'm scripting the above entities.

Wednesday, November 9, 2016

Playing audio via Biml

Playing audio via Biml

How often do you need to play audio while you're compiling your Biml packages? Never? Really? Huh, just me then. Very well, chalk this blog post as one to show you that you really can do *anything* in Biml that you can do in C#.

When I first learned how I can play audio in .NET, I would hook the Windows Media Player dll and use that. The first thing I then did was create an SSIS package that had a script task which played the A-Team theme song while it ran. That was useless but a fun demo. Fast forward to using Biml and I could not for the life of me get the Windows Media Player to correctly embed in a Biml Script Task. I suspect it's something to do with the COM bindings that Biml doesn't yet support. Does this mean you shouldn't use Biml - Hell no. It just means I've wandered far into a corner case that doesn't yet have support.

Hours before going on the stage for my Summit 2016 presentation, I took another crack at finding a way to play music via .NET and discovered the System.Media.SoundPlayer class and I was ecstatic.


You understand this code, it's not hard. I create a string variable to hold the path to my sound file. I picked a sound file in a well known location. I prefaced my string with the @ symbol to avoid having to escape the default windows path separator.

<Biml xmlns="">
string sourceFile = string.Empty;
sourceFile = @"C:\Windows\Media\tada.wav";
System.Media.SoundPlayer player = new System.Media.SoundPlayer(sourceFile);

SSIS package that plays music

Using the above knowledge, we can also have an SSIS package with a script task to play an arbitrary media file

<Biml xmlns="">
        <Package Name="SoundPlayer" >
                <Variable Name="AudioPath" DataType="String"></Variable>
                <Script ProjectCoreName="ST_PlayAudio" Name="SCR Echo Back">
                    <ScriptTaskProjectReference ScriptTaskProjectName="ST_PlayAudio" />
    <ScriptTaskProject ProjectCoreName="ST_PlayAudio" Name="ST_PlayAudio" VstaMajorVersion="0">
                <!-- List all the variables you are interested in tracking -->
                <Variable Namespace="User" VariableName="AudioPath" DataType="String" />
                <File Path="ScriptMain.cs" BuildAction="Compile">using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace ST_PlayAudio
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
        public void Main()
            string sourceFile = string.Empty;
            sourceFile = Dts.Variables[0].Value.ToString();
            System.Media.SoundPlayer player = new System.Media.SoundPlayer(sourceFile);
            Dts.TaskResult = (int)ScriptResults.Success;

        enum ScriptResults
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
}                </File>
                <File Path="Properties\AssemblyInfo.cs" BuildAction="Compile">
using System.Reflection;
using System.Runtime.CompilerServices;

[assembly: AssemblyVersion("1.0.*")]
                <AssemblyReference AssemblyPath="System" />
                <AssemblyReference AssemblyPath="System.Data" />
                <AssemblyReference AssemblyPath="System.Windows.Forms" />
                <AssemblyReference AssemblyPath="System.Xml" />
                <AssemblyReference AssemblyPath="Microsoft.SqlServer.ManagedDTS.dll" />
                <AssemblyReference AssemblyPath="Microsoft.SqlServer.ScriptTask.dll" />

Now, you could marry the two Biml snippets together so that you get audio playing while you build an SSIS package that plays audio, Dawg.

Friday, October 28, 2016

What packages still use Configuration?

What packages still use Configurations?

I'm sitting in Tim Mitchell's excellent "Deep Dive into the SSISDB session" and someone asked how they can figure out what packages use the classic deployment model's Configuration option.

Create an SSIS package. Add a Variable to your package called FolderSource and assign it the path to your SSIS packages. Add a Script Task to the package and then add @[User::FolderSource] to the ReadOnly parameters.

Double click the script, assuming C#, and when it opens up, use the following script as your Main

public void Main()
    // Assign the SSIS Variable's value to our local variable
    string sourceFolder = Dts.Variables["FolderSource"].Value.ToString();
    Microsoft.SqlServer.Dts.Runtime.Application app = new Microsoft.SqlServer.Dts.Runtime.Application();
    string message = "Package {0} uses configuration {1}";
    bool fireAgain = false;
    Package pkg = null;
    foreach (string packagePath in System.IO.Directory.GetFiles(sourceFolder, "*.dtsx", System.IO.SearchOption.AllDirectories))
        pkg = app.LoadPackage(packagePath, null);
        // EnableConfigurations is a boolean specifying whether you have checked the first button
        if (pkg.EnableConfigurations)
        Dts.Events.FireInformation(0, "Configuration Finder", string.Format(message, packagePath, string.Empty), string.Empty, 0, ref fireAgain);

        // This will expose all the configurations that are being used
        // because you could have specified different configuration mechanisms
        foreach (Configuration config in pkg.Configurations)
            Dts.Events.FireInformation(0, "Configuration Details", string.Format(message, packagePath, config.ConfigurationType), string.Empty, 0, ref fireAgain);
    catch (Exception ex)
        Dts.Events.FireWarning(0, "Config finder", packagePath, string.Empty, 0);
        Dts.Events.FireWarning(0, "Config finder", ex.ToString(), string.Empty, 0);


    Dts.TaskResult = (int)ScriptResults.Success;

Save and close the package and hit F5.

How cool is that, we're using an SSIS package to inspect the rest of our packages. Now, if you store your packages in the MSDB, the above changes ever so slightly. We'd need to provide a connection string to the database and then change our first foreach loop to enumerate through all the packages in the MSDB. Perhaps that'll be a followup post.

Tuesday, October 18, 2016

Debugging Biml

Debugging Biml

At this point, I don't even know who to give credit for on this tip/trick as I've seen it from so many luminaries in the field. This mostly applies to BimlScript debugging within the context of BIDS Helper/BimlExpress.

Using tooling is always a trade-off between time/frustration and monetary cost. BIDS Helper/BimlExpress are free so you're prioritizing cost over all others. And that's ok, there's no judgement here. I know what it's like to be in places where you can't buy the tools you really need. One of the hard parts about debugging the expanded Biml from BimlScript is you can't see the intermediate or flat Biml. You've got your Metadata, Biml and BimlScript and a lot of imagination to think through how the code is being generated and where it might be going wrong. That's tough. Even at this point where I've been working with it for four years, I can still spend hours trying to track down just where the heck things went wrong. SPOILER ALERT It's the metadata, it's always the metadata (except when it's not). I end up with NULLs where I don't expect it or some goofball put the wrong values in a field. But how can you get to a place where you can see the result? That's what this post is about.

It's a trivial bit of code but it's important. You need to add a single Biml file to your project and whenever you want to see the expanded Biml, prior to it being translated into SSIS packages, right click on the file and you'll get all that Biml dumped to a file. This recipe calls for N steps.


Right click on your project and add a Biml file called WriteAll.biml. Or whatever makes sense to you. I like WriteAll because it will generally sort to the bottom of my list of files alphabetically and that's about as often as I hope to use it.


The first thing we need to do is ensure that the tier of this BimlScript file is greater than any other asset in the project. We will do that through the directive of template tier="N" where N is a sufficiently large number to ensure we don't have any natural tiers greater than it.

I'll also take this as an opportunity to impart a lesson learned from writing Apple Basic many, many years ago. Do not use one as the step value for line numbers or tiers in this case. Instead, give yourself some breathing room and count by 10s because sure as you're breathing, you'll discover you need to insert something between 2 and 3 and you can't add 2.5, much less 2.25. The same lesson goes with Tiers. Tier 0 is flat biml. Tier is biml script that doesn't specify its tier. After that you're in control of your destiny.


The .NET library offers a method called WriteAllText. This is the easiest method to write all the text to a file. It takes two arguments: the contents and the file name. If the file exists, it's going to overwrite it. If it doesn't exist, it will create it. Piece of pie!


WriteAllText needs a path - where should we put it? I'm lazy and want to put our debugging file into a location everyone has on their computer. I can't tell you what that location will be because it's going to be different for everyone but it's guaranteed to exist. It's the %userprofile% location. On my work laptop, it's C:\Users\BillFellows. On my home computer, it's C:\users\bfellows At the governmental agency, my home directory was actually on a network somewhere so it was just H:\ All you have to do is open up windows explorer and type %userprofile% and that's where we'll write this file.

If you are ever putting paths together through string building, please stop. It's a pain to deal with escaping the path separators, \, and it can be difficult to be consistent as some will build a path with a trailing slash and others won't. Stop trying to figure out that logic and use Path.Combine

We'll combine the special path location with a file name, Debug.biml and get a perfectly valid path for our output file. If you don't want overkill, then just make a hardcoded path.


Every object in the Biml universe supports the GetBiml method. What's amazingly powerful about this function is that it has the ability to call the GetBiml method on all the items under it. You don't have to worry about how many packages exist and how many Tasks and Variables and Events exist under them. Just call the appropriate parent level GetBiml method and object inheritance takes care of the rest.


The RootNode is the base of everything in Biml so by calling its GetBiml method, you'll get the Biml for all the derived objects within the project. Eureka! That's what we wanted! And since we won't call this until everything else has completed, via tiering property, we will get our flattened Biml


Putting all that together, we get a file that looks like this

<#@ template tier="999"#>
System.IO.File.WriteAllText(System.IO.Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.UserProfile), "Debug.biml"), RootNode.GetBiml());
If I want to see what's being built in ComplexBimlScript, I simply multiselect it and WriteAllText and I'll get a Debug.biml file. From there, I generally open Debug.biml in a separate SSIS project and Check Biml For Errors and it's much easier to zip to the error. Then it's a matter of trying where that bad code is generated back to the correct bit of Biml.

Closing thoughts

If you get some really weird error going on inside your BimlScript, this debug file will appear to be an empty Biml tag. In that case, it's probably your metadata so start breaking your solution down until it's working and then gradually add complexity back into it.


An alternative thought on tearing your code apart until you find it works would be to use this WriteAllText approach but do it per tier. That would allow you to inspect the compilation at ever step in the process to discern where things went wrong.

Thursday, October 6, 2016

UNION removes duplicates

UNION removes duplicates

When you need to combine two sets of data together, we use the UNION operator. That comes in two flavors: UNION and UNION ALL. The default is to remove duplicates between the two sets whereas UNION ALL does no filtering.

Pop quiz! Given the following sets A and B

What's the result of SELECT * FROM A UNION SELECT * FROM B;

Piece of cake, we start with everything in A and get the values in B that aren't in A.

So we're looking at 1, 5, 9 7, 3, 3, 2, 3

Except of course that's not what is actually happening. UNION is actually going to smash both sets of data together and then take the distinct results. Or it does a distinct within each result set, smashes them together and takes one last pass to remove duplicates. I don't know or care about the actual mechanics, what I care about is the final outcome.

We actually end up with a result of 1, 5, 9, 7, 3, 2. In the fifteen years I've been writing SQL statements, I don't think I ever realized that behavior of the final result set being distinct. I thought it was purely an intra set dedupe process.

I thought wrong