A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Tuesday, August 15, 2017

Biml build Database collection nodes

Biml build Database collection nodes aka what good are Annotations

In many of the walkthroughs on creating relational objects via Biml, it seems like people skim over the Databases collection. There's nothing built into the language to really support the creation of database nodes. The import database operations are focused on tables and schemas and assume the database node(s) have been created. I hate assumptions.

Connections.biml

Add a biml file to your project to house your connection information. I'm going to create two connections to the same Adventureworks 2014 database.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
        <OleDbConnection Name="Aw2014" ConnectionString="Provider=SQLNCLI11;Server=localhost\dev2016;Initial Catalog=AdventureWorks2014;Integrated Security=SSPI;" />
        <OleDbConnection Name="Aw2014Annotated" ConnectionString="Provider=SQLNCLI11;Server=localhost\dev2016;Initial Catalog=AdventureWorks2014;Integrated Security=SSPI;">
            <Annotations>
                <Annotation Tag="DatabaseName">AdventureWorks2014</Annotation>
                <Annotation Tag="ServerName">localhost\dev2016</Annotation>
                <Annotation Tag="Provider">SQLNCLI11</Annotation>
            </Annotations>
        </OleDbConnection>
    </Connections>
</Biml>

The only material difference between the first and second OleDbConnection node is the declaration of Annotations on the second instance. Annotations are free form entities that allow you to enrich your Biml with more metadata. Here I've duplicated the DatabaseName, ServerName and Provider properties from my connection string. As you'll see in the upcoming BimlScript, prior proper planning prevents poor performance.

Databases.biml

The Databases node is a collection of AstDatabaseNode. They require a Name and a ConnectionName to be valid. Let's look at how we can construct these nodes based on information in our project. The first question I'd ask is what metadata do I readily have available? My Connections collection - that is already built out so I can enumerate through the items in there to populate the value of the ConnectionName. The only remaining item then is the Name for our database. I can see three ways of populating it: parsing the connection string for the database name, instantiating the connection manager and querying the database name from the RDBMS, or pulling the database name from our Annotations collection.

The basic approach would take the form

<Databases>
<#
    string databaseName = string.Empty;

    foreach(AstOleDbConnectionNode conn in this.RootNode.OleDbConnections)
    {
        databaseName = "unknown";
        // Logic goes here!
#>
        <Database Name="<#= conn.Name #>.<#= databaseName#>" ConnectionName="<#= conn.Name #>" />
<#
    }
#>
</Databases>

The logic we stuff in there can be as simple or complex as needed but the end result would be a well formed Database node.

Parsing

Connection strings are delimited (semicolon) key value pairs unique to the connection type. In this approach we'll split the connection string by semicolon and then split each resulting entity by the equals sign.

         // use string parsing and prayer
         try
         {
            string [] kVpish;
            KeyValuePair<string, string> kvp ;
            // This is probably the most fragile as it needs to take into consideration all the
            // possible connection styles. Get well acquainted with ConnectionStrings.com
            // Split our connnection string based on the delimiter of a semicolon
            foreach (var element in conn.ConnectionString.Split(new char [] {';'}))
            {
                kVpish = element.Split(new char[]{'='});
                if(kVpish.Count() > 1)
                {
                    kvp = new KeyValuePair<string, string>(kVpish[0], kVpish[1]);
                    if (String.Compare("Initial Catalog", kvp.Key, StringComparison.InvariantCultureIgnoreCase) == 0)
                    {
                        databaseName = kvp.Value;
                    }
                }
            }
         }
         catch (Exception ex)
         {
             databaseName = string.Format("{0}_{1}", "Error", ex.ToString());
         }

The challenge with this approach is that it's a fragile approach. As ConnectionStrings.com can attest, there are a lot of ways of constructing a connection string and what it denotes as the database name property.

Query

Another approach would be to instantiate the connection manager and then query the information schema equivalent and ask the database what the database name is. SQL Server makes this easy

            // Or we can use database connectivity
            // This query would need to have intelligence built into it to issue correct statement
            // per database. This works for SQL Server only
            string queryDatabaseName = @"SELECT db_name(db_id()) AS CurrentDB;";
            System.Data.DataTable dt = null;
            dt = ExternalDataAccess.GetDataTable(conn, queryDatabaseName);
            foreach (System.Data.DataRow row in dt.Rows)
            {
                databaseName = row[0].ToString();
            }

The downside to this is that, much like parsing the connection string it's going to be provider specific. Plus, this will be the slowest due to the cost of instantiating connections. Not to mention ensuring the build server has all the correct drivers installed.

Annotations

"Prior proper planning prevents poor performance" so let's spend a moment up front to define our metadata and then use Linq to extract that information. Since we can't guarantee that a connection node has an Annotation tag named DatabaseName, we need to test for the existence (Any) and if we find one, we'll extract the value.

            if (conn.Annotations.Any(an => an.Tag=="DatabaseName"))
            {
                // We use the Select method to pull out only the thing, Text, that we are interested in
                databaseName = conn.Annotations.Where(an => an.Tag=="DatabaseName").Select(t => t.Text).First().ToString();
            }

The downside to this approach is that it requires planning and a bit of double entry as you need to keep the metadata (annotations) synchronized with the actual connection string. But since we're automating kind of people, that shouldn't be a problem...

Databases.biml

Putting it all together, our Databases.biml file becomes

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<#@ template tier="10" #>
    <Databases>
    <#
    foreach(AstOleDbConnectionNode conn in this.RootNode.OleDbConnections)
    {
        string databaseName = "unknown";
        
        // Test whether the annotation collection contains a tag named DatabaseName
        if (conn.Annotations.Any(an => an.Tag=="DatabaseName"))
        {
            // We use the Select method to pull out only the thing, Text, that we are interested in
            databaseName = conn.Annotations.Where(an => an.Tag=="DatabaseName").Select(t => t.Text).First().ToString();
        }
        else
        {
            // No annotations found
            bool useStringParsing = true;
            if (useStringParsing)
            {
         // use string parsing and prayer
         try
         {
            string [] kVpish;
            KeyValuePair<string, string> kvp ;
            // This is probably the most fragile as it needs to take into consideration all the
            // possible connection styles. Get well acquainted with ConnectionStrings.com
            // Split our connnection string based on the delimiter of a semicolon
            foreach (var element in conn.ConnectionString.Split(new char [] {';'}))
            {
                kVpish = element.Split(new char[]{'='});
                if(kVpish.Count() > 1)
                {
                    kvp = new KeyValuePair<string, string>(kVpish[0], kVpish[1]);
                    if (String.Compare("Initial Catalog", kvp.Key, StringComparison.InvariantCultureIgnoreCase) == 0)
                    {
                        databaseName = kvp.Value;
                    }
                }
            }
         }
         catch (Exception ex)
         {
             databaseName = string.Format("{0}_{1}", "Error", ex.ToString());
         }
            }
            else
            {
                // Or we can use database connectivity
                // This query would need to have intelligence built into it to issue correct statement
                // per database. This works for SQL Server only
                string queryDatabaseName = @"";
                System.Data.DataTable dt = null;
                dt = ExternalDataAccess.GetDataTable(conn, queryDatabaseName);
                foreach (System.Data.DataRow row in dt.Rows)
                {
                    databaseName = row[0].ToString();
                }
            }
            
        }
        
    #>
        <Database Name="<#= conn.Name #>.<#= databaseName#>" ConnectionName="<#= conn.Name #>" />
        <#
        }
        #>
    </Databases>
</Biml>

And that's what it takes to use BimlScript to build out the Database collection nodes based on Annotation, string paring or database querying. Use Annotations to enrich your Biml objects with good metadata and then you can use it to simplify future operations.

No comments: