Generating complex text files from Dynamics 365 with KingswaySoft and XSLT

Recently I needed to build an SSIS package that would generate a delimited text file with multiple sets of related (but different entity) Dynamics 365 records combined in a single grouping. Each record would be on a separate row (with different formats for each record type), and each grouping of related records would have its own header and trailer rows. To give you a better idea of what that would look like, consider this example format for listing accounts and related contacts:

HEADER  {accountid}~
ACCOUNT {account name}|{account city}|{account state}~
CONTACT {contact id}|{contact name}|{contact phone number}~
TRAILER {accountid}|{number of contacts}~

A data export from Dynamics 365's sample data would look like this:

HEADER  2023a3fa-5d74-e611-944a-08002751dfcb~
ACCOUNT A. Datum Corporation (sample)|Redmond|WA~
CONTACT 8623a3fa-5d74-e611-944a-08002751dfcb|Rene Valdes (sample)|555-0108~
CONTACT 8c23a3fa-5d74-e611-944a-08002751dfcb|Susan Burk (sample)|555-0111~
TRAILER 2023a3fa-5d74-e611-944a-08002751dfcb|2~
HEADER  1423a3fa-5d74-e611-944a-08002751dfcb~
ACCOUNT Adventure Works (sample)|Santa Cruz|TX~
CONTACT 7a23a3fa-5d74-e611-944a-08002751dfcb|Nancy Anderson (sample)|555-0102~
TRAILER 1423a3fa-5d74-e611-944a-08002751dfcb|1~

I considered using SSIS script components to construct the text output using .Net's normal string manipulation capabilities, but that seemed like it would be a lot of work, and if the output format ever needed to change, I'd need to make changes to my code to handle it. After some pondering, I realized that if I could generate an XML document from the Dynamics 365 records, I would be able to use XSLT to transform it to any kind of text output. Additionally, if the output format ever needed to change, I would only have to update the XSLT stylesheet.

This approach requires three steps:

  1. Query the different data sets.
  2. Merge them into a single XML document.
  3. Convert XML into output format with XSLT.

Querying the data

I'm using the KingswaySoft SSIS integration toolkit to query the Dynamics 365 data. I have a query for active accounts and a separate query for active contacts. Here's the FetchXML for each query:

<fetch version="1.0" output-format="xml-platform" mapping="logical" distinct="false">
  <entity name="account">
    <attribute name="name" />
    <attribute name="address1_city" />
    <attribute name="primarycontactid" />
    <attribute name="telephone1" />
    <attribute name="accountid" />
    <attribute name="address1_stateorprovince" />
    <order attribute="name" descending="false" />
    <filter type="and">
      <condition attribute="statecode" operator="eq" value="0" />
    </filter>
    <link-entity name="contact" from="contactid" to="primarycontactid" visible="false" link-type="outer" alias="accountprimarycontactidcontactcontactid">
      <attribute name="emailaddress1" />
    </link-entity>
  </entity>
</fetch>

<fetch version="1.0" output-format="xml-platform" mapping="logical" distinct="false">
  <entity name="contact">
    <attribute name="fullname" />
    <attribute name="parentcustomerid" />
    <attribute name="telephone1" />
    <attribute name="emailaddress1" />
    <attribute name="contactid" />
    <order attribute="fullname" descending="false" />
    <filter type="and">
      <condition attribute="statecode" operator="eq" value="0" />
      <condition attribute="parentcustomerid" operator="not-null" />
    </filter>
  </entity>
</fetch>

Generating the XML

You could generate the XML document using a script component in your data flow, but I'm using the SSIS XML merge component from the KingswaySoft SSIS productivity pack. Note, this component is not included with the Dynamics 365 SSIS integration toolkit, so you will need an additional license (or the KingswaySoft ultimate edition license) to use it.

The XML merge component lets you specify the relationships between the different data sets.

And then you can specify the output format.

The output is a DT_NTEXT column called "MergedXml," but you can change the name.

Converting the XML

SSIS has an XML task component you can use in the package control flow to transform XML with XSLT, but I had some problems getting that to work just the way I wanted. Instead I decided to perform the transformation in a script component inside my data flow. Here's the code from the script component to do that:

public override void Input0_ProcessInputRow(Input0Buffer Row)
{
	//build the xsl
	XslCompiledTransform xslt = new XslCompiledTransform();
	string xsl = "<xsl:stylesheet version=\"1.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\">" +
		"<xsl:template match=\"/\">" +
		"<xsl:for-each select=\"accounts/account\">" +
		"\n" +
		"HEADER  <xsl:value-of select=\"accountid\"/>~\n" +
		"ACCOUNT <xsl:value-of select=\"name\"/>|<xsl:value-of select=\"address1_city\"/>|<xsl:value-of select=\"address1_stateorprovince\"/>~\n" +
		"<xsl:for-each select=\"contacts/contact\">" +
		"CONTACT <xsl:value-of select=\"contactid\"/>|<xsl:value-of select=\"fullname\"/>|<xsl:value-of select=\"telephone1\"/>~\n" +
		"</xsl:for-each>" +
		"TRAILER <xsl:value-of select=\"accountid\"/>|<xsl:value-of select=\"count(contacts/contact)\" />~" +
		"</xsl:for-each>" +
		"</xsl:template>" +
		"</xsl:stylesheet>";
	System.Text.UnicodeEncoding myEncoder = new System.Text.UnicodeEncoding();
	Byte[] bytes = myEncoder.GetBytes(xsl);
	MemoryStream xslms = new MemoryStream(bytes);

	//load it
	xslt.Load(XmlReader.Create(xslms));

	//transform the xml using the xslt
	string outputtext = "";
	string rawxml = BlobColumnToString(Row.MergedXml);
	using (StringReader sr = new StringReader(rawxml))
	{
		using (XmlReader xr = XmlReader.Create(sr))
		{
			using (StringWriter sw = new StringWriter())
			{
				xslt.Transform(xr, null, sw);
				outputtext = sw.ToString();
			}
		}
	}

	//to remove the xml document declaration in the first line, we need to split the output into an array of lines first
	string[] lines = outputtext.Split(new string[] { "\r\n", "\n" }, StringSplitOptions.None);

	//join all the lines except the first one back together
	string outputlines = string.Join("\n", lines, 1, lines.Length - 1); //leave out the first line (xml declaration)

	//add an output row
	Output0Buffer.AddRow();

	//set the value of the output row column to the rejoined output lines value
	Output0Buffer.TransformOutput.AddBlobData(System.Text.Encoding.UTF8.GetBytes(outputlines));
}

/// <summary>
/// converts blob column data to a string - from https://stackoverflow.com/a/14253887
/// </summary>
string BlobColumnToString(Microsoft.SqlServer.Dts.Pipeline.BlobColumn blobColumn)
{
	if (blobColumn.IsNull)
		return string.Empty;

	var blobLength = Convert.ToInt32(blobColumn.Length);
	var blobData = blobColumn.GetBlobData(0, blobLength);
	var stringData = System.Text.Encoding.Unicode.GetString(blobData);

	return stringData;
}

A few notes:

  1. You will need to add using directives for these namespaces to your script: System.Xml, System.Xml.Xsl, System.IO.
  2. The BlobColumnToString helper method is necessary to read the contents of the MergedXml column into a string variable.
  3. Ideally you would store the XSL outside the script code so that it's easier to manage.
  4. When you transform the XML, the first line of the output will be an XML document declaration. I am removing that first line of the output with the code starting at line 44 in my sample above.