XSLTC — Compile XSLT to .NET Assembly
In my two previous posts I described a potential performance hit caused by XSLT-to-MSIL compilation and JIT-compilation when you load and run some XSLT stylesheet with the XslCompiledTransform
engine for the first time. Since the .NET Framework 2.0 did not allow you to save compiled stylesheets, you had to pay the compilation price on each application run.
XSLT Compiler Utility
The good news is we are providing the XSLT Compiler command-line utility xsltc.exe
(announced here) that can be used to compile multiple stylesheets into one assembly. The changes to the System.Xml assembly required for this utility to work are shipped with .NET Framework 2.0 Service Pack 1, and the utility itself is shipped with Windows SDK 6.0, which absorbs .NET Framework SDK. Both these components will be installed by Visual Studio 2008. Below is the usage screen of xsltc.exe
:
C:\>xsltc.exe /?
Microsoft (R) XSLT Compiler version 3.5
[Microsoft (R) .NET Framework version 2.0.50727]
Copyright (C) Microsoft Corporation. All rights reserved.
xsltc [options] [/class:<name>] <source file> [[/class:<name>] <source file>...]
XSLT Compiler Options
- OUTPUT FILES -
/out:<file> Specify name of binary output file (default: name of the first file)
/platform:<string> Limit which platforms this code can run on: x86, Itanium, x64, or anycpu,
which is the default
- CODE GENERATION -
/class:<name> Specify name of the class for compiled stylesheet (short form: /c)
/debug[+|-] Emit debugging information
/settings:<list> Specify security settings in the format (dtd|document|script)[+|-],...
Dtd enables DTDs in stylesheets, document enables document() function,
script enables <msxsl:script> element
- MISCELLANEOUS -
@<file> Insert command-line settings from a text file
/help Display this usage message (short form: /?)
/nologo Suppress compiler copyright message
The most useful options are /class
and /out
. If you have not specified the class name for some stylesheeet, it is defaulted to the name of the file containing that stylesheet, omitting the extension. The /debug
option disables practically all optimizations (beware of performance degradation!) and creates a PDB file for the output assembly, which allow debugging stylesheets with a debugger. For security reasons, DTDs in stylesheets, the document
XSLT function, and msxsl:script
elements are disabled by default; you have to explicitly enable them using the /settings
option if required. Each stylesheet is compiled into an abstract class, which can be loaded later by a new XslCompiledTransform.Load
overload:
public void Load(Type compiledStylesheet);
Compiling stylesheets into an assembly both simplifies the deployment (you don't have to deploy multiple stylesheet files) and eliminates XSLT-to-MSIL compilation time. Moreover, you may also eliminate JIT-compilation time by installing the resulting assembly in the native image cache.
How to Use It
Let us take, for example, a couple of the DocBook stylesheets, which had the worst JIT-compilation time in my previous experiment, and compile them:
C:\docbook-xsl-1.72.0>xsltc /settings:dtd+,document+ /class:DocBookToHtml html\docbook.xsl /class:DocBookToFO fo\docbook.xsl
If you run the ILDASM tool on the resulting docbook.dll
assembly, you will see two classes, DocBookToFO
and DocBookToHtml
generated for the stylesheets specified on the command line along with two helper $ArrayType$
... classes used internally to initialize XSLT engine runtime tables:
Assembly with compiled DocBook stylesheets
To use compiled stylesheets from your favorite .NET language, you need to add a reference to docbook.dll
to your project, and pass the desired class to the XslCompiledTransform.Load
method. After that you may call Transform
methods on the loaded XslCompiledTransform
object the usual way:
XslCompiledTransform stylesheet = new XslCompiledTransform();
stylesheet.Load(typeof(DocBookToHtml));
stylesheet.Transform("input.xml", "output.html");
To improve startup time you may choose to "pre-JIT" the assembly, installing a native image for it in the native image cache. However, before that you probably want to change the preferred base address of the assembly to avoid rebasing (I recommend reading Improving Application Startup Time and NGen Revs Up Your Performance with Powerful New Features articles). The xsltc.exe
utility does not support the /baseaddress
option, but you may use either rebase.exe
or editbin.exe
tool, both of which come with Visual Studio®:
C:\docbook-xsl-1.72.0>editbin.exe /rebase:base=0x60000000 docbook.dll /nologo
C:\docbook-xsl-1.72.0>ngen install docbook.dll /nologo
Installing assembly C:\docbook-xsl-1.72.0\docbook.dll
Compiling 1 assembly:
Compiling assembly C:\docbook-xsl-1.72.0\docbook.dll ...
docbook, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null
You may ask why we decided to compile stylesheets to abstract classes instead of implementing some common interface similar to IXmlTransform
from Mvp.Xml project. There were two main reasons. First, System.Xml
is a "red" assembly, and changes in the red bits have been greatly limited in Orcas. We tried to make public API changes as minimal as possible. Second, implementing XSLT 2.0 in the next release of the .NET Framework will probably require us to change the interface anyway.
Script Assemblies
If the stylesheet contains msxsl:script
elements, their content is compiled to one or more separate assemblies using the CodeDOM technology. Since the CodeDOM does not allow having code snippets in different languages in a single assembly, one script assembly per script language is created. Suppose, for example, that the stylesheet MyTransform.xsl
contains C# and Visual Basic .NET script blocks. When you compile it, three assemblies will be created: MyTransform.dll
, containing compiled XSLT code, MyTransform.Script.cs.dll
, containing compiled C# script blocks, and MyTransform.Script.vb.dll
, containing compiled Visual Basic .NET script blocks. You may merge script assemblies with the XSLT assembly using the ILMerge utility:
C:\MyTransform>ILMerge /out:MyTransform.dll MyTransform.dll MyTransform.Script.cs.dll MyTransform.Script.vb.dll
Limitations
Currently xsltc.exe
does not allow to embed XML files as resources. Why might you need that? Suppose that the stylesheet C:\MyTransform\MyTransform.xsl
contains relative document references document('')
and document('config.xml')
. If you compile it and deploy to another machine, it will try to read C:\MyTransform\MyTransform.xsl
and C:\MyTransform\config.xml
file respectively, which will result in an error unless you deploy MyTransform.xsl
and config.xml
in the same folder as on the build machine. You may think that relative document references should be resolved relative to the location of the compiled XSLT assembly, or that all documents referenced with relative URIs should be embedded in the assembly, but there are always cases when you need a different behavior. Fortunately, this problem may be resolved by modifying xsltc.exe
to use a custom XmlResolver
; I may write on this later.
Another limitation is that while XslCompiledTransform
compiles a stylesheet to a set of unloadable DynamicMethod
s, an assembly generated by xsltc.exe
cannot be unloaded until you shut down all AppDomain
s that used it (an infamous CLR limitation). This should not be a problem if you have a small set of fixed stylesheets, but becomes a real issue in server scenarios when thousand of stylesheets are generated dynamically based on user settings and customizations. We are actively investigating possible solutions for server scenarios, which do not require complicated AppDomain
manipulations.
Under the Hood
Under the hood, xsltc.exe
is a wrapper around the new XslCompiledTransform.CompileToType
static method. You don't need to know about it unless you are developing your own version of the XSLT compiler. We expect that very few people will ever need to call this low-level method directly, as most will use xsltc.exe
and optionally do some post-processing with other command-line utilities. However, for the sake of completeness, here is its brief description. (WARNING: The signature of the CompileToType
method in beta releases of .NET Framework 2.0 SP1 may differ from the one given below.)
// Compiles an XSLT stylesheet to a System.Type
public static CompilerErrorCollection CompileToType(
XmlReader stylesheet,
XsltSettings settings,
XmlResolver stylesheetResolver,
bool debug,
TypeBuilder typeBuilder,
string scriptAssemblyPath);
stylesheet
The XmlReader
positioned on the beginning of the stylesheet.
settings
The XsltSettings
to apply to the stylesheet. If this is null
, the XsltSettings.Default
settings are applied.
stylesheetResolver
The XmlResolver
used to resolve any stylesheet modules referenced in xsl:import
and xsl:include
elements. If this is null
, external resources are not resolved.
debug
true
to compile in debug mode; otherwise false
. Setting this to true
enables debugging the stylesheet with a debugger.
typeBuilder
The TypeBuilder
to use for the stylesheet compilation.
scriptAssemblyPath
The base path for the assemblies generated for msxsl:script
elements. If only one script assembly is generated, this parameter specifies the path for that assembly. In case of multiple script assemblies, a distinctive suffix will be appended to the file name to ensure uniqueness of assembly names.
Return Value
A CompilerErrorCollection
object containing compiler errors and warnings that indicates the results of the compilation.
Note that the first three parameters are the same as in XslCompiledTransform.Load
method. The xsltc.exe
utility creates an AssemblyBuilder
and a MethodBuilder
, then for each stylesheet specified on the command line creates a TypeBuilder
, and compiles the stylesheet into it using the CompileToType
method. Compiler errors and warning returned from the CompileToType
method are output to the console. If all stylesheets have been compiled successfully, the dynamic assembly is saved to disk. If you are new to Reflection.Emit, you may find this dynamic assembly sample code useful.
Conclusion
The xsltc.exe
utility allows you to precompile XSLT stylesheets so that your application will not incur the performance penalty of XSLT-to-MSIL and JIT-compilation on the first stylesheet execution. It also makes deployment of complex XSLT solutions, consisting of dozens of files, less cumbersome and protects your source XSLT code. Multiple stylesheets may be compiled into a single assembly, and the resulting assembly may be merged with the main DLL or EXE file of your application using the ILMerge utility.
Comments
- Anonymous
July 01, 2007
I have .Net 2.0 installed in my machine. But i couldnt find xsltce.exe. Wher this xsltc.exe will be available? Will it be available with .Net 3.0?