Last Week on DirectX Shader Compiler (2017-09-09)
This past week was relatively quiet - it was a short week and the team was busy with a bunch of planning. I'm really hoping we can share some of these soon.
To make up for the slim news, I'll be talking a bit about specific passes. This first time, however, I'll give a brief overview of how to get started.
First, the tool to play with these is not available in the Windows SDK, only on the repo. Clone it and build it (it should be pretty straightforward), and then in your binaries folder you'll find a dndxc tool, which provides a bit of UI over various compiler objects.
I'm going to start with this empty compute shader.
// -*- mode: hlsl; hlsl-target: cs_6_0; hlsl-args: /Od; -*-
[numthreads(1,1,1)]
void main() {
}
When you clik on the Optimizer tab, all known passes will be populated, as well as the specific configuration for /Od, which is what the shader is specifying on the first line. Because we've selected the debug configuration, there will be only a few passes. Here is the list of selected passes I get on my current build; we'll use this as a starting point in the future when we want to dig into some other pass.
- opt-fn-passes
- tti
- verify
- targetlibinfo
- opt-mod-passes
- tti
- hlsl-hlensure
- always-inline,InlineThreshold=2294967296,InsertLifetime=f
- barrier
- scalarrepl-param-hlsl
- scalarreplhlsl
- hlmatrixlower
- resource-handle
- dce
- globaldce
- hlsl-dxil-legalize-eval-operations
- dynamic-vector-to-array,ReplaceAllVectors=1
- simplify-inst
- simplifycfg
- hlsl-dxil-legalize-resource-use
- hlsl-dxil-legalize-static-resource-use
- dxilgen
- hlsl-dxilload
- simplify-inst
- hlsl-dxil-precise
- scalarizer
- simplify-inst
- simplifycfg
- dce
- multi-dim-one-dim
- hlsl-dxil-condense
- dxil-legalize-sample-offset
- hlsl-dxilfinalize
- viewid-state
- dxil-dfe
- hlsl-dxilemit
Next, select 'Print all passes' in the Optimizer tab of dndxc, then click Run Passes. This will bring up a window with the passes that ran as well as a diff of the shader at various stages.
The 'start' state is the LLVM IR as it was generated by the compiler.
target datalayout = "e-m:e-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple = "dxil-ms-dx"
%ConstantBuffer = type opaque
@"$Globals" = external constant %ConstantBuffer
; Function Attrs: nounwind
define void @main() #0 {
entry:
ret void
}
attributes #0 = { nounwind "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-realign-stack" "stack-protector-buffer-size"="0" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.ident = !{!0}
!dx.version = !{!1}
!dx.valver = !{!2}
!dx.shaderModel = !{!3}
!dx.typeAnnotations = !{!4}
!dx.entryPoints = !{!8}
!dx.fnprops = !{!12}
!dx.options = !{!13}
!dx.resource.type.annotation = !{!7}
!0 = !{!"dxcoob 2017.6"}
!1 = !{i32 1, i32 0}
!2 = !{i32 0, i32 0}
!3 = !{!"cs", i32 6, i32 0}
!4 = !{i32 1, void ()* @main, !5}
!5 = !{!6}
!6 = !{i32 1, !7, !7}
!7 = !{}
!8 = !{void ()* @main, !"main", null, !9, null}
!9 = !{null, null, !10, null}
!10 = !{!11}
!11 = !{i32 0, %ConstantBuffer* @"$Globals", !"$Globals", i32 0, i32 -1, i32 1, i32 0, null}
!12 = !{void ()* @main, i32 5, i32 1, i32 1, i32 1}
!13 = !{i32 144}
Because the shader is pretty much empty, there are really no changes until globaldce run. globaldce runs dead code elimination on the whole program, and in this case discovers that the @"$Globals" variable isn't used. The front-end always generates it to constant buffer globals that don't fall under any particular cbuffer, but in this case we don't need it, so out it goes. Its type is also removed, and the !11 metadata loses that reference as well.
There are no instructions to simplify, and resource legalization doesn't do anything, so again nothing happens until dxilgen runs, where the high-level metadata representations go away and we introduce a create handle function, which adds these:
- A type for handles: %dx.types.Handle = type { i8* }
- A function declaration: declare %dx.types.Handle @dx.op.createHandle(i32, i8, i32, i32, i1) #1
- A particular combination of function attributes: attributes #1 = { nounwind readonly }
None of the other passes does anything, until hlsl-dxil-condense (which tries to remove unused resources) eventually chucks out the createHandle function.
Finally, hlsl-dxilfinalize cleans up the attributes on the main function, and at last hlsl-dxilemit reintroduces metadata into the module, this time conforming to the DXIL specification.
Feel free to play with the tool, and let me know if you run into trouble.
Enjoy!
Marcelo