Designing a GPU-oriented geometry abstraction – Part Two.
My last post described the problem of crafting an appropriate geometry abstraction for Bling. Bling previously solved the code problem for vertex and pixel shading, but lacked decent geometry input abstractions as well as an abstraction that supported geometry shading. The last post proposes giving geometry its own abstraction, but I was a bit hesitant in including properties within the geometry, which leads to composition and addressing problems. I think its time to back up a bit and describe the abstractions that are relevant to a traditional programmable rendering pipeline:
- Pixels as children of geometric primitives. The position of a pixel is fixed and a pixel shader only specifies each pixel’s color and optional depth. It often makes sense to have other properties beyond color and depth at the pixel level for better shading computations; e.g., a normal that is computed on a per-pixel basis via a parametric equation or via normal mapping. Other properties can be accessed from previous vertex and geometry shading phases, where interpolation is used to determine the value of the property from the value of the property in the enclosing primitive’s vertices. Quality improves when properties are computed on a per-pixel basis rather than interpolated; e.g., consider Phong vs. Gouraud shading.
- Vertices as used to define primitives. A vertex shader specifies the position of each vertex, and will compute per-vertex properties that are later used in later geometry and pixel shading.
- Primitive geometries that are used to form complete geometries. A primitive is either of a point, line, or triangle topology, with triangle being the primitive that most commonly reaches the rendering stage. A primitive is defined by multiple vertices, and enclosing multiple pixels. Geometry shading works as a primitive translator, where one primitive can be translated into zero or more primitives of possibly a different topology. Primitives during geometry shading are defined as vertices.
- A geometry as a collection of primitives. Its initial definition is as a set of vertices, a topology for the vertices, and optionally an index buffer to allow for arbitrary vertex sharing between primitives, or an implicit adjacency relationship with explicit breaks (also used in geometry shading).
As described in the last post, we can define an abstraction for geometry that supports composition, duplication, and transformation. Ideally, rendering could then involve forming a geometry in a clean functional way through multiple compositions and transformations, and passing the resulting geometry into a render command. Transformations not only include modifying the layout of the geometry by rotating, scaling, and translating it, but also applying color and lighting to the geometry and the pixels contained within it, or whatever else is required for a complete rendering specification. Lighting could even be applied to the constituent parts of a geometry before they are composed. The various properties needed to make this possible include thinks like diffuse, specular, glass, and refractive materials as well as additional non-geometry constituents such as directional, ambient, and spot lights. Essentially, the geometry would then become a mini-scene graph.
Scene graphs are common in retained graphics APIs such as WPF 3D and Java 3D. Basically, a scene graph is a graph of the elements that affect scene rendering, and then becomes the basis for what is shown on the screen. Since we are only interested in what can be efficiently rendered in a shader pipeline, we have to keep the graph nodes mostly homogeneous: duplicating and transforming a geometry in the graph is fine, but composing completely different geometries with different lighting schemes is probably not going to work in the context of one rendering call (instead, render the geometries in separate rendering calls).
The primary difference then with my previous geometry abstraction is that this new geometry embeds properties and transformations at every level of composition. A level in a geometric composition can omit or duplicate properties from a higher level, and transformations (e.g., lighting or layout) should input a property from the lowest level that it exists in the geometry that the transformation is attached to; e.g., a normal property at the pixel level is preferred from a normal property at the vertex level as it is more accurate for lighting computations. The problem with this approach is that we can lose static typing: a property might not exist at any level, then what? Right now I’m willing to live with dynamic checking since it will occur relatively early in Bling when code is generated at application startup.