Nicholas Woodfield's Portfolio

"The best way to predict the future is to implement it."



Back to Showcase Page

Material Parser (Tesla Engine)


This code sample showcases a feature from the graphics engine named "Tesla Engine", which I have been working on since June 2010. It describes the problem and code solution, with an overview of provided source code class files. Since this is not a stand-alone project, only source files of pertinent classes (namely the tokenizer and parser classes) are downloadable. This showcase is to show an example of how I write code, as well as how I approach creating a solution to a problem. Some links to the source files of other pertinent objects are provided, which are located in the trunk of the SVN code repository for the Tesla Engine project.

Material parsing is part of Tesla's Content Pipeline, which includes functionality to serialize/deserialize engine objects (or even user-defined objects) to a binary (TEBO files) format. This means models and entire scenes can be saved into a compact format that is very close to the run-time representation to allow for faster loading of content. This typically is known as "off-line" content processing, where the content is saved into a format that requires no pre-processing at runtime. Also, the engine's content classes provide resource loaders to import different texture, shader, and model formats as well provides content caching to maximize performance (e.g. avoid duplicated textures in memory, which wastes space, since those objects should be re-used).


You can download the material parser source files and accompanying PDF documentation from the link below.



Problem Description


The term Material is used to describe how geometry is rendered, e.g. what color or image is used, lighting properties, and other render states. So every renderable object in the engine has at least one material. Since Tesla is a shader-driven graphics engine, a material uses a set of shaders in order to transform geometry data and control shading. A shader is a small program that runs on the GPU hardware, allowing complete control over the rendering pipeline.

One of the early goals of the engine was to create a scripting language in order to easily control how geometry is rendered. This is important to artists, as an artist may not necessarily be a programmer. Thus a way to describe how an object is rendered without writing any C# code would make the engine very friendly for artists or even GUI-based authoring tools. Additionally, it allows for content to be modified without requiring recompilation of source code.

So to recap:

Problem: Create an easy, flexible method to control how geometry is rendered.

Specifications:

  1. Human readable format.
  2. Requires no C# code or any programming expertise.
  3. Does not require application to be recompiled in order to modify content.
  4. Script can be read/written to and from a content file.

Solution Overview


I prototyped several implementations of a solution (including an XML-based format) until I arrived at what is now known as the Tesla Engine Material (TEM) script. The syntax is similar to curly-braced programming languages, where statements are declared inside a curly brace block. Statements are declared on a single line, or can be separated by a semi-colon on the same line. And finally, single-line “//” comments are allowed.

The script has all the features of its run-time equivalent, where the script writer can set any data type available in the engine such as floats, vectors, matrices, textures, render states, and shaders. It also supports “Engine Values” which are engine-defined values computed at runtime that are bound to shader variables. The engine value system allows for data to be automatically sent to the device every frame, which further automates rendering. A good example is the standard “World-View-Projection” matrix that is required to transform vertices to the screen. In a simple graphics application, the programmer would have to write code to compute and set this value to the device themselves. Using Tesla's material system, this is done for them. There are quite a few engine values that allow a shader access to variations of World, View, Projection matrices, camera and viewport values, or random and timing values.

Another key feature of a material script is it can inherit from another. This means you can write a template script that sets render states and shaders. Then you can write another script that inherits from the template where you only need to set specific parameters, such as colors or textures. This allows you to potentially write a large, complicated TEM once, and then write many smaller and simpler scripts for specific objects. If need be, the parent can be changed (such as the shader used), which is then reflected in all the scripts that inherit from it. The engine provides a multitude of template TEM scripts that use the built-in shader library, that can be used to easily create new materials.

All of these combine into a powerful feature that allows an artist to easily create material content without having to write additional rendering code. Although it's entirely possible to create a material in C# code, that is a method that may not be favorable to an artist. The only C# code that would actually have to be written, would be the application itself. In some cases this may be provided in the form of a model viewer. Tesla Engine also provides mesh data structures and other scene management support to make creating applications much easier.

An example of both a complex and simple TEM script are as follows:

//This is a comment
Material TexturedMaterial {

  Effect {
  
    File : Shaders/LitBasicEffect.tebo
    Technique : LitBasicTexture
  }
  
  MaterialParameters {
  
    Texture2D DiffuseMap : Textures/Rock_diffuse.dds
    Vector3 MatDiffuse : .5f .5f .5f
    float Alpha : .5f
  }
  
  EngineParameters {
  
    WorldViewProjection : WVP //WVP is the name of a shader variable
    WorldMatrix : World 
  }
  
  RenderStates {
  
    //AdditiveBlend is a predefined render state
    BlendState blend : AdditiveBlend
  }
  
  Technique LitBasicTexture {
  
    Pass pass0 {
    
      //Here we're telling the device that we want to set this render state before
      //we begin drawing
      BlendState : blend
    }
    
  }
  
}

//This material inherits from a TEM script that the engine provides in its shader library,
//so the only thing we need to do is set a specific image to be used
Material ChildTextured : BasicTexture.tem {

  MaterialParameters {
  
    Texture2D DiffuseMap : Textures/bricks.dds
  }
}

A complete description of syntax and script features is available as a PDF on the googlecode site for Tesla Engine. The PDF has also been included with the source code provided in this showcase.


Source Code

StringTokenizer.cs

This class handles the low-level parsing of the text input, splitting the text into a stream of tokens. Tokens are separated by white spaces, semi-colons, or control characters (such as tabs, returns, and new lines). The text input is a single string that is treated as a large character buffer. Each character is read from the buffer in order to easily identify characters to skip. For example, to allow for the cases where there is no white space or control character between a comment or a semi-colon.

Comments are identified by two consecutive forward slash marks (“//”) and are skipped until a new line is reached. Therefore multi-line comments must begin with a new set of slash marks. The “/* */” style comments are not allowed.

When the start of a token is reached, a StringBuilder is used to accumulate the characters. By using characters and a single string builder, the tokenizer attempts to reduce the amount of strings that are created when parsing the text. This was a major change from a previous implementation, which simply split the input text using the Split() string method. Obviously, this was a very naïve method and could be error prone if comments or semi-colons were not properly spaced from tokens. But that implementation was also a prototype where the development iteration focused more on the creation of TEM script.

The tokenizer also has other useful methods such as peeking ahead to the next token, parsing the next integer (including hexadecimal values), float, or boolean values, and string comparisons. Additionally, the class accounts for the line and column of the current token, which is used for exceptions to alert the user where in the script an error or malformed text is located.

MaterialParser.cs

The material parser contains all the logic of actually identifying proper syntax such as keywords, statement parsing, and curly brace block parsing. Its purpose and how it does its job is fairly straight forward – the parser will loop until there are no more tokens remaining (or an exception is thrown). There is no defined ordering of curly brace blocks in a TEM file, so a statement declaring which shader to use may actually come after a statement that assigns a shader variable some value. In order to account for this, the parser maintains several per-instance dictionaries to track declared objects and data values. It is not until after the parser finishes parsing a complete material script when it begins to assembling the actual run-time material.

In addition to this transient data, the parser maintains a longer lived dictionary for render states. In Tesla Engine, a render state is an immutable graphics object after it is bound to the rendering pipeline. It is good practice to declare render states up-front when the application (or content) is first loaded, and re-use the states through out the rest of the application's lifetime. In fact, the engine provides some predefined states for exactly this purpose. Therefore, the material parser keeps a cache of render states that are shared between material instances. This ensures if multiple materials declare a render state with the same configuration, only one instance of that state is created and shared.

The parser can throw an exception when it encounters malformed text or an error loading/creating a run-time object. Error reporting is handled uniformly with constant string error messages, error codes, file name and other information such as the line number and column number where the error occurred in the file. This is an attempt to try and alert the user precisely with the most information possible where and how the error occurred, or even how to fix it.

A TEM script may actually contain multiple material declarations, this is facilitated by the two methods in the parser:

  • LoadFirstMaterial()
  • LoadAllMaterials()

The content loading paths for TEM scripts are therefore split between two resource loaders and two objects. When loading a single Material, the first in the script is always loaded. When loading multiple materials, a Material Collection is returned containing all of them. For scripts that contain a single material, these two methods ultimately load the same exact content.

Example C# loading code using the engine's built-in ContentManager:

// "myscript.tem" contains exactly one material

Material mat = ContentManager.Load<Material>>(“Materials/myscript.tem”);

MaterialCollection matColl = ContentManager.Load<MaterialCollection>(“Materials/myscript.tem”);

Material sameAsFirstMat = matColl[0];

You can download the source files and accompanying PDF documentation from the link below.





Back to Showcase Page