Shaders give your programs the ability to take full advantage of the GPU for graphical effects and more. Some common use cases are creating materials applied to 3D objects, and other general filters or effects. The specifics of shader programming are not unique to the application programming framework you are using. While this post will focus mostly on creating materials in 3D with the Delphi FireMonkey framework, you can easily use what you learn in your programming language and framework of choice. We using Delphi FireMonkey to get all six samples working on Windows, macOS, iOS, Android, and Linux.
You may assume that 3D forms and applications are only useful for specific use cases where you need to work with 3D models or other types of 3D content. However, 3D forms can also be used to create a purely 2D user interface, by using plane-like objects without a camera so they look like regular 2D rectangles. A reason you may want to do this is so you can utilize GPU shaders to render those rectangles. For example, our Lumicademy product looks like a 2D application, but the main form is actually a 3D form built with FireMonkey. This enables us to very efficiently render multiple live videos and other content to the screen using specialized GPU shaders, thereby freeing the CPU to perform other tasks (such as decoding those videos).
The GPU or video card is responsible for rendering 2D and 3D content to the screen. When it comes to 3D content, the GPU only works with 3D triangles (and sometimes points and lines). It converts the coordinates of these triangles to screen coordinates, clips the triangles to the screen, and renders them using colors, textures, or custom effects.
In the early days, video cards and APIs used a fixed graphics pipeline. This means that the GPU would take care of most of the hard work for you: You would give the API a list of triangles, colors, texture coordinates, and other data, and the GPU would figure out how to render it all. If the GPU wasn’t capable of certain functionality, the API (eg. DirectX) would use software emulation if configured to do so.
Nowadays, virtually all GPUs and APIs use a programmable graphics pipeline. It could be said that in some ways this is a step back since the GPU and API do less work for you. Instead, you need to tell the GPU how triangles (vertices) must be converted to screen coordinates, and how each pixel must be rendered. This is a lot more work for the developer, but also offers much more flexibility and enables functionality that is simply not possible with a fixed pipeline.
The newest graphics APIs (DirectX 12, Metal, and Vulkan) shed even more abstraction layers to give developers even more control over the GPU hardware. This is achieved at the cost of increasingly complicated APIs but allows for even more efficient rendering.
Depending on the GPU and API, the graphics pipeline can have multiple programmable stages. Every programmable GPU has at least two programmable stages: a vertex transformation stage and a pixel shader stage. Newer GPUs and APIs may add additional stages, such as a tessellation stage. But unless you are creating a sophisticated game engine, you only need the first two stages. These are also the only stages you can customize in FireMonkey.
The following image shows a possible graphics pipeline with 4 stages, of which the first and last ones are programmable and the two middle ones are fixed.
In the vertex transformation stage, individual triangle vertices are transformed from 3D world space to 2D screen space. You program this stage by creating a vertex shader. This shader is also used to pass data that is needed for rendering to the pixel shader stage. The vertex shader is called once for each vertex.
The next two stages are fixed. The shape assembly stage converts the transformed vertices to shapes (triangles). The rasterization stage then rasterizes these triangles into a set of pixels.
Finally, the pixel shader stage determines the color of each pixel. This is the second programmable stage, for which you write a pixel shader (aka fragment shader in OpenGL and Metal). This shader calculates the final pixel color based on some algorithm and/or textures supplied to the shader. It can receive input data from the vertex shader if needed. The pixel shader is called once for each pixel.
Although most concepts of vertex and pixel shaders are the same across all graphics APIs, each API has its own shader language that is used to write these shaders:
There is also a new cross-platform API called Vulkan, which is the official successor of OpenGL. Unfortunately, since this API is not supported by Apple, it hasn’t gained much traction yet. If it becomes popular in the future, then Delphi will add support for it. This would also mean another shader language to learn (although it is very similar to GLSL).
All APIs provide the option to compile the shader source code on the fly in your app. Some APIs also provide the option to compile the shader off-line. In that case, you pass the compiled bytecode to the API.
Although most developers probably use FireMonkey as a 2D application framework (much like the Visual Component Library or VCL), FireMonkey had support for 3D applications from the beginning. This post will not go into the details of creating 3D applications with FireMonkey. That would leave little room to talk about the actual topic of shaders (and it is long enough already). If you want to learn more about 3D programming in FireMonkey you can consult the Delphi documentation, Delphi sample projects, blog posts like Bruce McGee’s 3D article, or Andrea Magni’s excellent FireMonkey book.
FireMonkey uses compiled bytecode for DirectX shaders and source code for OpenGL and Metal shaders. To compile DirectX shaders, you need to use the Direct3D Shader Compiler tool (fxc.exe), which ships with the DirectX SDK. I also included this tool in the GitHub repository that accompanies this post.
Here are two ways you can add 3D content to your Delphi FireMonkey application:
Which option you should use depends on the situation. In general, if most of your form contains 3D content, you should use the first option. Otherwise, the second option is more efficient.
In FireMonkey, the vertex shader and pixel shader are encapsulated in a material (derived from
TCustomMaterial
). However, you cannot apply a material directly to a 3D object. Instead, FireMonkey uses the concept of a material source (derived from TMaterialSource
), which is used to link a material to a 3D object. You usually add a component derived from TMaterialSource
to your form and set the MaterialSource
property of a 3D control to this component. The material source will then create the corresponding TMaterial
descendant when it is needed to render the control.So materials usually come in pairs: a material source and a material. For example, Delphi’s
TColorMaterialSource
is used at design time. It creates a TColorMaterial
at runtime for rendering.The remainder of this post focuses mostly on creating some custom material sources and materials for rendering various kinds of effects.
To demonstrate the use of custom materials and shaders, I have added a GpuProgramming folder to our JustAddCode GitHub repository with six sample projects. We start simple, with a shader that renders just blue pixels, and then work our way up to (slightly) more advanced scenarios that use textures and finally a simple plasma effect (which is also used for the title image of this post).
Most sample applications just show a spinning
TPlane
control with our custom material (source) applied to it. Since GPUs work exclusively with points, lines, and triangles, a rectangular plane is represented with two triangles. FireMonkey takes care of creating these triangles for you and passing them to the GPU.You can install your material sources into a design-time package so you can place them on a form and link them to controls visually. However, to keep the samples simple and avoid dependencies on packages, the material sources are created and linked in code. This is done in the
OnCreate
event of the form. For example, for the first demo app, this method looks like this:procedure TFormMain.Form3DCreate(Sender: TObject);
begin
FMaterialSource := TBlueMaterialSource.Create(Self);
Plane.MaterialSource := FMaterialSource;
end;
The code simply creates a material source (for a solid blue material in this example) and links it to the plane. We will look at how to create this material next.
By default, the sample applications use the default graphics backend for the platform. This is DirectX 11 on Windows and OpenGL on all other platforms.
However, you can also build each sample with an alternative backend by choosing the “Debug_AlternativeBackend” configuration. This configuration uses DirectX 9 on Windows and Metal on macOS and iOS.
Each sample app has a header that shows the name of the context class (and thus graphics backend) that is currently used (for example,
TDX11Context
when DirectX 11 is used).This is the easiest example, that just renders every pixel using a blue color. Even so, this example takes up the most space in this article due to the fact that there is a lot of scaffolding to put up. Before we look at the Delphi side, let’s take a look at what shader source code looks like.
Let’s start with the HLSL (DirectX) code of the vertex shader (in the file VertexShader.DX.txt):
float4x4 MVPMatrix;
float4 main(float4 position: POSITION0): SV_Position0
{
return mul(MVPMatrix, position);
}
All shader languages are C-like languages, and HLSL is no exception. The sole purpose of this vertex shader is to convert a vertex coordinate (a 4D
position
vector) from world space to screen space. This is done by multiplying the position
with a model-view-projection matrix (MVPMatrix
). This 4×4 matrix (of type float4x4
) is calculated for you by FireMonkey based on the camera and viewport. It is a combination of three matrices:The
MVPMatrix
variable is a so-called uniform input, which means that its value is constant for multiple invocations of the shader (that is, its value is the same for each vertex that is transformed). FireMonkey calculates this matrix in Delphi code and passes it to the shader.The shader must have a function called
main
that returns the transformed position (a 4D vector of type float4
). Its input is the source position in the local space. This type of variable is called a varying input in HLSL (or attribute in GLSL), meaning that its value is unique to each invocation of the shader. These inputs must be marked with a semantic, which is a name used to link the input and output of parts of the graphics pipeline. The POSITION0
semantic used here means that this parameter represents position data.The entire function is also marked with a (system value) semantic called
SV_Position0
, meaning that the function result represents the transformed position.FireMonkey requires that HLSL shares are compiled to bytecode. This is done with the following command line:
fxc /T vs_3_0 /E main /O3 /FoVertexShader.DX9 VertexShader.DX.txt
fxc /T vs_4_0 /E main /O3 /FoVertexShader.DX11 VertexShader.DX.txt
The parameters of interest are:
/T
to specify the target profile. For our purposes, the following values are used:vs_3_0
: a version 3.0 vertex shader (used by DirectX 9)vs_4_0
: a version 4.0 vertex shader (used by DirectX 11)ps_3_0
: a version 3.0 pixel shader (used by DirectX 9)ps_4_0
: a version 4.0 pixel shader (used by DirectX 11)/E
to specify the name of the entry point, which is main in our case/Fo
to specify the name of the output fileThis is followed by the name of the input file. Each sample project has a Shaders directory with the source code of all shaders, as well as a batch file (
Build.bat
) that compiles the shaders and generates a resource file.The GLSL (OpenGL) version looks a bit different:
uniform vec4 _MVPMatrix[4];
attribute vec4 a_Position;
void main()
{
gl_Position.x = dot(_MVPMatrix[0], a_Position);
gl_Position.y = dot(_MVPMatrix[1], a_Position);
gl_Position.z = dot(_MVPMatrix[2], a_Position);
gl_Position.w = dot(_MVPMatrix[3], a_Position);
}
The main differences compared to the HLSL version are:
vec4
instead of float4
and mat4
instead of float4x4
).uniform
keyword. These variables must start with an underscore (as in _MVPMatrix
). This is not a GLSL rule but a FireMonkey requirement.main
function, but must be declared with an attribute
keyword instead. Attributes can have any name, but FireMonkey requires fixed names so it knows what these attributes represent (in HLSL, this is not required since the semantic is used for this purpose). So position attributes must be named a_Position
(the a_
prefix is also a FireMonkey requirement).gl_
prefix) for common variables in the pipeline. The output vertex coordinate is always stored in the predefined variable gl_Position
._MVPMatrix
variable could be of type mat4
. However, FireMonkey requires that matrices are passed as arrays of vectors instead. This also means that the matrix multiplication has to be split into four separate vector dot operations. Fortunately, this part of the vertex shader is boilerplate, and you will use the same code in most GLSL vertex shaders you write.There is no need to compile this shader off-line because FireMonkey will compile it for you when needed.
Finally, we have the MSL (Metal) version:
using namespace metal;
struct Vertex
{
<#VertexDeclaration#>
};
struct ProjectedVertex
{
float4 position [[position]];
};
vertex ProjectedVertex vertexShader(
constant Vertex *vertexArray [[buffer(0)]],
const unsigned int vertexId [[vertex_id]],
constant float4x4 &MVPMatrix [[buffer(1)]])
{
Vertex in = vertexArray[vertexId];
ProjectedVertex out;
out.position = float4(in.position[0], in.position[1],
in.position[2], 1) * MVPMatrix;
return out;
}
This looks a bit more complicated:
The
using namespace metal
part means that the code can use standard types and functions from the metal
namespace (like a uses clause in Delphi).Next, it declares two structs (records):
Vertex
struct represents the type of input vertex. FireMonkey will fill this in for you by replacing <#VertexDeclaration#>
with the source code needed to represent the vertex (this <#...#>
tag is not a Metal feature).ProjectedVertex
structure represents the output vertex. In this case, it only contains an output position
, but we will add more in later examples. The [[position]]
attribute is like a semantic in HLSL and indicates that this field represents a vertex position.Finally, we have the main function, which starts with the keyword vertex to indicate this is a vertex shader (not to be confused with Vertex within an uppercase V, which is the type of the input vertices). It returns the transformed vertex of type ProjectedVertex.
This shader has three parameters:
vertexArray
parameter is an array of input vertices (of type Vertex). It has a constant qualifier, meaning that the vertex array is stored in the constant (read-only) address space (not to be confused with the const qualifier). Parameters in the constant address space must have a [[buffer(index)]] attribute, where the index is the location of the buffer. This index is later used in Delphi code to link this parameter to the vertices supplied by FireMonkey.vertexId
parameter contains the index of the vertex in the vertextArray
parameter that is currently being processed. This parameter has a const (not constant) qualifier meaning it is read-only (much like the const qualifier for parameters in Delphi). It must have a [[vertex_id]] attribute to tell Metal what the parameter is used for.MVPMatrix
parameter contains the 4×4 model-view-project matrix, also stored in the constant address space. It uses a different buffer index than the vertexArray parameter.The body of the function just extracts the vertex with the given Id from the array and multiplies it with the transformation matrix.
Again, there is no need to compile this shader off-line.
The pixel shader is pretty simple since it always returns just a blue color:
float4 main(): SV_Target0
{
return float4(0.0, 0.0, 1.0, 1.0);
}
The function returns a 4D vector, which is not only used for positions (X, Y, Z, W), but also for colors (R, G, B, A). Color components range from 0.0 (fully off) to 1.0 (fully on). Alpha components also range from 0.0 (fully transparent) to 1.0 (fully opaque). If you use values outside of this range, the GPU will automatically clip (or saturate) them.
The body of the function sets the Blue and Alpha components of the color to 1.0. Remember to always set the Alpha value as well.
Note that the function is marked with an
SV_Target0
semantic, meaning it returns the target pixel color.void main()
{
gl_FragColor = vec4(0.0, 0.0, 1.0, 1.0);
}
Here, the predefined
gl_FragColor
variable must be set to the output color.Finally, the Metal version looks like this:
fragment float4 fragmentShader()
{
return float4(0.0, 0.0, 1.0, 1.0);
}
Here, the function must start with the
fragment
keyword to indicate that this is a pixel (or fragment) shader.Now we are finally ready to use these shaders to create a material source and material (remember, these always come in pairs). The material code is specific to Delphi's FireMonkey and will be different if you choose to use a different framework. Since these materials have no configurable properties, the implementation is pretty simple. Let’s start with the material source:
type
TBlueMaterialSource = class(TMaterialSource)
protected
function CreateMaterial: TMaterial; override;
end;
function TBlueMaterialSource.CreateMaterial: TMaterial;
begin
Result := TBlueMaterial.Create;
end;
You must always override the
CreateMaterial
method and create the actual material instance in its implementation.If you look at the source code of materials that ship with Delphi (in the FMX.Materials unit), you will see that an offline tool is used to convert the (byte) code of HLSL and GLSL shaders to Delphi byte arrays. The MSL shaders are included as Delphi strings of MSL source code.
In the sample projects, we use a different approach that does not require an external tool: we link the shaders into the executable as resources and use a
TResourceStream
to load them. The interface of the TBlueMaterial
class looks like this:type
TBlueMaterial = class(TCustomMaterial)
private class var
FShaderArch: TContextShaderArch;
FVertexShaderData: TBytes;
FPixelShaderData: TBytes;
FMatrixIndex: Integer;
FMatrixSize: Integer;
private
class procedure LoadShaders; static;
class function LoadShader(
const AResourceName: String): TBytes; static;
protected
procedure DoInitialize; override;
end;
The class defines five static class variables (because we do not need different values of these variables for each instance):
FShaderArch
contains the shader architecture that is currently used, based on the graphics backend. It will have the values DX9, DX11, GLSL, or Metal.FVertexShaderData
and FPixelShaderData
contain the byte code or source code of the shaders, as read from the resource.FMatrixIndex
contains the index of the MVPMatrix
variable in the shader, which is needed to link the Delphi matrix to the corresponding matrix in the shader. This value can be different depending on the shader architecture.FMatrixSize
contains the size of a matrix. This also depends on the shader architecture. For DX11, this is the size of a matrix in bytes (which is 4 x 4 x 4 = 64). For other architectures, this is the size of a matrix as the number of 4D vectors it contains (which is 4).You must always override the
DoInitialize
method to register the shaders:procedure TBlueMaterial.DoInitialize;
begin
inherited;
if (FShaderArch = TContextShaderArch.Undefined) then
LoadShaders;
FVertexShader :=
TShaderManager.RegisterShaderFromData('blue.fvs',
TContextShaderKind.VertexShader, '', [
TContextShaderSource.Create(FShaderArch, FVertexShaderData,
[TContextShaderVariable.Create('MVPMatrix',
TContextShaderVariableKind.Matrix, FMatrixIndex,
FMatrixSize)])
]);
FPixelShader :=
TShaderManager.RegisterShaderFromData('blue.fps',
TContextShaderKind.PixelShader, '', [
TContextShaderSource.Create(FShaderArch, FPixelShaderData,
[])
]);
end;
If this is the first time this material is used, then the shaders are loaded from the resource using the static class method LoadShaders. Next, it registers the vertex shader and pixel shader. Note that FVertexShader and FPixelShader are fields of the TCustomMaterial class, from which TBlueMaterial derives.
The
TShaderManager.RegisterShaderFromData
method has these parameters:MVPMatrix
in this case)TContextShaderVariableKind.Matrix
here) [[buffer(index)]]
attribute discussed above. In our sample, we pass the value of FMatrixIndex here, which is set in the LoadShaders method.FMatrixSize
, which is also set in the LoadShaders method.The LoadShaders method detects the current shader architecture, sets the size of the
FMatrixIndex
and FMatrixSize
fields accordingly, and loads the corresponding vertex and pixels shaders from the resource:class procedure TBlueMaterial.LoadShaders;
begin
var Suffix := '';
var ContextClass := TContextManager.DefaultContextClass;
{$IF Defined(MSWINDOWS)}
if (ContextClass.InheritsFrom(TCustomDX9Context)) then
begin
FShaderArch := TContextShaderArch.DX9;
FMatrixIndex := 0;
FMatrixSize := 4;
Suffix := 'DX9';
end
else if (ContextClass.InheritsFrom(TCustomDX11Context)) then
begin
FShaderArch := TContextShaderArch.DX11;
FMatrixIndex := 0;
FMatrixSize := 64;
Suffix := 'DX11';
end;
{$ELSE}
if (ContextClass.InheritsFrom(TCustomContextOpenGL)) then
begin
FShaderArch := TContextShaderArch.GLSL;
FMatrixIndex := 0;
FMatrixSize := 4;
Suffix := 'GL';
end;
{$ENDIF}
{$IF Defined(MACOS)}
if (ContextClass.InheritsFrom(TCustomContextMetal)) then
begin
FShaderArch := TContextShaderArch.Metal;
FMatrixIndex := 1;
FMatrixSize := 4;
Suffix := 'MTL';
end;
{$ENDIF}
if (FShaderArch = TContextShaderArch.Undefined) then
raise EContext3DException.Create(
'Unknown or unsupported 3D context class');
FVertexShaderData := LoadShader('VERTEX_SHADER_' + Suffix);
FPixelShaderData := LoadShader('PIXEL_SHADER_' + Suffix);
end;
class function TBlueMaterial.LoadShader(
const AResourceName: String): TBytes;
begin
var Stream := TResourceStream.Create(HInstance,
AResourceName, RT_RCDATA);
try
SetLength(Result, Stream.Size);
Stream.ReadBuffer(Result, Length(Result));
finally
Stream.Free;
end;
end;
Now we can finally apply this material to a 3D control and run the application:
I know, pretty boring for that much work. But we have to take baby steps before we can run. Fortunately, with most of the scaffolding setup now, building on this in the next examples will be faster.
As promised I have 6 samples ready for you but to keep this post short I'll not paste everything here. You can jump out to the original blog post or my GitHub repository for all 6 samples, or I'll be back next week with part 2.