One of the hard problems in computer graphics specifically for games is how to render unlimited polygons/triangles efficiently in real time. Artists have dreamt about a world where they don’t have to worry about polycounts. Nanite is UE5 technology that tries to achieve it by implementing very efficient yet dynamic LOD (level of detail) algorithms. This lets artists focus on creating or importing existing film-quality meshes without worrying about FPS. Unreal manages that for you with little to no loss of quality.
To understand how Nanite works, we first need a basic understanding of how Unreal Engine renders scenes, and before that, a distinction between retained mode and immediate mode rendering.
Retained mode roughly means to prepare scene draw’s data in advance, this is where new API’s (Metal, Vulkan) are headed as well. You have to front-load everything and that includes every state of the pipeline. At runtime, you simply dispatch it to the GPU as fast as possible. Before 4.22 UE used to be immediate mode, something along the following lines:
Since the release of UE 4.22, it moved towards DrawCommands which follows more of a data-driven design (stateless) & these draw commands don’t have any context (e.g. they don’t have state information about where they came from). This helped UE move to retained mode i.e. UE can figure out full pipeline state object, shader bindings, etc at load time. It also helps UE parallelize draw commands.
Now, let's dive deeper into Nanite's architecture, which also ties closely to Unreal Engine's rendering framework. You can explore a full-resolution image by clicking here, or open this uml diagram file in PlantUML to visualize the architecture in detail.
A Nanite pass will produce data for cluster ID, triangle Id & depth (as shown in the following image)
1.) Nanite involves a lot of pre-processing before rendering such as all levels of clipping granularity, data for rasterization, data for base pass (shown above) & lighting stage as well.
2.) Nanite uses clusters, and page-based LOD representation, it forces nanite to build these data structures in advance since they are compute-intensive. This helps rendering but causes nanite to only support static meshes.
3.) Nanite is very much a custom GPU-driven software rasterizer, it seems some of the ideas came from the presentation linked here, (
4.) Nanite maintains it data such that it can be discarded by clusters or by page or by triangle.