[h1]Hello Riftbreakers![/h1] When we started developing our games, EXOR Studios had few choices regarding picking the right game engine. In early 2008, the commercial solutions were out of our reach because of their prohibitive costs, while free engines offered only some of the features we needed. Thus, we decided to develop our technology based on multiple open-source solutions, including the [url=https://www.ogre3d.org/]OGRE 3D[/url] render system. This allowed us to build games precisely how we wanted to. As years progressed and we moved from one project to another, the Schmetterling Engine slowly began to take shape and eventually evolved into the complete framework we use today. [previewyoutube=a-wNhKdML38;full][/previewyoutube] [b][i]A making-of video of our first game, Zombie Driver. You can see how it all started for us those 15 years ago![/i][/b] Developing your engine comes with many difficulties. You can’t refer to any documentation or expect bugs to be fixed by someone else - you’re on your own. On the other hand, you are free to implement solutions that your game will benefit from. We have already told you about some of the new additions to the Schmetterling Engine we implemented for The Riftbreaker. Raytraced Shadows and Ambient Occlusion, AMD FSR, and Intel XeSS are just a few of them. Today we would like to discuss a significant improvement coming to the engine - Tiled Deferred Shading. This cryptic-sounding technique opens the door for improvements and new features in The Riftbreaker. We’d like to talk about how this new technique works, but first, we need to discuss some basics about shading in general. [h2]Deferred Shading[/h2] There are several ways you can approach computer graphics rendering. The available ideas and techniques vary considerably, resulting in unique requirements, advantages, and disadvantages for each one. Our earliest project - Zombie Driver, used forward rendering. In this technique, the scene is rendered by drawing each object in the scene one by one. Each object is rendered with its own set of material properties, such as color, transparency, and reflectivity. Lighting is applied to each object in the scene as it is rendered, calculating the contribution of each light source to the final color of each pixel on the surface of the object being rendered. [img]https://clan.cloudflare.steamstatic.com/images//34659267/5b7daffd82028d8577bb0a0a7b23c0d473a643b0.png[/img] [b][i]In Zombie Driver we used forward rendering. We were limited to using only one light per scene. None of the particle effects you can see in this picture emit any light. We faked the highlights with additive alpha particles - smoke and mirrors ;)[/i][/b] It sounds simple enough, but it has a major disadvantage - to ensure correct results, contributions from all lights need to be taken into account per each pixel on the screen. If the scene is lit by only one light source, it can be quite quick. However, when you add more lights, things can get complicated. Even if a pixel is fully occluded and is not lit by any of the light sources on the screen, the forward rendering method will calculate the lighting for that pixel anyway. We will learn that it is occluded only after computing the lighting for that pixel, and all that computational time will be wasted. This is why we have been using a more efficient technique called [url=https://en.wikipedia.org/wiki/Deferred_shading]Deferred Shading[/url] in our later projects, X-Morph: Defense and The Riftbreaker. It excels in handling multiple light sources, which allows us to create scenes with many dynamic lights without sacrificing performance. The primary advantage of deferred shading is treating the scene geometry and its lighting separately. In the first stage of the shading process, only the 3D geometry of the scene is rendered. The resulting information about each pixel's position, surface normal, and material properties is stored in the geometry buffer - or G-buffer, for short. In the shading stage, the lighting and shading calculations are performed using the information stored in the G-buffer. Lighting is calculated using the previously collected data stored in the G-buffer. With all the geometry data available at this point, lights are computed only for those pixels which they actually affect. We can also calculate complex post-processing effects such as ambient occlusion, temporal anti-aliasing, and upscaling thanks to all the information stored in the G-buffer. [img]https://clan.cloudflare.steamstatic.com/images//34659267/7425a14b531e6885c0b00c4f6cb7904dc84f2c11.jpg[/img] [b][i]The use of deferred lighting in X-Morph: Defense allowed us to use multiple light sources. In night scenarios, scenes are only lit by additional lights - there is no directional lighting present on this scene.[/i][/b] The current G-buffer layout in The Riftbreaker looks like this: float3 GBuffer0 // Albedo (xyz), float3 GBuffer1 // World Space Normal (xyz) float3 GBuffer2 // Occlusion (x), Roughness (y), Metalness (z) float3 GBuffer3 // Emissive (xyz) float3 GBuffer4 // SubsurfaceScattering (xyz) float2 GBuffer5 // Velocity (xy) float GBuffer6 // Depth However, deferred shading also has some limitations. Reading and writing to the G-buffer takes some time, which can be a major downside when it comes to lower-spec hardware. It also does not scale well with high numbers of lights on the scene. Still, the most glaring issue for us regarding The Riftbreaker is transparency. Deferred shading can’t handle transparent objects. The fact that the technique relies on the g-buffer information means that all things that are not opaque can’t be used for lighting calculations since the buffer doesn’t store any information about non opaque geometry. This is one of the reasons why we decided to upgrade our renderer to use Tiled Deferred Shading. [h2]Tiled Deferred Shading[/h2] [url=https://leifnode.com/2015/05/tiled-deferred-shading/]Tiled Deferred Shading [/url]is an evolution of the Deferred Shading algorithm. The screen is divided into a grid of tiles, in our case, 16x16 pixels each. Each screen tile holds information about light indexes that affect that tile. This data is later used for light computation for opaque and non-opaque objects. [img]https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExZTgzZDQ0ZmVlODI5Y2NiYjQ2NjZiMTIyMjViZDFmZmRkNDU2ODAzOCZjdD1n/ZOXJCmjo9u3QwsMkve/giphy.gif[/img] [b][i]A debug view of a scene from The Riftbreaker with Tiled Deferred Shading enabled. The tile grid is visible. In this view, the color of the tile tells us how many lights affect the tile. The closer it is to red, the more lights affect a tile. It's essentially a heatmap.[/i][/b] [img]https://clan.cloudflare.steamstatic.com/images//34659267/a713f867696df534f9cbe8664a7221fd332c70b9.jpg[/img] [b][i]We didn't joke about the heatmap thing.[/i][/b] [img]https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExMmQyZTkzYWJlZTA5ZDgwZTdjNzgzZDdiM2MyZjk3ZjgyY2Q2ZWMwYSZjdD1n/fX6ZxK7faTGYaHwmXF/giphy.gif[/img] [b][i]The benchmark scene we used to test this technique. You can see how lights change in real time.[/i][/b] The algorithm for this process is: [olist] [*] G-buffer pass - Draw all opaque geometry. [*] Compute tile frustums - prepare frustums individually per tile depending on the buffer minimum/maximum depth. [*] Culling lights pass - use frustums to check if the light overlaps the tile and insert its index to the light index list. [*] Compute light shading for opaque objects - Use G-Buffer and prepared lights list for a single pass screen space light computation. [*] Compute light shading for non-opaque objects - Draw each non-opaque object and compute light shading per pixel only for lights that belong to the tile. [/olist] [img]https://clan.cloudflare.steamstatic.com/images//34659267/c6b7f17b068bda633ab86a66c7eea78e5c0ef5da.jpg[/img] [b][i]In higher resolutions the tile grid becomes more dense, allowing more precise calculations and much greater granulation.[/i][/b] The main advantage of this approach over the Deferred Shading we used up to this point is much higher efficiency. Traditional Deferred Shading has to access the G-buffer individually for each light source during the lighting pass The Tiled Deferred algorithm allows us to read the G-buffer only once for all light sources. [img]https://clan.cloudflare.steamstatic.com/images//34659267/3612b5b4ac628c513e4d4c1583f3c235d6080eb5.jpg[/img] [b][i]For contrast, in lower resolutions, the grid loses its precision.[/i][/b] The additional processing required for culling lights can add some overhead to the rendering pipeline on scenes with just a few light sources. In The Riftbreaker’s example, this is typically overcome in situations with more lights, such as bases with large numbers of defensive towers. The increased speed of completing the lighting pass becomes clearly apparent in those instances. [h2]Results[/h2] In recent tests we conducted on The Riftbreaker, we observed that implementing tiled deferred shading resulted in a significant increase in performance. Specifically, tests we ran on various hardware revealed considerable performance improvements when it comes to scenes with many additional dynamic lights. On average, The Riftbreaker ran 20% faster in scenarios with many towers, explosions, enemy units, and building lights on the screen compared to the traditional deferred shading approach. [h3]Test case 1: static scene, adding lights manually one by one.[/h3] [b]Test platform: Ryzen 5600X, 32GB DDR4, Nvidia RTX 3080, Win11, 4K[/b] [img]https://clan.cloudflare.steamstatic.com/images//34659267/0435755dba29319eba6cd6b384ab35da642359bd.png[/img] [b][i]This is what the synthetic multi-light test scenario looked like.[/i][/b] [img]https://clan.cloudflare.steamstatic.com/images//34659267/0c64dc10d32b16335f36fcaa97700eaba5943ba6.png[/img] [img]https://clan.cloudflare.steamstatic.com/images//34659267/ddb9d966ac40d1a79009a681c7348b68386ff755.png[/img] [img]https://clan.cloudflare.steamstatic.com/images//34659267/e40aaa2c6a467f31b045180e149a4d6df745a460.png[/img] [b][i]Synthetic multi-light benchmark scene results. In scenarios with more than 100 lightsources Tiled Deferred Rendering provides an almost 100% FPS boost![/i][/b] [img]https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExYzE5YzU4ZTIzNThjODJmMTI5MzhjODdiZGQ2NjdjMDJhNDRlNmExNyZjdD1n/r5IQIc0562G7Xc4jln/giphy.gif[/img] [b][i]You can see the tiles changing color as we turn sets of lights off.[/i][/b] [h3]Test case 2: In-game CPU benchmark mode.[/h3] [b]Results presented are the average of 2 runs.[/b] [img]https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExZGViZWJmNTNmNmFjMTUyNDU5ZWRmOTMyOWI4NjRlYjMxODliNDJmYSZjdD1n/q0FU32FFakUobI9n2p/giphy.gif[/img] [b]The test scene without the debug overlay.[/b] [img]https://clan.cloudflare.steamstatic.com/images//34659267/d9aa3031bcdf0f1dc449948f9b7851d32c257781.png[/img] [img]https://clan.cloudflare.steamstatic.com/images//34659267/2a163fe8020c204d588d776385aaa6b7cc856105.png[/img] [img]https://clan.cloudflare.steamstatic.com/images//34659267/2c9026bfd62b60f977e4de9f771a123db46d804e.png[/img] [b][i]"Benchmark CPU" testing results on three sample configurations. This benchmark is not perfect for testing this rendering method because it focuses primarily on CPU overhead. However, even in this case, we can report visibly better performance results.[/i][/b] [h2]Conclusions and future works[/h2] Tiled Deferred Rendering has allowed us to save a lot of CPU processing power in The Riftbreaker. This technique will make it easier to maintain high, stable frame rates, or if you’re already running the game above your monitor’s refresh rate, it will reduce the strain on your GPU and will lower your electricity bill - everybody wins! You will be able to start enjoying its benefits as soon as the World Expansion 2 free update reaches your hard drives. Thanks to the changes we have introduced to the Schmetterling Engine, exciting new possibilities have become available for use in The Riftbreaker and our future projects. With tiled deferred shading, we can now add support for transparency and particle light shading or even go for something much bigger like volumetric fog support. We believe that these improvements will positively affect the player's reception of the game and give our artists more creative freedom when it comes to designing levels and effects. We hope you enjoyed this deeper dive under the hood of the Schmetterling Engine and The Riftbreaker. We are really excited with the possibilities unfolding before us, and we hope to start using the new tech as soon as possible. Join our Discord at www.discord.gg/exorstudios to keep up with all the development news. Remember to post your ideas and suggestions at riftbreaker.featureupvote.com! See you next time! EXOR Studios P.S. During the last developer stream on Twitch we managed to finish a full Survival run in co-op, running a dedicated server in headless mode. We're making progress! Join us on Tuesday at 3PM CEST at www.twitch.tv/exorstudios for the next one!