Shader Optimisation Tips & Dealing with Platform Fragmentation

Key Things to optimise  

  • the amount of memory used up by shader variables.  
  • the amount of textures the shader is using.  
  • Efficiency: try to produce the same effect with reduced amount of code or data. 
  • Avoid using Unity’s internal shaders and Surface Shaders: they are designed to cover a broad number of general cases, vertex & fragment shaders can be optimised deeper.
  • reduce real time shadows and lighting: Realtime lighting is extremely expensive for mobile platforms. Bake lighting whenever possible.

The Problem with Platform Fragmentation    

  • Having to support many different platforms which can only accommodate parts of certain features – use cheaper approximations and Shader LOD to keep costs down.
  • Wide Differences in platforms: Unity supports many versions of Direct3D, OpenGL, OpenGL ES, Vulkan..
  • Need to work around platform-specific bugs, especially shader precision issues: For example, shader compiler on some OpenGL ES2.0 platforms (iOS) needs viewDir and h in Unity’s lighting.cginc to be medium precision rather than low, otherwise specular highlight becomes too bright. 
  • Check hardware support and study the GPU frameworks of different devices: for example, The PowerVR chipset uses tile-based deferred rendering (TBDR), which splits up the geometry data into small rectangular “tiles” that will be processed as one image. The fragment shader will only be executed on visible fragments which are determined by hardware. Other tile-based GPU frameworks like Qualcomm Adreno and ARM Mali will use early-Z rejection to do a low-precision Z-test and get rid of occluded pixels. Tegra chips uses a more traditional framework design which makes it more likely for overdraw to cause bottlenecks. 

How to avoid overdraw?  

  • Unity attempts to take care of this problem in default shader code automatically, using Z-Buffer to tests and sort the objects pixels – only the appropriate pixels will be rendered. 
  • However, the introduction of transparency destroys the usefulness of the Z-Buffer due to the need to blend the pixels.  
  • Be wary of transparent objects. The more transparent objects, the higher the likelihood of overdraw!  
  • Reduce total number of unique materials. Be aware that modifying properties of Renderer.material at runtime will create a new copy of a given material if it is used by multiple objects.

Shader CPU optimisation 

Problem: Too many draw calls

Potential Solutions:

  • Use static & dynamic batching
  • Combining static scene elements into a singular mesh (if possible)
  • Use GPU skinning for animation, rather than CPU skinning
  • Simplifying the number of API calls and unique rendering components required to render the scene
  • Using texture atlases for instances which share a common rendering path to reduce the amount of states that needs changing

Shader GPU optimisation 

Problem: Too many vertices or/and fragments to process

Potential Solutions:

  • Optimise geometry: reduce the number of vertices to process 
  • Use model LOD 
  • Use occlusion culling
  • Control Rendering Order
  • Be wary of transparent objects

Always use Rendering Analysis tools First

Before optimisation: you must find out which step causes the bottleneck by using rendering analysis tools. Identify which components of the rendering are the most expensive, and how content changes affect this overhead.

  1. Stats 
  2. Profiler 
  3. Frame Debugger