Skip to content

[Batch 4] Depth pyramid generation shader #383

@MichaelFisher1997

Description

@MichaelFisher1997

Summary

Implement a GPU depth pyramid (hierarchical Z-buffer) for use in occlusion culling. The compute shader reads the depth buffer from the previous frame and generates a mip chain where each level stores the min (or max) depth for a 2×2 block of the previous level. This enables O(log n) occlusion queries.

Depends on: #379 (GPU culling must be integrated before occlusion can be added)

Current State

No occlusion culling exists. All chunks within the frustum are rendered, even if they're completely hidden behind a mountain. In hilly terrain, 30-50% of visible chunks may be fully occluded.

Depth Pyramid Specification

Input

  • Depth buffer from frame N-1 (previous frame's depth attachment)
  • Format: VK_FORMAT_D32_SFLOAT or VK_FORMAT_D16_UNORM

Output

  • Mip chain texture: same width/height as depth buffer, ~log2(max_dimension) levels
  • Each level: 2×2 average/min/max of previous level
  • Level 0 = original depth
  • Level k = min/max depth for 2^k × 2^k blocks

Compute Shader

// depth_pyramid.comp
layout(local_size_x = 8, local_size_y = 8) in;
layout(binding = 0) uniform sampler2D src_depth;    // previous level
layout(binding = 1, r32f) uniform image2D dst_depth; // current level

void main() {
    ivec2 coord = ivec2(gl_GlobalInvocationID.xy);
    ivec2 src_size = imageSize(dst_depth) * 2;
    
    vec2 uv = (vec2(coord * 2) + 1.0) / vec2(src_size);
    float d00 = textureLod(src_depth, uv, 0).r;
    float d10 = textureLodOffset(src_depth, uv, 0, ivec2(1, 0)).r;
    float d01 = textureLodOffset(src_depth, uv, 0, ivec2(0, 1)).r;
    float d11 = textureLodOffset(src_depth, uv, 0, ivec2(1, 1)).r;
    
    // Store maximum depth (furthest point) for conservative occlusion test
    float max_depth = max(max(d00, d10), max(d01, d11));
    imageStore(dst_depth, coord, vec4(max_depth, 0, 0, 0));
}

Generation

  • Dispatch one compute pass per mip level
  • Level 0: read from depth attachment, write to mip 1
  • Level k: read from mip k, write to mip k+1
  • Pipeline barriers between levels (compute → compute)
  • Total dispatches: ~10 for a 1080p depth buffer (log2(1920) ≈ 11)

Implementation Plan

Step 1: Depth pyramid texture

  • Create a VK_IMAGE_TYPE_2D with mip levels = ceil(log2(max(width, height)))
  • Format: VK_FORMAT_R32_SFLOAT (single float per texel)
  • Usage: SAMPLED_BIT | STORAGE_BIT | TRANSFER_SRC_BIT
  • Image view with mip chain

Step 2: Compute pipeline

  • Pipeline with the depth pyramid compute shader
  • Descriptor: input texture (sampled) + output image (storage)
  • Push constants: mip level being generated (for source mip selection)

Step 3: Integration into frame

  • After opaque pass (or after main pass completes): generate pyramid from frame N-1 depth
  • The pyramid from the previous frame is used for occlusion in the current frame
  • Pipeline barrier: COMPUTE_SHADER → COMPUTE_SHADER between mip levels
  • Pipeline barrier: LATE_FRAGMENT_TESTS → COMPUTPUTE_SHADER at start

Step 4: Depth buffer copy

  • If depth attachment isn't directly readable as a texture:
    • Copy depth to a separate R32_SFLOAT texture first (or use depth-as-sampleable format)
    • Or render depth to a separate texture in the G-pass (if G-pass depth matches)

Files to Create

  • src/engine/graphics/vulkan/depth_pyramid.zig — texture management, dispatch, lifecycle
  • assets/shaders/vulkan/depth_pyramid.comp — compute shader
  • assets/shaders/vulkan/depth_pyramid.comp.spv — compiled SPIR-V

Files to Modify

  • src/engine/graphics/vulkan/pipeline_manager.zig — register compute pipeline
  • src/engine/graphics/render_graph.zig — add pyramid generation pass after opaque
  • build.zig — add glslangValidator check

Testing

  • Depth pyramid generates without validation errors
  • Each mip level has correct dimensions (half previous)
  • Visual debug: overlay depth pyramid as grayscale heatmap
  • Performance: pyramid generation < 0.1ms on RTX 3060
  • Works at all tested resolutions

Roadmap: docs/PERFORMANCE_ROADMAP.md — Batch 4, Issue 3A-1

Metadata

Metadata

Assignees

No one assigned

    Labels

    batch-4Batch 4: Advanced GPUbugSomething isn't workingdocumentationImprovements or additions to documentationenhancementNew feature or requestperf/gpu-computeGPU compute shader workshaders

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions