-
Notifications
You must be signed in to change notification settings - Fork 0
[Batch 4] Depth pyramid generation shader #383
Copy link
Copy link
Open
Labels
batch-4Batch 4: Advanced GPUBatch 4: Advanced GPUbugSomething isn't workingSomething isn't workingdocumentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or requestperf/gpu-computeGPU compute shader workGPU compute shader workshaders
Description
Summary
Implement a GPU depth pyramid (hierarchical Z-buffer) for use in occlusion culling. The compute shader reads the depth buffer from the previous frame and generates a mip chain where each level stores the min (or max) depth for a 2×2 block of the previous level. This enables O(log n) occlusion queries.
Depends on: #379 (GPU culling must be integrated before occlusion can be added)
Current State
No occlusion culling exists. All chunks within the frustum are rendered, even if they're completely hidden behind a mountain. In hilly terrain, 30-50% of visible chunks may be fully occluded.
Depth Pyramid Specification
Input
- Depth buffer from frame N-1 (previous frame's depth attachment)
- Format:
VK_FORMAT_D32_SFLOATorVK_FORMAT_D16_UNORM
Output
- Mip chain texture: same width/height as depth buffer, ~log2(max_dimension) levels
- Each level: 2×2 average/min/max of previous level
- Level 0 = original depth
- Level k = min/max depth for 2^k × 2^k blocks
Compute Shader
// depth_pyramid.comp
layout(local_size_x = 8, local_size_y = 8) in;
layout(binding = 0) uniform sampler2D src_depth; // previous level
layout(binding = 1, r32f) uniform image2D dst_depth; // current level
void main() {
ivec2 coord = ivec2(gl_GlobalInvocationID.xy);
ivec2 src_size = imageSize(dst_depth) * 2;
vec2 uv = (vec2(coord * 2) + 1.0) / vec2(src_size);
float d00 = textureLod(src_depth, uv, 0).r;
float d10 = textureLodOffset(src_depth, uv, 0, ivec2(1, 0)).r;
float d01 = textureLodOffset(src_depth, uv, 0, ivec2(0, 1)).r;
float d11 = textureLodOffset(src_depth, uv, 0, ivec2(1, 1)).r;
// Store maximum depth (furthest point) for conservative occlusion test
float max_depth = max(max(d00, d10), max(d01, d11));
imageStore(dst_depth, coord, vec4(max_depth, 0, 0, 0));
}Generation
- Dispatch one compute pass per mip level
- Level 0: read from depth attachment, write to mip 1
- Level k: read from mip k, write to mip k+1
- Pipeline barriers between levels (compute → compute)
- Total dispatches: ~10 for a 1080p depth buffer (log2(1920) ≈ 11)
Implementation Plan
Step 1: Depth pyramid texture
- Create a
VK_IMAGE_TYPE_2Dwith mip levels =ceil(log2(max(width, height))) - Format:
VK_FORMAT_R32_SFLOAT(single float per texel) - Usage:
SAMPLED_BIT | STORAGE_BIT | TRANSFER_SRC_BIT - Image view with mip chain
Step 2: Compute pipeline
- Pipeline with the depth pyramid compute shader
- Descriptor: input texture (sampled) + output image (storage)
- Push constants: mip level being generated (for source mip selection)
Step 3: Integration into frame
- After opaque pass (or after main pass completes): generate pyramid from frame N-1 depth
- The pyramid from the previous frame is used for occlusion in the current frame
- Pipeline barrier:
COMPUTE_SHADER → COMPUTE_SHADERbetween mip levels - Pipeline barrier:
LATE_FRAGMENT_TESTS → COMPUTPUTE_SHADERat start
Step 4: Depth buffer copy
- If depth attachment isn't directly readable as a texture:
- Copy depth to a separate
R32_SFLOATtexture first (or use depth-as-sampleable format) - Or render depth to a separate texture in the G-pass (if G-pass depth matches)
- Copy depth to a separate
Files to Create
src/engine/graphics/vulkan/depth_pyramid.zig— texture management, dispatch, lifecycleassets/shaders/vulkan/depth_pyramid.comp— compute shaderassets/shaders/vulkan/depth_pyramid.comp.spv— compiled SPIR-V
Files to Modify
src/engine/graphics/vulkan/pipeline_manager.zig— register compute pipelinesrc/engine/graphics/render_graph.zig— add pyramid generation pass after opaquebuild.zig— add glslangValidator check
Testing
- Depth pyramid generates without validation errors
- Each mip level has correct dimensions (half previous)
- Visual debug: overlay depth pyramid as grayscale heatmap
- Performance: pyramid generation < 0.1ms on RTX 3060
- Works at all tested resolutions
Roadmap: docs/PERFORMANCE_ROADMAP.md — Batch 4, Issue 3A-1
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
batch-4Batch 4: Advanced GPUBatch 4: Advanced GPUbugSomething isn't workingSomething isn't workingdocumentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or requestperf/gpu-computeGPU compute shader workGPU compute shader workshaders