Problem
Each build-agents matrix job in build-images.yml recompiles the entire Rust binary from scratch, despite the cache-from + BUILDER_IMAGE mechanism designed to reuse the build-core output.
Evidence
From run #28303304486:
build-core (arm64) completes and pushes the builder image to registry
build-agents jobs start after build-core completes (correct needs dependency)
- But each variant job still shows full
cargo build compilation in the "Build and push by digest" step
Example: hermes arm64 job — log shows compiling 2000+ crates from scratch (fdeflate, webpki-roots, image-webp, etc.)
Timing (arm64 jobs)
| Variant |
Duration |
| grok |
6 min |
| cursor |
10 min |
| copilot |
11 min |
| claude |
11 min |
| codex |
11 min |
| kiro |
12 min |
| hermes |
14+ min |
All are spending the majority of time on Rust compilation.
Root Cause
In Dockerfile.unified:
ARG BUILDER_IMAGE=rust:1-bookworm
FROM ${BUILDER_IMAGE} AS builder
WORKDIR /build
COPY Cargo.toml Cargo.lock ./
COPY crates/openab-core/Cargo.toml crates/openab-core/Cargo.toml
COPY crates/openab-gateway/Cargo.toml crates/openab-gateway/Cargo.toml
RUN ... cargo build --release --features unified ...
COPY crates/ crates/
COPY src/ src/
RUN ... cargo build --release --features unified
Even though BUILDER_IMAGE is set to the pre-built builder from registry, Docker buildx still evaluates the COPY + RUN layers. If the build context (file hashes) does not exactly match the cached layers, all subsequent layers are invalidated and recompiled.
The cache-from: type=registry helps with layer matching, but in practice the full recompile still happens — likely because the context sent to buildx differs slightly between build-core and build-agents jobs (same checkout, but timing/metadata differences can affect layer hashes).
Impact
- N variants × 2 architectures = ~28 redundant Rust compilations
- Each takes 6-14 min → total CI time much higher than necessary
- Wastes GitHub Actions minutes
Suggested Fix
Instead of relying on Docker layer cache for the builder stage, have build-agents directly copy the pre-built binary:
# Option A: Multi-stage with explicit image reference
ARG BUILDER_IMAGE
FROM ${BUILDER_IMAGE} AS prebuilt-builder
FROM debian:bookworm-slim AS hermes
COPY --from=prebuilt-builder /build/target/release/openab /usr/local/bin/openab
# ... install runtime deps ...
Or separate the Dockerfile so variant targets do NOT include the builder stage at all — they just COPY --from=<registry>/builder:<tag>-<arch>.
This would reduce each variant job from 10+ min to under 1 min (just pulling image + adding thin layer).
Problem
Each
build-agentsmatrix job inbuild-images.ymlrecompiles the entire Rust binary from scratch, despite thecache-from+BUILDER_IMAGEmechanism designed to reuse thebuild-coreoutput.Evidence
From run #28303304486:
build-core(arm64) completes and pushes the builder image to registrybuild-agentsjobs start afterbuild-corecompletes (correctneedsdependency)cargo buildcompilation in the "Build and push by digest" stepExample: hermes arm64 job — log shows compiling 2000+ crates from scratch (fdeflate, webpki-roots, image-webp, etc.)
Timing (arm64 jobs)
All are spending the majority of time on Rust compilation.
Root Cause
In
Dockerfile.unified:Even though
BUILDER_IMAGEis set to the pre-built builder from registry, Docker buildx still evaluates theCOPY+RUNlayers. If the build context (file hashes) does not exactly match the cached layers, all subsequent layers are invalidated and recompiled.The
cache-from: type=registryhelps with layer matching, but in practice the full recompile still happens — likely because the context sent to buildx differs slightly betweenbuild-coreandbuild-agentsjobs (same checkout, but timing/metadata differences can affect layer hashes).Impact
Suggested Fix
Instead of relying on Docker layer cache for the builder stage, have
build-agentsdirectly copy the pre-built binary:Or separate the Dockerfile so variant targets do NOT include the builder stage at all — they just
COPY --from=<registry>/builder:<tag>-<arch>.This would reduce each variant job from 10+ min to under 1 min (just pulling image + adding thin layer).