This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
This repository contains a tool to fix ZIP files >4GB that were created with a bug where all offsets in the directory structure are written using 32-bit fields instead of ZIP64 format. The files are not damaged but cannot be extracted with standard tools because offsets wrap around at the 4GB boundary.
python3 zip64_fix.py <input.zip> [output.zip]If no output path is given, writes <input>_fixed.zip.
zip64_fix.py fixes the broken ZIP in three phases:
-
Locate central directory: Finds the EOCD at end of file, then reads the central directory at
eocd_offset - cd_size(since the stored offset is truncated to 32 bits). -
Calculate actual offsets: Walks entries in order, tracking when
stored_offset + adjustmentwraps below the previous entry's end, adding 4GB to the adjustment at each boundary crossing. This handles files where the first N entries have correct offsets (below 4GB) and subsequent entries are off by multiples of 4GB. -
Write fixed output: Writes new local file headers (with proper sizes and ZIP64 extra fields only when needed), copies compressed data from original, then writes a new central directory and end records. Uses ZIP64 extensions only for fields that actually exceed 32-bit limits.
- Sizes and CRC32 are taken as-is from the original central directory (no decompression/recalculation).
- The data descriptor flag (bit 3) is cleared in output headers since sizes are written directly.
- ZIP64 extra fields are built conditionally: only fields that overflow 32 bits are included.
- Local headers are rewritten (not copied) because the originals may have wrong extra field layouts.