feat(backup): gzip-compressed backups (~5x smaller)#8
Merged
Conversation
Backup files in ~/.sedx/backups/<id>/ are now gzipped with a .gz suffix. On a real-world Rust source sample (86 KB) the backup shrinks to 15.6 KB — a 5.5x reduction. Savings scale with file compressibility; on logs and structured text the ratio is typically even better. Streaming I/O through flate2's GzEncoder keeps peak memory flat for large backups (same story as the main streaming file processor). Backwards compatibility: restore_backup auto-detects legacy uncompressed backups by checking for the .gz suffix on the stored backup_path, so pre-v1.1 backups remain restorable. Covered by the new restore_accepts_legacy_uncompressed_backup unit test. Other backup-related tests updated to assert the .gz filename and to verify content round-trips through decompression rather than reading the backup as raw text.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
~/.sedx/backups/<id>/filename.extis now~/.sedx/backups/<id>/filename.ext.gz. Streaming gzip viaflate2::GzEncoderkeeps memory flat on large files.Real-world measurement (2000-line Rust source, 86,210 bytes):
Logs / structured text will compress even harder; tiny configs will see more modest ratios (still a win — overhead of the gzip header is ~20 bytes).
Compatibility
restore_backupauto-detects legacy uncompressed backups by checking for the.gzsuffix onFileBackup::backup_path— pre-v1.1 backup directories remain fully restorable after upgrade. New regression test:restore_accepts_legacy_uncompressed_backup.Why gzip (and not diff)
Full-file gzip keeps the same "always-restorable" guarantee the current uncompressed backups have. Diff-based backups would save more space but trade reliability (manual edits between backup and rollback can invalidate the patch context). sedx's whole pitch is safety-by-default, so this PR takes the conservative-savings option.
Test plan
cargo clippy --all-targets -- -D warningsclean on both Linux and Windows cross-targetcargo fmt --checkcleanFollow-ups not in scope
Compression::default()(level 6). Could expose a config knob if users wantCompression::fast()orCompression::best(), but no evidence it's needed yet.