Skip to content

Commit 999aec6

Browse files
committed
refactor: optimize parquet library read/write performance
- Batch all binary encode/decode functions to operate on arrays instead of single values, reducing pack/unpack call overhead - Replace BinaryBufferReader's DataSize objects and Bytes allocations with raw int tracking and string returns, eliminating object creation on hot paths - Skip Dremel shredding/assembly for flat non-nested columns, bypassing unnecessary FlatValue object creation and definition level conversion - Replace array_merge with array_push in ColumnChunkBuilders and batch statistics tracking to avoid O(n²) array copying - Memoize maxDefinitionsLevel/maxRepetitionsLevel on schema columns and cache flatPath in local variables to cut repeated method calls refactor: parquet writer from row-by-row to columnar shredding refactor: replace serialize with pack on primivite values refactor: optimize flat columns writing refactor: make dremel shredder to precache shredding plans
1 parent a917ec6 commit 999aec6

69 files changed

Lines changed: 2903 additions & 2722 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

shell.nix

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ pkgs.mkShell {
4848
pkgs.figlet
4949
pkgs.symfony-cli
5050
pkgs.act
51+
pkgs.hyperfine
5152
]
5253
++ pkgs.lib.optional with-blackfire pkgs.blackfire
5354
++ pkgs.lib.optionals with-wasm [

src/lib/parquet/src/Flow/Parquet/Binary/Bytes.php

Lines changed: 0 additions & 110 deletions
This file was deleted.

0 commit comments

Comments
 (0)