Releases: datafusion-contrib/datafusion-distributed
Releases · datafusion-contrib/datafusion-distributed
v1.0.0
What's Changed
- Bring https://github.com/gabotechs/datafusion-distributed-experiment code by @gabotechs in #68
- Adds error serialization-deserialization by @gabotechs in #69
- Remove stage delegation in favor of planning-time stage assignation by @gabotechs in #71
- Fix rust toolchain to 1.83.0 by @gabotechs in #72
- Completed execution path + failing test by @robtandy in #74
- Fix serialization error by @LiaCastaneda in #76
- Small cleanup after #74 by @gabotechs in #75
- Fix ArrowFlightReadExec result streaming by @gabotechs in #77
- Add stage planner tests by @gabotechs in #78
- Split ArrowFlightReadExec node placement for distributed planning by @gabotechs in #79
- Update DataFusion version from 48.0.0 to 49.0.0 by @gabotechs in #82
- add doc comment for execution stage struct by @robtandy in #80
- Support user provided codecs by @gabotechs in #81
- Move all test utils to src/ and hide them behind an "integration" feature by @gabotechs in #84
- Add test comparing distributed + single node execution on TPCH data by @jayshrivastava in #83
- Execution working on all 22 TPCH queries by @robtandy in #89
- Add delta report for benchmarks by @gabotechs in #91
- Removes an extra line jump in distributed explains by @gabotechs in #95
- Create TTL map with time wheel architecture by @jayshrivastava in #96
- Fix compilation errors and warnings by @gabotechs in #102
- Introduce
ConfigExtensionExt, allowing the propagation of arbitraryConfigExtensions across network boundaries by @gabotechs in #100 - Nested Loop Joins (fixes TPCH query 22) by @robtandy in #104
- Improve
SessionBuilderergonomy and fix clippy errors by @gabotechs in #103 - Collect Left Hash Joins by @robtandy in #105
- Introduce
DistributedExttrait that extends the capabilities of DataFusion's session building tools by @gabotechs in #106 - Add plan validations to TPCH tests by @gabotechs in #107
- do_get: use TTL map to store task state by @jayshrivastava in #108
- Refactor arrow_flight_read.rs and friends by @gabotechs in #109
- Add
localhost_run.rsandlocalhost_worker.rsexamples by @gabotechs in #111 - Add README.md and LICENSE.txt by @gabotechs in #114
- Fix panics in tests and un-ignore working tests by @gabotechs in #120
- Improve EXPLAIN render by @gabotechs in #121
- Bigger TPCH tests by @gabotechs in #122
- File name and folder restructure by @gabotechs in #124
- Refactor do_get.rs and adjacent files by @gabotechs in #125
- Adds in-memory example by @gabotechs in #132
- Add support for in-memory TPCH tests by @gabotechs in #129
- Comment flaky test by @gabotechs in #133
- Support
--threadsand--workerson TPCH benchmarks by @gabotechs in #130 - Report host stats on TPCH benchmarks by @gabotechs in #131
- Robtandy/better graphviz plans by @robtandy in #135
- changes to allow nice graphviz of single node plans too by @robtandy in #136
- metrics: add metrics module and protos by @jayshrivastava in #141
- fix bug in graphviz for determining output partitions by @robtandy in #142
- move chrono out of optional deps so project can compile by @robtandy in #143
- execution_plans: add metrics collector and re-writer by @jayshrivastava in #144
- Distributed planning overhaul by @gabotechs in #145
- Update README.md with new diagrams based on NetworkShuffleExec and NetworkCoalesceExec by @gabotechs in #153
- Do not require default datafusion features by @gabotechs in #154
- set msrv via Cargo.toml, use 2024 edition by @adriangb in #152
- remove feature flags around chrono::DateTime by @adriangb in #155
- fix: Move
error.rsto protobuf by @jonathanc-n in #156 - execution_plans: add MetricsCollectingStream by @jayshrivastava in #150
- flight_service: add TrailingFlightDataStream by @jayshrivastava in #157
- fix: Incorrect weather parquet path in examples by @zuston in #165
- Generalize functions for NetworkCoalesceExec creation by @jonathanc-n in #162
- fix: Enable distributed plan for localhost_run by @zuston in #166
- Address public api weak points by @gabotechs in #158
- Add partition coalescing at the head of the plan by @gabotechs in #164
- Fix early drop stateful nodes by @gabotechs in #159
- Remove unnecessary StageExec proto serde overhead by @gabotechs in #163
- flight_service: emit metrics from ArrowFlightEndpoint by @jayshrivastava in #160
- Evolve
ChannelResolvertrait for requiring aFlightServiceClientinstead of atonic::BoxSyncCloneChannelby @gabotechs in #172 - update to DataFusion 50 by @adriangb in #146
- Use upstream composed extension codec by @gabotechs in #176
- Fix Dictionary Encoded Values by @cetra3 in #174
- Rework execution plan hierarchy for better interoperability by @gabotechs in #178
- Fix in-memory example by @gabotechs in #183
- Misc improvements to public API by @gabotechs in #181
- Add DistributedPlanError::NonDistributable rule and do not distribute SHOW COLUMNS by @gabotechs in #195
- implement distributed EXPLAIN ANALYZE by @jayshrivastava in #182
- Refactor distributed planner into its own folder by @gabotechs in #196
- Fix user provided UDFs encoding by @gabotechs in #200
- Add dynamic task config based on DataFusion extension options by @gabotechs in htt...