Skip to content

Add RFC to introduce Bolt backend for native engine#59

Open
Weixin-Xu wants to merge 8 commits into
prestodb:mainfrom
Weixin-Xu:introduce_bolt
Open

Add RFC to introduce Bolt backend for native engine#59
Weixin-Xu wants to merge 8 commits into
prestodb:mainfrom
Weixin-Xu:introduce_bolt

Conversation

@Weixin-Xu
Copy link
Copy Markdown

@Weixin-Xu Weixin-Xu commented Apr 14, 2026

Summary

Introduce Bolt as an additional backend for the Presto native execution engine.

The initial implementation provides a Bolt-based native worker that implements the Presto worker protocol and integrates with the existing Presto coordinator.

To support the Bolt backend build and dependency requirements, a Conan-based dependency flow is introduced for this worker module. Standardizing dependency management across all native backends is out of scope for this RFC.

@frankobe @ZacBlanco

@beinan
Copy link
Copy Markdown
Member

beinan commented Apr 16, 2026

Looking forward to having bolt in native presto workers!

Improve RFC with more implementation specifics
Comment thread RFC-0024-bolt-backend.md

The current code includes dedicated Bolt converters such as:

* `PrestoToBoltQueryPlan`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The Bolt worker deserializes those fragments using the existing Presto protocol model

How will future divergence in the protocol with Velox be handled? What happens if Velox requires a protocol change that isn’t compatible with Bolt, and vice versa?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on where the change originates.

If it’s a change to the Presto protocol, then it becomes a contract change, and we need to ensure both Bolt and Velox can handle it correctly (ideally in a backward-compatible way).
If it’s a Bolt- or Velox-specific interface change, then it should be handled within their respective translation/converter layers.

Copy link
Copy Markdown

@yingsu00 yingsu00 May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The Bolt worker deserializes those fragments using the existing Presto protocol model

How will future divergence in the protocol with Velox be handled? What happens if Velox requires a protocol change that isn’t compatible with Bolt, and vice versa?

I have the same question too. It's actually not just about Presto protocol but a very general concern overall. I think the right way to handle such kind of concerns is to support versioning on common interfaces like the Presto SPI, the Presto communication protocol, Velox interfaces, etc. In the past people have been very cautious when changes need to be made on Presto SPI, but changes were made very frequently and freely on Velox side, like the connector interfaces, DWIO interfaces, and Presto protocol. This caused lots of rebase conflicts in our internal repo in the past. I hope Bytedance Bolt can do a better job on this in the future.

So I see this as an opportunity to start cleaning things up, and maybe we can start working on versioning support on protocol in Presto and Bolt repos first.

Comment thread RFC-0024-bolt-backend.md
* `PrestoToBoltExpr`
* `PrestoToBoltConnector`
* `PrestoToBoltSplit`
* `BoltPlanConversion` and `BoltPlanValidator`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bolt execution aims to cover all side-car callbacks and extend them where needed.

Comment thread RFC-0024-bolt-backend.md

### 5. CI Plan

CI for Bolt should be split into a few clear lanes:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a fairly exhaustive sets in the presto-native-tests module https://github.com/prestodb/presto/tree/master/presto-native-tests. Please ensure these are covered as well.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

End-to-end tests will use the same test module wherever possible.

@jja725 jja725 self-requested a review May 8, 2026 23:16
Comment thread RFC-0024-bolt-backend.md
* `PrestoToBoltSplit`
* `BoltPlanConversion` and `BoltPlanValidator`

This is intentionally backend-local. The initial implementation does not try to share plan conversion logic with the Velox backend.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have the function-coverage delta: which Velox functions are not yet in Bolt, and which Bolt functions don't match Velox semantics?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function-coverage will be reflected in the bolt/bolt-execution unittests and match Presto semantics.

Comment thread RFC-0024-bolt-backend.md Outdated
@@ -0,0 +1,358 @@
# RFC-0023: Introduce Bolt Backend for Presto Native Execution
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already have RFC-0023-vector-search.md, change the number

Comment thread RFC-0024-bolt-backend.md
* worker server implementation
* task execution logic
* operators
* plan, expression, connector, and split conversion
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The coordinator's planner produces one plan, and the RFC names BoltPlanValidator as the Bolt-side analog of getVeloxPlanValidator(). But validation is downstream of plan emission — by the time the worker rejects a plan, the query has already been sent. It might be better to have a coordinator-side capability description so the planner can avoid emitting plans the deployed backend can't run. The RFC should at least call out that this gap exists and how it's
bridged for the homogeneous-pool case.

Comment thread RFC-0024-bolt-backend.md

The initial implementation keeps the existing Velox-based worker unchanged and adds a sibling module, `presto-bolt-execution`, that implements the same Presto worker protocol against Bolt. The coordinator, query protocol, and external worker model remain unchanged.

The current code does not turn `presto-native-execution` into a generic shared framework. Instead, it adds a Bolt-specific worker tree and extracts only a small set of reusable helpers from `presto-native-execution`. Build enablement is also separate in the initial implementation: Velox and Bolt are built from different module directories and produce different worker binaries from different build roots.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where will the small set of reusable helpers reside?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a top-level presto-native-common-helper module to provide a unified abstraction layer for reusable native integration helpers.

Comment thread RFC-0024-bolt-backend.md
## Summary

This RFC introduces Bolt as an additional backend for Presto's native worker implementation.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presto currently has other native modules at root level:

  • presto-native-sidecar-plugin
  • presto-native-tests
    What's the plan for presto-bolt-execution to work with them? Would presto-native-tests be used to cover both Velox and Bolt?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before the PR is finalized, presto-bolt-execution will be validated through presto-native-tests. This should help ensure Bolt stays aligned with existing Presto behaviors and test coverage.

For presto-native-sidecar-plugin, we have not tried integrating with it yet. This will be part of our next-step investigation and integration plan.

Comment thread RFC-0024-bolt-backend.md

The initial implementation keeps the existing Velox-based worker unchanged and adds a sibling module, `presto-bolt-execution`, that implements the same Presto worker protocol against Bolt. The coordinator, query protocol, and external worker model remain unchanged.

The current code does not turn `presto-native-execution` into a generic shared framework. Instead, it adds a Bolt-specific worker tree and extracts only a small set of reusable helpers from `presto-native-execution`. Build enablement is also separate in the initial implementation: Velox and Bolt are built from different module directories and produce different worker binaries from different build roots.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given Presto already has multiple top level native modules and it's unclear if the native side car and native tests can work with Bolt, maybe we can consider add a common top level folder to host the common helpers?

@jaystarshot
Copy link
Copy Markdown
Member

jaystarshot commented May 13, 2026

LGTM (with protocol versioning etc)

Copy link
Copy Markdown
Contributor

@rschlussel rschlussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concerns about the overall idea. Mainly, it boils down to this adding significant complexity/overhead, and the benefit isn't clear to me. What does Presto get out of having Bolt as a backend?
Some of the challenges are:

  1. as mentioned by some other reviewers, it makes any protocol changes much harder if we need to worry about two backends
  2. There's a huge risk of correctness issues/behavior differences between the two backends. The java -> Prestissimo migration had to contend with so many correctness issues, and that was meant to be a one way migration. Even with a test suite, there just will definitely be corner cases where there will be correctness differences between the two, and we will be dealing with that risk indefinitely
  3. maintenance. Once we add this support we will need to maintain it, even if the people originating it move on to other things. Is that something we want to take on as a project? Again, it comes down to that I'm not clear on what the advantages are of having this backend.

@amitkdutta
Copy link
Copy Markdown

amitkdutta commented May 15, 2026

I share the concern that Rebecca raised. While I appreciate the potential value this could bring, I'd like to highlight the maintenance burden that comes with adding a nearly identical native execution path.

Today, whenever we advance Velox as a submodule, we frequently need to make coordinated changes across both Velox and Presto to keep things working smoothly (e.g., #27390, #27271, #27716). Adding a third execution backend would compound this, a change in Velox that requires adaptation in Presto could just as easily break Bolt-based execution, and vice versa. This can quickly lead to a situation where even a straightforward change requires juggling three separate repositories.

This is manageable today because Velox operates as a single leaf node. However, with multiple leaf nodes as this proposal envisions, the complexity grows significantly. The protocols, refactorings, and code contracts between Presto and Velox are continuously evolving, and maintaining a third integration point without a clear, distinct benefit to the broader community is something I think we should consider carefully before committing to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants