Skip to content

Draft: BATCH message type#26

Open
kgpayne wants to merge 2 commits into
MeltanoLabs:mainfrom
kgpayne:kp/batch_messages
Open

Draft: BATCH message type#26
kgpayne wants to merge 2 commits into
MeltanoLabs:mainfrom
kgpayne:kp/batch_messages

Conversation

@kgpayne

@kgpayne kgpayne commented Oct 26, 2021

Copy link
Copy Markdown

Draft proposal for BATCH message type support as raised in #2

Rendered SIP Draft: https://github.com/kgpayne/Singer-Working-Group/blob/kp/batch_messages/proposals/draft/SIPXX%20-%20BATCH%20message%20type.md

Note: In order to support discoverability of capabilities like "BATCH" record types, we may require something like #8


This offeres several new avenues for improving performance:

- For taps and targets that _both_ support a common BATCH format (e.g. RDBMS to Warehouse) we have the opportunity to implement 'pass through', whereby records are _never deserialised_ between source to destination, greatly reducing overhead and improving throughput. This would drastically accelerate this common use-case in the singer ecosystem, inspired by the `fast_sync` feature of [Pipelinwise](https://transferwise.github.io/pipelinewise/concept/fastsync.html).

@louis-pie louis-pie Dec 1, 2021

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps worth mentioning that, especially when dealing with RDBMS, the COPY command is usually the best way to implement this "pass-through. And the COPY command usually requires csv.

Also, and unrelated: Pipelinwise -> PipelineWise

@judahrand

Copy link
Copy Markdown

Really excited for this - it is one of the things which has forced us to go with pure PipelineWise at the moment!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants