Summary
Currently, each reducer is responsible for calling SaveChangesAsync() at the end of RollForwardAsync. This creates a 1:1 relationship between blocks processed and database writes, which is inefficient for high-throughput sync scenarios.
Current Behavior
// Each reducer does this:
public async Task RollForwardAsync(Block block)
{
// Process block...
dbContext.Add(entity);
await dbContext.SaveChangesAsync(); // ← Every block = 1 DB write
}
At ~10 blocks/second during sync, this means 10 database round-trips per second per reducer. With multiple reducers, this multiplies.
Proposed Behavior
The framework (CardanoIndexWorker) should control when persistence happens, not the individual reducers.
// Reducer just stages changes:
public async Task RollForwardAsync(Block block, DbContext dbContext)
{
// Process block...
dbContext.Add(entity);
// No SaveChangesAsync - framework handles this
}
// Framework controls persistence:
foreach (var block in blocks)
{
await reducer.RollForwardAsync(block, dbContext);
if (ShouldFlush(blockCount, timeSinceLastFlush, isAtTip))
{
await dbContext.SaveChangesAsync();
}
}
Benefits
- Batching - Save every N blocks (e.g., 100 blocks = 100x fewer DB round-trips)
- Transaction boundaries - Framework can wrap batches in transactions for atomic rollback
- Cleaner separation of concerns - Reducers handle "what changes", framework handles "when to persist"
- Shared DbContext - Dependent reducers in same chain could share one context per batch
- Configurable flush strategy - Flush on block count, time interval, reaching tip, or before rollback
Implementation Plan
Phase 1: Core Infrastructure
Phase 2: DbContext Management
Phase 3: Dependent Reducer Optimization
Configuration Example
{
"Sync": {
"Batch": {
"Size": 100,
"FlushIntervalMs": 5000,
"FlushOnTip": true
}
}
}
Backward Compatibility
- Existing reducers that call
SaveChangesAsync() should continue to work
- Framework-level batching can be opt-in initially via configuration
- Deprecation warning for reducers calling SaveChanges when batching is enabled
Performance Impact
| Metric |
Current |
With Batching (100) |
| DB writes/sec (syncing) |
~10/reducer |
~0.1/reducer |
| Latency per block |
~50-100ms |
~1-5ms |
| Throughput |
~10 blocks/sec |
~100+ blocks/sec |
Related
- Reducers will need to be updated to remove
SaveChangesAsync() calls
- Rollback handling needs review to ensure batch boundaries are respected
Summary
Currently, each reducer is responsible for calling
SaveChangesAsync()at the end ofRollForwardAsync. This creates a 1:1 relationship between blocks processed and database writes, which is inefficient for high-throughput sync scenarios.Current Behavior
At ~10 blocks/second during sync, this means 10 database round-trips per second per reducer. With multiple reducers, this multiplies.
Proposed Behavior
The framework (
CardanoIndexWorker) should control when persistence happens, not the individual reducers.Benefits
Implementation Plan
Phase 1: Core Infrastructure
BatchSizeconfiguration option (default: 100)FlushIntervalMsconfiguration option (default: 5000)IReducer<T>interface to optionally not require SaveChangesCardanoIndexWorker.ProcessRollforwardAsyncto batch savesPhase 2: DbContext Management
Phase 3: Dependent Reducer Optimization
Configuration Example
{ "Sync": { "Batch": { "Size": 100, "FlushIntervalMs": 5000, "FlushOnTip": true } } }Backward Compatibility
SaveChangesAsync()should continue to workPerformance Impact
Related
SaveChangesAsync()calls