Skip to content

Commit 0a46d29

Browse files
committed
Add ADR-0010: CQRS Read/Write Distributor Separation
1 parent a3b187b commit 0a46d29

2 files changed

Lines changed: 135 additions & 0 deletions

File tree

Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
ADR-0010: CQRS Read/Write Distributor Separation
2+
=================================================
3+
4+
.. index:: ADR; CQRS distributor, distributor, WriteDistributor, shared store, Pipe, DistributedProvider
5+
6+
Status
7+
------
8+
Accepted
9+
10+
Context
11+
-------
12+
13+
The faker module uses **distributors** to control value selection with
14+
statistical distributions (weighted, Zipf/skew, uniform). A distributor
15+
stores a pool of created values and selects from them according to a
16+
distribution strategy.
17+
18+
Monolithic distributor
19+
^^^^^^^^^^^^^^^^^^^^^^
20+
21+
In the initial design each distributor owned both the storage and the
22+
selection strategy. This created a problem when the same pool of values
23+
needed to be accessed with **different distribution strategies**.
24+
25+
The motivating scenario is ``Pipe`` with ``DistributedProvider``.
26+
27+
Pipe topology
28+
^^^^^^^^^^^^^
29+
30+
``Pipe`` orchestrates sequential steps of aggregate generation. Each step
31+
can be wrapped in a ``DistributedProvider`` that adds distributor-based
32+
value selection::
33+
34+
Pipe(
35+
PipeStep('first_model', DistributedProvider(
36+
first_model_faker,
37+
distributor=make_distributor(
38+
weights=[0.9, 0.5, 0.1, 0.01], mean=10),
39+
)),
40+
PipeStep('second_model', DistributedProvider(
41+
second_model_faker,
42+
distributor=make_distributor(
43+
weights=[0.3, 0.2], mean=20),
44+
), require_fn=...),
45+
)
46+
47+
``DistributedProvider.populate()`` works as follows:
48+
49+
1. ``distributor.next()`` — tries to **read** an existing value from the pool
50+
using the configured distribution strategy.
51+
2. If ``ICursor`` is raised (pool exhausted or probabilistic creation signal) —
52+
delegates to ``inner.populate()`` to **create** a new value, then
53+
``cursor.append()`` **writes** it back to the pool.
54+
55+
The problem
56+
^^^^^^^^^^^
57+
58+
Multiple ``DistributedProvider`` instances may target the same aggregate type
59+
(e.g. ``FirstModel``) but with different distribution strategies:
60+
61+
- Pipe A selects FirstModels with ``weights=[0.9, 0.5]`` (heavy skew)
62+
- Pipe B selects FirstModels with ``weights=[0.3, 0.2]`` (more uniform)
63+
64+
With a monolithic distributor each instance maintains its own pool. A
65+
``FirstModel`` created via Pipe A's distributor is **invisible** to Pipe B's
66+
distributor. This leads to:
67+
68+
- **Data duplication** — the same aggregate type stored in multiple pools.
69+
- **Divergent pools** — each distributor sees only the values it created
70+
itself, distorting the intended distribution.
71+
- **Wasted round-trips** — synchronizing pools requires explicit observer
72+
plumbing.
73+
74+
Decision
75+
--------
76+
77+
Separate distributors into a **Write store** and **Read strategies** (CQRS
78+
within the distributor):
79+
80+
- ``WriteDistributor`` / ``PgWriteDistributor`` — owns the data (indexes,
81+
PG table). Single point of mutation (``append``). Always raises ``ICursor``
82+
on ``next()`` to signal the caller to create a new value.
83+
- ``WeightedDistributor`` / ``SkewDistributor`` / ``PgWeightedDistributor`` /
84+
``PgSkewDistributor`` — **stateless read decorators** over a shared store.
85+
Each implements ``_distribute(n) -> int`` — a pure function that selects an
86+
index given pool size. All reads delegate to the store's data.
87+
- ``NullableDistributor`` — decorator that probabilistically returns
88+
``Nothing`` before delegating to the inner distributor.
89+
90+
The ``distributor_factory`` / ``pg_distributor_factory`` accept an optional
91+
``store`` parameter to share a single write store across multiple read
92+
strategies:
93+
94+
.. code-block:: python
95+
96+
store = PgWriteDistributor()
97+
dist_a = pg_distributor_factory(weights=[0.9, 0.5], mean=5, store=store)
98+
dist_b = pg_distributor_factory(weights=[0.3, 0.2], mean=20, store=store)
99+
# Both read from the same PG table with different strategies
100+
101+
When ``store`` is not provided, the factory creates one internally — the
102+
single-distributor case works without any extra configuration.
103+
104+
Functional decomposition
105+
^^^^^^^^^^^^^^^^^^^^^^^^
106+
107+
The separation mirrors the natural decomposition in functional languages:
108+
109+
============================ ====================================
110+
OOP (Python) FP (Gleam / Elixir)
111+
============================ ====================================
112+
``WriteDistributor`` (state) Actor / Process (holds mutable state)
113+
``_distribute(n) -> int`` ``Strategy = fn(Int) -> Int`` (pure)
114+
``NullableDistributor`` Higher-order function wrapper
115+
``distributor_factory`` Config construction + process spawn
116+
============================ ====================================
117+
118+
Consequences
119+
------------
120+
121+
- Multiple ``DistributedProvider`` / ``ReferenceProvider`` instances for the
122+
same aggregate type can share a single pool via ``store`` parameter,
123+
eliminating data duplication and pool divergence.
124+
- The distribution strategy (``_distribute``) is a pure function with no
125+
state, making it trivial to test and reason about in isolation.
126+
- The factory API remains backwards-compatible: omitting ``store`` creates a
127+
dedicated store, preserving the simple single-distributor use case.
128+
- Adding a new distribution strategy requires only a new class with
129+
``_distribute(n) -> int`` — no changes to the store or the factory
130+
protocol.
131+
- ``PgWriteDistributor`` handles the diamond problem for shared stores:
132+
``setup()`` uses ``IF NOT EXISTS`` and an ``_initialized`` flag to ensure
133+
idempotent table creation even when multiple read distributors trigger
134+
setup concurrently.

docs/adr/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,4 @@ decision, and its consequences.
2020
0007-scaffold
2121
0008-aggregate-encapsulation
2222
0009-envelope-encryption
23+
0010-cqrs-read-write-distributor-separation

0 commit comments

Comments
 (0)