Replicating the services database

Currently, `sable_services` writes its database as a single JSON file on disk. This is similar to what Atheme does, so we know it works are least at Libera.Chat's scale.

While this can easily be replicated to other services, it means `sable_services` going down causes an outage where people cannot login, channel ops cannot be opped, etc. This happens on Libera.Chat from time to time.

Given Sable's distributed architecture, we can do better here. @spb's idea is to have multiple `sable_services` nodes, one of which would be a leader and would stream its database to the other.
The database could remain a single JSON file, but it might become a scaling concern to copy this file over and over. We see a few options to solve this:

1. use a database that supports streaming replication, [like PostgreSQL](https://wiki.postgresql.org/wiki/Streaming_Replication).
2. make `sable_services` nodes coordinate over the Sable network, and each have their own independent database
3. make `sable_services` nodes share a single replicated database (Cassandra, something on top of Ceph, CockroachDB, ...)

With options 1 and 2, if we want high availability,it means `sable_services` needs to somehow have a leader election, because we can't allow write to the same objects from multiple nodes at the same time. PostgreSQL does not provide a solution to this, and [expects users to tell it when to switch between follower/leader state](https://www.postgresql.org/docs/current/warm-standby-failover.html).

And option 3 may be unsustainable for Libera, as all solutions I'm aware of in this space require extensive specialized knowledge with that solution (maybe not CockroachDB though? I've never tried it). In particular, Cassandra and Ceph are designed to work with petabyte-scale data, which is far beyond what we need here. Additionally, they often come with constraints/caveats in what software developers can do with the database.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Replicating the services database #119

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Replicating the services database #119

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions