Skip to content

Create valkey-k8s-operator#28

Open
andrey-glazkov wants to merge 6 commits into
valkey-io:mainfrom
andrey-glazkov:patch-1
Open

Create valkey-k8s-operator#28
andrey-glazkov wants to merge 6 commits into
valkey-io:mainfrom
andrey-glazkov:patch-1

Conversation

@andrey-glazkov

@andrey-glazkov andrey-glazkov commented Sep 11, 2025

Copy link
Copy Markdown

This RFC proposes an open-source Kubernetes Operator for Valkey. It supports Cluster and Sentinel-based HA deployments and standalone nodes, with optional TLS, persistence, and Prometheus metrics integration.

Includes:
CRD schema and operator behavior
ConfigMap and Secret handling
Sentinel-managed shard support
AZ-aware scheduling design
Optional Failover CRD
Learnings from Helm-based deployments and internal prototypes

This RFC proposes an open-source Kubernetes Operator for Valkey.
It supports standalone and Sentinel-based HA deployments, with optional TLS, persistence, and Prometheus metrics integration.

Includes:
CRD schema and operator behavior
ConfigMap and Secret handling
Sentinel-managed shard support
AZ-aware scheduling design
Optional Failover CRD
Learnings from Helm-based deployments and internal prototypes

Signed-off-by: Andrey G <98027999+andrey-glazkov@users.noreply.github.com>
Comment thread VALKEY-K8S-OPERATOR.md
Comment thread valkey-k8s-operator Outdated
Comment thread VALKEY-K8S-OPERATOR.md
Comment thread valkey-k8s-operator Outdated
Comment thread valkey-k8s-operator Outdated
@rlunar

rlunar commented Sep 12, 2025

Copy link
Copy Markdown
Member

I would leave Sentinel as nice to have and Cluster Mode as a must have for version 0.1.0 besides Prometheus integration with Coroot will be ideal for Observability. Failover and node replacement has to be part of the functionality as well for reliability.

@zuiderkwast

Copy link
Copy Markdown
Contributor

I would leave Sentinel as nice to have and Cluster Mode as a must have for version 0.1.0

I think we don't need to define exactly what's must-have for a particular version. We can line out the long term goals, implement things incrementally and just make sure we don't shut the door on any of these long term goals.

It's great to have something to start with. We should review this RFC and make it covers what people want.

- Updated module reference to libvalkey_bloom.so
- Clarified Cluster Mode as primary HA/scalable deployment
- Updated Design Considerations to remove Bitnami Helm chart dependency
- Retained Sentinel HA and standalone mode for smaller deployments

Signed-off-by: Andrey G <98027999+andrey-glazkov@users.noreply.github.com>
Made sure we reference Valkey in all examples

Signed-off-by: Andrey G <98027999+andrey-glazkov@users.noreply.github.com>
Comment thread VALKEY-K8S-OPERATOR Outdated
Comment thread VALKEY-K8S-OPERATOR.md
Comment thread VALKEY-K8S-OPERATOR.md
Comment thread VALKEY-K8S-OPERATOR Outdated
Comment thread VALKEY-K8S-OPERATOR Outdated
Comment thread VALKEY-K8S-OPERATOR.md
Comment thread VALKEY-K8S-OPERATOR.md

@zuiderkwast zuiderkwast left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments and some of our requirements as suggestions in the comments.

Comment thread VALKEY-K8S-OPERATOR Outdated
Comment thread VALKEY-K8S-OPERATOR Outdated
Comment thread VALKEY-K8S-OPERATOR Outdated
Comment thread VALKEY-K8S-OPERATOR.md
Comment thread VALKEY-K8S-OPERATOR Outdated
Comment thread VALKEY-K8S-OPERATOR Outdated
Comment thread VALKEY-K8S-OPERATOR.md
Comment thread VALKEY-K8S-OPERATOR Outdated

@andrey-glazkov andrey-glazkov left a comment

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding some simple changes here re Cluster mode deployments and Requirements section

@Preisschild

Copy link
Copy Markdown

Would it make sense to fork Hyperspike's operator (https://github.com/hyperspike/valkey-operator) into this org and then continue development from there?

It seems that the basics (cluster mode) are already supported

Signed-off-by: Andrey G <98027999+andrey-glazkov@users.noreply.github.com>
Update Valkey Operator RFC with clarified mode definitions (cluster/sentinel/standalone), reliability requirements, TLS/mTLS support, Prometheus exporter wording, and CRD strategy. Added requirements section and design overviews for all modes.

Signed-off-by: Andrey G <98027999+andrey-glazkov@users.noreply.github.com>
added .md

Signed-off-by: Andrey G <98027999+andrey-glazkov@users.noreply.github.com>
Comment thread VALKEY-K8S-OPERATOR.md
@@ -0,0 +1,219 @@
RFC: 21

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RFC number is the PR number, which is #28

Suggested change
RFC: 21
RFC: 28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants