Automation Controller E2E Test Framework

Pytest-based end-to-end tests for the automation-controller running on a self-managed local kind cluster.

Prerequisites

pip install pytest pytest-timeout kubernetes

Host tools required on the machine that runs the suite:

kind
kubectl
helm
docker
make

The framework assumes it can create and manage its own Kind cluster for a test run. By default it bootstraps:

a Kind cluster matching --kube-context
metrics-server for HPA coverage
KEDA for KEDA-managed HPA coverage
VPA for VPA-backed example coverage

The CI matrix runs two variants: v1.35.0 with the full stack and v1.32.0 with metrics-server only (WITH_KEDA=false WITH_VPA=false).

Controller installation is handled by the Python bootstrap module. It installs the Helm charts using chart defaults by default, and only generates image override values when you pass --controller-image-repository and --controller-image-tag.

By default the runners also load examples/recommendations.json into a recommendations ConfigMap and enable the chart's localRecommendations mode so recommendation-dependent tests exercise real data instead of just status fields.

VPA is installed by default so the full suite can cover VPA-backed examples and filters. Set WITH_VPA=false when you explicitly want a leaner cluster bootstrap.

If you already have a cluster and controller running, pass --skip-kind-bootstrap to disable framework-managed bootstrap.

The framework can also target specific Kubernetes versions by selecting the Kind node image with --kind-node-image. This matters for resize behavior because Kubernetes 1.35+ supports in-place resize directly, while pre-1.35 clusters may fall back to eviction-driven behavior.

After bootstrap, the suite expects these controller-managed resources to exist:

Resource	Name	Scope
`GlobalConfiguration`	`global-config`	cluster
`PolicyEvaluation`	`policy-evaluation`	cluster

Usage

# Basic run against the default test cluster
./scripts/run-full-suite.sh

# Explicit environment overrides
WITH_METRICS_SERVER=true \
WITH_KEDA=true \
WITH_VPA=true \
./scripts/run-full-suite.sh

# Keep the cluster for inspection
pytest tests/ -v \
  --keep-kind-cluster

# Run the full suite against Kubernetes v1.35.0 (full stack) and
# v1.32.0 (metrics-server only)
./scripts/run-full-matrix-local.sh

# The local suite uses an in-cluster mock Kubex upstream by default.
# Disable it if you want the older local recommendations file flow.
DEPLOY_KUBEX_STUB=false ./scripts/run-full-matrix-local.sh

# Run a subset of tests through the matrix bootstrap
./scripts/run-full-matrix-local.sh tests/test_automation_strategy.py
./scripts/run-full-matrix-local.sh tests/test_policies.py::TestProactivePolicy::test_create_proactive_policy
./scripts/run-full-matrix-local.sh tests/test_automation_strategy.py tests/test_policies.py

# Keep the cluster alive between subset reruns while debugging
KEEP_KIND_CLUSTER=1 ./scripts/run-full-matrix-local.sh tests/test_policies.py::TestProactivePolicy::test_create_proactive_policy


# Pin a single run to a specific Kind node image
NODE_IMAGE=kindest/node:v1.35.0 \
./scripts/run-full-suite.sh

# Use an existing cluster without bootstrapping a new Kind environment
pytest tests/ -v \
  --skip-kind-bootstrap \
  --kube-context kind-e2e

# Override the controller image instead of using chart defaults
pytest tests/ -v \
  --controller-image-repository <your-image-repo> \
  --controller-image-tag <your-image-tag>

# Validate the vendored example bundles against the bootstrapped cluster
pytest tests/test_examples.py -v

# Exercise the valid examples against a live cluster and assert workload health
pytest tests/test_example_behavior.py -v

# Run a single test module
pytest tests/test_automation_strategy.py -v

# Run a single test class
pytest tests/test_policies.py::TestStaticPolicy -v

# Run with a timeout (seconds) per test
pytest tests/ -v --timeout=120

CLI Options

Option	Default	Description
`--kube-context`	`kind-e2e`	kubectl context to target
`--kind-cluster-name`	derived from context	Kind cluster name to create/delete
`--kind-node-image`	`kindest/node:v1.35.0`	Kind node image used when creating the cluster
`--namespace`	`kubex`	Namespace where the controller is deployed
`--test-namespace`	`e2e-test`	Namespace for test workloads (created/deleted per session)
`--recommendations-file`	(none)	Path to a JSON recommendations fixture to load
`--controller-image-repository`	chart default	Controller image repository override for Helm installation
`--controller-image-tag`	chart default	Controller image tag override for Helm installation
`--controller-image-pull-policy`	`IfNotPresent`	Controller image pull policy override for Helm installation
`--keep-kind-cluster`	`false`	Keep the cluster after the test session
`--skip-kind-bootstrap`	`false`	Use the current kube context without creating a cluster
`--without-vpa`	`false`	Skip VPA installation
`PYTEST_WORKERS`	unset	Optional `pytest-xdist` worker count; leave unset for the default serial run
`--without-keda`	`false`	Skip KEDA installation
`--without-metrics-server`	`false`	Skip metrics-server installation

`run-full-suite.sh` environment variables

Variable	Default	Description
`CLUSTER_NAME`	`e2e`	Kind cluster name
`NODE_IMAGE`	`kindest/node:v1.35.0`	Kind node image (override to test another version)
`KEEP_KIND_CLUSTER`	unset	Set to `true` to pass `--keep-kind-cluster` to pytest and skip the uninstall/teardown step
`WITH_METRICS_SERVER`	`true`	Set to `false` to skip metrics-server installation
`WITH_KEDA`	`true`	Set to `false` to skip KEDA installation
`WITH_VPA`	`true`	Set to `false` to skip VPA installation
`HELM_CRDS_CHART`	`kubex/kubex-crds`	Override the kubex-crds chart reference
`HELM_CONTROLLER_CHART`	`kubex/kubex-automation-engine`	Override the controller chart reference
`HELM_CRDS_CHART_VERSION`	unset	Override the kubex-crds chart version
`HELM_CONTROLLER_CHART_VERSION`	unset	Override the controller chart version
`HELM_REPO_URL`	chart default	Override the Helm chart repository URL
`CONTROLLER_IMAGE_REPOSITORY`	chart default	Controller image repository override
`CONTROLLER_IMAGE_TAG`	chart default	Controller image tag override
`PYTEST_WORKERS`	unset	Optional `pytest-xdist` worker count
`DEPLOY_KUBEX_STUB`	`true`	Deploy an in-cluster Kubex upstream server and point the gateway sidecar at it
`KUBEX_URL_HOST`	unset	Override the upstream host used by the gateway sidecar when not using the in-cluster stub
`KUBEX_URL_SCHEME`	unset	Override the upstream scheme used by the gateway sidecar when not using the in-cluster stub

Layout

e2e-testing/
├── bootstrap.py                     # Kind bootstrap and Helm installation helpers
├── conftest.py                      # CLI options, fixtures, K8sClients dataclass
├── examples/                        # Vendored example manifests used by test_examples.py
│   └── invalid/                     # Intentionally invalid examples that should be rejected
├── helpers.py                       # Constants, k8s utilities, manifest builders
├── scripts/
│   └── run-full-suite.sh              # Bootstrap, run the functional suite, then verify uninstall
│   └── run-full-matrix-local.sh       # Build local images and run the full Kind version matrix
└── tests/
    ├── test_health.py               # Controller pod, webhooks, metrics smoke tests
    ├── test_crd_validation.py       # Admission webhook schema enforcement
    ├── test_automation_strategy.py  # AutomationStrategy CRUD
    ├── test_policies.py             # StaticPolicy, EnablementGates, ClusterStaticPolicy, ProactivePolicy
    ├── test_global_config.py        # GlobalConfiguration + recommendation reload status
    ├── test_recommendation_behavior.py # Recommendation-content behavior using local fixture data
    ├── test_metrics.py              # Prometheus metrics endpoint
    ├── test_examples.py             # Valid example apply/delete coverage + invalid example rejection
    ├── test_example_behavior.py     # Live-cluster behavior coverage for vendored examples
    ├── test_resize_behavior.py      # Real workload in-place resize vs eviction fallback by Kubernetes version
    ├── test_webhook.py              # Mutating webhook annotation injection
    └── test_safety.py              # HPA filter, protected namespace

Test Classes

Class	Module	Area	Notes
`TestControllerHealth`	`test_health.py`	Pod readiness, webhook certificate, metrics	Smoke tests — run first
`TestCRDValidation`	`test_crd_validation.py`	CRD schema enforcement	Verifies required fields, rejects bad specs
`TestAutomationStrategy`	`test_automation_strategy.py`	`AutomationStrategy` CRUD	Namespaced; tests all enablement flag combinations
`TestStaticPolicy`	`test_policies.py`	`StaticPolicy` CRUD + resource mutation	Creates a Deployment and verifies CPU/mem are updated
`TestEnablementGates`	`test_policies.py`	Per-direction enable/disable flags	Verifies downsize-only / upsize-only gate behaviour
`TestClusterStaticPolicy`	`test_policies.py`	`ClusterStaticPolicy` namespace selector	`In` applies, `NotIn` excludes — cluster-scoped
`TestProactivePolicy`	`test_policies.py`	`ProactivePolicy` CRUD + staleness gate	`maxAnalysisAgeDays=0` edge case
`TestGlobalConfiguration`	`test_global_config.py`	`GlobalConfiguration` singleton	Update + revert; verifies persistence via reconciler
`TestRecommendations`	`test_global_config.py`	Recommendation load status	Checks `recommendationReload` status fields
`TestRecommendationBehavior`	`test_recommendation_behavior.py`	Recommendation-content behavior	Verifies local recommendations mutate matching workloads and respect `KubexAutomation` per container
`TestMetrics`	`test_metrics.py`	Prometheus metrics endpoint	Verifies `controller_runtime_reconcile_total` is exposed
`TestExampleBehavior`	`test_example_behavior.py`	Live example coverage	Applies every valid vendored example and asserts declared resources exist and workloads become ready
`TestHPAExampleBehavior`	`test_example_behavior.py`	Example-backed HPA safety	Applies HPA examples and verifies the controller preserves workload requests
`TestResizeBehavior`	`test_resize_behavior.py`	Real workload resize behavior	Verifies pod identity stays stable only when the live cluster actually supports in-place resize, and changes otherwise
`TestWebhookAnnotations`	`test_webhook.py`	Mutating webhook pod annotation	Checks `automation-webhook.kubex.ai/pod-rightsizing-info`; verifies `PodAdmissionWebhookHealthy` condition
`TestHPAFilter`	`test_safety.py`	Safety check: HPA protection	Resize must be blocked when an HPA targets the workload
`TestProtectedNamespace`	`test_safety.py`	Safety check: protected namespace patterns	`kube-*` default; custom pattern round-trip

Notes

Kind bootstrap is handled by bootstrap.py.
The main local entry point is scripts/run-full-suite.sh.
scripts/run-full-matrix-local.sh builds the local controller images, then runs the full-suite flow twice: once for v1.35.0 with the full stack (metrics-server, KEDA, VPA) and once for v1.32.0 with metrics-server only (KEDA and VPA skipped). Pass one or more pytest nodeids/paths to run only that subset through the matrix bootstrap.
The local suite can deploy an in-cluster Python mock Kubex service, feed recommendations from examples/recommendations.json, and assert heartbeat/policy/mutation uploads through the real gateway sidecar path.
The full-suite runner verifies install through the functional tests, then uninstalls the controller Helm release and kubex-crds and verifies their removal.
The bootstrap flow installs metrics-server, KEDA, and VPA by default. Set WITH_KEDA=false, WITH_VPA=false, or WITH_METRICS_SERVER=false to skip individual addons. The CI matrix uses the full stack on v1.35.0 and metrics-server only on v1.32.0 (WITH_KEDA=false WITH_VPA=false).
The default full-suite runner is serial because many tests mutate shared cluster state and vendored example resources; set PYTEST_WORKERS only after isolating those tests.
Tests can use supports_in_place_resize as a coarse version check, but behavior-sensitive tests should gate on the live actual_in_place_resize_support probe fixture.
Test workloads are created in --test-namespace and cleaned up after each test class via autouse fixtures.
Recommendation-dependent tests should run with recommendations available, either by passing --recommendations-file or by generating recommendation input as part of bootstrap.
run-full-suite.sh defaults RECOMMENDATIONS_FILE to examples/recommendations.json; set it to another path or to an empty string if you want to disable local recommendation injection.
The TestWebhookAnnotations.test_webhook_probe_annotation_handled test polls GlobalConfiguration.status.conditions and may take up to 120 s on a cold cluster.
ClusterAutomationStrategy and ClusterStaticPolicy resources created during tests are deleted in teardown; if a test is interrupted run kubectl delete clusterautomationstrategies,clusterstaticpolicies -l app.kubernetes.io/managed-by=e2e as a manual cleanup.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.github		.github
charts		charts
examples		examples
scripts		scripts
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bootstrap.py		bootstrap.py
conftest.py		conftest.py
example_utils.py		example_utils.py
helm_post_renderer.py		helm_post_renderer.py
helpers.py		helpers.py
metadata.json		metadata.json
mock_kubex_server.py		mock_kubex_server.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automation Controller E2E Test Framework

Prerequisites

Usage

CLI Options

`run-full-suite.sh` environment variables

Layout

Test Classes

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Automation Controller E2E Test Framework

Prerequisites

Usage

CLI Options

run-full-suite.sh environment variables

Layout

Test Classes

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`run-full-suite.sh` environment variables

Packages