Skip to content

Xgerman/k3s azure fleet playground#233

Open
xgerman wants to merge 7 commits intodocumentdb:mainfrom
xgerman:xgerman/k3s-azure-fleet-playground
Open

Xgerman/k3s azure fleet playground#233
xgerman wants to merge 7 commits intodocumentdb:mainfrom
xgerman:xgerman/k3s-azure-fleet-playground

Conversation

@xgerman
Copy link
Collaborator

@xgerman xgerman commented Feb 9, 2026

No description provided.

German added 2 commits February 9, 2026 09:30
Add a new playground demonstrating DocumentDB on k3s clusters running on
Azure VMs, integrated with KubeFleet for cluster membership and Istio for
cross-cluster networking.

Key features:
- k3s on Azure VMs (lightweight Kubernetes for edge scenarios)
- AKS hub cluster with KubeFleet for fleet management
- Istio service mesh for cross-cluster replication
- Azure VM Run Command for all VM operations (no SSH required)
- Multi-region deployment across 3 Azure regions
- Comprehensive troubleshooting and lessons learned docs

Files: Bicep infrastructure, 8 deployment scripts, CRP manifests, README
This deploys k3s in Azure and adds scripts to install
documentdb-operator, etc.
Copilot AI review requested due to automatic review settings February 9, 2026 17:39
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new “k3s on Azure VMs + AKS hub” playground under documentdb-playground/k3s-azure-fleet, including IaC, multi-cluster (Fleet) setup, Istio multi-primary networking, and scripts to install/deploy DocumentDB across clusters.

Changes:

  • Introduces Azure Bicep/ARM templates and parameterization for AKS hub + per-region k3s VMs.
  • Adds end-to-end automation scripts (deploy infra, install Istio, setup Fleet, install cert-manager/operator, deploy DocumentDB, test, cleanup).
  • Adds a public doc on reserving nodes for DocumentDB workloads.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
documentdb-playground/k3s-azure-fleet/test-connection.sh Cluster-by-cluster health/ready checks for namespaces, DocumentDB, services, secrets, operator
documentdb-playground/k3s-azure-fleet/setup-fleet.sh Installs KubeFleet hub-agent and joins member clusters; installs fleet-networking
documentdb-playground/k3s-azure-fleet/parameters.bicepparam Bicep parameter file intended to drive infra deployment
documentdb-playground/k3s-azure-fleet/main.json Generated ARM template for the infra (from Bicep)
documentdb-playground/k3s-azure-fleet/main.bicep Infra definition: AKS hub + per-region k3s VM, VNets, NSGs, public IPs
documentdb-playground/k3s-azure-fleet/install-istio.sh Installs Istio multi-cluster (shared CA, east-west gateway, remote secrets)
documentdb-playground/k3s-azure-fleet/install-documentdb-operator.sh Installs DocumentDB operator on hub via Helm and on k3s via Azure Run Command
documentdb-playground/k3s-azure-fleet/install-cert-manager.sh Installs cert-manager via Helm across clusters; applies CRP
documentdb-playground/k3s-azure-fleet/documentdb-resource-crp.yaml Namespace/Secret/DocumentDB plus Fleet placement resources for propagation
documentdb-playground/k3s-azure-fleet/documentdb-operator-crp.yaml Reference CRP for operator propagation (documented as not applied)
documentdb-playground/k3s-azure-fleet/deploy-infrastructure.sh Provisions infra + fetches kubeconfigs/contexts (AKS + k3s)
documentdb-playground/k3s-azure-fleet/deploy-documentdb.sh Generates and applies DocumentDB + Fleet placement config and verifies rollout
documentdb-playground/k3s-azure-fleet/delete-resources.sh Cleanup for Kubernetes resources and Azure resource group
documentdb-playground/k3s-azure-fleet/cert-manager-crp.yaml Fleet ClusterResourcePlacement for cert-manager resources
documentdb-playground/k3s-azure-fleet/README.md Full walkthrough, architecture, troubleshooting, and operational notes
documentdb-playground/k3s-azure-fleet/.gitignore Ignores generated deployment info, certs, chart packages, SSH key
docs/operator-public-documentation/reserving-nodes-for-documentdb.md Guidance on labeling/tainting nodes for DocumentDB workloads

Comment on lines 28 to 32
for cmd in kubectl helm git jq; do
if ! command -v "$cmd" &>/dev/null; then
echo "Error: Required command '$cmd' not found."
exit 1
fi
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script uses curl later (for GitHub tag discovery) but curl isn’t included in the prerequisites check, so it can fail with a confusing error. Add curl (and any other required tools used below) to the prerequisites list.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,28 @@
using './main.bicep'

param aksRegions = [
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parameters.bicepparam sets aksRegions, but main.bicep does not declare an aksRegions parameter (it only has hubLocation). Deployments using this .bicepparam file will fail until the parameter names match.

Suggested change
param aksRegions = [
param hubLocation = [

Copilot uses AI. Check for mistakes.
Comment on lines +58 to +62
pushd "$CERT_DIR" > /dev/null
if [ ! -d "istio-${ISTIO_VERSION}" ]; then
curl -sL "https://github.com/istio/istio/archive/refs/tags/${ISTIO_VERSION}.tar.gz" | tar xz
fi
make -f "istio-${ISTIO_VERSION}/tools/certs/Makefile.selfsigned.mk" root-ca
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Root CA generation relies on make (and typically openssl) being present locally, but the script doesn’t check for them. Add a prerequisites check near the top so the failure mode is clearer.

Copilot uses AI. Check for mistakes.
Comment on lines 19 to 22
param aksNodeCount = 2

param k3sVmSize = 'Standard_D2s_v3'

Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aksNodeCount and k3sVmSize are set here but main.bicep hardcodes the AKS node count and uses vmSize (not k3sVmSize) for k3s VMs. Either wire these parameters into the template or remove them from the param file to avoid failed/ineffective configuration.

Suggested change
param aksNodeCount = 2
param k3sVmSize = 'Standard_D2s_v3'

Copilot uses AI. Check for mistakes.
STATUS=$(kubectl --context "$cluster" get documentdb documentdb-preview -n documentdb-preview-ns -o jsonpath='{.status.phase}' 2>/dev/null || echo "Unknown")
echo "✓ (Status: $STATUS)"
else
echo "✗ Not found"
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DocumentDB existence check prints "✗ Not found" but does not increment FAILED, so the script can report success even when the DocumentDB resource is missing on a cluster. Treat this as a failed cluster check (and consider skipping subsequent checks that depend on it).

Suggested change
echo "✗ Not found"
echo "✗ Not found"
((FAILED++))
continue

Copilot uses AI. Check for mistakes.
Comment on lines 95 to 98
echo "Joining member clusters to fleet..."
chmod +x ./hack/membership/joinMC.sh
./hack/membership/joinMC.sh "$TAG" "$HUB_CLUSTER_NAME" $ALL_MEMBERS

Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

./hack/membership/joinMC.sh ... $ALL_MEMBERS relies on word-splitting and can be affected by glob expansion; it also makes it easy to accidentally pass a single concatenated string if ALL_MEMBERS formatting changes. Prefer building an array of member contexts and passing it as "${members[@]}".

Copilot uses AI. Check for mistakes.
Comment on lines 13 to 16
param hubRegion = 'westus3'

param kubernetesVersion = ''

Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hubRegion/kubernetesVersion are defined here, but main.bicep expects hubLocation and does not currently expose a kubernetesVersion parameter. This .bicepparam file won’t work as-is unless the parameter names match the template.

Suggested change
param hubRegion = 'westus3'
param kubernetesVersion = ''
param hubLocation = 'westus3'

Copilot uses AI. Check for mistakes.
Comment on lines 131 to 135
protocol: 'Tcp'
sourceAddressPrefix: '*'
sourcePortRange: '*'
destinationAddressPrefix: '*'
destinationPortRange: '6443'
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AllowKubeAPI NSG rule allows inbound 6443/TCP from * (public Internet) to the VM. This is a high-risk default for a Kubernetes API server; restrict sourceAddressPrefix to a known CIDR/IP, or make the API private and require VPN/Bastion/Jumpbox access.

Copilot uses AI. Check for mistakes.
# Generate manifest with substitutions
TEMP_YAML=$(mktemp)

sed -e "s/{{DOCUMENTDB_PASSWORD}}/$DOCUMENTDB_PASSWORD/g" \
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sed substitution injects the password directly into YAML without escaping. If the user supplies a password containing /, &, or newlines, this will produce an invalid manifest or the wrong secret value. Escape replacement strings (or use a safer templating approach like envsubst with properly quoted YAML) before generating the final YAML.

Suggested change
sed -e "s/{{DOCUMENTDB_PASSWORD}}/$DOCUMENTDB_PASSWORD/g" \
# Escape password for safe use in sed replacement
ESCAPED_DOCUMENTDB_PASSWORD=${DOCUMENTDB_PASSWORD//\\/\\\\}
ESCAPED_DOCUMENTDB_PASSWORD=${ESCAPED_DOCUMENTDB_PASSWORD//&/\\&}
ESCAPED_DOCUMENTDB_PASSWORD=${ESCAPED_DOCUMENTDB_PASSWORD//\//\\/}
sed -e "s/{{DOCUMENTDB_PASSWORD}}/$ESCAPED_DOCUMENTDB_PASSWORD/g" \

Copilot uses AI. Check for mistakes.
Comment on lines 64 to 67
properties: {
dnsPrefix: aksClusterName
kubernetesVersion: '1.32'
enableRBAC: true
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AKS kubernetesVersion is hardcoded to 1.32, but this repo’s other Bicep templates accept an optional kubernetesVersion parameter and omit it when empty to use the region default GA version (see documentdb-playground/aks-fleet-deployment/main.bicep). Consider following the same pattern here (and/or wiring up the kubernetesVersion parameter in parameters.bicepparam) to avoid deployments failing in regions where that version isn’t available.

Copilot uses AI. Check for mistakes.
- Fix parameters.bicepparam: align param names with main.bicep, remove unused params
- Parameterize kubernetesVersion in main.bicep (was hardcoded to 1.32)
- Add allowedSourceIP param to NSG rule for Kube API (was open to *)
- Add missing prerequisite checks: curl in setup-fleet.sh, make/openssl in
  install-istio.sh, az/base64/awk/curl in install-documentdb-operator.sh
- Fix test-connection.sh: increment FAILED counter for missing DocumentDB
  resource, service, and credentials secret
- Escape password for sed substitution in deploy-documentdb.sh
- Document intentional word-splitting in setup-fleet.sh joinMC.sh call
Comment on lines 5 to 7
# - AKS hub: installed via Helm from local chart package
# - k3s VMs: installed via Azure VM Run Command (CNPG from upstream, operator manifests via base64)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we install from our official published helm chart instead of building from local?

German added 4 commits February 10, 2026 14:56
…EADME

- Pre-generate Istio CA certificates locally (openssl) and inject via cloud-init
- Auto-generate Istio remote secrets on k3s VMs via cloud-init runcmd
- Add NSGs to Bicep for AKS and k3s subnets (prevents NRMS auto-creation)
- Open all required Istio ports (15010/15012/15017/15021/15443)
- Use all-Helm approach for k3s Istio install with --skip-schema-validation
- Use istio-remote-reader SA (avoids conflict with Helm istio-base chart)
- Remove main.json (Bicep is the source of truth)
- Update README with deployment architecture details and lessons learned
- Make kubernetesVersion optional in main.bicep (empty = region default)
- Add security warning for allowedSourceIP NSG default
- Support official OCI Helm chart install via BUILD_CHART=false
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants