A practical Azure cloud operations lab demonstrating infrastructure-as-code, RBAC governance, resource tagging, and observability. This project provisions a sample workload (App Service + Storage) with proper identity management, monitoring, and cost controls suitable for a spin-up/spin-down lab environment.
Key capabilities:
- Infrastructure as Code using Azure Bicep
- Entra ID RBAC configuration with role assignments
- Resource tagging for governance and cost tracking
- Azure Monitor and Application Insights integration
- Automated CI/CD with GitHub Actions
- Python-based compliance auditing tools
Resource Group: azure-ops-lab-rg-westus
Compute & Web:
- App Service Plan (F1 Free Tier)
- Web App (Python 3.11, Linux)
- System-assigned managed identity
Storage:
- Storage Account (Standard_LRS, StorageV2)
Observability:
- Application Insights (connected to all resources)
- Log Analytics Workspace (centralized logging)
Governance:
- RBAC: Contributor role assignment (Web App identity -> Resource Group)
- Tags: environment=lab, owner=brandon-metcalf, project=azure-ops-lab
- IaC Templates: Complete Bicep definitions for all Azure resources
- CI Validation: GitHub Actions workflow validates syntax without requiring Azure credentials
- Python Tooling: Tag compliance auditor using Azure SDK
- Governance Strategy: Consistent tagging and RBAC patterns
- Teardown Automation: Scripts for cost-effective resource cleanup
- Proof Cycle Execution: Completed end-to-end deploy -> verify -> tag-audit -> teardown run with captured evidence (
docs/proof/runs/2026-02-07T140703/), including F1 SKU verification and $0 cost controls
- Enable and validate the GitHub Actions OIDC deployment path end-to-end in this subscription (federated credentials + environment approvals)
- Configure additional Application Insights alerts for performance/availability monitoring
- Implement automated cost tracking and reporting
- Add integration tests for deployed resources
- Azure subscription (free tier compatible)
- Azure CLI (
az) installed - Bicep CLI installed
- Python 3.9+ with pip
- GitHub account (for Actions workflows)
Azure requires resource providers to be registered before you can deploy resources of that type. Run this once per subscription:
az provider register --namespace Microsoft.Web
az provider register --namespace Microsoft.Storage
az provider register --namespace Microsoft.Insights
az provider register --namespace Microsoft.OperationalInsights
az provider register --namespace Microsoft.AuthorizationRegistration is idempotent (safe to run multiple times) and typically completes in under a minute. The deploy scripts (scripts/deploy.sh, scripts/deploy-fast.ps1) also check and register these automatically.
Create a resource group:
az group create --name azure-ops-lab-rg-westus --location westusDeploy using Bicep:
az deployment group create \
--resource-group azure-ops-lab-rg-westus \
--template-file infra/main.bicep \
--parameters infra/parameters.jsonList deployed resources:
az resource list --resource-group azure-ops-lab-rg-westus --output tableInstall Python dependencies:
pip install azure-identity azure-mgmt-resourceRun the audit script:
python src/tag_audit.py \
--subscription-id YOUR_SUBSCRIPTION_ID \
--resource-group azure-ops-lab-rg-westus \
--output-format jsonTo delete all resources and avoid ongoing charges:
./scripts/teardown.sh azure-ops-lab-rg-westusOr manually:
az group delete --name azure-ops-lab-rg-westus --yes --no-wait- Principle of Least Privilege: Role assignments scoped to resource group level
- Entra ID Integration: Uses Azure RBAC for identity-based access control
- Role Used: Contributor role for deployment automation (read/write, no permission management)
All resources include mandatory tags for governance:
| Tag | Purpose | Example Value |
|---|---|---|
environment |
Deployment stage | lab, dev, prod |
owner |
Responsible party | brandon-metcalf |
project |
Cost allocation | azure-ops-lab |
Tags enable:
- Cost tracking and allocation
- Automated compliance auditing
- Resource lifecycle management
- Environment isolation
-
Application Insights: Application performance monitoring (APM)
- Request rates and response times
- Dependency tracking (Storage Account calls)
- Exception logging
- Custom metrics and events
-
Log Analytics Workspace: Centralized log aggregation
- Platform logs from all resources
- Query using KQL (Kusto Query Language)
- Retention configured for cost optimization
- Budget threshold warning configured at $1 spend (see
docs/proof/runs/2026-02-07T140703/budget-config.png)
- App Service response time > 2s
- Storage Account throttling events
- Failed authentication attempts
Designed for minimal cost:
- App Service: F1 Free tier (60 CPU minutes/day)
- Storage Account: Pay-as-you-go with minimal usage
- Application Insights: First 5GB/month free
- Log Analytics: First 5GB/month free
Spin-up/Spin-down approach:
- Use
az group deleteto remove all resources when not in use - Redeploy from IaC templates when needed
- No persistent data in lab environment
Regional quota constraints: As of Feb 10, 2026, operational default is westus (F1 quota approved). eastus2 is documented fallback. See incident report for details.
Estimated monthly cost: ~$0 with free tiers (if kept within limits)
- Validates Bicep syntax
- Checks Python code for errors
- Runs on every push to main
- No Azure credentials required
- Uses OIDC federation (no secrets in repo)
- Requires Azure federated credentials setup
- Manual workflow_dispatch trigger
- Environment-based approvals
azure-ops-lab/
+-- README.md # This file
+-- capture-evidence.sh # Full proof cycle automation
+-- docs/
| +-- architecture/
| | +-- high-availability-design.md # Multi-region HA design (design-only)
| +-- incidents/ # Incident reports and RCAs
+-- infra/
| +-- main.bicep # Infrastructure as Code definitions
| +-- parameters.json # Deployment parameters
+-- src/
| +-- tag_audit.py # Compliance auditing tool (Python)
+-- scripts/
| +-- deploy.sh # Deploy with what-if gate (Bash)
| +-- deploy-fast.ps1 # Fast deploy for demos (PowerShell)
| +-- teardown.sh # Resource cleanup script (Bash)
+-- .github/
+-- workflows/
+-- build.yml # Syntax validation (no Azure creds)
+-- deploy.yml # Deployment workflow (manual)
This project is maintained as a production-like environment. Real operational incidents are documented here to practice incident response and root cause analysis.
| Date | Incident | Impact | Status |
|---|---|---|---|
| 2026-02-02 | GitHub Actions Platform Outage | CI delayed ~6hrs (upstream Azure issue) | Resolved |
| 2026-02-07 | F1 Quota Regional Constraints | Deployment region changed; westus now primary | Resolved |
See docs/incidents/ for detailed incident reports and post-mortems.
Completed proof cycles are captured in docs/proof/runs/. Each run includes what-if output, deployment logs, SKU verification, tag audit, and teardown status checks.
Latest run: 2026-02-10T101157 — westus, F1 Free tier, $0 cost.
This is a personal lab project for learning Azure operations. Feel free to fork and adapt for your own learning purposes.
MIT License - Free to use and modify.
Author: Brandon Metcalf GitHub: @bmetcalf21 Project: azure-ops-lab