Workload Identity Project

Welcome to the Workload Identity project! This repository contains a comprehensive set of documentation and guides to help you understand, implement, and manage workload identity in cloud-native environments.

Documentation Overview

Below is a list of the key README files and guides available in this repository:

Architecture Guide: A detailed overview of the system architecture, components, and their interactions. This guide is essential for understanding the design principles and implementation details of the workload identity system.
Developer Guide: Instructions and best practices for developers looking to integrate or extend the workload identity system. This guide includes API references, implementation tips, and code examples.
Compliance Guide: Guidelines and requirements for ensuring compliance with security standards and regulations. This guide covers metrics, thresholds, and automated compliance checks.
Deployment Guide: Step-by-step instructions for deploying the workload identity system in various environments, including Kubernetes and cloud providers.
Monitoring Guide: Information on how to monitor the workload identity system, including metrics collection, alerting, and performance optimization.
Security Best Practices: Best practices for securing the workload identity system, including encryption, access control, and audit logging.
API Reference Guide: Detailed documentation of the APIs provided by the workload identity system, including endpoints, parameters, and response formats.
Migration Guide: Instructions for migrating from existing identity systems to the workload identity system, including compatibility considerations and migration strategies.
Troubleshooting Guide: Common issues and solutions for troubleshooting the workload identity system, including diagnostic tools and debugging tips.
Integration Guide: Guidelines for integrating the workload identity system with other services and platforms, including cloud providers and service meshes.
Agentic Identity Guide: Workload identity for AI agents and autonomous systems. SPIFFE ID patterns for agent roles, JIT access provisioning, MCP gateway enforcement, and migration from API key authentication.
Post-Quantum Cryptography Migration Guide: Migration planning for NIST's 2024 PQC standards. Hybrid deployment architecture and compliance implications under CRA, EU AI Act, NIS2, and DORA.
CI/CD OIDC Federation Guide: Secretless CI/CD with GitHub Actions and GitLab. Working configurations for AWS, GCP, and Azure.
TPM 2.0 Node Attestation: Hardware-rooted attestation for physical infrastructure, edge, and air-gapped environments.
Resources: Curated references organized by use case.

Getting Started

To get started with the Workload Identity project, please refer to the Architecture Guide for an overview of the system design and the Developer Guide for implementation details.

Contributing

We welcome contributions! Please see the Contributing Guide for more information on how to contribute to this project.

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Contact

For any questions or feedback, please contact the maintainer at salkimmich.

Workload Identity System

Introduction

This repository implements a comprehensive workload identity system designed for modern cloud-native environments. It provides a robust foundation for secure service-to-service communication, identity management, and access control in distributed systems. The system is built on the principles of zero-trust security, enabling organizations to implement strong authentication and authorization mechanisms across their infrastructure.

Why Workload Identity?

In today's cloud-native architectures, identity is no longer a human-first problem. By 2025, enterprise environments averaged 45 non-human identities for every human identity - a ratio growing fast as AI agents, CI/CD pipelines, serverless functions, and automated workflows take on more operational work.

Workloads must authenticate and authorize dynamically, with provable integrity and no static secrets. Workload identity assigns each non-human actor a unique, verifiable identifier (a SPIFFE ID), enabling secure, policy-driven communication at scale.

For AI agents specifically, see the Agentic Identity Guide.

SPIRE Implementation

The project uses SPIRE (SPIFFE Runtime Environment) as its core workload identity provider. The implementation includes:

Core Components

SPIRE Server
- Deployed as a Kubernetes Deployment
- High availability configuration with rolling updates
- Comprehensive health checks and monitoring
- Secure secret management
- Resource limits and requests configured
- Security context with non-root user execution
SPIRE Agent
- Deployed as a Kubernetes DaemonSet
- Rolling update strategy for controlled updates
- Node attestation and workload registration
- Health checks and monitoring
- Resource limits and requests configured
- Security context with non-root user execution
Certificate Management
- Automated certificate generation and rotation
- CRL (Certificate Revocation List) support
- Backup and disaster recovery procedures
- Secure storage of certificates and keys

Security Features

Certificate Security
- Strong key usage constraints
- Extended key usage validation
- Certificate revocation support
- Automated backup of certificates
Runtime Security
- Non-root user execution
- Read-only root filesystem
- Dropped capabilities
- Prevented privilege escalation
Operational Security
- Comprehensive health checks
- Resource limits and requests
- Graceful termination
- Secure volume mounts

Monitoring and Operations

Health Checks
- Startup probes for slow-starting containers
- Liveness probes for container health
- Readiness probes for service availability
- Configurable timeouts and thresholds
Resource Management
- CPU and memory limits
- Resource requests for scheduling
- Graceful termination periods
- Update strategies for zero-downtime
Backup and Recovery
- Automated certificate backups
- Timestamped backup directories
- Secure backup storage
- Cleanup procedures

Documentation

Core Documentation

Architecture Guide - System architecture and component interactions
Security Best Practices - Security guidelines and implementation details
Developer Guide - Development workflows and best practices
Deployment Guide - Deployment procedures and configurations
Monitoring Guide - Monitoring setup and observability practices
Disaster Recovery Guide - Recovery procedures and backup strategies
API Reference Guide - API documentation and usage examples

Migration and Compliance

Migration Guide - Migration procedures from existing systems
Compliance Guide - Compliance requirements and implementation details
Troubleshooting Guide - Common issues and solutions

Repository Structure

.
├── docs/                      # Documentation
│   ├── architecture_guide.md  # System architecture and design
│   ├── security_best_practices.md # Security guidelines
│   ├── developer_guide.md    # Development workflows
│   ├── deployment_guide.md   # Deployment procedures
│   ├── monitoring_guide.md   # Monitoring and observability
│   ├── disaster_recovery_guide.md # Recovery procedures
│   ├── api_reference.md      # API documentation
│   ├── migration_guide.md    # Migration procedures
│   ├── compliance_guide.md   # Compliance requirements
│   ├── troubleshooting_guide.md # Common issues
│   ├── pki_guide.md         # PKI concepts and usage
│   └── pki_concepts_detailed.md # Detailed PKI documentation
├── tests/                    # Test infrastructure
│   ├── unit/                # Unit tests
│   ├── integration/         # Integration tests
│   ├── e2e/                # End-to-end tests
│   ├── security/           # Security tests
│   └── fixtures/           # Test fixtures and mocks
├── infrastructure/           # Infrastructure as Code
│   └── kubernetes/          # Kubernetes manifests
│       ├── spire/          # SPIRE server and agent configs
│       ├── monitoring/     # Monitoring stack configs
│       └── networking/     # Network policies and configs
├── core/                   # Core libraries and utilities
│   ├── identity/          # Identity management
│   ├── security/          # Security utilities
│   └── integration/       # Integration components
├── services/              # Service implementations
│   ├── identity-provider/ # Identity provider service
│   ├── policy-engine/     # Policy evaluation service
│   └── federation/        # Federation service
└── examples/              # Example applications
    ├── kubernetes/        # Kubernetes examples
    ├── cloud/            # Cloud provider examples
    └── service-mesh/     # Service mesh examples

Key Directories Explained

Documentation (`docs/`)

The documentation directory contains comprehensive guides covering all aspects of the system:

Architecture and design decisions
Security best practices and guidelines
Development workflows and standards
Deployment procedures and configurations
Monitoring and observability setup
Disaster recovery procedures
API reference and usage examples
Migration procedures from existing systems
Compliance requirements and implementation
Troubleshooting guides
PKI concepts and detailed documentation

Testing (`tests/`)

The testing directory contains a comprehensive test suite:

Unit tests for individual components
Integration tests for component interactions
End-to-end tests for complete workflows
Security tests including penetration and fuzzing tests
Test fixtures and mocks for various components

Key features:

Cloud provider test fixtures (AWS, GCP, Azure)
Kubernetes test configurations
Certificate generation and management
Mock implementations for external services

Infrastructure (`infrastructure/`)

The infrastructure directory contains all infrastructure-as-code configurations:

Kubernetes manifests for SPIRE deployment
Monitoring stack configurations
Network policies and security configurations
(Planned) Terraform configurations for cloud resources
(Planned) Ansible playbooks for automation

Core Libraries (`core/`)

The core directory contains the fundamental libraries and utilities:

Identity management components
Security utilities and helpers
Integration components for various platforms
Common utilities and shared code

Services (`services/`)

The services directory contains the main service implementations:

Identity provider service for workload identity
Policy engine for access control
Federation service for cross-domain trust
Additional supporting services

Examples (`examples/`)

The examples directory contains practical examples and templates:

Kubernetes deployment examples
Cloud provider integration examples
Service mesh integration examples
Common use case implementations

Key Features

Identity Management

The system provides robust identity management capabilities through a distributed architecture that ensures secure and scalable workload identity provisioning. Each workload receives a unique SPIFFE ID that is cryptographically verifiable and tied to its runtime environment.

Special Considerations:

Ephemeral workloads require identity at runtime—not via pre-provisioned secrets
Trust is anchored in secure enclaves, TPMs, or cloud-native CAs
The system is designed for rapid scaling and horizontal expansion
Supports both X.509 certificates and JWTs
Enables decentralized issuance with centralized governance
Federation-ready: supports trust bundles and multi-domain identity

graph TD
    A[Workload] -->|Attestation| B[SPIRE Agent]
    B -->|Identity Request| C[SPIRE Server]
    C -->|Validate| D[Node Attestation]
    C -->|Issue| E[SPIFFE ID]
    E -->|Bind| F[Workload Identity]
    F -->|Use| G[Service Communication]

Key components:

Automatic identity provisioning based on workload attributes
Certificate-based authentication using X.509 certificates
Role-based access control with fine-grained permissions
Support for multiple identity providers through federation

Security

The security architecture implements a zero-trust model where every service interaction requires mutual authentication and authorization. The system uses mTLS for secure communication and implements robust key management practices.

Security Features:

End-to-end mTLS with SPIFFE-based identities
Continuous credential rotation (5–15 minute TTLs)
Just-in-time identity provisioning and revocation
Optional confidential computing integration (e.g. Intel SGX, AMD SEV)
Real-time attestation: validate integrity of code and runtime environment

Temporal Governance:

Trust is not just who a workload is—but when it is valid
Time-bound credentials ensure credentials expire quickly, reducing lateral movement risk
Policies can enforce access windows and runtime conditions via engines like OPA

graph LR
    A[Service A] -->|mTLS| B[Service B]
    A -->|Verify| C[Certificate]
    B -->|Verify| D[Certificate]
    C -->|Validate| E[Trust Chain]
    D -->|Validate| E
    E -->|Check| F[Policy Engine]

Security features:

Mutual TLS (mTLS) for all service-to-service communication
Zero-trust model implementation with continuous verification
Secure key management with automatic rotation
Hardware security module (HSM) support for key storage

Integration

The system provides seamless integration with modern cloud-native platforms and tools, enabling organizations to implement workload identity across their entire infrastructure stack.

Integration capabilities:

Kubernetes: Uses native ServiceAccount tokens for OIDC federation
Cloud IAM: Compatible with AWS IRSA, GCP Workload Identity, Azure Federated Identity Credentials
Service Mesh: Works with Istio, Linkerd, and custom mTLS setups
CI/CD: Secure ephemeral identity for GitHub Actions, GitLab, Jenkins, and more
Secrets Managers: Authenticate to systems like Vault using ambient SPIFFE identity

graph TD
    A[Workload Identity] -->|Integrate| B[Kubernetes]
    A -->|Connect| C[Service Mesh]
    A -->|Federate| D[Cloud Providers]
    A -->|Automate| E[CI/CD]
    B -->|Use| F[Service Accounts]
    C -->|Use| G[mTLS]
    D -->|Use| H[Cloud IAM]
    E -->|Use| I[Pipeline Security]

Getting Started

Prerequisites

Kubernetes cluster (v1.24+)
Helm v3.7+
kubectl
SPIRE v1.13.3+ (see SPIRE releases)
Access to container registry

Quick Start

Review the Architecture Guide
Follow the Deployment Guide
Consult the Developer Guide
Refer to the Security Best Practices

Development Workflow

1. Local Development Setup

# Clone the repository
git clone https://github.com/your-org/workload-identity.git
cd workload-identity

# Set up development environment
make setup-dev

# Start local development cluster
make start-local-cluster

# Run tests
make test

2. Component Development

The system is organized into several key components:

Identity Provider
- Located in services/identity-provider/
- Handles workload identity issuance and validation
- Implements SPIFFE/SPIRE integration
- Manages trust relationships
Policy Engine
- Located in services/policy-engine/
- Evaluates access control policies
- Integrates with OPA for policy decisions
- Manages policy lifecycle
Federation Service
- Located in services/federation/
- Handles cross-domain trust
- Manages trust bundle exchange
- Implements federation protocols

3. Testing Strategy

Unit tests for core components
Integration tests for service interactions
End-to-end tests for complete workflows
Security tests for trust relationships
Performance tests for scalability

4. Deployment Process

Build components:
```
make build
```
Run tests:
```
make test
```
Build containers:
```
make docker-build
```
Deploy to cluster:
```
make deploy
```

5. Monitoring and Debugging

Use the monitoring stack for observability
Check logs using the logging system
Use tracing for request flows
Monitor metrics for performance

6. Contributing

Create a feature branch
Make your changes
Run tests and linting
Submit a pull request
Address review comments
Merge after approval

For more details, see the Developer Guide.

Basic Usage Example

apiVersion: workload-identity/v1
kind: WorkloadIdentity
metadata:
  name: my-service
spec:
  serviceAccount: my-service-account
  identityProvider: kubernetes
  policies:
    - name: service-access
      rules:
        - apiGroups: ["*"]
          resources: ["*"]
          verbs: ["get", "list"]

Migration

The system supports migration from various identity systems:

OIDC to Workload Identity
SAML to Workload Identity
Custom identity system migration

See the Migration Guide for detailed procedures.

Compliance

Supports major compliance frameworks:

ISO 27001
SOC 2
GDPR
HIPAA

See the Compliance Guide for implementation details.

Monitoring and Observability

The system provides comprehensive monitoring and observability capabilities through a modern observability stack:

Monitoring Stack

monitoring:
  metrics:
    prometheus:
      enabled: true
      retention: 15d
      scrape_interval: 15s
  logging:
    loki:
      enabled: true
      retention: 30d
  tracing:
    jaeger:
      enabled: true
      sampling_rate: 0.1

Key Metrics

Identity Metrics
- Identity issuance rate
- Identity validation success/failure
- Certificate rotation events
- Trust bundle updates
Security Metrics
- Authentication attempts
- Authorization decisions
- Policy evaluation latency
- Security violations
Performance Metrics
- Request latency
- Error rates
- Resource utilization
- Cache hit rates
Federation Metrics
- Cross-domain trust operations
- Trust bundle exchange events
- Federation health status
- Federation latency

Logging

Structured logging in JSON format
Log levels: DEBUG, INFO, WARN, ERROR
Log rotation and retention
Log aggregation and analysis

Tracing

Distributed tracing with OpenTelemetry
Request flow visualization
Latency analysis
Error tracking

Dashboards

System Overview
- System health
- Resource utilization
- Error rates
- Performance metrics
Security Dashboard
- Authentication metrics
- Authorization decisions
- Security events
- Policy evaluations
Federation Dashboard
- Trust relationships
- Federation health
- Cross-domain operations
- Trust bundle status

Alerting

Critical Alerts
- System failures
- Security violations
- Trust chain issues
- Federation failures
Warning Alerts
- High latency
- Error rate spikes
- Resource constraints
- Certificate expiration
Info Alerts
- Configuration changes
- Policy updates
- Trust bundle updates
- Federation events

For detailed setup and configuration, see the Monitoring Guide.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Acknowledgments

SPIFFE/SPIRE community
Open source contributors
Security researchers
Early adopters and testers

Future Considerations

Workload identity is the foundation of runtime trust for autonomous systems. As software shifts from static services to AI agents that reason, delegate, and act across trust domains, the primitives in this repository become more important, not less.

The Agentic Identity Guide covers how to extend this architecture to cover AI agents today. The PQC Migration Guide covers preparing this infrastructure for the post-quantum transition NIST formalized in 2024.

These are not future problems. They are current engineering requirements.

Senior Engineer Review Guide

Critical Files to Review

1. Core Implementation Files

core/identity/manager.go - Core identity management logic
core/security/crypto.go - Cryptographic operations and key management
core/integration/kubernetes.go - Kubernetes integration implementation
services/identity-provider/main.go - Identity provider service implementation
services/policy-engine/evaluator.go - Policy evaluation engine
infrastructure/kubernetes/spire/server.yaml - SPIRE server configuration
infrastructure/kubernetes/spire/agent.yaml - SPIRE agent configuration

2. Security-Critical Files

core/security/tls.go - TLS configuration and certificate management
core/security/auth.go - Authentication and authorization logic
core/security/audit.go - Audit logging implementation
infrastructure/kubernetes/spire/secrets.yaml - Secret management
infrastructure/kubernetes/spire/network-policy.yaml - Network security policies

3. Configuration Files

infrastructure/kubernetes/spire/configmap.yaml - Core configuration
infrastructure/kubernetes/spire/trust-bundle.yaml - Trust domain configuration
infrastructure/kubernetes/spire/federation.yaml - Federation settings
infrastructure/kubernetes/monitoring/prometheus.yaml - Monitoring configuration

Essential Documentation Review

1. Architecture and Design

Architecture Guide - System design and component interactions
Security Best Practices - Security implementation details
PKI Guide - Certificate management and PKI concepts
API Reference - API design and implementation

2. Operational Documentation

Deployment Guide - Deployment architecture and procedures
Monitoring Guide - Observability and monitoring
Disaster Recovery Guide - Recovery procedures
Compliance Guide - Compliance implementation

Review Checklist

1. Security Architecture

Zero Trust principles implementation
Cryptographic implementation review
Authentication and authorization flows
Secret management approach
Network security controls
Audit logging implementation

2. Scalability and Performance

3. Reliability and Resilience

4. Integration Points

5. Operational Considerations

Key Areas of Focus

1. Security Implementation

Review cryptographic implementations for best practices
Verify secure key management and rotation
Check authentication and authorization flows
Validate audit logging and monitoring
Assess network security controls

2. Architecture Design

Evaluate component interactions
Review scalability considerations
Check fault tolerance mechanisms
Verify high availability design
Assess integration patterns

3. Operational Excellence

Review deployment procedures
Check monitoring implementation
Verify backup and recovery
Assess maintenance procedures
Validate configuration management

4. Compliance and Standards

Review security standards compliance
Check audit requirements
Verify documentation completeness
Assess policy implementation
Validate control effectiveness

Post-Review Actions

1. Documentation Updates

Update architecture documentation
Enhance security guidelines
Improve operational procedures
Add troubleshooting guides
Update compliance documentation

2. Implementation Improvements

Address security findings
Enhance scalability features
Improve monitoring capabilities
Strengthen disaster recovery
Optimize performance

3. Operational Enhancements

Improve deployment procedures
Enhance monitoring setup
Strengthen backup procedures
Optimize maintenance tasks
Update security controls

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
config		config
core		core
docs		docs
examples		examples
infrastructure/kubernetes		infrastructure/kubernetes
mTLS_demo		mTLS_demo
scripts		scripts
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
The Past is Prologue What Workload Identity’s History Tells Us About the Future.pdf		The Past is Prologue What Workload Identity’s History Tells Us About the Future.pdf
VERSION		VERSION

Folders and files

Latest commit

History

Repository files navigation

Workload Identity Project

Documentation Overview

Getting Started

Contributing

License

Contact

Workload Identity System

Introduction

Why Workload Identity?

SPIRE Implementation

Core Components

Security Features

Monitoring and Operations

Documentation

Core Documentation

Migration and Compliance

Repository Structure

Key Directories Explained

Documentation (docs/)

Testing (tests/)

Infrastructure (infrastructure/)

Core Libraries (core/)

Services (services/)

Examples (examples/)

Key Features

Identity Management

Security

Integration

Getting Started

Prerequisites

Quick Start

Development Workflow

1. Local Development Setup

2. Component Development

3. Testing Strategy

4. Deployment Process

5. Monitoring and Debugging

6. Contributing

Basic Usage Example

Migration

Compliance

Monitoring and Observability

Monitoring Stack

Key Metrics

Logging

Tracing

Dashboards

Alerting

License

Acknowledgments

Future Considerations

Senior Engineer Review Guide

Critical Files to Review

1. Core Implementation Files

2. Security-Critical Files

3. Configuration Files

Essential Documentation Review

1. Architecture and Design

2. Operational Documentation

Review Checklist

1. Security Architecture

2. Scalability and Performance

3. Reliability and Resilience

4. Integration Points

5. Operational Considerations

Key Areas of Focus

1. Security Implementation

2. Architecture Design

3. Operational Excellence

4. Compliance and Standards

Post-Review Actions

1. Documentation Updates

2. Implementation Improvements

3. Operational Enhancements

Additional Resources

1. Standards and Best Practices

Documentation (`docs/`)

Testing (`tests/`)

Infrastructure (`infrastructure/`)

Core Libraries (`core/`)

Services (`services/`)

Examples (`examples/`)

Packages