[Workload]: object-storage

### Workload Name

object-storage

### Workload Description

S3-compatible object storage workload that performs sustained PUT, GET, DELETE, and LIST operations against an S3-compatible endpoint. Produces continuous object lifecycle traffic — uploading objects of configurable sizes, reading them back, listing buckets, and deleting old objects — to stress object storage backends from inside a KubeVirt VM.

This fills a fundamentally different storage niche than the existing `disk` workload. The `disk` workload tests *block storage* via the CSI driver — fio issues read/write syscalls against a mounted filesystem backed by a PersistentVolume. `object-storage` tests the *S3 API path* — HTTP PUT/GET requests over the network to an object storage service. These are completely different data planes: different protocols (POSIX vs HTTP), different access patterns (random block I/O vs whole-object upload/download), different backend implementations, and different partner products.

The primary targets are OpenShift Data Foundation (ODF) with its Ceph-backed S3 (via RADOS Gateway), MinIO, and cloud storage gateway partners. ODF is a core Red Hat product and validating its S3 performance from VMs is a gap today.

### Tooling and Packages

- Tool: `warp` (MinIO's S3 performance benchmark) or `s5cmd` (fast S3 client) or AWS CLI (`aws s3`)
- RPM packages: none — `warp` and `s5cmd` are single Go binaries; AWS CLI available via `pip3 install awscli`
- systemd service command: `warp mixed --host=<endpoint> --access-key=<key> --secret-key=<secret> --bucket=virtwork-bench --duration=0 --obj.size=1MiB --concurrent=16`
  - `--duration=0`: run indefinitely
  - `--obj.size=1MiB`: 1 MiB object size (configurable)
  - `--concurrent=16`: 16 concurrent operations
  - `mixed` mode: 50% GET, 30% PUT, 10% DELETE, 10% LIST (default distribution)
- Configurable parameters:
  - `s3-endpoint`: S3-compatible endpoint URL (required — no default, must point to ODF/MinIO/external)
  - `s3-access-key` / `s3-secret-key`: credentials (via Secret)
  - `s3-bucket`: target bucket name (default: `virtwork-bench`)
  - `s3-object-size`: object size (default: `1MiB`)
  - `s3-concurrency`: concurrent operations (default: 16)
  - `s3-op-mix`: operation distribution (default: mixed — 50% GET, 30% PUT, 10% DELETE, 10% LIST)

### VM Count Model

Single VM (like cpu, memory, disk)

### Required Resources

- [ ] Persistent storage (DataVolume)
- [ ] Kubernetes Service (for inter-VM communication)
- [x] Kubernetes Secret (for credentials or config)
- [ ] Additional CPU/memory beyond defaults
- [ ] GPU or special device passthrough

The Secret holds S3 access key and secret key credentials. The S3 endpoint is typically a Kubernetes Service (e.g., `rook-ceph-rgw-ocs-storagecluster-cephobjectstore.openshift-storage.svc.cluster.local`) or an external URL.

### Cloud-Init Details

```yaml
write_files:
  - path: /usr/local/bin/virtwork-object-storage.sh
    permissions: '0755'
    content: |
      #!/bin/bash
      set -euo pipefail
      S3_ENDPOINT="${S3_ENDPOINT:?S3_ENDPOINT is required}"
      S3_ACCESS_KEY="${S3_ACCESS_KEY:?S3_ACCESS_KEY is required}"
      S3_SECRET_KEY="${S3_SECRET_KEY:?S3_SECRET_KEY is required}"
      S3_BUCKET="${S3_BUCKET:-virtwork-bench}"
      S3_OBJ_SIZE="${S3_OBJ_SIZE:-1MiB}"
      S3_CONCURRENCY="${S3_CONCURRENCY:-16}"

      exec /usr/local/bin/warp mixed \
        --host="$S3_ENDPOINT" \
        --access-key="$S3_ACCESS_KEY" \
        --secret-key="$S3_SECRET_KEY" \
        --bucket="$S3_BUCKET" \
        --obj.size="$S3_OBJ_SIZE" \
        --concurrent="$S3_CONCURRENCY" \
        --duration=0 \
        --autoterm \
        --noclear
  - path: /etc/systemd/system/virtwork-object-storage.service
    content: |
      [Unit]
      Description=Virtwork S3 object storage workload
      After=network-online.target
      Wants=network-online.target
      [Service]
      Type=simple
      EnvironmentFile=/etc/virtwork/s3-credentials
      ExecStart=/usr/local/bin/virtwork-object-storage.sh
      Restart=always
      RestartSec=10
      [Install]
      WantedBy=multi-user.target
  - path: /etc/virtwork/s3-credentials
    permissions: '0600'
    content: |
      S3_ENDPOINT=<from-config>
      S3_ACCESS_KEY=<from-secret>
      S3_SECRET_KEY=<from-secret>
      S3_BUCKET=virtwork-bench
      S3_OBJ_SIZE=1MiB
      S3_CONCURRENCY=16
runcmd:
  - curl -Lo /usr/local/bin/warp https://github.com/minio/warp/releases/download/v1.1.4/warp_Linux_x86_64
  - chmod +x /usr/local/bin/warp
  - systemctl enable --now virtwork-object-storage.service
```

### Use Case

- **ODF (OpenShift Data Foundation) validation:** ODF provides S3-compatible object storage via Ceph RADOS Gateway. Validating S3 throughput and latency from VMs is a gap — most ODF testing comes from pods. VMs accessing ODF's S3 endpoint exercise a different network path (VM pod network → Service → RGW pod) and are the access pattern that enterprise customers migrating from VMware will use.
- **MinIO partners:** MinIO is widely deployed on OpenShift as an alternative object store. Partners need sustained S3 traffic to validate throughput, multi-part upload handling, and consistency guarantees under load from VMs.
- **Cloud storage gateway partners (NetApp StorageGRID, Scality, Cloudian):** These products provide S3-compatible interfaces to enterprise storage. Partners need to validate that their gateway handles sustained object operations from VM-based applications — the typical deployment pattern when migrating legacy applications that use S3 libraries.
- **Backup/DR partners:** Many backup products (Velero, Cohesity, Commvault) write backup data to S3-compatible storage. This workload simulates the I/O pattern of a backup job — large sequential PUTs followed by periodic GETs for restore validation and DELETEs for retention policy enforcement.
- **Data pipeline partners:** Applications running in VMs that produce data (logs, telemetry, ETL output) often write to object storage. Partners building data pipeline products need to validate that their S3 ingestion handles sustained PUT traffic from VM-based producers.

### Additional Context

- **Endpoint configuration is required:** Unlike other workloads that are self-contained, `object-storage` requires an external S3 endpoint. The implementation should fail fast with a clear error if `s3-endpoint` is not configured. Consider detecting ODF's default RGW endpoint automatically if ODF is installed on the cluster.
- **warp vs alternatives:**
  - `warp` (MinIO): purpose-built S3 benchmark, single binary, excellent statistics output, supports mixed workloads. Preferred for initial implementation.
  - `s5cmd`: fast S3 client, good for simple PUT/GET loops, less configurable workload profiles.
  - AWS CLI: universally available but slower and less suitable for sustained benchmarking.
  - `cosbench` (Intel): comprehensive but heavyweight (Java, requires orchestrator). Overkill for a VM-based workload.
- **Bucket lifecycle:** The `--noclear` flag tells warp not to clean up the bucket on exit. Combined with `Restart=always`, this means objects accumulate across service restarts. The `--autoterm` flag enables automatic benchmarking termination detection. For indefinite operation, the mixed mode's DELETE operations naturally reclaim space — but monitor bucket size for very long runs.
- **Object size spectrum:** The default 1MiB is a balanced choice. For backup/DR validation, use larger objects (64MiB-1GiB). For metadata-heavy workloads (many small files), use smaller objects (4KiB-64KiB). Configurable via `s3-object-size`.
- **Multi-VM scaling:** At `--vm-count 5` with 16 concurrent ops each, this produces 80 concurrent S3 operations — enough to stress most object storage deployments. The operations are naturally distributed across the object namespace (warp uses random object keys), so multiple VMs don't create hot-spot contention.
- **Credential management:** S3 credentials should be injected via a Kubernetes Secret, written to `/etc/virtwork/s3-credentials` as an `EnvironmentFile`, and never logged. This follows the same pattern as SSH credentials in the existing workloads.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Workload]: object-storage #163

Workload Name

Workload Description

Tooling and Packages

VM Count Model

Required Resources

Cloud-Init Details

Use Case

Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Workload]: object-storage #163

Description

Workload Name

Workload Description

Tooling and Packages

VM Count Model

Required Resources

Cloud-Init Details

Use Case

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions