Skip to content

[Workload]: dns-stress #160

Description

@mrhillsman

Workload Name

dns-stress

Workload Description

High-rate DNS resolution workload that generates sustained DNS query volume from inside a VM. Produces configurable query patterns — A, AAAA, SRV, PTR record lookups — against both cluster-internal names (Kubernetes Service DNS) and external domains, exercising the full DNS resolution path from a KubeVirt VM through the pod network to CoreDNS.

DNS from VMs follows a different path than DNS from containers. A container's /etc/resolv.conf points directly to the CoreDNS ClusterIP. A KubeVirt VM's DNS resolution goes through the VM's internal resolver, out through the pod network interface, to CoreDNS — an additional network hop that is often the first thing to break under load or misconfiguration. Network partners, DNS partners, and anyone validating OVN-Kubernetes or SDN behavior need this signal.

Tooling and Packages

  • Tool: dnsperf (DNS performance testing tool from DNS-OARC) or queryperf (BIND utility)
  • RPM packages: dnsperf (available in EPEL) or bind-utils (for queryperf and dig)
  • systemd service command: dnsperf -s <dns-server> -d /etc/virtwork/dns-queries.txt -l 0 -Q 100
    • -s <dns-server>: target DNS server (CoreDNS ClusterIP or VM's configured nameserver)
    • -d <queryfile>: file containing query names and types
    • -l 0: run indefinitely
    • -Q 100: target 100 queries per second
  • Configurable parameters:
    • dns-qps: target queries per second (default: 100)
    • dns-server: target nameserver (default: auto-detect from /etc/resolv.conf)
    • dns-query-mix: proportion of internal vs external queries (default: 70% internal, 30% external)
    • dns-record-types: query types to generate (default: A, AAAA, SRV)

VM Count Model

Single VM (like cpu, memory, disk)

Required Resources

  • Persistent storage (DataVolume)
  • Kubernetes Service (for inter-VM communication)
  • Kubernetes Secret (for credentials or config)
  • Additional CPU/memory beyond defaults
  • GPU or special device passthrough

Cloud-Init Details

packages:
  - dnsperf
  - bind-utils
write_files:
  - path: /etc/virtwork/dns-queries.txt
    content: |
      kubernetes.default.svc.cluster.local A
      kubernetes.default.svc.cluster.local AAAA
      kube-dns.kube-system.svc.cluster.local A
      kube-dns.kube-system.svc.cluster.local SRV
      openshift-apiserver.openshift-apiserver.svc.cluster.local A
      console.openshift-console.svc.cluster.local A
      example.com A
      example.com AAAA
      www.google.com A
      github.com A
      dns.google A
      _http._tcp.example.com SRV
      1.0.0.127.in-addr.arpa PTR
  - path: /usr/local/bin/virtwork-dns-stress.sh
    permissions: '0755'
    content: |
      #!/bin/bash
      set -euo pipefail
      DNS_SERVER=$(awk '/^nameserver/{print $2; exit}' /etc/resolv.conf)
      QPS="${DNS_QPS:-100}"
      QUERY_FILE="${DNS_QUERY_FILE:-/etc/virtwork/dns-queries.txt}"

      # dnsperf with -l 0 runs indefinitely
      exec dnsperf \
        -s "$DNS_SERVER" \
        -d "$QUERY_FILE" \
        -l 0 \
        -Q "$QPS" \
        -c 10 \
        -S 30
  - path: /etc/systemd/system/virtwork-dns-stress.service
    content: |
      [Unit]
      Description=Virtwork DNS stress workload
      After=network-online.target
      Wants=network-online.target
      [Service]
      Type=simple
      Environment=DNS_QPS=100
      ExecStart=/usr/local/bin/virtwork-dns-stress.sh
      Restart=always
      RestartSec=5
      [Install]
      WantedBy=multi-user.target
runcmd:
  - systemctl enable --now virtwork-dns-stress.service

Use Case

  • Network partners (CNI, SDN, OVN-Kubernetes): DNS is the first network service to degrade under load or misconfiguration. Sustained DNS query volume from VMs validates that the CNI correctly routes DNS traffic from the VM's network namespace through the pod network to CoreDNS. Partners need to prove their network product doesn't introduce DNS latency or packet loss on the VM-to-CoreDNS path.
  • DNS/Service discovery partners (external-dns, CoreDNS plugin vendors): Need sustained query volume to validate custom CoreDNS plugins, response caching, and query logging under VM-sourced load. The mix of internal (cluster.local) and external queries exercises both the Kubernetes plugin and upstream forwarding.
  • Security partners (DNS-based threat detection): Products that inspect DNS queries for threat indicators (DGA detection, C2 domain detection, DNS tunneling) need sustained DNS traffic to validate their detection pipelines. The configurable query file can include suspicious-looking domains for controlled testing.
  • Observability partners: DNS latency is a critical SLI. Partners building DNS dashboards or alerting need a consistent query source to validate p50/p95/p99 DNS latency measurement from VMs. dnsperf outputs per-interval statistics (queries sent, responses received, latency percentiles) that pair directly with partner monitoring.
  • Platform validation: Validates that CoreDNS scales correctly when VMs — not just pods — are generating DNS load. VMs often have different DNS caching behavior than containers (systemd-resolved, NetworkManager DNS caching), which can produce different query patterns.

Additional Context

  • dnsperf is the industry standard for DNS performance testing, maintained by DNS-OARC (DNS Operations, Analysis, and Research Center). It provides precise QPS control, latency statistics, and sustained operation — exactly what this workload needs.
  • The query file (dns-queries.txt) is the primary tuning knob. The default mix of cluster-internal and external domains is a starting point; partners can customize it to include their specific service names, longer query chains, or domains that exercise specific DNS features (CNAME chains, wildcards, NXDOMAIN).
  • The 70/30 internal/external split in the default query file reflects a realistic VM workload: most DNS queries resolve Kubernetes services (databases, APIs, other VMs), with a minority resolving external services (package repos, API endpoints, telemetry).
  • At 100 QPS from a single VM, this is a moderate DNS load. Scale by deploying multiple VMs (--vm-count 5 at 100 QPS each = 500 QPS total) or increasing dns-qps to 1000+ for stress testing. CoreDNS on a typical OpenShift cluster handles 5,000-10,000 QPS comfortably; this workload can scale to find the ceiling.
  • dnsperf's -S 30 flag prints statistics every 30 seconds — these are the latency and throughput numbers that appear in logs and can be scraped by monitoring agents.
  • Consider also providing a fallback mode using a shell loop with dig for environments where dnsperf is not available in EPEL: while true; do dig @$DNS_SERVER +short $(shuf -n1 /etc/virtwork/dns-queries.txt | cut -d' ' -f1); sleep 0.01; done. Lower precision but zero package dependencies beyond bind-utils.

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.size/LDenotes a PR that changes 100-499 lines, ignoring generated files.workload-requestRequest for a new workload typeworkload/tier-2High impact, introduces new patterns or requires domain knowledge.

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions