Skip to content
@KubedAI

Kube-dAI

Building Smarter, Faster Data and AI Platforms on Kubernetes

Kube-dAI 🚀

Kubernetes + Data + AI = Cubed AI

Open-source blueprints for running high-performance data and AI workloads on Kubernetes.

What is this?

Kube-dAI is where I experiment with emerging tech, build benchmark tools, and create production-ready patterns for data and AI on Kubernetes. Think of it as a lab for scalable data infrastructure—mostly on AWS, always evolving.

What you'll find here:

  • Benchmark tools for Spark, GPU acceleration, and distributed compute
  • Infrastructure patterns using Terraform, Helm, and GitOps
  • Performance analysis for real-world workloads (TPC-DS, RAPIDS, shuffle services)
  • Operator utilities like Spark History Server integrations
  • Agentic AI tools for data platforms—troubleshooting agents, upgrade agents, and autonomous optimization for Spark, Kubernetes, and distributed systems

If you're running Apache Spark at scale, training models on Kubernetes, or just curious about what's next in cloud-native data—this is the place.

Projects

Repository Description Status
spark-rapids-on-kubernetes GPU-accelerated Spark with RAPIDS on EKS ✅ Live
spark-k8s-benchmarks TPC-DS benchmark suite for Spark on K8s ✅ Live
spark-history-server Production-grade Helm chart for Spark History Server ✅ Live

More coming soon: Celeborn benchmarks, DRA experiments, agent orchestrators.

Tech Stack

Orchestration: Kubernetes (EKS), Karpenter, ArgoCD
Data Processing: Apache Spark, RAPIDS, Velox, Celeborn
AI/ML: KServe, Ray, Triton Inference Server
IaC: Terraform, Crossplane, Helm
Observability: Prometheus, FluentBit, Spark UI

Get Started

Explore the projects above—each repository has detailed setup instructions, architecture diagrams, and deployment guides.

Contributing

Got ideas? Found a bug? Want to add a new benchmark?

  1. Fork the repo
  2. Create a feature branch
  3. Submit a PR

No bureaucracy—just useful contributions. Check individual repos for specific guidelines.

Learn More

📝 Blog posts on Medium
💬 Open an issue for questions or advanced use cases

License

Apache 2.0 — use it, modify it, ship it.


Disclaimer: Independent project. Not affiliated with AWS, Apache, or NVIDIA. All trademarks belong to their respective owners.

Popular repositories Loading

  1. spark-history-server spark-history-server Public

    Helm Chart for deploying Spark history server in Amazon EKS for S3 Spark Event Logs

    Shell 28 12

  2. spark-rapids-on-kubernetes spark-rapids-on-kubernetes Public template

    Accelerating Data processing workloads on GPUs with Spark-RAPIDS

    HCL 6 1

  3. .github .github Public

  4. airflow-dags airflow-dags Public

    Sample DAGs repo to use with Apache Airflow GitSync feature.

    Python 7

  5. spark-k8s-benchmarks spark-k8s-benchmarks Public

    Scala

  6. eks-hybrid-azure-stack eks-hybrid-azure-stack Public

    A reference implementation for running Amazon EKS in a hybrid multi-cloud configuration with Azure-based worker nodes. This project demonstrates how to extend EKS control plane capabilities across …

    Shell

Repositories

Showing 7 of 7 repositories
  • terraform-aws-emr-containers Public

    Terraform module for EMR on EKS virtual clusters with multi-tenancy, Pod Identity, scoped IAM, and CloudWatch logging

    KubedAI/terraform-aws-emr-containers’s past year of commit activity
    HCL 0 Apache-2.0 0 0 0 Updated Feb 16, 2026
  • eks-hybrid-azure-stack Public

    A reference implementation for running Amazon EKS in a hybrid multi-cloud configuration with Azure-based worker nodes. This project demonstrates how to extend EKS control plane capabilities across cloud providers, enabling workloads to run on Azure compute while maintaining AWS EKS orchestration.

    KubedAI/eks-hybrid-azure-stack’s past year of commit activity
    Shell 0 Apache-2.0 0 0 0 Updated Feb 10, 2026
  • .github Public
    KubedAI/.github’s past year of commit activity
    0 0 0 0 Updated Feb 9, 2026
  • spark-history-server Public

    Helm Chart for deploying Spark history server in Amazon EKS for S3 Spark Event Logs

    KubedAI/spark-history-server’s past year of commit activity
    Shell 28 Apache-2.0 12 9 0 Updated Feb 9, 2026
  • KubedAI/spark-k8s-benchmarks’s past year of commit activity
    Scala 0 Apache-2.0 0 0 0 Updated Jan 15, 2026
  • spark-rapids-on-kubernetes Public template

    Accelerating Data processing workloads on GPUs with Spark-RAPIDS

    KubedAI/spark-rapids-on-kubernetes’s past year of commit activity
    HCL 6 Apache-2.0 1 9 0 Updated Oct 16, 2024
  • airflow-dags Public

    Sample DAGs repo to use with Apache Airflow GitSync feature.

    KubedAI/airflow-dags’s past year of commit activity
    Python 0 7 0 0 Updated Jun 28, 2023

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…