SageSched: Intelligent LLM Request Scheduler with Workload Prediction — QoS-aware dual-queue scheduling for black-box LLM APIs (OpenAI/Azure/Doubao/Gemini)
-
Updated
May 18, 2026 - Python
SageSched: Intelligent LLM Request Scheduler with Workload Prediction — QoS-aware dual-queue scheduling for black-box LLM APIs (OpenAI/Azure/Doubao/Gemini)
Research paper and technical notes on the microservice ecosystem
Self-adaptive data layout for distribute joins
Experimental pipeline for a simulation-based comparison of ML-driven auto-scaling policies (LSTM and Random Forest) against Kubernetes HPA, using the Alibaba Cluster Trace 2018 dataset. Published at UTP Student and Doctoral Scientific Session 2026.
Add a description, image, and links to the workload-prediction topic page so that developers can more easily learn about it.
To associate your repository with the workload-prediction topic, visit your repo's landing page and select "manage topics."