docs(readme): land ModelRouter prominently for the 0.7.8 release#464
Merged
Merged
Conversation
…ffic wave That DevOps Guy (91K subs) published a video featuring LLMKube on 2026-05-13, and a measurable star bump tracks the publish time. The README didn't mention ModelRouter, fail-closed semantics, or per-rule budgets at all, so any viewer landing on the repo was missing the headline new capability that ships in v0.7.8 (about to release). Updates: - New "0.7.8 just shipped" callout below The Problem so the version number a viewer sees matches the helm chart they're about to install. - New top-level "Composition: ModelRouter" section between The Metal Agent and How Is This Different. Includes a worked example (strict pii rule + complex fallback + cloud-tier credentialsRef) drawn from the demo we use day-to-day. Three pull-out properties the audience cares about (fail-closed, per-rule timeouts, OpenAI-compatible streaming) called out in plain language. - Comparison table grows three rows: hybrid local+cloud routing, fail-closed for regulated data, per-rule timeout budgets. These are the columns where LLMKube is alone vs vLLM / Ollama / KServe / LocalAI. - "Versus newer adjacent projects" prose gains a LiteLLM entry explaining that ModelRouter composes with LiteLLM rather than replacing it. - Features list grows a "Routing & policy" block parallel to the existing Inference / GPU / Operations blocks. - TOC swaps "Architecture" for "ModelRouter" since the architecture mermaid is already inline. Nothing removed. The Problem statement gains one sentence about mixed local + cloud routing being its own platform problem. Signed-off-by: Christopher Maher <chris@mahercode.io>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
6 tasks
Merged
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
README audit that brings the
ModelRouterCRD, fail-closed semantics, and per-rule timeout budgets to the front of the page. Right now the README doesn't mention any of the major capabilities shipping in v0.7.8 (about to release via PR #443), so a viewer arriving from external coverage sees an outdated story.Why
Marcel Dempers ("That DevOps Guy", 91.2K YouTube subscribers) published a video titled "Run LLMs on Kubernetes with LLMKube" on 2026-05-13 at 12:07 UTC. GitHub traffic data attributes today's six new stars to direct viewer follow-through (all six landed in the hours after the video published). Naveen (Kubernetes with Naveen, 10.8K X followers) amplified the video at 17:31 UTC.
A first-time viewer landing on the README needs to see what `ModelRouter` is and why `InferenceService` is now only half the story. Today's README doesn't surface either.
No issue link — this is README polish, not a bug or a feature.
How
Six surgical edits:
Nothing removed. The Problem statement gains one sentence about mixed local+cloud routing being its own platform problem.
All three internal links verified locally:
Checklist
Suggested merge order
Merge PR #443 (release-please `chore: release 0.7.8`) first so the chart version matches the README's 0.7.8 references. Then merge this PR. Both can land within a few minutes of each other.