More robust bench execution for Dynamo over k8s#793
Conversation
📝 WalkthroughWalkthroughThe change replaces in-cluster Kubernetes API streaming with a subprocess.run call executing kubectl exec directly. Command output is written to genai_perf.log with enhanced error handling, and exception handling is broadened from specific API exceptions to general exceptions with debug-level logging. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 🧪 Unit Test Generation v2 is now available!We have significantly improved our unit test generation capabilities. To enable: Add this to your reviews:
finishing_touches:
unit_tests:
enabled: trueTry it out by using the Have feedback? Share your thoughts on our Discord thread! Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile OverviewGreptile SummaryReplaced Kubernetes Python Stream API with Key Changes:
Issues Found:
Confidence Score: 3/5
Important Files Changed
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/cloudai/systems/kubernetes/kubernetes_system.py`:
- Around line 316-324: The current subprocess.run call in kubernetes_system.py
(the try block around kubectl_exec_cmd) only logs result.returncode and swallows
failures; change it to either pass check=True to subprocess.run or explicitly
check result.returncode and raise a subprocess.CalledProcessError when non-zero,
and re-raise TimeoutExpired and other exceptions instead of only logging them so
failures propagate; update the except block to log and then raise the caught
exception (so _genai_perf_completed is only set when the command actually
succeeds).
- Around line 313-316: The kubectl subprocess command built in kubectl_exec_cmd
does not pass the resolved kubeconfig, so kubectl may target the wrong cluster
when self.kube_config_path differs from the default; update the construction of
kubectl_exec_cmd (the list used in subprocess.run for running genai_perf_cmd) to
include the flag "--kubeconfig" followed by self.kube_config_path (or its
resolved path used by load_kube_config()) before the "exec" invocation so the
subprocess uses the same kubeconfig as the API client.
Summary
Stream API can be sensitive to network issues, delays, broken ping, etc. Using
kubectlis more robust for this case.Test Plan
Additional Notes
—