Skip to content

Agent params#792

Merged
srivatsankrishnan merged 8 commits intoNVIDIA:agent-devfrom
alexmanle:agent-params
Feb 3, 2026
Merged

Agent params#792
srivatsankrishnan merged 8 commits intoNVIDIA:agent-devfrom
alexmanle:agent-params

Conversation

@alexmanle
Copy link

Summary

Adds a simple interface to override agents hyperparameters.

Test Plan

Tested using AIConfigurator workload and GA agent.
Test scenario

Example TOML:

[agent_config]
population_size = 0
n_offsprings = 0
crossover_prob = 0.0
mutation_prob = 0.0
random_seed = 0

New terminal output:

2026-01-27 13:05:11,633 - INFO - Applying agent config overrides for 'ga': {'n_offsprings': 0, 'crossover_prob': 0.0, 'mutation_prob': 0.0, 'random_seed': 0}

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 27, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review
📝 Walkthrough

Walkthrough

Added Pydantic agent configuration models, extended TestDefinition with an optional agent_config field, and implemented runtime validation of agent overrides in CLI handlers that validates per-agent config types and conditionally supplies validated kwargs to agent constructors.

Changes

Cohort / File(s) Summary
Agent configuration models
src/cloudai/models/agent_config.py
New Pydantic module introducing AgentConfig (base, extra="forbid") and specialized models: GeneticAlgorithmConfig, BayesianOptimizationConfig, MultiArmedBanditConfig with typed, constrained optional fields and metadata.
Workload model update
src/cloudai/models/workload.py
Added agent_config: Optional[dict[str, Any]] = None to TestDefinition to carry agent override data.
Runtime validation & handler changes
src/cloudai/cli/handlers.py
Added validate_agent_overrides(agent_type, agent_config) -> dict[str, Any], imported ValidationError and agent config models, and updated agent construction flow to validate overrides per-agent type, log field-level ValidationError details, skip invalid overrides, and pass validated kwargs when present.
Miscellaneous
manifest_file, requirements.txt
Updated manifest/requirements to reflect new module/dependencies (lines added).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐇 I nibble on configs, tidy and neat,
Seeds for GA, BO, and arms that compete,
Pydantic keeps rules, no surprises today,
Validated overrides guide each agent's play,
Hop, ship, and repeat — carrots all the way! 🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Agent params' is vague and generic, using non-descriptive language that doesn't convey the specific nature of the changeset. Use a more specific title that describes the main feature, such as 'Add agent hyperparameter override configuration' or 'Support agent config overrides in DSE jobs'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description check ✅ Passed The description clearly explains the purpose, implementation, and testing of the agent hyperparameter override feature with concrete examples.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Important

Action Needed: IP Allowlist Update

If your organization protects your Git platform with IP whitelisting, please add the new CodeRabbit IP address to your allowlist:

  • 136.113.208.247/32 (new)
  • 34.170.211.100/32
  • 35.222.179.152/32

Failure to add the new IP will result in interrupted reviews.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 27, 2026

Greptile Overview

Greptile Summary

This PR adds a mechanism to override agent hyperparameters via TOML configuration. The implementation introduces a base AgentConfig class and validation logic to ensure type-safe parameter passing.

Key Changes:

  • Added AgentConfig base class with pydantic validation and extra="forbid" to prevent invalid keys
  • Added optional agent_config field to TestDefinition to store configuration overrides from TOML
  • Implemented validate_agent_overrides() function that validates config against agent's schema and returns kwargs for agent initialization
  • Modified agent instantiation to conditionally pass validated kwargs when config is provided

Issues Found:

  • Error handling on lines 156-162 only catches ValidationError but not ValueError. When an agent doesn't support configuration and user provides agent_config, line 152 raises ValueError which won't be caught, causing an unhandled exception. Additionally, the error handler assumes the exception is ValidationError and calls e.errors(), which would fail with AttributeError if ValueError were caught without separate handling.

Confidence Score: 4/5

  • Safe to merge after fixing the exception handling bug in handlers.py
  • The PR implements a clean mechanism for agent parameter overrides with proper validation. However, there's a critical error handling bug where ValueError isn't caught, which could cause unhandled exceptions when users provide config for agents that don't support it. Once fixed, the implementation is solid.
  • Pay close attention to src/cloudai/cli/handlers.py - the error handling needs to be fixed before merge

Important Files Changed

Filename Overview
src/cloudai/cli/handlers.py Adds agent config validation and kwargs passing logic. Error handling on line 161 could raise uncaught ValueError if agent doesn't support config.
src/cloudai/configurator/base_agent.py Adds optional AgentConfig class attribute to BaseAgent. Clean implementation.
src/cloudai/models/agent_config.py New base configuration model for agent overrides with random_seed field.
src/cloudai/models/workload.py Adds agent_config field to TestDefinition to store configuration overrides.

Sequence Diagram

sequenceDiagram
    participant User
    participant handle_dse_job
    participant validate_agent_overrides
    participant Registry
    participant AgentClass
    participant Agent

    User->>handle_dse_job: Run DSE job with agent_config
    handle_dse_job->>Registry: Get agent_class for agent_type
    
    alt agent_config provided
        handle_dse_job->>validate_agent_overrides: validate_agent_overrides(agent_type, agent_config)
        validate_agent_overrides->>Registry: Get agents with config support
        validate_agent_overrides->>AgentClass: Check if agent.config exists
        
        alt Agent doesn't support config
            validate_agent_overrides-->>handle_dse_job: ValueError
            handle_dse_job-->>User: Error: agent doesn't support config
        else Agent supports config
            validate_agent_overrides->>AgentClass: config.model_validate(agent_config)
            
            alt Invalid config values
                AgentClass-->>validate_agent_overrides: ValidationError
                validate_agent_overrides-->>handle_dse_job: ValidationError
                handle_dse_job->>validate_agent_overrides: Get valid field descriptions
                handle_dse_job-->>User: Error with valid field list
            else Valid config
                validate_agent_overrides-->>handle_dse_job: agent_kwargs (validated)
                handle_dse_job->>AgentClass: __init__(env, **agent_kwargs)
                AgentClass-->>Agent: Instance with overrides
            end
        end
    else No agent_config
        handle_dse_job->>AgentClass: __init__(env)
        AgentClass-->>Agent: Instance with defaults
    end
    
    Agent->>User: Execute DSE with configured agent
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@src/cloudai/cli/handlers.py`:
- Around line 198-201: The code silently drops a user-supplied agent_config when
no validator exists; update the branch that checks config_class =
config_class_map.get(agent_type) to emit a logging.warning (not debug) when an
agent_config was provided and will be ignored, referencing agent_type and
agent_config in the message so callers understand overrides were ignored; keep
returning {} but ensure the warning clearly states the agent_type and that
agent_config will be ignored (optionally include a truncated/summary of
agent_config) to aid user visibility.

In `@src/cloudai/models/agent_config.py`:
- Around line 51-54: The algorithm Field currently accepts any string which
risks invalid runtime values; restrict and validate it by replacing the loose
Optional[str] with a strict set of allowed values (use a Python Enum or
typing.Literal for "ucb1", "ts" (thompson_sampling), "epsilon_greedy",
"softmax", "random") or add a Pydantic validator on the algorithm field in the
AgentConfig class to raise a clear validation error when an unsupported
algorithm is provided, and update the Field description to match the enforced
choices.
- Around line 42-45: The botorch_num_trials field currently allows any integer
but should only accept -1 or integers >= 1 (matching the semantic described and
the sibling sobol_num_trials). Add a Pydantic validator for botorch_num_trials
on the AgentConfig model (e.g., a method named validate_botorch_num_trials
decorated with `@validator`("botorch_num_trials")) that returns the value if it's
None or equals -1 or is >= 1, and raises a ValueError for other values (e.g., 0
or < -1).

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@alexmanle alexmanle marked this pull request as draft January 28, 2026 20:31
@alexmanle alexmanle changed the base branch from main to agent-dev January 29, 2026 17:51
@alexmanle alexmanle marked this pull request as ready for review January 31, 2026 00:34
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@srivatsankrishnan srivatsankrishnan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets test it with BO as well with AIConfigurator. We would use BO a lot for most problems.

@srivatsankrishnan srivatsankrishnan merged commit 5afe35c into NVIDIA:agent-dev Feb 3, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants