Issue Description
The API key validation in the Python bindings (python/src/validation.rs) uses simplistic length-based checks that are insufficient for detecting invalid or malformed API keys. Different providers have different key formats and requirements, but the current validation only checks minimum length without validating format, structure, or provider-specific patterns.
Affected Components
- Python Layer:
python/src/validation.rs - Validation logic
- LLM Configuration:
python/src/llm/config.rs - Configuration creation
- Integration: All LLM provider configurations
Related Files
python/src/validation.rs (lines 6-29) - Validation function
python/src/llm/config.rs (lines 18-260) - All provider configuration methods that call validate_api_key
core/src/llm/openai.rs (lines 23-43) - OpenAI provider implementation (no validation)
core/src/llm/anthropic.rs (lines 22-32) - Anthropic provider implementation (no validation)
core/src/llm/azure_openai.rs (lines 26-53) - Azure OpenAI provider implementation (no validation)
tests/python_integration_tests/tests_validation.py (lines 14-100) - Tests expecting format validation
Code Snippets
Current Validation (Length-Only, No Format Checking)
// From python/src/validation.rs (lines 6-29)
pub(crate) fn validate_api_key(api_key: &str, provider: &str) -> PyResult<()> {
if api_key.is_empty() {
return Err(PyErr::new::<pyo3::exceptions::PyValueError, _>(format!(
"{} API key cannot be empty",
provider
)));
}
let min_length = match provider.to_lowercase().as_str() {
"openai" => 20,
"anthropic" => 15,
"huggingface" => 10,
_ => 8,
};
if api_key.len() < min_length {
return Err(PyErr::new::<pyo3::exceptions::PyValueError, _>(format!(
"{} API key too short",
provider
)));
}
Ok(())
}
How Validation is Called (All Providers)
// From python/src/llm/config.rs (examples from lines 18-19, 31-32, 50, etc.)
#[staticmethod]
fn openai(api_key: String, model: Option<String>) -> PyResult<Self> {
validate_api_key(&api_key, "OpenAI")?; // ← Only length check, no format validation
// ... rest of config creation
}
#[staticmethod]
fn anthropic(api_key: String, model: Option<String>) -> PyResult<Self> {
validate_api_key(&api_key, "Anthropic")?; // ← Only length check, no format validation
// ... rest of config creation
}
Core Layer Has No Validation
// From core/src/llm/openai.rs (lines 23-43)
impl OpenAiProvider {
pub fn new(api_key: String, model: String) -> GraphBitResult<Self> {
// ← No validation of api_key format or content
let client = Client::builder()
.timeout(std::time::Duration::from_secs(60))
.build()
.map_err(|e| {
GraphBitError::llm_provider("openai", format!("Failed to create HTTP client: {e}"))
})?;
Ok(Self {
client,
api_key, // ← Accepted as-is, no validation
model,
base_url: "https://api.openai.com/v1".to_string(),
organization: None,
})
}
}
Impact Assessment
Affected Areas:
- User Experience: Invalid keys are accepted during configuration, only failing at runtime when the first API call is made. Users might not discover configuration errors until they try to use the system.
- Error Detection: Errors occur late in the workflow execution (at first LLM call) rather than at configuration time. This delays error discovery and makes debugging harder.
- Debugging: Users receive cryptic authentication errors from the LLM provider (e.g., "Unauthorized", "Invalid API key") instead of clear validation messages that explain what format is expected.
- Developer Experience: Developers cannot quickly identify common mistakes like:
- Using the wrong provider's API key (e.g., OpenAI key for Anthropic)
- Typos in API keys
- Incomplete or truncated keys
- Test Coverage Gap: Integration tests expect format validation (e.g.,
test_invalid_api_key_error expects "invalid-format-key" to fail) but current implementation doesn't validate formats
Real-World Example:
# This currently succeeds at config time but fails at runtime
config = LlmConfig.openai("invalid-format-key", "gpt-4") # ✅ Passes validation (length >= 20)
client = LlmClient(config)
response = client.complete("Hello") # ❌ Fails with cryptic "Unauthorized" error
Expected Behavior:
# Should fail at config time with clear message
config = LlmConfig.openai("invalid-format-key", "gpt-4") # ❌ Should fail: "OpenAI API keys must start with 'sk-'"
Potential Solutions
Recommendation 1: Implement Provider-Specific Format Validation
Enhance the validate_api_key() function to check format patterns for each provider:
pub(crate) fn validate_api_key(api_key: &str, provider: &str) -> PyResult<()> {
if api_key.is_empty() {
return Err(PyErr::new::<pyo3::exceptions::PyValueError, _>(format!(
"{} API key cannot be empty",
provider
)));
}
let (min_length, expected_prefix, pattern_desc) = match provider.to_lowercase().as_str() {
"openai" => (20, Some("sk-"), "OpenAI keys start with 'sk-' followed by alphanumeric characters"),
"anthropic" => (15, Some("sk-ant-"), "Anthropic keys start with 'sk-ant-' followed by alphanumeric characters"),
"huggingface" => (10, Some("hf_"), "HuggingFace keys start with 'hf_' followed by alphanumeric characters"),
"azure openai" => (8, None, "Azure OpenAI keys are typically hex strings or UUIDs"),
"ollama" => (0, None, "Ollama does not require an API key"),
_ => (8, None, "Generic API key format"),
};
if api_key.len() < min_length {
return Err(PyErr::new::<pyo3::exceptions::PyValueError, _>(format!(
"{} API key too short (minimum {} characters). {}",
provider, min_length, pattern_desc
)));
}
if let Some(prefix) = expected_prefix {
if !api_key.starts_with(prefix) {
return Err(PyErr::new::<pyo3::exceptions::PyValueError, _>(format!(
"{} API key has invalid format. {}",
provider, pattern_desc
)));
}
}
Ok(())
}
Recommendation 2: Add Character Set Validation
Validate that API keys contain only expected characters (alphanumeric, hyphens, underscores):
- Reject keys with spaces, special characters, or control characters
- Catch common copy-paste errors (e.g., extra whitespace)
Recommendation 3: Provide Detailed Error Messages
Include specific guidance in error messages:
- Show the expected format for the provider
- Suggest common mistakes (e.g., "Did you use the wrong provider's key?")
- Link to provider documentation
Recommendation 4: Add Optional Strict Mode
Provide a configuration option to enable/disable strict validation:
- Default: Strict validation enabled (fail fast at config time)
- Optional: Lazy validation (defer to first API call) for advanced use cases
Recommendation 5: Add Validation Tests
Ensure comprehensive test coverage for format validation:
- Valid formats for each provider
- Invalid formats (wrong prefix, too short, invalid characters)
- Edge cases (empty, whitespace, special characters)
Current Strengths
- ✅ Empty key detection works correctly
- ✅ Length validation is provider-aware
- ✅ Validation happens at configuration time (not at runtime)
- ✅ Clear error messages for empty/short keys
Key Finding: The core layer (Rust) has NO validation - all validation happens in the Python bindings layer. This is appropriate since the Python layer is the user-facing API.
Notes
This validation enhancement should be non-breaking and should provide clear, actionable error messages to help users quickly identify and resolve configuration issues. The fix is localized to python/src/validation.rs and requires no changes to the core layer.
Issue Description
The API key validation in the Python bindings (
python/src/validation.rs) uses simplistic length-based checks that are insufficient for detecting invalid or malformed API keys. Different providers have different key formats and requirements, but the current validation only checks minimum length without validating format, structure, or provider-specific patterns.Affected Components
python/src/validation.rs- Validation logicpython/src/llm/config.rs- Configuration creationRelated Files
python/src/validation.rs(lines 6-29) - Validation functionpython/src/llm/config.rs(lines 18-260) - All provider configuration methods that call validate_api_keycore/src/llm/openai.rs(lines 23-43) - OpenAI provider implementation (no validation)core/src/llm/anthropic.rs(lines 22-32) - Anthropic provider implementation (no validation)core/src/llm/azure_openai.rs(lines 26-53) - Azure OpenAI provider implementation (no validation)tests/python_integration_tests/tests_validation.py(lines 14-100) - Tests expecting format validationCode Snippets
Current Validation (Length-Only, No Format Checking)
How Validation is Called (All Providers)
Core Layer Has No Validation
Impact Assessment
Affected Areas:
test_invalid_api_key_errorexpects "invalid-format-key" to fail) but current implementation doesn't validate formatsReal-World Example:
Expected Behavior:
Potential Solutions
Recommendation 1: Implement Provider-Specific Format Validation
Enhance the
validate_api_key()function to check format patterns for each provider:Recommendation 2: Add Character Set Validation
Validate that API keys contain only expected characters (alphanumeric, hyphens, underscores):
Recommendation 3: Provide Detailed Error Messages
Include specific guidance in error messages:
Recommendation 4: Add Optional Strict Mode
Provide a configuration option to enable/disable strict validation:
Recommendation 5: Add Validation Tests
Ensure comprehensive test coverage for format validation:
Current Strengths
Key Finding: The core layer (Rust) has NO validation - all validation happens in the Python bindings layer. This is appropriate since the Python layer is the user-facing API.
Notes
This validation enhancement should be non-breaking and should provide clear, actionable error messages to help users quickly identify and resolve configuration issues. The fix is localized to
python/src/validation.rsand requires no changes to the core layer.