Skip to content

Unify environment handling in Python#55

Open
xCatG wants to merge 1 commit intomainfrom
fix/unify-env-handling
Open

Unify environment handling in Python#55
xCatG wants to merge 1 commit intomainfrom
fix/unify-env-handling

Conversation

@xCatG
Copy link
Owner

@xCatG xCatG commented Sep 3, 2025

This PR consolidates environment detection and usage across the Python backend.\n\nKey changes:\n- Add role_play.common.environment with a single resolver (CONFIG_FILE > ENV > ENVIRONMENT > dev) and synonym mapping (development->dev, production->prod).\n- Use common.models.Environment everywhere; remove duplicate enum in server/config_loader.py.\n- Delegate server.config.get_config() to server.config_loader.get_config().\n- Update server/dependencies.py, common/storage_factory.py to use the unified resolver.\n- Align logging and health endpoint to report consistent environment names.\n- Guard PCM audio logging strictly by Environment.PROD.\n\nWhy:\n- Avoids mismatches between storage selection and app config.\n- Eliminates duplicate enums/naming drift.\n- Provides predictable precedence for env sources.\n\nTesting plan:\n- make run-local-docker DATA_DIR=...; curl localhost:8080/health shows environment dev|beta|prod per ENV/ENVIRONMENT/CONFIG_FILE.\n- Verify logging format switches to structured JSON in non-dev environments.\n- Ensure file storage is rejected in beta/prod but allowed in dev per storage_factory restrictions.\n\nDocs follow-up suggested to reference the resolver for Python code.

…ver role_play.common.environment with canonical enum mapping\n- Switch config_loader, dependencies, storage_factory to unified resolver\n- Guard PCM logs via Environment.PROD only\n- BaseServer health/env reporting via resolver\n- Delegate server.config.get_config to config_loader\n- Align logging_config to use environment_name/resolve_environment\n\nThis removes duplicate Environment enums and inconsistent ENV/ENVIRONMENT parsing.
@claude
Copy link

claude bot commented Sep 3, 2025

Claude finished @xCatG's task —— View job


PR Review: Unify Environment Handling

Review Tasks:

  • Read repository guidelines (CLAUDE.md)
  • Analyze new environment.py module
  • Review changes to existing modules
  • Check for potential bugs and security issues
  • Assess performance implications
  • Evaluate test coverage needs
  • Provide comprehensive feedback

Summary

This PR successfully consolidates environment detection logic by introducing a unified resolver with clear precedence rules. The changes eliminate duplicate Environment enums and inconsistent parsing logic across the codebase.

Code Quality & Best Practices ✅

Strengths:

  • Clear Architecture: New common.environment module follows single responsibility principle
  • Consistent API: resolve_environment(), environment_name(), get_environment_info() provide intuitive interfaces
  • Good Documentation: Comprehensive docstrings and module-level documentation explain precedence rules
  • Backwards Compatibility: Existing code patterns maintained while centralizing logic

Minor Improvements Suggested:

  • Consider making parse_environment_str() more explicit about fallback behavior in logs/warnings

Potential Bugs & Issues ⚠️

Critical Issue: Regex Pattern Bug

File: src/python/role_play/common/environment.py:65

m = re.search(r"/(\\w+)\\.yaml$", config_file)

Problem: Double-escaped backslashes in regex pattern will not match correctly. Should be:

m = re.search(r"/(\w+)\.yaml$", config_file)

Minor Issues:

  1. Silent Fallback: parse_environment_str() silently falls back to DEV on invalid input, which could mask configuration errors
  2. Import Safety: Some modules have inline imports to avoid circular dependencies - this is handled correctly

Security Considerations ✅

Good Security Practices:

  • Production Guards: PCM audio logging properly restricted to non-production environments using unified resolver
  • Environment Validation: Consistent environment detection prevents mismatches between storage and application configuration
  • Fail-Safe Defaults: Defaults to DEV environment when detection fails, which is safer than assuming production

Performance Implications ✅

Minimal Performance Impact:

  • Functions are lightweight with simple string operations
  • Environment resolution happens once at startup in most cases
  • No caching needed since environment is static during runtime
  • Regex compilation only occurs when CONFIG_FILE is used

Test Coverage - Action Required ❗

Missing Test Coverage:

  • No unit tests for the new environment.py module
  • Functions like resolve_environment(), parse_environment_str(), get_environment_info() are untested
  • Environment precedence logic needs comprehensive test coverage

Recommended Test Cases:

# test_environment.py
def test_parse_environment_str_canonical():
    assert parse_environment_str("dev") == Environment.DEV
    assert parse_environment_str("prod") == Environment.PROD

def test_parse_environment_str_synonyms():
    assert parse_environment_str("development") == Environment.DEV
    assert parse_environment_str("production") == Environment.PROD

def test_resolve_environment_precedence():
    # Test CONFIG_FILE > ENV > ENVIRONMENT > default
    pass

def test_config_file_regex_extraction():
    # Test the regex pattern after fixing the bug
    pass

Architecture Alignment ✅

Follows Repository Patterns:

  • Aligns with layered architecture (common → server)
  • Uses standard Pydantic models from common.models
  • Maintains separation of concerns
  • Consistent with existing error handling patterns

Integration Impact ✅

Well-Integrated Changes:

  • All consuming modules updated consistently
  • Health endpoint reports unified environment names
  • Logging configuration uses unified resolver
  • Storage factory properly integrates with new resolver

Recommendations

High Priority (Fix Before Merge):

  1. Fix regex pattern in environment.py:65 - critical bug that breaks CONFIG_FILE parsing
  2. Add unit tests for environment.py module - this is core infrastructure that needs test coverage

Medium Priority (Consider for Future):

  1. Add warning logs when falling back to DEV environment on invalid input
  2. Add integration tests for end-to-end environment detection scenarios

Low Priority:

  1. Consider adding environment validation decorator for functions that should only run in specific environments

Verdict: ✅ Approve with Required Changes

This is a well-architected solution that successfully unifies environment handling. The regex bug needs immediate fixing, and test coverage is essential for this core functionality. Once these issues are addressed, this will significantly improve the codebase's maintainability and consistency.


@xCatG
Copy link
Owner Author

xCatG commented Sep 6, 2025

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request does a great job of unifying environment detection and handling across the Python backend. By centralizing the logic in the new role_play.common.environment module, you've eliminated duplicate code and made the configuration more predictable and less error-prone. The updates to logging, storage, and server components to use this new resolver are well-executed.

I've found a couple of issues in the new implementation that could lead to bugs in environment detection. My review includes suggestions to fix these to make the new module more robust. Overall, this is a solid and valuable refactoring.

if config_file:
# Try to extract the environment from the filename
# e.g., /app/config/beta.yaml -> beta
m = re.search(r"/(\\w+)\\.yaml$", config_file)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The regular expression used to extract the environment from the config filename is incorrect and not robust. The double backslashes \\ in the raw string will be treated as literal backslashes, which will cause the pattern to fail to match. Additionally, using a regex that matches on the full path is brittle as it assumes a specific directory structure.

A more robust approach is to operate on the base name of the file. An even better, more Pythonic solution would be to use pathlib.Path(config_file).stem to avoid regex altogether.

Suggested change
m = re.search(r"/(\\w+)\\.yaml$", config_file)
m = re.search(r"(\w+)\.yaml$", os.path.basename(config_file))

try:
env_enum = resolve_environment()
else:
env_enum = Environment(environment)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

When an environment string is passed to get_config, it's used directly to initialize the Environment enum. This will raise a ValueError if a synonym like "development" or "production" is passed, as the enum constructor only accepts the canonical values ("dev", "beta", "prod").

You should use the parse_environment_str function from the new environment module to correctly handle synonyms and invalid values. You'll also need to update the import on line 19 to bring in this function:
from ..common.environment import resolve_environment, parse_environment_str

Suggested change
env_enum = Environment(environment)
env_enum = parse_environment_str(environment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant