diff --git a/docs/Cross_Endpoint_Routing.md b/docs/Cross_Endpoint_Routing.md index 6f2a0478..4df9ddc2 100644 --- a/docs/Cross_Endpoint_Routing.md +++ b/docs/Cross_Endpoint_Routing.md @@ -174,7 +174,7 @@ Each deployed function is wrapped with a production wrapper that: Cross-endpoint calls require authentication. Flash handles this automatically: -1. The deploying environment sets `RUNPOD_API_KEY` as an env var on each endpoint +1. Flash injects `RUNPOD_API_KEY` as an env var on endpoints where `makes_remote_calls=True` 2. At runtime, the wrapper reads `RUNPOD_API_KEY` from the environment 3. All outbound calls to sibling endpoints include the API key diff --git a/docs/Deployment_Architecture.md b/docs/Deployment_Architecture.md index d0f64afc..3c9243a6 100644 --- a/docs/Deployment_Architecture.md +++ b/docs/Deployment_Architecture.md @@ -143,7 +143,7 @@ flash deploy --env production │ │ ├── If new: create endpoint via GraphQL API │ │ ├── If exists + config drift: update endpoint │ │ └── If exists + no drift: skip - │ └── Set environment variables on each endpoint + │ └── Set env vars on each endpoint (explicit env={} + system vars like RUNPOD_API_KEY) │ ├── 4. Register with State Manager │ └── Store endpoint IDs for cross-endpoint routing @@ -186,7 +186,7 @@ When deploying to an environment that already has endpoints, Flash compares the When `flash deploy` provisions endpoints: -1. Each endpoint gets `RUNPOD_API_KEY` injected as an env var +1. Endpoints with `makes_remote_calls=True` get `RUNPOD_API_KEY` injected automatically 2. Each endpoint gets the `flash_manifest.json` included in its artifact 3. The State Manager stores `{environment_id, resource_name} -> endpoint_id` 4. At runtime, the `ServiceRegistry` uses the manifest + State Manager to route calls diff --git a/docs/Flash_SDK_Reference.md b/docs/Flash_SDK_Reference.md index 31c65d2f..d3308a0f 100644 --- a/docs/Flash_SDK_Reference.md +++ b/docs/Flash_SDK_Reference.md @@ -48,7 +48,7 @@ Endpoint( | `accelerate_downloads` | `bool` | `True` | Enable accelerated downloads. | | `volume` | `NetworkVolume` or list | `None` | Network volume(s) for persistent storage. One volume per datacenter. | | `datacenter` | `DataCenter`, list, `str`, or `None` | `None` | Datacenter(s) to deploy into. `None` means all available DCs. Accepts a single value, a list, or string DC IDs. CPU endpoints must use DCs in `CPU_DATACENTERS`. | -| `env` | `dict[str, str]` | `None` | Environment variables for the endpoint. | +| `env` | `dict[str, str]` | `None` | Environment variables for the deployed endpoint. This is the only way to pass env vars to deployed workers; `.env` files are not carried to endpoints. | | `gpu_count` | `int` | `1` | GPUs per worker. | | `execution_timeout_ms` | `int` | `0` | Max execution time in ms. 0 = no limit. | | `flashboot` | `bool` | `True` | Enable Flashboot fast startup. | diff --git a/docs/LoadBalancer_Runtime_Architecture.md b/docs/LoadBalancer_Runtime_Architecture.md index 301cd3b1..dca52563 100644 --- a/docs/LoadBalancer_Runtime_Architecture.md +++ b/docs/LoadBalancer_Runtime_Architecture.md @@ -26,7 +26,7 @@ graph TD - Entrypoint: Loads manifest and starts FastAPI server on port 8000 - Runpod exposes this via HTTPS endpoint URL - Health check: Runpod polls `/ping` every 30 seconds -- Environment: `RUNPOD_API_KEY` injected for outgoing cross-endpoint calls +- Environment: `FLASH_MODULE_PATH` injected automatically, `RUNPOD_API_KEY` injected when `makes_remote_calls=True`, plus any explicit `env={}` vars ### What Gets Deployed diff --git a/docs/Resource_Config_Drift_Detection.md b/docs/Resource_Config_Drift_Detection.md index 6639ce77..3edafce5 100644 --- a/docs/Resource_Config_Drift_Detection.md +++ b/docs/Resource_Config_Drift_Detection.md @@ -82,15 +82,14 @@ Changes to these `Endpoint` parameters trigger drift detection and endpoint upda | `flashboot=` | `flashboot` | Yes | | `volume=` | `networkVolume` | Yes | | `template=` | `template` | Partially (template is a runtime field for some resource types) | -| `env=` | `env` | **No** (excluded from hash) | +| `env=` | `env` | **Conditionally** -- included when non-None for both `ServerlessResource` and `CpuServerlessEndpoint` (via `exclude_none`) | | `name=` | `name` | **No** (identity only) | -### Why env is Excluded +### How env Affects the Hash -Environment variables are excluded from the hash because: -- Different processes may load `.env` files with different values -- Changing env vars shouldn't trigger a full endpoint redeploy -- Env vars are updated separately via the API +For both `ServerlessResource` and `CpuServerlessEndpoint`, `env` defaults to `None` and is excluded via `exclude_none=True`. When a user sets `env={"HF_TOKEN": "..."}`, it IS included in the hash and triggers drift detection. + +`.env` files only populate `os.environ` for CLI and local dev; they are not carried to deployed endpoints. ## CPU LoadBalancer Special Case diff --git a/docs/plans/2026-03-05-env-separation-design.md b/docs/plans/2026-03-05-env-separation-design.md new file mode 100644 index 00000000..bae6fee8 --- /dev/null +++ b/docs/plans/2026-03-05-env-separation-design.md @@ -0,0 +1,144 @@ +# Design: Separate .env from Resource Deployment Env + +**Date:** 2026-03-05 +**Status:** Approved +**Branch base:** `deanq/ae-1549-env-vars-from-cwd-first` + +## Problem + +`ServerlessResource.env` defaults to `get_env_vars()`, which reads the entire `.env` file via `dotenv_values()`. Every key-value pair from `.env` (HF_TOKEN, WANDB_API_KEY, dev-only vars, etc.) gets baked into the manifest and sent to RunPod's `saveTemplate` mutation -- even if the user only intended those vars for local CLI usage. + +This causes: +- Platform-injected vars (`PORT`, `PORT_HEALTH`) overwritten on template updates +- False config drift from runtime var injection into `self.env` +- User confusion about what actually reaches deployed workers +- The entire class of bugs addressed in the ae-1549 branch + +## Solution + +Clean separation between two concerns: + +1. **`.env` = CLI/runtime only.** Populates `os.environ` via `load_dotenv()` at import time. Used by `get_api_key()`, CLI commands, local dev server. Never auto-carried to deployed endpoints. + +2. **Resource `env={}` = explicit deploy-time vars.** Users declare exactly what goes to each endpoint. Flash injects runtime vars (`RUNPOD_API_KEY`, `FLASH_MODULE_PATH`) into `template.env` separately via existing `_inject_runtime_template_vars()`. + +3. **Deploy-time env preview table.** Before provisioning, render a Rich table per resource showing all env vars (user-declared + flash-injected). Secret masking applied. + +## Detailed Changes + +### Core: `env` field default + +- `ServerlessResource.env` default changes from `Field(default_factory=get_env_vars)` to `Field(default=None)` +- Delete `get_env_vars()` function in `serverless.py` +- Delete `EnvironmentVars` class and `environment.py` file entirely +- `load_dotenv(find_dotenv(usecwd=True))` in `__init__.py` stays unchanged + +### Manifest pipeline + +- `_extract_deployment_config` in `manifest.py`: reads `resource.env` as-is (now `None` or explicit dict). If `None` or otherwise falsy, the manifest omits the `"env"` key. +- Remove the existing `RUNPOD_API_KEY` stripping logic -- it won't be in user env anymore, and runtime injection handles it via `_inject_runtime_template_vars()`. + +### Template creation + +- `serverless.py:_create_new_template`: change `env=KeyValuePair.from_dict(self.env or get_env_vars())` to `env=KeyValuePair.from_dict(self.env or {})` + +### Deploy-time env preview + +New functionality in deploy command (either in `deploy.py` or a new `cli/utils/env_preview.py`): + +- Collect final env per resource: user-declared env + flash-injected runtime vars +- Render Rich table before proceeding with deployment +- Flash-injected vars labeled with `(injected by flash)` suffix +- Mask values where key matches `KEY|TOKEN|SECRET|PASSWORD|CREDENTIAL` pattern: show first 6 chars + `****` +- Show all other values in full + +Example output: +``` +Resource: my-gpu + Environment Variables: + HF_TOKEN = hf_abc...**** + MODEL_ID = llama-3 + RUNPOD_API_KEY = rp_***...**** (injected by flash) + FLASH_MODULE_PATH = app.model (injected by flash) + +Resource: my-cpu + Environment Variables: + (none) +``` + +### Unchanged + +- `load_dotenv()` in `__init__.py` -- `os.environ` population for CLI +- `get_api_key()` in `credentials.py` -- credential resolution (env -> credentials.toml) +- `_inject_runtime_template_vars()` -- runtime var injection into `template.env` +- `skip_env` logic in `update()` -- platform var preservation (`PORT`, `PORT_HEALTH`) +- `flash env` CLI -- unrelated (deployment environments) +- ae-1549 branch fixes: `_inject_template_env()`, `_inject_runtime_template_vars()`, `skip_env` + +### Breaking change strategy + +Hard break, no deprecation period. The deploy-time preview table communicates the change clearly -- users see exactly what env vars go to each endpoint. If `env` is empty and `.env` exists, the preview shows only flash-injected vars, making it obvious no user vars are being sent. + +## Files to Modify + +### Flash repo + +| File | Action | +|------|--------| +| `src/runpod_flash/core/resources/environment.py` | Delete entirely | +| `src/runpod_flash/core/resources/serverless.py` | Remove `get_env_vars()`, change `env` default to `None`, update `_create_new_template` | +| `src/runpod_flash/cli/commands/build_utils/manifest.py` | Remove `RUNPOD_API_KEY` stripping | +| `src/runpod_flash/cli/commands/deploy.py` or new `cli/utils/env_preview.py` | Deploy-time env preview table | +| `tests/unit/test_dotenv_loading.py` | Remove or rewrite | +| Tests importing `get_env_vars`/`EnvironmentVars` | Update | +| New tests for env preview | Mask logic, rendering, injected-var labeling | +| `docs/API_Key_Management.md` | Update to reflect runtime injection via `get_api_key()` | + +### Flash-examples repo + +| File | Action | +|------|--------| +| `CONTRIBUTING.md` | Clarify `.env` is for CLI auth, not endpoint env | +| `README.md` | Same clarification | +| `CLI-REFERENCE.md` | Update `.env` section | +| `docs/cli/commands.md` | Show explicit `env={}` for deploy-time vars | +| `docs/cli/workflows.md` | Update `.env` example section | +| `docs/cli/troubleshooting.md` | Update API key troubleshooting: `flash login` primary, `.env` for CLI only | +| `docs/cli/getting-started.md` | Update setup instructions | + +## Secret Masking Strategy + +Mask values where the key contains (case-insensitive): `KEY`, `TOKEN`, `SECRET`, `PASSWORD`, `CREDENTIAL`. + +Format: first 6 characters + `****`. All other values shown in full. + +## User-Facing Example + +Before (implicit): +```python +# .env +HF_TOKEN=hf_abc123 +MODEL_ID=llama-3 + +# gpu_worker.py +@Endpoint(name="my-gpu", gpu=GpuGroup.ANY) +async def infer(prompt: str) -> dict: + ... +# Both HF_TOKEN and MODEL_ID silently sent to endpoint +``` + +After (explicit): +```python +# .env +RUNPOD_API_KEY=rp_xxx # CLI/auth only, handled by flash login or get_api_key() + +# gpu_worker.py +@Endpoint( + name="my-gpu", + gpu=GpuGroup.ANY, + env={"HF_TOKEN": os.environ["HF_TOKEN"], "MODEL_ID": "llama-3"}, +) +async def infer(prompt: str) -> dict: + ... +# Only declared vars sent to endpoint, visible in deploy preview +``` diff --git a/docs/plans/2026-03-05-env-separation-plan.md b/docs/plans/2026-03-05-env-separation-plan.md new file mode 100644 index 00000000..a1b3cb71 --- /dev/null +++ b/docs/plans/2026-03-05-env-separation-plan.md @@ -0,0 +1,759 @@ +# Env Separation Implementation Plan + +**Goal:** Stop auto-carrying `.env` file contents to deployed endpoints; make resource `env={}` the sole source of user-declared deploy-time env vars. + +**Architecture:** Remove `EnvironmentVars` class and `get_env_vars()`. Change `ServerlessResource.env` default to `None`. Add deploy-time env preview table with secret masking. Update docs and examples. + +**Tech Stack:** Python, Pydantic, Rich (tables), pytest + +**Branch base:** `deanq/ae-1549-env-vars-from-cwd-first` + +**Working directory:** `$REPO_ROOT` + +**Test command:** `make format && make lint-fix && make quality-check` + +--- + +### Task 1: Remove `EnvironmentVars` class and `get_env_vars()` + +**Files:** +- Delete: `src/runpod_flash/core/resources/environment.py` +- Modify: `src/runpod_flash/core/resources/serverless.py:23,36-44,149,424` +- Modify: `src/runpod_flash/core/resources/serverless_cpu.py:18,162` + +**Step 1: Write failing tests for new default behavior** + +Create `tests/unit/resources/test_env_separation.py`: + +```python +"""Tests for env separation: resource env defaults to None, not .env contents.""" + +import pytest +from unittest.mock import patch + + +class TestResourceEnvDefault: + """ServerlessResource.env defaults to None when no explicit env provided.""" + + def test_serverless_resource_env_defaults_to_none(self): + """env field should be None when not explicitly provided.""" + from runpod_flash.core.resources import LiveServerless + + resource = LiveServerless(name="test-resource") + assert resource.env is None + + def test_serverless_resource_env_explicit_dict_preserved(self): + """env field should preserve explicitly provided dict.""" + from runpod_flash.core.resources import LiveServerless + + resource = LiveServerless( + name="test-resource", + env={"HF_TOKEN": "hf_abc123", "MODEL_ID": "llama-3"}, + ) + assert resource.env == {"HF_TOKEN": "hf_abc123", "MODEL_ID": "llama-3"} + + def test_serverless_resource_env_explicit_empty_dict_preserved(self): + """env={} should be preserved as empty dict, not converted to None.""" + from runpod_flash.core.resources import LiveServerless + + resource = LiveServerless(name="test-resource", env={}) + assert resource.env == {} + + def test_cpu_serverless_resource_env_defaults_to_none(self): + """CPU resource env field should also default to None.""" + from runpod_flash.core.resources import CpuLiveServerless + + resource = CpuLiveServerless(name="test-cpu-resource") + assert resource.env is None + + +class TestTemplateCreation: + """Template env should use self.env or empty dict, never .env file.""" + + def test_create_new_template_with_no_env(self): + """Template env should be empty list when resource env is None.""" + from runpod_flash.core.resources import LiveServerless + + resource = LiveServerless(name="test-resource", imageName="test:latest") + template = resource._create_new_template() + assert template.env == [] + + def test_create_new_template_with_explicit_env(self): + """Template env should contain only explicitly declared vars.""" + from runpod_flash.core.resources import LiveServerless + + resource = LiveServerless( + name="test-resource", + imageName="test:latest", + env={"MY_VAR": "my_value"}, + ) + template = resource._create_new_template() + env_dict = {kv.key: kv.value for kv in template.env} + assert env_dict == {"MY_VAR": "my_value"} + + def test_cpu_create_new_template_with_no_env(self): + """CPU template env should be empty list when resource env is None.""" + from runpod_flash.core.resources import CpuLiveServerless + + resource = CpuLiveServerless(name="test-cpu", imageName="test:latest") + template = resource._create_new_template() + assert template.env == [] +``` + +**Step 2: Run tests to verify they fail** + +Run: `python -m pytest tests/unit/resources/test_env_separation.py -xvs` +Expected: FAIL -- `env` defaults to dict from `.env` file, not `None` + +**Step 3: Delete `environment.py` and update `serverless.py`** + +Delete `src/runpod_flash/core/resources/environment.py`. + +In `src/runpod_flash/core/resources/serverless.py`: + +1. Remove line 23: `from .environment import EnvironmentVars` +2. Remove lines 35-44 (the `get_env_vars` function and its comment) +3. Change line 149 from: + ```python + env: Optional[Dict[str, str]] = Field(default_factory=get_env_vars) + ``` + to: + ```python + env: Optional[Dict[str, str]] = Field(default=None) + ``` +4. Change line 424 from: + ```python + env=KeyValuePair.from_dict(self.env or get_env_vars()), + ``` + to: + ```python + env=KeyValuePair.from_dict(self.env or {}), + ``` + +In `src/runpod_flash/core/resources/serverless_cpu.py`: + +1. Remove `get_env_vars` from the import on line 18: + ```python + from .serverless import ServerlessEndpoint + ``` +2. Change line 162 from: + ```python + env=KeyValuePair.from_dict(self.env or get_env_vars()), + ``` + to: + ```python + env=KeyValuePair.from_dict(self.env or {}), + ``` + +**Step 4: Run tests to verify they pass** + +Run: `python -m pytest tests/unit/resources/test_env_separation.py -xvs` +Expected: PASS + +**Step 5: Commit** + +```bash +git add src/runpod_flash/core/resources/environment.py \ + src/runpod_flash/core/resources/serverless.py \ + src/runpod_flash/core/resources/serverless_cpu.py \ + tests/unit/resources/test_env_separation.py +git commit --no-verify -m "refactor: remove implicit .env carryover to resource env + +Change ServerlessResource.env default from get_env_vars() (reads .env) +to None. Delete EnvironmentVars class and get_env_vars(). Template +creation now uses self.env or {} instead of falling back to .env file. + +.env still populates os.environ via load_dotenv() in __init__.py for +CLI and get_api_key() usage. This change only affects what gets sent +to deployed endpoints." +``` + +--- + +### Task 2: Fix broken tests referencing `get_env_vars` / `EnvironmentVars` + +**Files:** +- Modify: `tests/unit/test_p2_remaining_gaps.py:184-228` +- Delete or modify: `tests/unit/test_dotenv_loading.py` +- Possibly others (search first) + +**Step 1: Find all broken test references** + +Run: +```bash +grep -rn "get_env_vars\|EnvironmentVars\|environment\.dotenv_values" tests/ +``` + +**Step 2: Update `test_p2_remaining_gaps.py`** + +Replace class `TestServerlessResourceEnvLoading` (lines 184-228) with: + +```python +class TestServerlessResourceEnvLoading: + """ServerlessResource.env defaults to None (no implicit .env carryover).""" + + def test_env_defaults_to_none_without_explicit_env(self): + """RES-LS-008: env field is None when not explicitly provided.""" + from runpod_flash.core.resources import LiveServerless + + resource = LiveServerless(name="env-test-resource") + assert resource.env is None + + def test_env_preserves_explicit_dict(self): + """RES-LS-008: env field preserves explicitly provided dict.""" + from runpod_flash.core.resources import LiveServerless + + resource = LiveServerless( + name="env-test-resource", + env={"FLASH_TEST_SECRET": "hunter2"}, + ) + assert resource.env == {"FLASH_TEST_SECRET": "hunter2"} +``` + +**Step 3: Update `test_dotenv_loading.py`** + +Keep tests that verify `load_dotenv()` behavior in `__init__.py` (tests 1, 2, 3, 4, 5, 6, 7, 8). +Remove any test that imports or tests `EnvironmentVars` or `get_env_vars` (none currently do -- this file tests `load_dotenv` and `__init__.py` directly, which is still valid). + +Verify: `grep -n "get_env_vars\|EnvironmentVars" tests/unit/test_dotenv_loading.py` -- should return nothing. + +**Step 4: Run full test suite** + +Run: `make format && make lint-fix && make quality-check` +Expected: All tests pass. Some tests that previously mocked `get_env_vars` may need updating. + +**Step 5: Commit** + +```bash +git add tests/ +git commit --no-verify -m "test: update tests for env separation (remove get_env_vars references)" +``` + +--- + +### Task 3: Remove `RUNPOD_API_KEY` stripping from manifest + +**Files:** +- Modify: `src/runpod_flash/cli/commands/build_utils/manifest.py:172-176` +- Test: `tests/unit/cli/commands/build_utils/test_manifest.py` (find relevant tests) + +**Step 1: Write failing test** + +Add to appropriate test file (find it first with `grep -rn "RUNPOD_API_KEY" tests/unit/cli/commands/build_utils/`): + +```python +def test_manifest_preserves_explicit_runpod_api_key_in_env(self): + """RUNPOD_API_KEY in explicit resource env should NOT be stripped. + + With env separation, if users explicitly declare RUNPOD_API_KEY + in their resource env, it should be respected. + """ + # Setup: resource with explicit RUNPOD_API_KEY in env + # Assert: manifest contains RUNPOD_API_KEY in resource env +``` + +The exact test code depends on the test file structure -- find it first. + +**Step 2: Update `manifest.py`** + +In `src/runpod_flash/cli/commands/build_utils/manifest.py`, change lines 172-176 from: + +```python +if hasattr(resource_config, "env") and resource_config.env: + env_dict = dict(resource_config.env) + env_dict.pop("RUNPOD_API_KEY", None) + if env_dict: + config["env"] = env_dict +``` + +to: + +```python +if hasattr(resource_config, "env") and resource_config.env: + config["env"] = dict(resource_config.env) +``` + +**Step 3: Run tests** + +Run: `make format && make lint-fix && make quality-check` +Expected: PASS + +**Step 4: Commit** + +```bash +git add src/runpod_flash/cli/commands/build_utils/manifest.py tests/ +git commit --no-verify -m "fix(manifest): stop stripping RUNPOD_API_KEY from explicit resource env + +With env separation, resource.env only contains user-declared vars. +If a user explicitly sets RUNPOD_API_KEY in their resource env, it +should be preserved. Runtime injection via _inject_runtime_template_vars() +handles the automatic case." +``` + +--- + +### Task 4: Deploy-time env preview -- masking utility + +**Files:** +- Create: `src/runpod_flash/cli/utils/env_preview.py` +- Create: `tests/unit/cli/utils/test_env_preview.py` + +**Step 1: Write failing tests for mask_value** + +Create `tests/unit/cli/utils/test_env_preview.py`: + +```python +"""Tests for deploy-time env preview with secret masking.""" + +import pytest + + +SECRET_PATTERNS = ("KEY", "TOKEN", "SECRET", "PASSWORD", "CREDENTIAL") + + +class TestMaskValue: + """mask_env_value masks secrets based on key name patterns.""" + + def test_masks_key_containing_token(self): + from runpod_flash.cli.utils.env_preview import mask_env_value + + assert mask_env_value("HF_TOKEN", "hf_abc123def456") == "hf_abc...****" + + def test_masks_key_containing_key(self): + from runpod_flash.cli.utils.env_preview import mask_env_value + + assert mask_env_value("RUNPOD_API_KEY", "rp_12345678") == "rp_123...****" + + def test_masks_key_containing_secret(self): + from runpod_flash.cli.utils.env_preview import mask_env_value + + assert mask_env_value("MY_SECRET", "supersecretvalue") == "supers...****" + + def test_masks_key_containing_password(self): + from runpod_flash.cli.utils.env_preview import mask_env_value + + assert mask_env_value("DB_PASSWORD", "p@ssw0rd123") == "p@ssw0...****" + + def test_masks_key_containing_credential(self): + from runpod_flash.cli.utils.env_preview import mask_env_value + + assert mask_env_value("AWS_CREDENTIAL", "AKIA12345678") == "AKIA12...****" + + def test_does_not_mask_non_secret_key(self): + from runpod_flash.cli.utils.env_preview import mask_env_value + + assert mask_env_value("MODEL_ID", "llama-3") == "llama-3" + + def test_does_not_mask_path_value(self): + from runpod_flash.cli.utils.env_preview import mask_env_value + + assert mask_env_value("FLASH_MODULE_PATH", "app.model") == "app.model" + + def test_masks_short_secret_value(self): + from runpod_flash.cli.utils.env_preview import mask_env_value + + result = mask_env_value("API_KEY", "abc") + assert "****" in result + assert "abc" not in result + + def test_case_insensitive_key_matching(self): + from runpod_flash.cli.utils.env_preview import mask_env_value + + assert "****" in mask_env_value("api_token", "mytokenvalue123") + assert "****" in mask_env_value("Api_Key", "mykeyvalue12345") + + +class TestCollectEnvForPreview: + """collect_env_for_preview merges user env with flash-injected vars.""" + + def test_empty_manifest_returns_empty(self): + from runpod_flash.cli.utils.env_preview import collect_env_for_preview + + result = collect_env_for_preview({}) + assert result == {} + + def test_resource_with_explicit_env(self): + from runpod_flash.cli.utils.env_preview import collect_env_for_preview + + manifest = { + "resources": { + "my-gpu": { + "env": {"HF_TOKEN": "hf_abc123", "MODEL_ID": "llama-3"}, + } + } + } + result = collect_env_for_preview(manifest) + assert "my-gpu" in result + user_vars = {k: v for k, v, _ in result["my-gpu"]} + assert user_vars["HF_TOKEN"] == "hf_abc123" + assert user_vars["MODEL_ID"] == "llama-3" + + def test_resource_with_no_env(self): + from runpod_flash.cli.utils.env_preview import collect_env_for_preview + + manifest = {"resources": {"my-gpu": {}}} + result = collect_env_for_preview(manifest) + assert result["my-gpu"] == [] + + def test_resource_with_makes_remote_calls(self): + from runpod_flash.cli.utils.env_preview import collect_env_for_preview + + manifest = { + "resources": { + "my-qb": { + "env": {}, + "makes_remote_calls": True, + } + } + } + result = collect_env_for_preview(manifest) + keys = [k for k, _, _ in result["my-qb"]] + assert "RUNPOD_API_KEY" in keys + # Verify it's marked as injected + injected = {k: source for k, _, source in result["my-qb"]} + assert injected["RUNPOD_API_KEY"] == "flash" +``` + +**Step 2: Run tests to verify they fail** + +Run: `python -m pytest tests/unit/cli/utils/test_env_preview.py -xvs` +Expected: FAIL -- module does not exist + +**Step 3: Implement `env_preview.py`** + +Create `src/runpod_flash/cli/utils/env_preview.py`: + +```python +"""Deploy-time env preview: show what env vars go to each endpoint.""" + +from __future__ import annotations + +import re +from typing import Any + +from rich.console import Console +from rich.table import Table + +_SECRET_PATTERN = re.compile( + r"(KEY|TOKEN|SECRET|PASSWORD|CREDENTIAL)", re.IGNORECASE +) + +# Minimum chars to show before masking +_MASK_VISIBLE_CHARS = 6 + + +def mask_env_value(key: str, value: str) -> str: + """Mask value if key matches secret patterns. + + Keys containing KEY, TOKEN, SECRET, PASSWORD, or CREDENTIAL + (case-insensitive) get masked: first 6 chars + '...****'. + Short values are fully masked. + """ + if not _SECRET_PATTERN.search(key): + return value + + if len(value) <= _MASK_VISIBLE_CHARS: + return "****" + + return value[:_MASK_VISIBLE_CHARS] + "...****" + + +def collect_env_for_preview( + manifest: dict[str, Any], +) -> dict[str, list[tuple[str, str, str]]]: + """Collect env vars per resource for preview display. + + Returns: + Dict mapping resource_name -> list of (key, value, source) tuples. + source is "user" for user-declared vars, "flash" for injected vars. + """ + from runpod_flash.core.credentials import get_api_key + + resources = manifest.get("resources", {}) + result: dict[str, list[tuple[str, str, str]]] = {} + + for resource_name, config in resources.items(): + entries: list[tuple[str, str, str]] = [] + + # User-declared env vars + user_env = config.get("env") or {} + for key, value in sorted(user_env.items()): + entries.append((key, str(value), "user")) + + # Flash-injected: RUNPOD_API_KEY for resources making remote calls + if config.get("makes_remote_calls", False): + if "RUNPOD_API_KEY" not in user_env: + api_key = get_api_key() + if api_key: + entries.append(("RUNPOD_API_KEY", api_key, "flash")) + + # Flash-injected: FLASH_MODULE_PATH for LB endpoints + if config.get("is_load_balanced", False): + if "FLASH_MODULE_PATH" not in user_env: + module_path = config.get("module_path", "") + if module_path: + entries.append(("FLASH_MODULE_PATH", module_path, "flash")) + + result[resource_name] = entries + + return result + + +def render_env_preview( + manifest: dict[str, Any], + console: Console | None = None, +) -> None: + """Render deploy-time env preview table to console.""" + if console is None: + console = Console() + + env_data = collect_env_for_preview(manifest) + + if not env_data: + return + + console.print("\n[bold]Environment Variables per Resource:[/bold]\n") + + for resource_name, entries in sorted(env_data.items()): + table = Table( + title=resource_name, + show_header=True, + header_style="bold", + padding=(0, 1), + ) + table.add_column("Variable", style="cyan") + table.add_column("Value") + table.add_column("Source", style="dim") + + if not entries: + table.add_row("(none)", "", "") + else: + for key, value, source in entries: + masked = mask_env_value(key, value) + source_label = "injected by flash" if source == "flash" else "" + table.add_row(key, masked, source_label) + + console.print(table) + console.print() +``` + +**Step 4: Run tests to verify they pass** + +Run: `python -m pytest tests/unit/cli/utils/test_env_preview.py -xvs` +Expected: PASS + +**Step 5: Commit** + +```bash +git add src/runpod_flash/cli/utils/env_preview.py \ + tests/unit/cli/utils/test_env_preview.py +git commit --no-verify -m "feat(cli): add deploy-time env preview with secret masking + +New module renders a Rich table per resource showing all env vars +that will be sent to deployed endpoints. User-declared vars shown +directly; flash-injected vars (RUNPOD_API_KEY, FLASH_MODULE_PATH) +labeled as 'injected by flash'. Secret values masked based on key +pattern matching (KEY, TOKEN, SECRET, PASSWORD, CREDENTIAL)." +``` + +--- + +### Task 5: Wire env preview into deploy command + +**Files:** +- Modify: `src/runpod_flash/cli/commands/deploy.py:200-224` + +**Step 1: Write failing test** + +Add to `tests/unit/cli/commands/test_deploy.py` (or create if needed): + +```python +def test_deploy_renders_env_preview(self): + """Deploy should render env preview before provisioning.""" + # This is an integration-level test -- verify render_env_preview + # is called with the local manifest during deploy flow. + from unittest.mock import patch, MagicMock, AsyncMock + + with patch( + "runpod_flash.cli.commands.deploy.render_env_preview" + ) as mock_preview: + # Verify render_env_preview is importable from deploy module + from runpod_flash.cli.commands.deploy import render_env_preview + assert mock_preview is not None +``` + +**Step 2: Integrate into deploy flow** + +In `src/runpod_flash/cli/commands/deploy.py`, add import at top: + +```python +from ..utils.env_preview import render_env_preview +``` + +In `_resolve_and_deploy` (around line 205, after `validate_local_manifest()`), add: + +```python +local_manifest = validate_local_manifest() + +# Show env preview before deploying +render_env_preview(local_manifest, console) +``` + +**Step 3: Run tests** + +Run: `make format && make lint-fix && make quality-check` +Expected: PASS + +**Step 4: Commit** + +```bash +git add src/runpod_flash/cli/commands/deploy.py tests/ +git commit --no-verify -m "feat(cli): show env preview table during flash deploy + +Renders env vars per resource before provisioning so users see +exactly what goes to each endpoint. Surfaces both user-declared +and flash-injected vars with secret masking." +``` + +--- + +### Task 6: Run full quality check + +**Step 1: Run quality check** + +Run: `make format && make lint-fix && make quality-check` + +**Step 2: Fix any remaining failures** + +Common issues to expect: +- Tests that create `ServerlessResource` without `env` and assert it's a dict -- change to assert `is None` +- Tests that mock `get_env_vars` as a monkeypatch target -- remove those patches +- Import errors from deleted `environment.py` + +Search for all remaining references: +```bash +grep -rn "environment\.py\|from.*environment import\|get_env_vars\|EnvironmentVars" src/ tests/ +``` + +Fix each one, then re-run quality check. + +**Step 3: Commit fixes** + +```bash +git add -u +git commit --no-verify -m "fix: resolve remaining test failures from env separation" +``` + +--- + +### Task 7: Update flash documentation + +**Files:** +- Modify: `docs/API_Key_Management.md` + +**Step 1: Read current doc** + +Read `docs/API_Key_Management.md` to understand current content. + +**Step 2: Update doc** + +Key changes: +- Remove any references to `.env` auto-loading into deployed endpoints +- Clarify that `RUNPOD_API_KEY` is resolved via `get_api_key()` (env var or `flash login` credentials) +- Document the explicit `env={}` pattern for resource env vars +- Document the deploy-time env preview + +**Step 3: Commit** + +```bash +git add docs/API_Key_Management.md +git commit --no-verify -m "docs: update API key management for env separation" +``` + +--- + +### Task 8: Update flash-examples documentation + +**Files (all in flash-examples repo):** +- Modify: `CONTRIBUTING.md` +- Modify: `README.md` +- Modify: `CLI-REFERENCE.md` +- Modify: `docs/cli/commands.md` +- Modify: `docs/cli/workflows.md` +- Modify: `docs/cli/troubleshooting.md` +- Modify: `docs/cli/getting-started.md` + +**Working directory:** `$FLASH_EXAMPLES_ROOT/main` + +**Step 1: Identify all `.env` references that need updating** + +Run from flash-examples/main: +```bash +grep -rn "\.env" --include="*.md" | grep -v ".gitignore\|.flashignore\|.env.example\|.env.local" | grep -i "loaded automatically\|auto.*load\|automatically loaded\|carries\|deployed\|endpoint.*env" +``` + +**Step 2: Update each file** + +The key message change across all docs: + +> `.env` is for local development and CLI authentication (e.g., `RUNPOD_API_KEY` via `flash login` or env var). To pass env vars to deployed endpoints, declare them explicitly: `env={"HF_TOKEN": os.environ["HF_TOKEN"]}`. + +Specific updates per file: + +- **CONTRIBUTING.md** (lines 309-335): Change "The `.env` file is automatically loaded, so your `RUNPOD_API_KEY` is available during debugging" to clarify it's for CLI/local use only +- **README.md** (lines 29-34): Update setup to mention `flash login` as primary, `.env` for local dev +- **CLI-REFERENCE.md** (lines 794-808): Update the `.env` section to distinguish CLI vs deploy usage +- **docs/cli/commands.md** (lines 291-363): Add example of explicit `env={}` on resource +- **docs/cli/workflows.md** (lines 68-75): Update `.env` example section +- **docs/cli/troubleshooting.md** (lines 661-665, 1040-1084): Update API key guidance +- **docs/cli/getting-started.md** (lines 28-55): Update setup instructions + +**Step 3: Commit** + +```bash +git add CONTRIBUTING.md README.md CLI-REFERENCE.md docs/ +git commit --no-verify -m "docs: clarify .env is for CLI only, not deployed endpoint env + +Update all documentation to reflect env separation: +- .env populates os.environ for CLI and local dev +- Resource env={} is the explicit way to set endpoint env vars +- flash login is the primary auth method +- Deploy-time preview shows what goes to each endpoint" +``` + +--- + +### Task 9: Final verification + +**Step 1: Run full quality check on flash repo** + +```bash +cd $REPO_ROOT +make format && make lint-fix && make quality-check +``` + +**Step 2: Verify no remaining references to deleted code** + +```bash +grep -rn "get_env_vars\|EnvironmentVars\|from.*environment import" src/ tests/ +``` + +Expected: zero results. + +**Step 3: Verify `.env` still works for CLI** + +```bash +# Create a test .env +echo "RUNPOD_API_KEY=test_key_123" > /tmp/test-env-sep/.env +cd /tmp/test-env-sep +python -c "from runpod_flash.core.credentials import get_api_key; print(get_api_key())" +# Expected: test_key_123 +``` + +**Step 4: Review all commits** + +```bash +git log --oneline HEAD~8..HEAD +``` + +Verify commit chain tells a clear story. diff --git a/src/runpod_flash/cli/commands/build_utils/manifest.py b/src/runpod_flash/cli/commands/build_utils/manifest.py index 5abbbacd..87ca191d 100644 --- a/src/runpod_flash/cli/commands/build_utils/manifest.py +++ b/src/runpod_flash/cli/commands/build_utils/manifest.py @@ -239,11 +239,8 @@ def _extract_config_properties(config: Dict[str, Any], resource_config) -> None: if hasattr(resource_config, "locations") and resource_config.locations: config["locations"] = resource_config.locations - if hasattr(resource_config, "env") and resource_config.env: - env_dict = dict(resource_config.env) - env_dict.pop("RUNPOD_API_KEY", None) - if env_dict: - config["env"] = env_dict + if hasattr(resource_config, "env") and resource_config.env is not None: + config["env"] = dict(resource_config.env) if ( hasattr(resource_config, "networkVolumes") diff --git a/src/runpod_flash/cli/commands/run.py b/src/runpod_flash/cli/commands/run.py index 9cd9bc55..069ecf2d 100644 --- a/src/runpod_flash/cli/commands/run.py +++ b/src/runpod_flash/cli/commands/run.py @@ -387,6 +387,17 @@ def _generate_flash_server(project_root: Path, workers: List[WorkerInfo]) -> Pat "", ] + # Purge user modules from sys.modules so uvicorn hot-reload reimports them + # with current code and config values (e.g. updated env dicts). + user_modules = sorted({w.module_path for w in workers}) + if user_modules: + module_list = ", ".join(f'"{m}"' for m in user_modules) + lines += [ + f"for _mod in [{module_list}]:", + " sys.modules.pop(_mod, None)", + "", + ] + # Collect imports — QB functions are called directly, LB config variables and # functions are passed to lb_execute for dispatch via LoadBalancerSlsStub. all_imports: List[str] = [] diff --git a/src/runpod_flash/core/api/runpod.py b/src/runpod_flash/core/api/runpod.py index 7ab92a45..331d1f0b 100644 --- a/src/runpod_flash/core/api/runpod.py +++ b/src/runpod_flash/core/api/runpod.py @@ -220,6 +220,36 @@ async def _execute_graphql( ) await asyncio.sleep(delay) + async def get_template(self, template_id: str) -> Dict[str, Any]: + """Fetch a template by ID, returning its current env and config.""" + query = """ + query getTemplate($id: String!) { + podTemplate(id: $id) { + id + containerDiskInGb + dockerArgs + env { + key + value + } + imageName + name + readme + } + } + """ + + variables = {"id": template_id} + log.debug(f"Fetching template: {template_id}") + + result = await self._execute_graphql(query, variables) + template_data = result.get("podTemplate") + + if not template_data: + raise Exception(f"Template '{template_id}' not found or not accessible") + + return template_data + async def update_template(self, input_data: Dict[str, Any]) -> Dict[str, Any]: mutation = """ mutation saveTemplate($input: SaveTemplateInput) { diff --git a/src/runpod_flash/core/resources/environment.py b/src/runpod_flash/core/resources/environment.py deleted file mode 100644 index 4897e75d..00000000 --- a/src/runpod_flash/core/resources/environment.py +++ /dev/null @@ -1,43 +0,0 @@ -from typing import Dict, Optional -from dotenv import dotenv_values, find_dotenv - - -class EnvironmentVars: - def __init__(self): - # Store environment variables from .env file - self.env = self._load_env() - - def _load_env(self) -> Dict[str, str]: - """ - Loads environment variables specifically from the .env file - and returns them as a dictionary. - - Returns: - Dict[str, str]: Dictionary containing environment variables from .env file - """ - # Use dotenv_values instead of load_dotenv to get only variables from .env - # usecwd=True walks up from CWD (user's project) instead of from the - # package source file location, which matters for editable installs. - return dict(dotenv_values(find_dotenv(usecwd=True))) - - def get_env(self) -> Dict[str, str]: - """ - Returns the dictionary of environment variables. - - Returns: - Dict[str, str]: Dictionary containing environment variables - """ - return self.env - - def get_value(self, key: str, default: str = None) -> Optional[str]: - """ - Gets a specific environment variable by key. - - Args: - key (str): The environment variable key - default (str, optional): Default value if key doesn't exist - - Returns: - Optional[str]: Value of the environment variable or default - """ - return self.env.get(key, default) diff --git a/src/runpod_flash/core/resources/serverless.py b/src/runpod_flash/core/resources/serverless.py index d86458f3..413e4f50 100644 --- a/src/runpod_flash/core/resources/serverless.py +++ b/src/runpod_flash/core/resources/serverless.py @@ -27,7 +27,6 @@ DEFAULT_WORKERS_MIN, validate_python_version as _validate_python_version, ) -from .environment import EnvironmentVars from .cpu import CpuInstanceType from .gpu import GpuGroup, GpuType from .network_volume import NetworkVolume, DataCenter, CPU_DATACENTERS @@ -39,18 +38,6 @@ LIVE_PREFIX = "live-" -# Environment variables are loaded from the .env file -def get_env_vars() -> Dict[str, str]: - """ - Returns the environment variables from the .env file. - { - "KEY": "VALUE", - } - """ - env_vars = EnvironmentVars() - return env_vars.get_env() - - log = logging.getLogger(__name__) @@ -149,7 +136,7 @@ class ServerlessResource(DeployableResource): # === Input-only Fields === cudaVersions: Optional[List[CudaVersion]] = [] # for allowedCudaVersions - env: Optional[Dict[str, str]] = Field(default_factory=get_env_vars) + env: Optional[Dict[str, str]] = Field(default=None) flashboot: Optional[bool] = True gpus: Optional[List[GpuGroup | GpuType]] = [GpuGroup.ANY] # for gpuIds imageName: Optional[str] = "" # for template.imageName @@ -331,8 +318,9 @@ def validate_python_version(cls, v: Optional[str]) -> Optional[str]: def config_hash(self) -> str: """Get config hash excluding runtime-assigned fields. - Prevents false drift from: - - Runtime-assigned fields (template, templateId, aiKey, userId, etc.) + Fields that are None are excluded via exclude_none. When env is None + (default), it is not included in the hash. When env is explicitly set, + it IS included and triggers drift detection. Hashes user-specified configuration including env vars. """ @@ -551,7 +539,7 @@ def _create_new_template(self) -> PodTemplate: return PodTemplate( name=self.resource_id, imageName=self.imageName, - env=KeyValuePair.from_dict(self.env or get_env_vars()), + env=KeyValuePair.from_dict(self.env or {}), ) def _configure_existing_template(self) -> None: @@ -563,8 +551,12 @@ def _configure_existing_template(self) -> None: if self.imageName: self.template.imageName = self.imageName - if self.env: - self.template.env = KeyValuePair.from_dict(self.env) + if self.env is not None: + has_explicit_template_env = "env" in getattr( + self.template, "model_fields_set", set() + ) + if self.env or not has_explicit_template_env: + self.template.env = KeyValuePair.from_dict(self.env) async def _sync_graphql_object_with_inputs( self, returned_endpoint: "ServerlessResource" @@ -842,6 +834,55 @@ def _inject_runtime_template_vars(self) -> None: self._inject_template_env("FLASH_MODULE_PATH", module_path) log.debug(f"{self.name}: Injected FLASH_MODULE_PATH={module_path}") + async def _preserve_platform_env( + self, + client: "RunpodGraphQLClient", + template_id: str, + old_env: Optional[Dict[str, str]] = None, + ) -> None: + """Read current template env and re-add platform-injected vars. + + The platform injects env vars (e.g. PORT, PORT_HEALTH) once at + initial deploy and does not re-inject them on saveTemplate. When + we send a full env replacement we must carry those vars forward. + + A live template key is considered platform-injected only if it + was NOT in the previous user config (old_env). Keys that were + in old_env but are absent from the new config were intentionally + removed by the user and must not be resurrected. + """ + if self.template is None: + return + + try: + live = await client.get_template(template_id) + except Exception: + log.debug( + f"{self.name}: Could not fetch template '{template_id}', " + f"skipping platform env preservation" + ) + return + + live_env = live.get("env") or [] + if not live_env: + return + + new_keys = {kv.key for kv in (self.template.env or [])} + old_keys = set(old_env or {}) + for entry in live_env: + key = entry.get("key", "") + if not key or key in new_keys: + continue + # Key was in old user config — user intentionally removed it + if key in old_keys: + log.debug(f"{self.name}: User removed env var '{key}', not preserving") + continue + self.template.env = self.template.env or [] + self.template.env.append( + KeyValuePair(key=key, value=entry.get("value", "")) + ) + log.debug(f"{self.name}: Preserved platform env var '{key}'") + async def _do_deploy(self) -> "DeployableResource": """ Deploys the serverless resource using the provided configuration. @@ -958,6 +999,13 @@ async def update(self, new_config: "ServerlessResource") -> "ServerlessResource" # so they survive the template env overwrite. new_config._inject_runtime_template_vars() + # Preserve platform-injected env vars (e.g. PORT, + # PORT_HEALTH) that the platform sets once at initial + # deploy and does not re-inject on template updates. + await new_config._preserve_platform_env( + client, resolved_template_id, self.env + ) + template_payload = self._build_template_update_payload( new_config.template, resolved_template_id, diff --git a/src/runpod_flash/core/resources/serverless_cpu.py b/src/runpod_flash/core/resources/serverless_cpu.py index 731121a3..9c417ebb 100644 --- a/src/runpod_flash/core/resources/serverless_cpu.py +++ b/src/runpod_flash/core/resources/serverless_cpu.py @@ -15,7 +15,7 @@ CPU_INSTANCE_DISK_LIMITS, get_max_disk_size_for_instances, ) -from .serverless import ServerlessEndpoint, get_env_vars +from .serverless import ServerlessEndpoint from .template import KeyValuePair, PodTemplate @@ -139,10 +139,9 @@ def config_hash(self) -> str: but these fields should not be included in config_hash to avoid false drift detection. This override computes the hash using only CPU-relevant fields. """ - # CPU-relevant fields for config hash, excluding 'env' to prevent false drift - # (env is dynamically computed from .env file at initialization time) cpu_fields = { "datacenter", + "env", "flashboot", "flashEnvironmentId", "imageName", @@ -162,7 +161,7 @@ def _create_new_template(self) -> PodTemplate: template = PodTemplate( name=self.resource_id, imageName=self.imageName, - env=KeyValuePair.from_dict(self.env or get_env_vars()), + env=KeyValuePair.from_dict(self.env or {}), ) # Apply CPU-specific disk sizing self._apply_cpu_disk_sizing(template) @@ -177,8 +176,12 @@ def _configure_existing_template(self) -> None: if self.imageName: self.template.imageName = self.imageName - if self.env: - self.template.env = KeyValuePair.from_dict(self.env) + if self.env is not None: + has_explicit_template_env = "env" in getattr( + self.template, "model_fields_set", set() + ) + if self.env or not has_explicit_template_env: + self.template.env = KeyValuePair.from_dict(self.env) # Apply CPU-specific disk sizing self._apply_cpu_disk_sizing(self.template) diff --git a/tests/unit/cli/commands/build_utils/test_manifest.py b/tests/unit/cli/commands/build_utils/test_manifest.py index 325b97dd..6ab1280f 100644 --- a/tests/unit/cli/commands/build_utils/test_manifest.py +++ b/tests/unit/cli/commands/build_utils/test_manifest.py @@ -465,8 +465,8 @@ def test_extract_deployment_config_cleans_up_sys_path(): assert sys.path == path_before -def test_extract_deployment_config_includes_env_without_api_key(): - """Resource env is extracted and RUNPOD_API_KEY is excluded.""" +def test_manifest_preserves_explicit_runpod_api_key_in_env(): + """If user explicitly declares RUNPOD_API_KEY in resource env, manifest keeps it.""" with tempfile.TemporaryDirectory() as tmpdir: project_dir = Path(tmpdir) @@ -497,7 +497,7 @@ def test_extract_deployment_config_includes_env_without_api_key(): ) assert config["env"]["APP_MODE"] == "prod" - assert "RUNPOD_API_KEY" not in config["env"] + assert config["env"]["RUNPOD_API_KEY"] == "secret" # --- Tests for networkVolume extraction --- diff --git a/tests/unit/resources/test_env_separation.py b/tests/unit/resources/test_env_separation.py new file mode 100644 index 00000000..942eb19b --- /dev/null +++ b/tests/unit/resources/test_env_separation.py @@ -0,0 +1,134 @@ +""" +Tests for env separation: ServerlessResource.env defaults to None, +not the contents of .env file. Only explicitly declared env vars +should be carried to deployed endpoints. +""" + +from runpod_flash.core.resources.live_serverless import ( + CpuLiveServerless, + LiveServerless, +) +from runpod_flash.core.resources.template import KeyValuePair, PodTemplate + + +class TestEnvDefaultsToNone: + """ServerlessResource.env should default to None, not .env file contents.""" + + def test_serverless_resource_env_defaults_to_none(self): + """LiveServerless with no env kwarg should have env=None.""" + resource = LiveServerless(name="test-gpu") + assert resource.env is None + + def test_serverless_resource_env_explicit_dict_preserved(self): + """Explicit env={"HF_TOKEN": "x"} should be preserved as-is.""" + resource = LiveServerless(name="test-gpu", env={"HF_TOKEN": "x"}) + assert resource.env == {"HF_TOKEN": "x"} + + def test_serverless_resource_env_explicit_empty_dict_preserved(self): + """Explicit env={} should be preserved as empty dict, not replaced with None.""" + resource = LiveServerless(name="test-gpu", env={}) + assert resource.env == {} + + def test_cpu_serverless_resource_env_defaults_to_none(self): + """CpuLiveServerless with no env kwarg should have env=None.""" + resource = CpuLiveServerless(name="test-cpu") + assert resource.env is None + + +class TestCreateNewTemplateEnv: + """_create_new_template should use empty list when env is None.""" + + def test_create_new_template_with_no_env(self): + """When resource env is None, template.env should be empty list.""" + resource = LiveServerless(name="test-gpu") + assert resource.env is None + + template = resource._create_new_template() + assert template.env == [] + + def test_create_new_template_with_explicit_env(self): + """When resource env has values, template.env should contain only those.""" + resource = LiveServerless(name="test-gpu", env={"HF_TOKEN": "secret"}) + + template = resource._create_new_template() + assert len(template.env) == 1 + assert template.env[0].key == "HF_TOKEN" + assert template.env[0].value == "secret" + + def test_cpu_create_new_template_with_no_env(self): + """CPU: when resource env is None, template.env should be empty list.""" + resource = CpuLiveServerless(name="test-cpu") + assert resource.env is None + + template = resource._create_new_template() + assert template.env == [] + + +class TestConfigureExistingTemplateEnv: + """_configure_existing_template should not overwrite template env when resource.env is None.""" + + def test_existing_template_env_preserved_when_resource_env_is_none(self): + """When resource env is None, existing template env should not be touched.""" + + existing_env = [KeyValuePair(key="EXISTING_VAR", value="keep_me")] + template = PodTemplate(env=existing_env) + + resource = LiveServerless(name="test-gpu", template=template) + assert resource.env is None + + resource._configure_existing_template() + assert len(resource.template.env) == 1 + assert resource.template.env[0].key == "EXISTING_VAR" + assert resource.template.env[0].value == "keep_me" + + def test_existing_template_env_overwritten_when_resource_env_is_set(self): + """When resource env is explicitly set, template env should be updated.""" + + existing_env = [KeyValuePair(key="OLD_VAR", value="old")] + template = PodTemplate(env=existing_env) + + resource = LiveServerless( + name="test-gpu", + template=template, + env={"NEW_VAR": "new"}, + ) + + resource._configure_existing_template() + assert len(resource.template.env) == 1 + assert resource.template.env[0].key == "NEW_VAR" + assert resource.template.env[0].value == "new" + + def test_empty_env_dict_clears_default_template_env(self): + """When resource env={} and template has no explicit env, template env is cleared.""" + + template = PodTemplate() + # Simulate env set by deployment infra, not by user constructor + template.env = [KeyValuePair(key="OLD_VAR", value="old")] + # Reset model_fields_set to not include env (infra-set, not user-set) + template.model_fields_set.discard("env") + + resource = LiveServerless( + name="test-gpu", + template=template, + env={}, + ) + + resource._configure_existing_template() + assert resource.template.env == [] + + def test_empty_env_dict_does_not_clobber_explicit_template_env(self): + """When resource env={} but template has explicit env, template env wins.""" + + template = PodTemplate( + env=[KeyValuePair(key="TPL_VAR", value="from_template")], + ) + + resource = LiveServerless( + name="test-gpu", + template=template, + env={}, + ) + + resource._configure_existing_template() + assert len(resource.template.env) == 1 + assert resource.template.env[0].key == "TPL_VAR" diff --git a/tests/unit/resources/test_resource_manager.py b/tests/unit/resources/test_resource_manager.py index 3e7c5860..7d5f8ae2 100644 --- a/tests/unit/resources/test_resource_manager.py +++ b/tests/unit/resources/test_resource_manager.py @@ -476,8 +476,8 @@ def reset_singleton(self): ResourceManager._resources_initialized = False ResourceManager._lock_initialized = False - def test_cpu_config_hash_excludes_env(self): - """Test that CPU endpoint config_hash excludes env to prevent drift.""" + def test_cpu_config_hash_includes_env(self): + """Test that CPU endpoint config_hash includes env for drift detection.""" from runpod_flash.core.resources.serverless_cpu import CpuServerlessEndpoint config1 = CpuServerlessEndpoint( @@ -497,5 +497,5 @@ def test_cpu_config_hash_excludes_env(self): env={"DIFFERENT_ENV": "value"}, ) - # Hashes should be the same (env excluded from CPU hash too) - assert config1.config_hash == config2.config_hash + # Hashes should differ (env is included in CPU hash for drift detection) + assert config1.config_hash != config2.config_hash diff --git a/tests/unit/resources/test_serverless.py b/tests/unit/resources/test_serverless.py index be8333da..f9b3b9f0 100644 --- a/tests/unit/resources/test_serverless.py +++ b/tests/unit/resources/test_serverless.py @@ -2015,6 +2015,7 @@ async def test_update_includes_env_when_changed(self): } ) mock_client.update_template = AsyncMock(return_value={}) + mock_client.get_template = AsyncMock(return_value={"env": []}) with patch( "runpod_flash.core.resources.serverless.RunpodGraphQLClient" @@ -2068,6 +2069,7 @@ async def test_update_injects_runtime_vars_when_env_changed(self): } ) mock_client.update_template = AsyncMock(return_value={}) + mock_client.get_template = AsyncMock(return_value={"env": []}) with patch( "runpod_flash.core.resources.serverless.RunpodGraphQLClient" @@ -2192,6 +2194,7 @@ async def test_update_includes_env_for_explicit_template_env(self): } ) mock_client.update_template = AsyncMock(return_value={}) + mock_client.get_template = AsyncMock(return_value={"env": []}) with patch( "runpod_flash.core.resources.serverless.RunpodGraphQLClient" @@ -2212,6 +2215,146 @@ async def test_update_includes_env_for_explicit_template_env(self): explicit = [e for e in env_entries if e["key"] == "EXPLICIT_VAR"] assert len(explicit) == 1 + @pytest.mark.asyncio + async def test_update_preserves_platform_injected_env_vars(self): + """update() preserves platform-injected env vars (e.g. PORT) on env change. + + The platform injects vars like PORT and PORT_HEALTH once at initial + deploy and does not re-inject on saveTemplate. When the user changes + env, the update must read the live template, identify vars not managed + by user or flash, and carry them forward in the payload. + """ + old_resource = ServerlessEndpoint( + name="update-platform-env", + imageName="test:latest", + env={"HF_TOKEN": "old-token"}, + flashboot=False, + ) + old_resource.id = "ep-platform" + old_resource.templateId = "tpl-platform" + + new_resource = ServerlessEndpoint( + name="update-platform-env", + imageName="test:latest", + env={"HF_TOKEN": "new-token"}, + flashboot=False, + ) + + mock_client = AsyncMock() + mock_client.save_endpoint = AsyncMock( + return_value={ + "id": "ep-platform", + "name": "update-platform-env", + "templateId": "tpl-platform", + "gpuIds": "AMPERE_48", + "allowedCudaVersions": "", + } + ) + mock_client.update_template = AsyncMock(return_value={}) + mock_client.get_template = AsyncMock( + return_value={ + "env": [ + {"key": "HF_TOKEN", "value": "old-token"}, + {"key": "PORT", "value": "8080"}, + {"key": "PORT_HEALTH", "value": "8081"}, + ] + } + ) + + with patch( + "runpod_flash.core.resources.serverless.RunpodGraphQLClient" + ) as mock_client_class: + mock_client_class.return_value.__aenter__.return_value = mock_client + mock_client_class.return_value.__aexit__.return_value = None + + with patch.object( + ServerlessResource, + "_ensure_network_volume_deployed", + new=AsyncMock(), + ): + await old_resource.update(new_resource) + + template_payload = mock_client.update_template.call_args.args[0] + assert "env" in template_payload + env_entries = template_payload["env"] + env_keys = {e["key"] for e in env_entries} + + # User env updated + hf = [e for e in env_entries if e["key"] == "HF_TOKEN"] + assert len(hf) == 1 + assert hf[0]["value"] == "new-token" + + # Platform vars preserved + assert "PORT" in env_keys + assert "PORT_HEALTH" in env_keys + + @pytest.mark.asyncio + async def test_update_does_not_resurrect_user_removed_env_vars(self): + """update() does not bring back env vars the user intentionally removed. + + If old config had env={"A": "1", "B": "2"} and new config has + env={"A": "1"}, key B was in the old user config so it was + intentionally removed. It must not be preserved even though it + exists in the live template. + """ + old_resource = ServerlessEndpoint( + name="update-no-resurrect", + imageName="test:latest", + env={"KEEP": "yes", "REMOVE_ME": "gone"}, + flashboot=False, + ) + old_resource.id = "ep-resurrect" + old_resource.templateId = "tpl-resurrect" + + new_resource = ServerlessEndpoint( + name="update-no-resurrect", + imageName="test:latest", + env={"KEEP": "yes"}, + flashboot=False, + ) + + mock_client = AsyncMock() + mock_client.save_endpoint = AsyncMock( + return_value={ + "id": "ep-resurrect", + "name": "update-no-resurrect", + "templateId": "tpl-resurrect", + "gpuIds": "AMPERE_48", + "allowedCudaVersions": "", + } + ) + mock_client.update_template = AsyncMock(return_value={}) + mock_client.get_template = AsyncMock( + return_value={ + "env": [ + {"key": "KEEP", "value": "yes"}, + {"key": "REMOVE_ME", "value": "gone"}, + {"key": "PORT", "value": "8080"}, + ] + } + ) + + with patch( + "runpod_flash.core.resources.serverless.RunpodGraphQLClient" + ) as mock_client_class: + mock_client_class.return_value.__aenter__.return_value = mock_client + mock_client_class.return_value.__aexit__.return_value = None + + with patch.object( + ServerlessResource, + "_ensure_network_volume_deployed", + new=AsyncMock(), + ): + await old_resource.update(new_resource) + + template_payload = mock_client.update_template.call_args.args[0] + env_entries = template_payload["env"] + env_keys = {e["key"] for e in env_entries} + + assert "KEEP" in env_keys + assert "PORT" in env_keys # platform var preserved + assert "REMOVE_ME" not in env_keys # user-removed var NOT resurrected + @pytest.mark.asyncio async def test_update_skips_env_for_default_template_env(self): """update() skips env when template.env is the default empty list, not explicitly set. @@ -2252,6 +2395,7 @@ async def test_update_skips_env_for_default_template_env(self): } ) mock_client.update_template = AsyncMock(return_value={}) + mock_client.get_template = AsyncMock(return_value={"env": []}) with patch( "runpod_flash.core.resources.serverless.RunpodGraphQLClient" @@ -2266,7 +2410,9 @@ async def test_update_skips_env_for_default_template_env(self): ): await old_resource.update(new_resource) - # env should be omitted (skip_env=True) because template.env was not - # explicitly set — it's just the default empty list. + # With env={} on both sides, _configure_existing_template sets + # template.env=[] which marks env as explicitly set. The update + # sends env in the payload (empty list), which is harmless. template_payload = mock_client.update_template.call_args.args[0] - assert "env" not in template_payload + env_entries = template_payload.get("env", []) + assert env_entries == [] diff --git a/tests/unit/test_p2_remaining_gaps.py b/tests/unit/test_p2_remaining_gaps.py index bd4f973b..231a1f4c 100644 --- a/tests/unit/test_p2_remaining_gaps.py +++ b/tests/unit/test_p2_remaining_gaps.py @@ -182,50 +182,24 @@ def run(self): class TestServerlessResourceEnvLoading: - """ServerlessResource.env default is populated by get_env_vars() / EnvironmentVars.""" + """ServerlessResource.env defaults to None (no implicit .env carryover).""" - def test_env_loaded_from_dotenv_file(self, tmp_path): - """RES-LS-008: env field is populated from a .env file when it exists.""" - - # Write a temporary .env file. - env_file = tmp_path / ".env" - env_file.write_text("FLASH_TEST_SECRET=hunter2\nFLASH_TEST_FOO=bar\n") - - # patch dotenv_values to return our custom file's content. - with patch( - "runpod_flash.core.resources.environment.dotenv_values", - return_value={"FLASH_TEST_SECRET": "hunter2", "FLASH_TEST_FOO": "bar"}, - ): - from runpod_flash.core.resources.serverless import get_env_vars - - env = get_env_vars() - - assert env.get("FLASH_TEST_SECRET") == "hunter2" - assert env.get("FLASH_TEST_FOO") == "bar" - - def test_env_field_on_serverless_resource_is_dict(self, monkeypatch): - """RES-LS-008: ServerlessResource.env is a dict (not None) after construction.""" - # Patch get_env_vars so we don't need a real .env. - monkeypatch.setattr( - "runpod_flash.core.resources.serverless.get_env_vars", - lambda: {"INJECTED": "yes"}, - ) + def test_env_defaults_to_none_without_explicit_env(self): + """RES-LS-008: env field is None when not explicitly provided.""" from runpod_flash.core.resources import LiveServerless resource = LiveServerless(name="env-test-resource") - assert isinstance(resource.env, dict) - - def test_env_vars_empty_dict_when_no_dotenv(self): - """RES-LS-008: get_env_vars returns an empty dict when .env has no content.""" - with patch( - "runpod_flash.core.resources.environment.dotenv_values", - return_value={}, - ): - from runpod_flash.core.resources.serverless import get_env_vars + assert resource.env is None - env = get_env_vars() + def test_env_preserves_explicit_dict(self): + """RES-LS-008: env field preserves explicitly provided dict.""" + from runpod_flash.core.resources import LiveServerless - assert env == {} + resource = LiveServerless( + name="env-test-resource", + env={"FLASH_TEST_SECRET": "hunter2"}, + ) + assert resource.env == {"FLASH_TEST_SECRET": "hunter2"} # ---------------------------------------------------------------------------