Skip to content

Fix #168 #140: Improve generation error handling and file operations#178

Open
Vaibhavee89 wants to merge 1 commit intojamiepine:mainfrom
Vaibhavee89:fix/generation-error-handling-168-140
Open

Fix #168 #140: Improve generation error handling and file operations#178
Vaibhavee89 wants to merge 1 commit intojamiepine:mainfrom
Vaibhavee89:fix/generation-error-handling-168-140

Conversation

@Vaibhavee89
Copy link

@Vaibhavee89 Vaibhavee89 commented Feb 24, 2026

Summary

This PR fixes issues #168 and #140 related to generation failures with file system and connection errors.

Issues Fixed

Issue #168: Generation failed [Errno 2] No such file or directory

  • Platform: macOS Apple Silicon (and others)
  • Problem: Users couldn't save generated audio due to file system errors
  • Root Cause: Poor error handling, no directory verification, no atomic writes

Issue #140: Generation failed [Errno 32] Broken pipe

  • Platform: Multiple platforms
  • Problem: Generation fails with broken pipe error when clients disconnect
  • Root Cause: Unhandled BrokenPipeError, no cleanup on connection failures

Changes Made

1. Improved File Writing (backend/utils/audio.py)

  • Atomic file writes: Write to .tmp file first, then rename atomically
  • Better error handling: Catch and re-raise with contextual information
  • Directory creation: Automatically create parent directories
  • Cleanup on failure: Remove temporary files if write fails

Before:

def save_audio(audio, path, sample_rate):
    sf.write(path, audio, sample_rate)

After:

def save_audio(audio, path, sample_rate):
    Path(path).parent.mkdir(parents=True, exist_ok=True)
    temp_path = f"{path}.tmp"
    sf.write(temp_path, audio, sample_rate)
    os.replace(temp_path, path)  # Atomic

2. Enhanced Error Handling (backend/main.py)

  • Specific error messages for different errno codes:
    • errno 2 (ENOENT): "Directory error - shows exact path"
    • errno 13 (EACCES): "Permission denied - check write permissions"
    • errno 28 (ENOSPC): "No space left on device"
    • errno 32 (EPIPE): "Connection interrupted - please retry"
  • BrokenPipeError handling: Graceful handling when clients disconnect
  • Directory verification: Check exists and writable before saving
  • Task cleanup: Properly cleanup task state on all error paths

3. Added Filesystem Health Check (backend/main.py)

  • New endpoint: GET /health/filesystem
  • Diagnostics: Check directory existence, permissions, disk space
  • Response includes:
    • Directory paths
    • Write permission status
    • Free disk space (GB)
    • Total disk space (GB)

Example response:

{
  "status": "ok",
  "data_dir": "/Users/user/data",
  "generations_dir": "/Users/user/data/generations",
  "writable": true,
  "free_space_gb": 123.45,
  "total_space_gb": 500.0
}

4. Comprehensive Test Suite (backend/tests/test_generation_error_handling.py)

  • ✅ Tests for normal save operation
  • ✅ Tests for parent directory creation
  • ✅ Tests for atomic write behavior
  • ✅ Tests for error cleanup
  • ✅ Tests for permission errors
  • ✅ Tests for no space errors
  • ✅ Tests for broken pipe detection
  • ✅ Tests for health check endpoint

User Experience Improvements

Before This Fix

Issue #168:

  • ❌ Error: Generation failed [Errno 2]
  • ❌ No indication of what's wrong
  • ❌ No guidance on how to fix

Issue #140:

  • ❌ Error: Generation failed [Errno 32]
  • ❌ Must restart server
  • ❌ No explanation of issue

After This Fix

Issue #168:

  • ✅ Error: Directory error: [Errno 2] No such file or directory. Generations directory: /path/to/dir
  • ✅ OR: Permission denied when saving audio. Check write permissions for: /path/to/dir
  • ✅ OR: No space left on device. Please free up disk space.
  • ✅ Clear indication of the exact problem
  • ✅ Path information for debugging
  • ✅ Actionable guidance

Issue #140:

  • ✅ Error: Connection interrupted during file save. Please retry.
  • ✅ OR: Client disconnected during generation. Please retry.
  • ✅ Automatic cleanup of task state
  • ✅ No server restart needed (in most cases)
  • ✅ Clear retry instructions

Testing

Manual Testing

  1. Test with read-only directory
  2. Test with missing directory (auto-created)
  3. Test with full disk (clear error message)
  4. Test client disconnect (graceful handling)
  5. Test health endpoint

Automated Tests

cd backend
pip install pytest pytest-asyncio
pytest tests/test_generation_error_handling.py -v

Benefits

  1. No more corrupted audio files (atomic writes)
  2. Clear error messages (specific errno handling)
  3. Better diagnostics (health check endpoint)
  4. Graceful degradation (handle client disconnects)
  5. Prevents data loss (proper cleanup on errors)

Checklist

  • Code follows style guidelines (PEP 8 for Python)
  • Documentation updated (CHANGELOG.md)
  • Changes tested manually
  • Automated tests added
  • No breaking changes
  • User-friendly error messages implemented
  • Comprehensive test coverage

Related Issues

Closes #168
Closes #140

Future Enhancements (Suggested)

  • Add configurable save location (environment variable)
  • Add retry logic for transient errors
  • Add disk space monitoring/alerts

Summary by CodeRabbit

  • New Features

    • Added a filesystem health check endpoint reporting directory status, write permissions, and available disk space.
  • Bug Fixes

    • Improved generation error handling with specific, actionable error messages.
    • Implemented atomic file writes for audio generation.
    • Enhanced graceful handling of network disconnections during file operations.

…d file operations

- Implement atomic file writes in save_audio() to prevent corrupted files
- Add specific error messages for different errno codes (ENOENT, EACCES, ENOSPC, EPIPE)
- Handle BrokenPipeError gracefully when clients disconnect during generation
- Verify directory exists and is writable before saving audio files
- Add /health/filesystem endpoint to diagnose file system issues
- Add comprehensive test suite for error handling scenarios
- Update CHANGELOG.md with fix details

This fix addresses:
- Issue jamiepine#168: Generation failed [Errno 2] No such file or directory
  - Better error messages showing exact directory path and permission issues
  - Atomic writes prevent partial file corruption
  - Disk space checks and clear error messages

- Issue jamiepine#140: Generation failed [Errno 32] Broken pipe
  - Graceful handling of client disconnects
  - Proper task cleanup on errors
  - Clear retry instructions for users
@coderabbitai
Copy link

coderabbitai bot commented Feb 24, 2026

📝 Walkthrough

Walkthrough

Enhances error resilience by introducing a new filesystem health endpoint, implementing atomic file writes with cleanup, and adding granular errno-based error handling (ENOENT, EACCES, ENOSPC, EPIPE) throughout the generation flow, with comprehensive test coverage.

Changes

Cohort / File(s) Summary
Documentation
CHANGELOG.md
Added Unreleased section documenting improved generation error handling, atomic writes, explicit errno-based error messages, broken pipe handling, and new filesystem health check endpoint.
Filesystem Health & Audio Writing
backend/utils/audio.py
Modified save_audio to write atomically using temporary files with automatic directory creation and error cleanup; preserves function signature and documentation.
Generation Endpoint & Health Checks
backend/main.py
Added new GET /health/filesystem endpoint that validates directory existence, write permissions, and disk space; enhanced generation flow with directory existence checks, atomic write verification, specific errno-based error messages, and graceful BrokenPipeError handling returning HTTP 499.
Error Handling Test Suite
backend/tests/test_generation_error_handling.py
Comprehensive new test file covering save_audio behavior (success, directory creation, atomic writes, cleanup, permission/ENOSPC errors), health endpoint responses, broken pipe detection, and generation flow error scenarios with patching and temporary filesystem setups.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API as Generation Endpoint
    participant Filesystem
    participant Storage as Audio File
    participant Metadata as Metadata DB

    Client->>API: POST /generate
    activate API
    
    rect rgba(100, 150, 255, 0.5)
    Note over API,Filesystem: Pre-Generation Checks
    API->>Filesystem: Verify generations dir exists
    alt Dir missing
        Filesystem-->>API: ENOENT
        API->>API: Create directory
    end
    API->>Filesystem: Test write permissions
    alt Not writable
        Filesystem-->>API: EACCES
        API-->>Client: HTTP 500 (Permission Denied)
        API->>API: Mark completion, log warning
    end
    end
    
    rect rgba(150, 200, 100, 0.5)
    Note over API,Storage: Audio Generation & Save
    API->>API: Generate audio
    API->>Storage: save_audio (atomic write)
    activate Storage
    Storage->>Filesystem: Create temp file
    Storage->>Filesystem: Write audio data
    alt Write fails (ENOSPC/EPIPE)
        Filesystem-->>Storage: OSError
        Storage->>Filesystem: Cleanup temp file
        Storage-->>API: OSError
        API-->>Client: HTTP 500 (Disk/Pipe error)
    else Write succeeds
        Storage->>Filesystem: Atomic rename to target
        Storage-->>API: Success
    end
    deactivate Storage
    end
    
    rect rgba(200, 150, 100, 0.5)
    Note over API,Metadata: Completion
    alt BrokenPipeError during generation
        API->>API: Mark completion
        API->>API: Log warning
        API-->>Client: HTTP 499 (Retry)
    else Normal completion
        API->>Metadata: Save generation metadata
        API-->>Client: HTTP 200 (Success)
    end
    end
    deactivate API
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A filesystem spring in the warren—
No more lost files through the door,
With atomic writes we've borrowed,
And health checks to ensure
The pipes stay unbroken, not marred. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly summarizes the main objectives: fixing error handling and file operations for issues #168 and #140.
Linked Issues check ✅ Passed All key requirements from issues #168 and #140 are met: atomic file writes [#168], directory existence/writability checks [#168], errno-based error messages [#168, #140], graceful BrokenPipeError handling [#140], task cleanup on connection failure [#140], and filesystem health endpoint [#168].
Out of Scope Changes check ✅ Passed All changes are directly aligned with fixing #168 and #140: filesystem error handling, atomic writes, health monitoring, and error recovery. No extraneous changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/main.py (1)

747-804: ⚠️ Potential issue | 🔴 Critical

HTTPException raised in the inner except OSError is re-caught by the outer except Exception, causing double-wrapping and double task completion.

The inner except OSError block (line 747) is nested inside the outer try (line 659). When it raises HTTPException at line 763, that exception propagates through the outer try-except chain. None of except BrokenPipeError, except ValueError, or except OSError match HTTPException, so except Exception at line 802 catches it.

Consequences:

  1. task_manager.complete_generation(generation_id) is called twice — once at line 749 inside the inner except, and again at line 803 in the outer except.
  2. raise HTTPException(status_code=500, detail=str(e)) at line 804 wraps the original HTTPException. Since Starlette's HTTPException doesn't populate self.args, str(e) is effectively empty, so the carefully constructed error_msg (the whole point of this PR's errno-based messaging) is silently discarded and replaced with an empty detail.

The fix is to add an explicit except HTTPException: raise pass-through guard before except Exception:

🐛 Proposed fix
+    except BrokenPipeError as e:
+        task_manager.complete_generation(generation_id)
+        print(f"[WARNING] Client disconnected during generation {generation_id}: {e}")
+        raise HTTPException(
+            status_code=499,
+            detail="Client disconnected during generation. Please retry."
+        )
     except BrokenPipeError as e:
-        # Client disconnected during generation
         task_manager.complete_generation(generation_id)
         ...
+    from fastapi import HTTPException as _HTTPException
+    except _HTTPException:
+        raise  # Let FastAPI handle it as-is; task was already marked complete above
     except ValueError as e:
         ...

Cleaner approach — add it directly:

     except OSError as e:
         # OSError already handled above in save_audio block, but catch any others
         task_manager.complete_generation(generation_id)
         ...
         raise HTTPException(status_code=500, detail=f"System error: {str(e)}")
+    except HTTPException:
+        raise  # Already handled with correct status/detail; don't re-wrap
     except Exception as e:
         task_manager.complete_generation(generation_id)
         raise HTTPException(status_code=500, detail=str(e))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/main.py` around lines 747 - 804, The inner OSError handler raises an
HTTPException that is then caught by the outer generic except Exception, causing
task_manager.complete_generation(generation_id) to run twice and the
HTTPException detail to be lost; fix by adding an explicit "except
HTTPException: raise" clause in the outer try/except (placed before the final
generic "except Exception" block) so HTTPException propagates unchanged and is
not re-handled or double-completed (references: inner OSError handler that
raises HTTPException, outer except Exception and
task_manager.complete_generation(generation_id)).
🧹 Nitpick comments (3)
backend/utils/audio.py (1)

83-84: Move from pathlib import Path and import os to the module top level.

Both pathlib and os are standard-library modules already used (or expected to be used) across the file. Importing them inside the function body adds a small lookup cost on every call and obscures the module's actual dependencies.

♻️ Proposed change
+import os
 import numpy as np
 import soundfile as sf
 import librosa
 from typing import Tuple, Optional
+from pathlib import Path

Then remove lines 83–84 from inside save_audio.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/utils/audio.py` around lines 83 - 84, The imports for pathlib.Path
and os should be moved to the module top level and removed from inside
save_audio to avoid repeated lookups and clarify dependencies; update the file
by adding "from pathlib import Path" and "import os" at the top of the module
and delete the in-function import lines in the save_audio function, keeping all
existing references to Path and os unchanged.
backend/main.py (2)

763-801: Use raise ... from e when re-raising HTTPException inside except blocks.

Lines 763, 787, 792, 797–800, and 801 all raise HTTPException inside except clauses without chaining (raise HTTPException(...) from e). This drops the original traceback from the exception context, making it harder to diagnose the root cause in logs. Ruff flags these as B904.

♻️ Example fix (apply to all five sites)
-            raise HTTPException(status_code=500, detail=error_msg)
+            raise HTTPException(status_code=500, detail=error_msg) from e
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/main.py` around lines 763 - 801, The HTTPException raises in the
except handlers for BrokenPipeError, ValueError, and OSError drop the original
exception context; update each raise to chain the original exception using
"raise HTTPException(... ) from e" so tracebacks are preserved — specifically
modify the raises in the BrokenPipeError block (the raise returning
status_code=499), the ValueError block (raise HTTPException(status_code=400,
...)), and the OSError block (both the EPIPE-specific raise and the generic
raise) adjacent to task_manager.complete_generation/generation creation so each
uses "from e".

212-278: Add a response_model (or at least a typed TypedDict/Pydantic model) to /health/filesystem.

The endpoint currently returns raw dict with varying shapes depending on which error branch is hit — some include data_dir, some only generations_dir, the success path includes both plus free_space_gb / total_space_gb. Without a response_model, the OpenAPI docs show no schema, and callers can't rely on a stable contract. A Pydantic response model also makes it easier to catch field renames early.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/main.py` around lines 212 - 278, Create explicit Pydantic response
models and attach them to the route decorator for check_filesystem: define a
base model (e.g., FileSystemBase with status and message), an error model
(FileSystemError extends FileSystemBase adding optional data_dir and
generations_dir and writable: Optional[bool]), and a success model
(FileSystemSuccess extends FileSystemBase adding data_dir, generations_dir,
writable: bool, free_space_gb: float, total_space_gb: float); then change the
decorator to `@app.get`("/health/filesystem",
response_model=Union[FileSystemSuccess, FileSystemError]) (or a single model
with optional fields) and update check_filesystem to return dicts that match
those models' field names so OpenAPI has a stable schema for the endpoint.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/main.py`:
- Around line 782-789: The except BrokenPipeError handler currently raises
HTTPException with non-standard status 499; change it to use a standard status
(e.g., 503 Service Unavailable or 408 Request Timeout) and include appropriate
headers (for 503 add a Retry-After header) while still calling
task_manager.complete_generation(generation_id) and logging the error; update
the raise HTTPException call to use the chosen standard status_code and include
headers={"Retry-After": "<seconds>"} when using 503 and keep the same detail
message so clients and intermediaries correctly interpret the client-closed
condition.

In `@backend/tests/test_generation_error_handling.py`:
- Around line 129-130: There's a redundant local "import tempfile" inside the
test block that re-imports a module already imported at the top of the file
(causing Ruff F811); remove the inner "import tempfile" so the code uses the
module-level import and keep the with tempfile.TemporaryDirectory() as tmpdir:
block unchanged; look for and delete the duplicate import near the
with-statement in test_generation_error_handling.py.
- Around line 51-64: The test fails because save_audio calls os.replace on a
non-existent temp file when soundfile.write is patched to do nothing; update
test_save_audio_atomic_write to also patch os.replace (or patch 'os.replace'
alongside 'soundfile.write') so no FileNotFoundError is raised, then assert that
the mocked os.replace was called and that its source argument (the first arg)
endswith('.tmp'); reference the test function test_save_audio_atomic_write and
the implementation save_audio, and assert calls on the mocked os.replace and/or
soundfile.write to verify the atomic-write temp file behavior.

In `@backend/utils/audio.py`:
- Around line 98-108: The current except block discards errno by always raising
a brand-new OSError; after cleaning up the temp file (temp_path /
Path(...).unlink()), preserve original errno by re-raising the original OSError
when e is an OSError (or by constructing the new OSError with the same errno
from e.errno) instead of throwing a fresh OSError without errno; if e is not an
OSError, wrap it with a generic OSError including the original message for
context. Use the existing exception variable e and the path/temp_path names to
locate where to change the re-raise logic in backend/utils/audio.py.

In `@CHANGELOG.md`:
- Around line 8-21: There are duplicate "## [Unreleased]" sections; merge the
second section's entries (e.g., the audio-export fix and Makefile entries) into
the first "## [Unreleased]" block so all changes appear under a single header,
remove the extra "## [Unreleased]" header, and ensure the combined subsections
(Fixed, Added, etc.) are properly merged and deduplicated while preserving
Markdown list formatting and link references.

---

Outside diff comments:
In `@backend/main.py`:
- Around line 747-804: The inner OSError handler raises an HTTPException that is
then caught by the outer generic except Exception, causing
task_manager.complete_generation(generation_id) to run twice and the
HTTPException detail to be lost; fix by adding an explicit "except
HTTPException: raise" clause in the outer try/except (placed before the final
generic "except Exception" block) so HTTPException propagates unchanged and is
not re-handled or double-completed (references: inner OSError handler that
raises HTTPException, outer except Exception and
task_manager.complete_generation(generation_id)).

---

Nitpick comments:
In `@backend/main.py`:
- Around line 763-801: The HTTPException raises in the except handlers for
BrokenPipeError, ValueError, and OSError drop the original exception context;
update each raise to chain the original exception using "raise HTTPException(...
) from e" so tracebacks are preserved — specifically modify the raises in the
BrokenPipeError block (the raise returning status_code=499), the ValueError
block (raise HTTPException(status_code=400, ...)), and the OSError block (both
the EPIPE-specific raise and the generic raise) adjacent to
task_manager.complete_generation/generation creation so each uses "from e".
- Around line 212-278: Create explicit Pydantic response models and attach them
to the route decorator for check_filesystem: define a base model (e.g.,
FileSystemBase with status and message), an error model (FileSystemError extends
FileSystemBase adding optional data_dir and generations_dir and writable:
Optional[bool]), and a success model (FileSystemSuccess extends FileSystemBase
adding data_dir, generations_dir, writable: bool, free_space_gb: float,
total_space_gb: float); then change the decorator to
`@app.get`("/health/filesystem", response_model=Union[FileSystemSuccess,
FileSystemError]) (or a single model with optional fields) and update
check_filesystem to return dicts that match those models' field names so OpenAPI
has a stable schema for the endpoint.

In `@backend/utils/audio.py`:
- Around line 83-84: The imports for pathlib.Path and os should be moved to the
module top level and removed from inside save_audio to avoid repeated lookups
and clarify dependencies; update the file by adding "from pathlib import Path"
and "import os" at the top of the module and delete the in-function import lines
in the save_audio function, keeping all existing references to Path and os
unchanged.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 38bf96f and 4ee5b5a.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • backend/main.py
  • backend/tests/test_generation_error_handling.py
  • backend/utils/audio.py

Comment on lines +782 to +789
except BrokenPipeError as e:
# Client disconnected during generation
task_manager.complete_generation(generation_id)
print(f"[WARNING] Client disconnected during generation {generation_id}: {e}")
raise HTTPException(
status_code=499, # Client Closed Request (non-standard but widely used)
detail="Client disconnected during generation. Please retry."
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

HTTP 499 is a non-standard, nginx-specific status code.

Some reverse proxies, load balancers, and HTTP clients may not recognise 499 and could map it to a generic 5xx error, obscuring the intended "client closed request" semantics. Consider using 503 Service Unavailable with a Retry-After header, or 408 Request Timeout, both of which are standardised and widely understood.

🧰 Tools
🪛 Ruff (0.15.2)

[warning] 786-789: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/main.py` around lines 782 - 789, The except BrokenPipeError handler
currently raises HTTPException with non-standard status 499; change it to use a
standard status (e.g., 503 Service Unavailable or 408 Request Timeout) and
include appropriate headers (for 503 add a Retry-After header) while still
calling task_manager.complete_generation(generation_id) and logging the error;
update the raise HTTPException call to use the chosen standard status_code and
include headers={"Retry-After": "<seconds>"} when using 503 and keep the same
detail message so clients and intermediaries correctly interpret the
client-closed condition.

Comment on lines +51 to +64
def test_save_audio_atomic_write(self, tmp_path):
"""Test that save_audio uses atomic write (temp file then rename)."""
audio = np.random.rand(24000).astype(np.float32)
output_path = tmp_path / "test.wav"

with patch('soundfile.write') as mock_write:
# Mock sf.write to not actually write
mock_write.return_value = None

save_audio(audio, str(output_path), sample_rate=24000)

# Should have been called with temp file path
args = mock_write.call_args[0]
assert args[0].endswith('.tmp')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

test_save_audio_atomic_write always fails — os.replace is not mocked.

When soundfile.write is patched to return None, no .tmp file is actually created on disk. The subsequent os.replace(temp_path, path) in save_audio therefore raises FileNotFoundError, which propagates as OSError from save_audio. Since the test has no pytest.raises guard, the test exits with an unexpected exception before reaching the assertions on lines 63–64.

🐛 Proposed fix — also mock `os.replace`, or capture the exception and assert after

Option A: mock both sf.write and os.replace

-        with patch('soundfile.write') as mock_write:
-            # Mock sf.write to not actually write
-            mock_write.return_value = None
-
-            save_audio(audio, str(output_path), sample_rate=24000)
-
-            # Should have been called with temp file path
-            args = mock_write.call_args[0]
-            assert args[0].endswith('.tmp')
+        with patch('utils.audio.sf.write') as mock_write, \
+             patch('utils.audio.os.replace'):
+            mock_write.return_value = None
+
+            save_audio(audio, str(output_path), sample_rate=24000)
+
+            # Should have been called with temp file path
+            args = mock_write.call_args[0]
+            assert args[0].endswith('.tmp')

Option B: allow the exception and assert on call_args afterward

-        with patch('soundfile.write') as mock_write:
-            mock_write.return_value = None
-            save_audio(audio, str(output_path), sample_rate=24000)
-            args = mock_write.call_args[0]
-            assert args[0].endswith('.tmp')
+        with patch('utils.audio.sf.write') as mock_write:
+            mock_write.return_value = None
+            with pytest.raises(OSError):
+                save_audio(audio, str(output_path), sample_rate=24000)
+            # mock_write.call_args is still accessible after the exception
+            args = mock_write.call_args[0]
+            assert args[0].endswith('.tmp')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_generation_error_handling.py` around lines 51 - 64, The
test fails because save_audio calls os.replace on a non-existent temp file when
soundfile.write is patched to do nothing; update test_save_audio_atomic_write to
also patch os.replace (or patch 'os.replace' alongside 'soundfile.write') so no
FileNotFoundError is raised, then assert that the mocked os.replace was called
and that its source argument (the first arg) endswith('.tmp'); reference the
test function test_save_audio_atomic_write and the implementation save_audio,
and assert calls on the mocked os.replace and/or soundfile.write to verify the
atomic-write temp file behavior.

Comment on lines +129 to +130
import tempfile
with tempfile.TemporaryDirectory() as tmpdir:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove redundant import tempfile — it was already imported at line 11.

The module-level import tempfile at line 11 is re-imported here, triggering Ruff F811 (redefinition of unused import).

♻️ Proposed fix
-        import tempfile
-        with tempfile.TemporaryDirectory() as tmpdir:
+        with tempfile.TemporaryDirectory() as tmpdir:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import tempfile
with tempfile.TemporaryDirectory() as tmpdir:
with tempfile.TemporaryDirectory() as tmpdir:
🧰 Tools
🪛 Ruff (0.15.2)

[error] 129-129: Redefinition of unused tempfile from line 11: tempfile redefined here

Remove definition: tempfile

(F811)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_generation_error_handling.py` around lines 129 - 130,
There's a redundant local "import tempfile" inside the test block that
re-imports a module already imported at the top of the file (causing Ruff F811);
remove the inner "import tempfile" so the code uses the module-level import and
keep the with tempfile.TemporaryDirectory() as tmpdir: block unchanged; look for
and delete the duplicate import near the with-statement in
test_generation_error_handling.py.

Comment on lines +98 to +108
except Exception as e:
# Clean up temp file if it exists
temp_path = f"{path}.tmp"
if Path(temp_path).exists():
try:
Path(temp_path).unlink()
except Exception:
pass # Best effort cleanup

# Re-raise with more context
raise OSError(f"Failed to save audio to {path}: {str(e)}") from e
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

errno is discarded in the re-raise, breaking all errno-based error messages in main.py.

raise OSError(f"...") creates a brand-new exception with no errno attribute. The getattr(e, 'errno', None) check in main.py (line 752) will therefore always return None, so every if errno_num == 2/13/28/32 branch in the generation handler is unreachable dead code. The ENOSPC, EPIPE, and EACCES specific messages will never be shown to users.

Note: test_save_audio_no_space_error (test line 115) only asserts "Failed to save audio" in str(exc_info.value) — it doesn't verify errno is preserved, so it passes without catching this regression.

🐛 Proposed fix — preserve errno in the re-raise
-    except Exception as e:
+    except OSError as e:
         # Clean up temp file if it exists
         temp_path = f"{path}.tmp"
         if Path(temp_path).exists():
             try:
                 Path(temp_path).unlink()
-            except Exception:
+            except OSError:
                 pass  # Best effort cleanup
-        raise OSError(f"Failed to save audio to {path}: {str(e)}") from e
+        raise OSError(e.errno, f"Failed to save audio to {path}: {e.strerror or str(e)}") from e
+    except Exception as e:
+        temp_path = f"{path}.tmp"
+        if Path(temp_path).exists():
+            try:
+                Path(temp_path).unlink()
+            except OSError:
+                pass
+        raise OSError(f"Failed to save audio to {path}: {str(e)}") from e

This ensures OSError originating from sf.write or os.replace retains its errno, while unexpected non-OSError exceptions are still wrapped generically.

🧰 Tools
🪛 Ruff (0.15.2)

[error] 104-105: try-except-pass detected, consider logging the exception

(S110)


[warning] 104-104: Do not catch blind exception: Exception

(BLE001)


[warning] 108-108: Avoid specifying long messages outside the exception class

(TRY003)


[warning] 108-108: Use explicit conversion flag

Replace with conversion flag

(RUF010)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/utils/audio.py` around lines 98 - 108, The current except block
discards errno by always raising a brand-new OSError; after cleaning up the temp
file (temp_path / Path(...).unlink()), preserve original errno by re-raising the
original OSError when e is an OSError (or by constructing the new OSError with
the same errno from e.errno) instead of throwing a fresh OSError without errno;
if e is not an OSError, wrap it with a generic OSError including the original
message for context. Use the existing exception variable e and the
path/temp_path names to locate where to change the re-raise logic in
backend/utils/audio.py.

Comment on lines +8 to +21
## [Unreleased]

### Fixed
- **Generation Error Handling** - Improved error handling for file system and connection errors ([#168](https://github.com/jamiepine/voicebox/issues/168), [#140](https://github.com/jamiepine/voicebox/issues/140))
- Implemented atomic file writes to prevent corrupted audio files
- Added specific error messages for different errno codes (ENOENT, EACCES, ENOSPC, EPIPE)
- Handle broken pipe errors gracefully when clients disconnect
- Verify directory writability before saving files

### Added
- **Filesystem Health Check** - New `/health/filesystem` endpoint to diagnose file system issues
- Reports directory status and write permissions
- Shows available disk space
- Helps troubleshoot generation failures
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Merge the duplicate ## [Unreleased] sections.

There are now two ## [Unreleased] sections (lines 8–21 and line 69). Keep a Changelog allows only one; automated tools (e.g. release-please, standard-version) will parse only the first one and ignore the second, so the audio-export fix and Makefile entries at line 69 would be silently dropped at release time. Merge both sets of entries into a single ## [Unreleased] block.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CHANGELOG.md` around lines 8 - 21, There are duplicate "## [Unreleased]"
sections; merge the second section's entries (e.g., the audio-export fix and
Makefile entries) into the first "## [Unreleased]" block so all changes appear
under a single header, remove the extra "## [Unreleased]" header, and ensure the
combined subsections (Fixed, Added, etc.) are properly merged and deduplicated
while preserving Markdown list formatting and link references.

@wbrunsong
Copy link

Thanks for the fix, looking forward to Jamie implementing and releasing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Generation failed [Errno 2] No such file director Generation failed [Errno 32] Broken pipe

2 participants