Skip to content

feat: self-hosted transcription + speaker diarization pipeline#30

Open
Copilot wants to merge 6 commits intomainfrom
copilot/feature-transcription-speaker-diarization
Open

feat: self-hosted transcription + speaker diarization pipeline#30
Copilot wants to merge 6 commits intomainfrom
copilot/feature-transcription-speaker-diarization

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 29, 2026

  • Explore codebase and understand structure
  • Add faster-whisper, pyannote.audio, ffmpeg-python dependencies to pyproject.toml
  • Add transcription/diarization config env keys to app/core/config.py and env.example
  • Create app/detection/transcribe.py with core pipeline functions
  • Add TranscribeParams, TranscribeResultsModel and related sub-models
  • Add transcribe to JobType enum and update CreateJobRequest param routing
  • Add process_transcribe_task Celery task in app/core/tasks.py
  • Register new task in app/core/celery_queue.py
  • Add unit tests for merge/alignment logic (21 tests, all passing)
  • Replace placeholder http://example.com/video.mp4 with real test video URL
  • Add _transcribe_payload helper and transcribe job type tests to test_api.py
  • Fix CI failures: test_status_returns_correct_shape and test_status_duplicate_project_returns_existing_job by adding a lightweight Redis job registry
  • Regenerate openapi.json with all new transcribe/diarization schemas (TranscribeParams, TranscribeResultsModel, DiarizationSegmentModel, etc.)

💬 Send tasks to Copilot coding agent from Slack and Teams to turn conversations into code. Copilot posts an update in your thread when it's finished.

Copilot AI linked an issue Mar 29, 2026 that may be closed by this pull request
11 tasks
Copilot AI changed the title [WIP] Implement self-hosted transcription with speaker diarization feat: self-hosted transcription + speaker diarization pipeline Mar 29, 2026
Copilot AI requested a review from younes200 March 29, 2026 15:45
@younes200
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 29, 2026

@copilot use this video url for testing instead of https://example.com/video.mp4 :

https://pt-storage.celluloid.cloud/web-videos/a53d2ce5-0db1-49bf-9daa-be2dafd46ee9-144.mp4

Done in commit 7eed24a. Replaced the placeholder URL with https://pt-storage.celluloid.cloud/web-videos/a53d2ce5-0db1-49bf-9daa-be2dafd46ee9-144.mp4 in all three payload helpers (_object_detect_payload, _scene_detect_payload, _transcribe_payload) and the inline test bodies. Also added a _transcribe_payload helper and transcribe job type tests to test_api.py.

@younes200
Copy link
Copy Markdown
Contributor

@copilot fix the failing github workflow action API Tests / api-tests (pull_request)

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 29, 2026

@copilot fix the failing github workflow action API Tests / api-tests (pull_request)

Fixed in commit 68a1b35.

The two failing tests were caused by a race condition: with worker_prefetch_multiplier=1 and the real video URL (which takes ~60s to process), the single Celery worker is busy executing an earlier job. New jobs sit in the Celery queue in PENDING state and are invisible to inspect.active()/reserved(), so get_job_from_celery() returned None → 404.

The fix adds a lightweight Redis job registry (cvision:job:{job_id} hashes, 24h TTL) that stores minimal job metadata at enqueue time. When Celery reports PENDING and the inspect API finds nothing, we fall back to this registry. get_all_jobs() also merges registry entries so the deduplication check works correctly for all queued jobs.

@younes200
Copy link
Copy Markdown
Contributor

@copilot generate and commit also the new openapi.json

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 29, 2026

@copilot generate and commit also the new openapi.json

Done in commit 2a50283. The openapi.json is regenerated and now includes all new schemas: TranscribeParams, TranscribeResultsModel, DiarizationSegmentModel, TranscribeWordModel, and TranscriptSegmentModel, along with the transcribe job type in the CreateJobRequest discriminated union.

@younes200 younes200 marked this pull request as ready for review March 29, 2026 16:43
Add robust diarization handling, model persistence configuration, and deployment/hook updates while removing hardcoded secrets from deployment scripts.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: transcription + speaker diarization

2 participants