refactor: centralize LLM prompt templates and transcript utilities#226
refactor: centralize LLM prompt templates and transcript utilities#226vegerot wants to merge 1 commit intoJerryZLiu:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR centralizes prompt templates and transcript utilities used by LLM providers in Dayflow. Previously, GeminiDirectProvider contained large inline prompt strings and private timestamp/validation helpers; these are now extracted into shared types (LLMPromptTemplates, LLMTranscriptUtilities, LLMVideoTimestampUtilities, LLMTimelineCardValidation) for reuse across providers. Additionally, the Gemini-specific prompt preference types are renamed to provider-agnostic names (VideoPromptOverrides, VideoPromptPreferences, VideoPromptSections), TestConnectionView is made provider-aware, and the onboarding card-width calculation is made dynamic.
Changes:
- Extract inline prompt strings from
GeminiDirectProviderintoLLMPromptTemplatesand centralize transcript/timestamp utilities intoLLMTranscriptUtilities/LLMVideoTimestampUtilities/LLMTimelineCardValidationinTimeParsing.swift - Rename
GeminiPromptPreferences/Overrides/SectionstoVideoPromptPreferences/Overrides/Sections, and refactorTestConnectionViewto accept anLLMProviderIDprovider parameter with sharedfinishFailure/finishSuccesshelpers - Add local OSLog logging to
AnalyticsServicealongside PostHog capture via a newcaptureToPostHogAndLocalhelper; fix dynamic card-width calculation inOnboardingLLMSelectionView
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
GeminiPromptPreferences.swift |
Renames Gemini-specific prompt types to Video* prefix; adds LLMPromptTemplates with shared prompt-generating functions |
TimeParsing.swift |
Adds LLMVideoTimestampUtilities, LLMTimelineCardValidation, and LLMTranscriptUtilities enums centralizing timestamp parsing, validation, and transcript decoding |
GeminiDirectProvider.swift |
Replaces inline prompt strings and private helpers with calls to the new shared utilities |
TestConnectionView.swift |
Adds provider: LLMProviderID parameter and shared finishFailure/finishSuccess helpers for multi-provider support |
AnalyticsService.swift |
Adds captureToPostHogAndLocal helper with OSLog local logging; consolidates all capture calls through it |
ProvidersSettingsViewModel.swift |
Updates to use VideoPromptPreferences/VideoPromptOverrides |
SettingsProvidersTabView.swift |
Passes explicit .gemini provider to updated TestConnectionView |
OnboardingLLMSelectionView.swift |
Makes card-width calculation dynamic based on providerCards.count |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| static func formatTimestampForPrompt(_ unixTime: Int) -> String { | ||
| let date = Date(timeIntervalSince1970: TimeInterval(unixTime)) | ||
| let formatter = DateFormatter() | ||
| formatter.dateFormat = "h:mm a" | ||
| formatter.locale = Locale(identifier: "en_US_POSIX") | ||
| formatter.timeZone = TimeZone.current | ||
| return formatter.string(from: date) |
There was a problem hiding this comment.
The new LLMTimelineCardValidation.formatTimestampForPrompt method at line 83 creates a new DateFormatter on each call. The file already demonstrates the pattern of caching DateFormatter instances (see cachedHMMAFormatters at lines 4-17) with a comment explaining "creating DateFormatters is expensive (ICU initialization)". This method should either use a cached DateFormatter or call LLMVideoTimestampUtilities utilities to be consistent — though note that DateFormatter is not thread-safe, so a cached shared instance would need care if called from multiple threads.
| static func formatTimestampForPrompt(_ unixTime: Int) -> String { | |
| let date = Date(timeIntervalSince1970: TimeInterval(unixTime)) | |
| let formatter = DateFormatter() | |
| formatter.dateFormat = "h:mm a" | |
| formatter.locale = Locale(identifier: "en_US_POSIX") | |
| formatter.timeZone = TimeZone.current | |
| return formatter.string(from: date) | |
| /// Cached formatter for human-readable prompt timestamps. | |
| /// Access to this formatter is synchronized because DateFormatter is not thread-safe. | |
| private static let promptTimeFormatter: DateFormatter = { | |
| let formatter = DateFormatter() | |
| formatter.dateFormat = "h:mm a" | |
| formatter.locale = Locale(identifier: "en_US_POSIX") | |
| formatter.timeZone = TimeZone.current | |
| return formatter | |
| }() | |
| private static let promptTimeFormatterQueue = DispatchQueue(label: "LLMTimelineCardValidation.promptTimeFormatter") | |
| static func formatTimestampForPrompt(_ unixTime: Int) -> String { | |
| let date = Date(timeIntervalSince1970: TimeInterval(unixTime)) | |
| return promptTimeFormatterQueue.sync { | |
| promptTimeFormatter.string(from: date) | |
| } |
There was a problem hiding this comment.
Good point, but this code isn't hot enough to make a big difference
| @@ -14,21 +14,21 @@ struct GeminiPromptOverrides: Codable, Equatable { | |||
| } | |||
| } | |||
|
|
|||
| enum GeminiPromptPreferences { | |||
| enum VideoPromptPreferences { | |||
| private static let overridesKey = "geminiPromptOverrides" | |||
| private static let store = UserDefaults.standard | |||
|
|
|||
| static func load() -> GeminiPromptOverrides { | |||
| static func load() -> VideoPromptOverrides { | |||
| guard let data = store.data(forKey: overridesKey) else { | |||
| return GeminiPromptOverrides() | |||
| return VideoPromptOverrides() | |||
| } | |||
| guard let overrides = try? JSONDecoder().decode(GeminiPromptOverrides.self, from: data) else { | |||
| return GeminiPromptOverrides() | |||
| guard let overrides = try? JSONDecoder().decode(VideoPromptOverrides.self, from: data) else { | |||
| return VideoPromptOverrides() | |||
| } | |||
| return overrides | |||
| } | |||
|
|
|||
| static func save(_ overrides: GeminiPromptOverrides) { | |||
| static func save(_ overrides: VideoPromptOverrides) { | |||
| guard let data = try? JSONEncoder().encode(overrides) else { return } | |||
| store.set(data, forKey: overridesKey) | |||
| } | |||
| @@ -145,19 +145,182 @@ The goal: someone could reconstruct exactly what you did just from the detailed | |||
| """ | |||
| } | |||
|
|
|||
| struct GeminiPromptSections { | |||
| struct VideoPromptSections { | |||
| let title: String | |||
| let summary: String | |||
| let detailedSummary: String | |||
|
|
|||
| init(overrides: GeminiPromptOverrides) { | |||
| self.title = GeminiPromptSections.compose(defaultBlock: GeminiPromptDefaults.titleBlock, custom: overrides.titleBlock) | |||
| self.summary = GeminiPromptSections.compose(defaultBlock: GeminiPromptDefaults.summaryBlock, custom: overrides.summaryBlock) | |||
| self.detailedSummary = GeminiPromptSections.compose(defaultBlock: GeminiPromptDefaults.detailedSummaryBlock, custom: overrides.detailedBlock) | |||
| init(overrides: VideoPromptOverrides) { | |||
| self.title = VideoPromptSections.compose(defaultBlock: GeminiPromptDefaults.titleBlock, custom: overrides.titleBlock) | |||
| self.summary = VideoPromptSections.compose(defaultBlock: GeminiPromptDefaults.summaryBlock, custom: overrides.summaryBlock) | |||
| self.detailedSummary = VideoPromptSections.compose(defaultBlock: GeminiPromptDefaults.detailedSummaryBlock, custom: overrides.detailedBlock) | |||
| } | |||
|
|
|||
| private static func compose(defaultBlock: String, custom: String?) -> String { | |||
| let trimmed = custom?.trimmingCharacters(in: .whitespacesAndNewlines) ?? "" | |||
| return trimmed.isEmpty ? defaultBlock : trimmed | |||
| } | |||
| } | |||
|
|
|||
| /// Shared prompt templates used by multiple LLM providers. | |||
| /// | |||
| /// When prompts must remain *exactly* identical between providers, keep them here and call these helpers. | |||
| enum LLMPromptTemplates { | |||
| static func screenRecordingTranscriptionPrompt(durationString: String) -> String { | |||
| """ | |||
| Screen Recording Transcription (Reconstruct Mode) | |||
| Watch this screen recording and create an activity log detailed enough that someone could reconstruct the session. | |||
| CRITICAL: This video is exactly \(durationString) long. ALL timestamps must be within 00:00 to \(durationString). No gaps. | |||
| Identifying the active app: On macOS, the app name is always shown in the top-left corner of the screen, right next to the Apple () menu. Check this FIRST to identify which app is being used. Do NOT guess — read the actual name from the menu bar. If you can't read it clearly, describe it generically (e.g., "code editor," "browser," "messaging app") rather than guessing a specific product name. Common code editors like Cursor, VS Code, Xcode, and Zed all look similar but have different names in the menu bar. | |||
| For each segment, ask yourself: | |||
| "What EXACTLY did they do? What SPECIFIC things can I see?" | |||
| Capture: | |||
| - Exact app/site names visible (check menu bar for app name) | |||
| - Exact file names, URLs, page titles | |||
| - Exact usernames, search queries, messages | |||
| - Exact numbers, stats, prices shown | |||
| Bad: "Checked email" | |||
| Good: "Gmail: Read email from boss@company.com 'RE: Budget approval' - replied 'Looks good'" | |||
| Bad: "Browsing Twitter" | |||
| Good: "Twitter/X: Scrolled feed - viewed posts by @pmarca about AI, @sama thread on GPT-5 (12 tweets)" | |||
| Bad: "Working on code" | |||
| Good: "Editing StorageManager.swift in [exact app name from menu bar] - fixed type error on line 47, changed String to String?" | |||
| Segments: | |||
| - 3-8 segments total | |||
| - You may use 1 segment only if the user appears idle for most of the recording | |||
| - Group by GOAL not app (IDE + Terminal + Browser for the same task = 1 segment) | |||
| - Do not create gaps; cover the full timeline | |||
| Return ONLY JSON in this format: | |||
| [ | |||
| { | |||
| "startTimestamp": "MM:SS", | |||
| "endTimestamp": "MM:SS", | |||
| "description": "1-3 sentences with specific details" | |||
| } | |||
| ] | |||
| """ | |||
| } | |||
|
|
|||
| static func activityCardsPrompt( | |||
| existingCardsString: String, | |||
| transcriptText: String, | |||
| categoriesSection: String, | |||
| promptSections: VideoPromptSections, | |||
| languageBlock: String | |||
| ) -> String { | |||
| """ | |||
| # Timeline Card Generation | |||
|
|
|||
| You're writing someone's personal work journal. You'll get raw activity logs — screenshots, app switches, URLs — and your job is to turn them into timeline cards that help this person remember what they actually did. | |||
|
|
|||
| The test: when they scan their timeline tomorrow morning, each card should make them go "oh right, that." | |||
|
|
|||
| Write as if you ARE the person jotting down notes about their day. Not an analyst writing a report. Not a manager filing a status update. | |||
|
|
|||
| --- | |||
|
|
|||
| ## Card Structure | |||
|
|
|||
| Each card covers one cohesive chunk of activity, roughly 15–60 minutes. | |||
|
|
|||
| - Minimum 10 minutes per card. If something would be shorter, fold it into the neighboring card that makes the most sense. | |||
| - Maximum 60 minutes. If a card runs longer, split it where the focus naturally shifts. | |||
| - No gaps or overlaps between cards. If there's a real gap in the source data, preserve it. Otherwise, cards should meet cleanly. | |||
|
|
|||
| **When to start a new card:** | |||
| 1. What's the main thing happening right now? | |||
| 2. Does the next chunk of activity continue that same thing? → Keep extending. | |||
| 3. Is there a brief unrelated detour (<5 min)? → Log it as a distraction, keep the card going. | |||
| 4. Has the focus genuinely shifted for 10+ minutes? → New card. | |||
|
|
|||
| --- | |||
|
|
|||
| \(promptSections.title) | |||
|
|
|||
| --- | |||
|
|
|||
| \(promptSections.summary) | |||
|
|
|||
| --- | |||
|
|
|||
| \(promptSections.detailedSummary) | |||
|
|
|||
| \(languageBlock) | |||
|
|
|||
| --- | |||
|
|
|||
| ## Category | |||
|
|
|||
| \(categoriesSection) | |||
|
|
|||
| --- | |||
|
|
|||
| ## Distractions | |||
|
|
|||
| A distraction is a brief (<5 min) unrelated interruption inside a card. Checking X for 2 minutes while debugging is a distraction. Spending 15 minutes on X is not a distraction — it's either part of the card's theme or it's a new card. | |||
|
|
|||
| Don't label related sub-tasks as distractions. Googling an error message while debugging isn't a distraction, it's part of debugging. | |||
|
|
|||
| --- | |||
|
|
|||
| ## App Sites | |||
|
|
|||
| Identify the main app or website for each card. | |||
|
|
|||
| - primary: the main app used in the card (canonical domain, lowercase, no protocol). | |||
| - secondary: another meaningful app used, or the enclosing app (e.g., browser). Omit if there isn't a clear one. | |||
|
|
|||
| Be specific: docs.google.com not google.com, mail.google.com not google.com. | |||
|
|
|||
| Common mappings: | |||
| - Figma → figma.com | |||
| - Notion → notion.so | |||
| - Google Docs → docs.google.com | |||
| - Gmail → mail.google.com | |||
| - VS Code → code.visualstudio.com | |||
| - Xcode → developer.apple.com/xcode | |||
| - Twitter/X → x.com | |||
| - Zoom → zoom.us | |||
| - ChatGPT → chatgpt.com | |||
|
|
|||
| --- | |||
|
|
|||
| ## Continuity Rules | |||
|
|
|||
| Your output cards must cover the same total time range as the previous cards plus any new observations. Think of previous cards as a draft you're revising and extending, not locked history. | |||
|
|
|||
| - Don't drop time segments that were previously covered. | |||
| - If new observations extend beyond the previous range, add cards to cover the new time. | |||
| - Preserve genuine gaps in the source data. | |||
|
|
|||
| INPUTS: | |||
| Previous cards: \(existingCardsString) | |||
| New observations: \(transcriptText) | |||
| Return ONLY a JSON array with this EXACT structure: | |||
|
|
|||
| [ | |||
| { | |||
| "startTime": "1:12 AM", | |||
| "endTime": "1:30 AM", | |||
| "category": "", | |||
| "subcategory": "", | |||
| "title": "", | |||
| "summary": "", | |||
| "detailedSummary": "", | |||
| "distractions": [ | |||
| { | |||
| "startTime": "1:15 AM", | |||
| "endTime": "1:18 AM", | |||
| "title": "", | |||
| "summary": "" | |||
| } | |||
| ], | |||
| "appSites": { | |||
| "primary": "", | |||
| "secondary": "" | |||
| } | |||
| } | |||
| ] | |||
| """ | |||
| } | |||
| } | |||
There was a problem hiding this comment.
The file GeminiPromptPreferences.swift now contains VideoPromptOverrides, VideoPromptPreferences, VideoPromptSections, and LLMPromptTemplates — types that are no longer Gemini-specific. The filename no longer reflects the contents of the file. Consider renaming it to something like LLMPromptPreferences.swift or VideoPromptPreferences.swift to match the types it now defines.
There was a problem hiding this comment.
I agree with this, and initially did name it something else, but it made keeping track of upstream too hard. Once this patch lands I'll open another one to do the rename
|
@JerryZLiu Would you please review this PR? This doesn't change any behavior, but makes adding new video providers in my fork easier |
- Extract shared prompt templates into LLMPromptTemplates (GeminiPromptPreferences.swift) - Add VideoPromptPreferences/VideoPromptOverrides/VideoPromptSections types, replacing GeminiPromptPreferences/GeminiPromptOverrides/GeminiPromptSections - Centralize transcript JSON decoding and observation conversion in LLMTranscriptUtilities (TimeParsing.swift) for reuse across providers - Refactor GeminiDirectProvider to use LLMPromptTemplates and LLMTranscriptUtilities - Refactor TestConnectionView to accept a provider parameter with finishFailure/finishSuccess helpers for clean multi-provider support - Fix OnboardingLLMSelectionView card-width calculation to be dynamic based on card count rather than hard-coded divisor of 3 - Update SettingsProvidersTabView and ProvidersSettingsViewModel to use new VideoPrompt* types Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
replacing GeminiPromptPreferences/GeminiPromptOverrides/GeminiPromptSections
LLMTranscriptUtilities (TimeParsing.swift) for reuse across providers
finishFailure/finishSuccess helpers for clean multi-provider support
based on card count rather than hard-coded divisor of 3
new VideoPrompt* types
Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com
Stack created with Sapling. Best reviewed with ReviewStack.