Risk control group enhance by andrenoah307 · Pull Request #1 · Calderic/new-api-xifeng

andrenoah307 · 2026-04-27T10:16:44Z

⚠️ 提交说明 / PR Notice

Important

请提供人工撰写的简洁摘要，避免直接粘贴未经整理的 AI 输出。

📝 变更描述 / Description

增强风控中心，增加内容审核
尝试修复异常流扣费问题

🚀 变更类型 / Type of change

🐛 Bug 修复 (Bug fix) - 请关联对应 Issue，避免将设计取舍、理解偏差或预期不一致直接归类为 bug
✨ 新功能 (New feature) - 重大特性建议先通过 Issue 沟通
⚡ 性能优化 / 重构 (Refactor)
📝 文档更新 (Documentation)

🔗 关联任务 / Related Issue

Closes # (如有)

✅ 提交前检查项 / Checklist

人工确认: 我已亲自整理并撰写此描述，没有直接粘贴未经处理的 AI 输出。
非重复提交: 我已搜索现有的 Issues 与 PRs，确认不是重复提交。
Bug fix 说明: 若此 PR 标记为 Bug fix，我已提交或关联对应 Issue，且不会将设计取舍、预期不一致或理解偏差直接归类为 bug。
变更理解: 我已理解这些更改的工作原理及可能影响。
[ x 范围聚焦: 本 PR 未包含任何与当前任务无关的代码改动。
本地验证: 已在本地运行并通过测试或手动验证，维护者可以据此复核结果。
安全合规: 代码中无敏感凭据，且符合项目代码规范。

📸 运行证明 / Proof of Work

(请在此粘贴截图、关键日志或测试报告，以证明变更生效)

- introduce per-group whitelist (EnabledGroups) plus per-group mode override (GroupModes) on RiskControlSetting; default empty so the engine has zero impact on upgrade until admins flip groups in - key every metric/inflight/block/rule-hit redis key and memory map by (scope, subjectID, group); same token tracked across groups now has independent counters and block state - snapshot UsingGroup into RelayInfo.RiskGroup at BeforeRelay so the finish/start pair always lands on the same bucket even when auto cross-group retry rewrites UsingGroup mid-request - rebuild risk_subject_snapshot unique index to (subject_type, subject_id, group) with idempotent cross-db drop+recreate; add group columns and indexes to risk_rules and risk_incident - gate every BeforeRelay/AfterRelay/enqueue call on isRiskControlEnabledForGroup so unlisted groups bypass risk control entirely; reject auto from EnabledGroups during normalize - ship GET /api/risk/groups returning schema_version=1 matrix used by the new admin "分组启用矩阵" widget; require ?group= on unblock, document why non-whitelisted unblocks are still allowed - enforce "rule must bind groups before enable" in validateRiskRule and the rule editor; surface unconfigured + unlisted rule counts on the overview cards - TDD coverage: triple-key isolation, group-aware evaluate, effective mode truth table, normalize filter for auto/invalid mode, controller unblock contract, sortRiskGroups order

- triggers on dev/** and dev-* branches so admins can validate staging builds before merging back to main - runs go vet and go test on the focused risk control packages so the TDD suite stays green - builds and pushes a linux/amd64 image to ghcr.io with two tags: dev-<branch-slug> (floating) dev-<branch-slug>-<sha7> (immutable per commit) - self-hosted runner references and goreleaser/helm/kustomize blocks from the reference workflow are intentionally omitted; this repo only needs github-hosted runners and a single ghcr push

The metadata-action emits multiple tags newline-separated. Inlining ${{ steps.meta.outputs.tags }} into "for tag in ..." injects literal newlines into the shell script, which the runner parses as "syntax error near unexpected token". Pass the value through env and iterate with `while read` instead.

Two independent features sharing the risk console. Login warning (enforce path): - User.RiskWarningPendingAt timestamp refreshed by the engine whenever an enforce-mode user-scope decision is non-allow; ack handler zeroes it - GET /api/user/self exposes a boolean risk_warning_pending without leaking timestamps, scopes, or rule names so users cannot reverse the thresholds - POST /api/user/self/risk_warning/ack clears the flag without lifting the actual block - Dashboard shell shows AccountRiskWarningModal once per fresh decision, closable only via the explicit acknowledge button Async OpenAI omni-moderation: - New independent ModerationSetting with EnabledGroups/GroupModes (mirroring the risk-control gate pattern), multi-key list, sampling rate (integer percent), threshold, two-tier retention (flagged rows kept for 30 days by default for downstream client handling, benign rows for 72h) - ModerationKeyRing rotates keys round-robin with per-key cooldowns; 429 honours Retry-After then falls back to exponential backoff - moderationCenter copies relay payloads off the gin context and enqueues via gopool.Go so the relay path never blocks; debug card uses the same async pipeline with a polled debugStore for results - Engine implements buildModerationRequest / parseModerationResponse per the official OpenAI API schema (multi-modal input array, results with category_scores and category_applied_input_types) - New endpoints under /api/risk/moderation/{config,overview,debug, debug/:id,incidents}; admin tab in the risk page wraps the existing distribution-detection workflow in a top-level Tabs strip - PreflightModerationHook stubbed for future enforce-mode work; the signature is locked so callers do not change later Tests (all green via go test ./service/ ./model/ ./controller/): - KeyRing round-robin / cooldown skip / all-cooldown / reset / empty - buildModerationRequest emits multi-modal array and rejects empty - parseModerationResponse folds max score and applied input types - parseRetryAfter handles plain seconds and invalid values - ModerationSetting normalize filters auto and clamps sampling/threshold - IsModerationEnabledForGroup truth table - PreflightModerationHook stub allow-all

…gories Replace the single-threshold gate with a full rule system mirroring the distribution-detection workflow: - ModerationRule model + CRUD with name/match_mode/action/priority/ score_weight/conditions/groups, indexed by name and enabled. Reload drops rules with empty groups so admins cannot accidentally enable a rule that silently never fires. - ModerationCondition is the unit predicate: { category, op, value, apply_input_type, applied_input_type }. ApplyInputType is a per-row toggle so admins choose whether the rule must match the OpenAI category_applied_input_types list (text/image) or the raw category score regardless of modality. - ValidateModerationRule pins category to the official OpenAI 13-item list, value to [0,1], op to the shared comparator set, and rejects image-only filters on text-only categories (e.g. sexual/minors). - EvaluateModerationRules runs every group-applicable rule with the rule's own match_mode (all → AND, any → OR; default all). Conditions whose category is missing from the response are short-circuit failures under AND, ignored under OR. - BuildModerationDecision picks the most severe action (block > flag > observe). Block is recorded in incidents but does NOT short-circuit the relay path in v3 — the existing PreflightModerationHook stub is the future home for that behavior. - moderation_center now records incidents only when a rule fires; the legacy FlagScoreThreshold fallback has been removed per the v3 design ("不保留兜底"). Debug events still record so admins can audit threshold-tuning sessions; previewModerationDecision in debug mode evaluates against every enabled rule (group-agnostic) so the editor shows what would fire. - Five seeded default rules (sexual/minors block, violent illicit flag, text-only sexual flag, image-only violence flag, hate/harassment combo observe) — all Enabled=false until an operator binds a group. - Frontend: 内容审核 tab gains an "审核规则" card with table, switch, edit/delete actions, and a 6-field-per-row editor modal. Each condition row exposes the apply_input_type toggle plus a category picker that disables the image option on text-only categories. - ModerationIncident gains decision/primary_rule/matched_rules columns so downstream dashboards can pivot on rule names. Tests: - AND mode requires every condition; OR mode needs one - ApplyInputType toggle distinguishes text-only vs image-only matches - Group filtering, decision severity ordering, allow on empty match - Validate rejects unknown categories, out-of-range scores, image filters on text-only categories, enabled rules without groups - previewModerationDecision picks across all enabled rules regardless of group - Category list exposes image_scored flag for the UI dropdown

Production debug runs were silently routed through a synthetic "__debug__" group that no rule binds to, so the rule engine returned allow and the recorded incident showed 未命中 — even when OpenAI itself had flagged the input. The debug card meanwhile rendered the upstream raw flagged field, which was sourced from a different field than the persisted decision, so admins saw 命中 on the result Tag and 未命中 in the incident table at the same time. - SubmitModerationDebug now accepts a group parameter; non-empty evaluates the request against that group's bound rules (mirroring production traffic in that group), empty falls back to the legacy preview that scans every enabled rule. The whitelist gate is bypassed in both cases so admins can rehearse before flipping on. - The frontend debug card adds a group selector (default = preview; groups list mirrors the moderation enablement matrix and excludes auto). The submit payload includes the chosen group. - Result rendering distinguishes the OpenAI raw flagged tag from the rule-engine decision tag and lists matched rule names so the card no longer disagrees with the persisted incident row.

Both engines (distribution detection and moderation) now hand off to a single enforcement service that owns the user-facing email policy, per-user hit counters, and the auto-ban decision. Engines stay decoupled — neither one knows anything about email plumbing. - EnforcementSetting registers under the "enforcement" option key with defaults locked off (Enabled=false, EmailOn*=false, BanThreshold=0) so upgrade is zero-effect until an operator opts in. Per-source ban thresholds let admins weight moderation hits differently from distribution hits, and the email rate limiter is "max N emails per M minutes per user" with a 3-per-10min default. - service/enforcement.go.EnforcementHit is the engine-facing entry point, gopool-spawned and side-effect free when disabled. Hit processing follows decision points 1-9: fixed-window counter (resets at expiry), atomic per-user UPDATE, audit row with merged email status, optional auto-ban that flips User.Status to Disabled exactly like the existing admin disable flow. Already-banned users skip silently to avoid duplicate emails. Vague email templates intentionally omit rule names, only carrying time/group/source/count/threshold. - Counters live on the User row (HitCountRisk, HitCountModeration, WindowStartAt, LastHitAt, EmailWindowStartAt, EmailCountInWindow, AutoBannedAt) so increments are a single UPDATE without joins. Manual unban / reset zeroes everything per decision point 6. - The two engines call EnforcementHit at the same place they fire user-level vague warnings (risk_control on enforce-mode user decisions; moderation_center on rule-engine non-allow decisions for relay-source events only — debug runs stay local). - 8 admin endpoints under /api/risk/enforcement/* (config, overview, incidents, counters, reset_counter, unban, test_email). The test email endpoint is hard-wired to the calling admin's mailbox so it cannot be repurposed as a relay (decision point 7). - Frontend ships a third top-level tab "处置操作" with overview cards, full strategy editor (sources / window / thresholds / per-source thresholds / email rate limit / templates), per-user counter table, and audit incident list with source/action filters. - Tests cover Normalize source filtering and rate-limit defaults, the source gate truth table, per-source threshold fallback, email body rendering with the "no rule names" red line, and a no-op assertion that confirms EnforcementHit is safe to call when disabled.

The 内容审核 tab derived its enabledGroupSet from the riskGroups prop — that response is sourced from GET /api/risk/groups, which reads the distribution-detection whitelist. Because the two engines are decoupled, an admin who enabled the default group only for content moderation would see "default 已启用" in the per-tab matrix card but still find debug card dropdowns and rule editor labels marking default as "分组未启用内容审核". Production capture confirmed the divergence: moderation.enabled_groups = ["default"] risk_control.enabled_groups = ["svip"] Switch the in-tab badge source to config.enabled_groups (the live moderation setting). The riskGroups prop is still passed in but only used to enumerate the available group names — never to decide whether a group is enabled for moderation.

… batch audit writes Production captured a real bug: ban-notification emails were being dropped because the single hit-email rate-limit budget was exhausted by an earlier flurry of hit emails. SSH-validated database snapshot confirmed the dropped ban email and showed channel-500/404 upstream failures still entering the moderation queue, contaminating hit counters with content the user never actually saw. Changes: - Split email rate limiting into two independent buckets. Hit emails keep the existing 10min/3 default. Ban emails get their own bucket (default 60min/3) backed by new User columns enforcement_ban_email_window_start_at + count. Hit-bucket exhaustion cannot starve ban notifications. JSON migration preserves the v2 email_rate_limit_* keys so saved configs upgrade in place. - Skip moderation entirely when relayErr != nil and the response delivered no chunks (RelayInfo.SendResponseCount == 0). Failed upstream requests no longer waste OpenAI tokens or pollute hit counters. Streaming responses that delivered at least one SSE chunk before failing still moderate, since the user did receive content. - Switch the in-memory queue to ring-buffer semantics — when full we drop the OLDEST event, preserving freshness. Add an optional Redis LIST persistence layer (LPUSH/LTRIM/RPOPLPUSH/LREM with startup recovery from rc:mod:processing:* lists) so events survive container restarts. Falls back to memory-only when Redis is unreachable. - Add a moderation_incidents batcher that aggregates rows for a configurable interval (default 500ms) or batch size (default 100) and uses CreateInBatches. PG write latency is now decoupled from OpenAI worker throughput. Synchronous fallback on submit-channel saturation guarantees no dropped audit rows under sustained load. - Tune defaults for the target sizing scenario captured in DEV_GUIDE §14: WorkerCount 16, HTTPTimeoutMS 3000, EventQueueSize 32768. Beefier http.Transport keeps OpenAI keep-alives healthy. - New /api/risk/moderation/queue_stats endpoint plus a moderation tab status card that polls every 15 seconds — admins can watch queue depth, per-worker idle/processing state, drop count, and the incident batcher backlog without leaving the page. - DEV_GUIDE §14 records the choice to ship Redis LIST instead of asynq, the capacity math, and the migration trigger if throughput ever outgrows the simple queue.

…ts still moderate Production captured the symptom: usage logs were being written for every relay 200 response but moderation_incidents stayed empty no matter how many requests went through. SSH into micu-us-1 confirmed two consecutive claude-opus-4-7 /v1/messages successes followed by zero new rows in moderation_incidents while the corresponding consume-log rows landed normally. Root cause: when ShouldCheckPromptSensitive() and CountToken are both false (the production default), controller/relay.go takes the fastTokenCountMetaForPricing fast path. That helper only fills MaxTokens for ClaudeRequest / OpenAIResponsesRequest / GeneralOpenAIRequest and deliberately leaves CombineText + Files empty to avoid the strings.Join allocation hot path. The moderation hook then saw text=="" and len(images)==0 and bailed out before ever enqueuing the event — usage logs are an independent code path so they continued to land. Two changes restore moderation: - EnqueueModerationFromRelay now extracts text/images via a small helper that returns ("", nil) for nil/empty meta, then defers the real check until inside the gopool callback. If the initial extraction is empty we lazily call info.Request.GetTokenCountMeta() to build the full meta on-demand. The strings.Join cost is paid only when moderation is actually configured for the request's group, and only after the relay client has already received its response. - New regression test TestExtractModerationPayloadHandlesNilAndEmpty pins the helper's nil/empty contract so a future refactor cannot re-introduce the silent-drop behaviour.

… complete Production deployment of the lazy-meta fix correctly enqueued events and the worker pool actually processed them — the queue stats card showed workers cycling between idle and processing — but the moderation_incidents table stayed empty because the v3 design short-circuited persistence whenever the rule decision was "allow". With six successful relay 200 responses in the last hour and zero incident rows admins had no way to distinguish "moderation ran and nothing matched" from "moderation is silently broken". Change recordResult to persist every successfully scored event, regardless of whether a rule fired. The flagged column distinguishes the two cases (true == rule hit, false == benign), and the existing two-tier retention (BenignRetentionHours defaults to 72h, kept short on purpose) prevents the table from growing unbounded. Failures (result.Error != "") still skip persistence — those rows would only record "OpenAI couldn't be reached", which is more useful as a SysLog line than a database row. A unit test pins the flagged-vs-decision mapping so a future refactor cannot reintroduce the silent-drop behaviour.

CombineText aggregates system prompts, all conversation history, tool definitions and role labels for token counting — sending all of that to the moderation API pollutes the signal. Switch relay path to extractLastMessagePayload which type-asserts the request and pulls text and images from messages[-1] only.

…, detail modal - New setting record_unmatched_inputs (default false): when off, only flagged incidents are persisted, reducing DB pressure significantly. - Flagged incidents now store the full input text without truncation; list API truncates to 200 chars in Go for transport efficiency. - New GET /api/risk/moderation/incidents/:id returns the full record for the detail modal. - Input summary column: tooltip on header explains protocol tags are normal; click opens a modal with complete content and metadata. - Global config card gains a Switch for the new toggle.

The previous stream billing fix only checked `usage == nil`, which never triggers for mainstream stream handlers (OaiStreamHandler / ClaudeStreamHandler always return non-nil usage). This left users charged for incomplete or empty output when streams failed mid-way. Two-layer fix: - Layer 1 (billing): calculateTextQuotaSummary forces zero tokens on any server-side stream error (timeout, scanner error, panic, ping failure), regardless of whether usage was reported. client_gone is excluded since the user initiated the disconnect. - Layer 2 (retry): StreamAbortRetryError returns a 503 when the stream failed before any data was sent to the client, enabling the retry loop to try another channel transparently. The check is inserted in all three major helpers (TextHelper, ClaudeHelper, GeminiHelper) including their chatCompletionsViaResponses code paths.

Backend: - Redis pipeline counters (HINCRBY) on log write path for real-time metric collection - Background aggregation loop (master-node only) reads Redis buckets and writes DB snapshots - 3 DB tables: ChannelMonitoringStat, GroupMonitoringStat, MonitoringHistory - DB fallback aggregation when Redis unavailable (simplified LOG_DB query) - 7 API endpoints: admin CRUD + public read-only, rate-limited refresh trigger - Hook mechanism (common.GroupMonitoringHook) avoids model→service circular dependency Frontend: - GroupMonitoringDashboard with responsive card grid and 60s auto-refresh - GroupStatusCard with availability/cache progress bars and mini VChart history - GroupDetailPanel SideSheet with full chart and admin channel detail table - Settings page for monitoring groups, periods, exclude rules - Sidebar and route integration Tests: - ParseMonitoringKey/ParseBucketValues table-driven tests - IsGroupMonitored cache correctness tests - RecordMonitoringMetric auto/empty group skip tests - TriggerAggregationRefresh CAS guard test

…group monitoring Frontend used `group_monitoring.*` prefix while backend registered as `group_monitoring_setting.*`, causing settings to never reach the Go config struct. Also renamed `groups` to `monitoring_groups` and added `group_display_order` sync on save.

…ader nav - parseArrayField: coerce all elements to strings, filter out numeric indices and invalid values; short-circuit on "[]"/"null" - selectedGroups: filter against availableGroups to drop stale entries - HeaderNavModules: add `monitoring` toggle (default true, backward compatible); add monitoring link to header navigation bar - useHeaderBar/useNavigation: handle missing `monitoring` field in old configs

… filters Backend: - Expand GroupMonitoringHook signature to carry modelName, statusCode, content — enables filtering at recording time - RecordMonitoringMetric now checks AvailabilityExcludeModels, AvailabilityExcludeKeywords, AvailabilityExcludeStatusCodes, and CacheHitExcludeModels before incrementing Redis counters - Excluded errors skip t/s/e counters; excluded cache models skip ct/pt - Add AvailabilityExcludeStatusCodes []int field to config struct - Pass statusCode from other["status_code"] in RecordErrorLog Frontend: - Add "可用率排除状态码" TagInput to group monitoring settings card - Add i18n keys for the new field

…ings When monitoring_groups stored numeric indices (0, 1, 2) instead of group names, map them to the corresponding entries from /api/group/ and auto-correct the state so the resolved names get persisted on save.

The /api/group/ endpoint returns a flat array of group names, not an object. Object.keys on an array returns numeric indices ["0","1","2"] instead of the actual element values, which was the root cause of monitoring groups displaying indices instead of group names.

When a stream ends abnormally due to server-side causes (timeout, scanner error, ping fail, panic) before any chunk is delivered to the client, return a 503 so the relay loop can transparently retry through another channel. Client-initiated disconnects are excluded — the user chose to stop, no retry needed. Cherry-picked from PR #1, scoped to retry behavior only. The PR's text_quota.go change is intentionally NOT taken; we keep the upstream billing semantics (zero-charge only when usage is nil and no chunks sent), which trusts upstream-provided usage data even on partial streams. - relay/common/stream_status.go: add IsServerSideError() helper - service/stream_abort.go: 503 retry shim used by handlers - relay/{claude,compatible,gemini}_handler.go: 4-line hooks at post-DoResponse points

合并 PR #1 中两个紧耦合的特性： 1. 风控按 group 隔离所有风控指标/决策/快照按 (scope, subject, group) 维度存储，同一用户/令牌在不同 group（例如 vip / free）拥有独立风险状态。新增一次性迁移：在 AutoMigrate 创建新的三列唯一索引 v2 之前，先 DROP risk_subject_snapshot 旧的两列唯一索引。三库（SQLite/MySQL/PG）均做幂等处理。 2. 统一命中处置层 (enforcement) 解耦邮件限流（hit / ban 独立桶）、审计写入 enforcement_incident、阈值自动封禁；后续 moderation 引擎也会复用此层。相对 PR 的性能优化：把"读—改—写"计数器更新改为单事务内的 FOR UPDATE 行锁（IncrementEnforcementHit），并发命中同一用户时不会丢增量。SQLite 默认串行化写入；MySQL/PG 使用行锁。附带改动： - model/user.go: risk_warning_pending_at + 9 个 enforcement 计数字段 - controller/user.go: GetSelf 暴露 risk_warning_pending（仅布尔）；新增 POST /api/user/risk_warning/ack 让用户消除登录弹窗 - relay/common/relay_info.go: 增加 RiskGroup 快照字段，跨组重试时 defer 仍能记账到正确的 group

从 PR #1 引入异步内容审核引擎，针对 1000 RPM 量级做了简化：精简内容 - 删除 Redis 持久化队列：1000 RPM × 100% 采样 ≈ 17 RPS，内存 channel + ring-buffer 足够，重启丢未处理事件可接受 - 删除 batcher：每秒最多个位数 INSERT，同步直写更简单 - 删除 stopCh / stopOnce 死代码：从未被读取，goroutine 随进程退出即可 - WorkerCount 从 16 降到 8，EventQueueSize 从 32768 降到 4096 保留并接入 - OpenAI omni-moderation 多 key 轮询 + cooldown - 规则引擎（AND/OR over OpenAI 类别）+ 默认规则种子 - 异步采样、debug 试运行、保留期清理（按 flagged/benign 分桶） - 命中后自动调用 EnforcementHit 触达统一处置层 - relay 路径异步 hook：失败请求（SendResponseCount=0）不计入涉及文件 - service/moderation_center.go: 简化后核心引擎 - service/moderation_keyring.go: API key 轮询 + 冷却 - service/moderation_rules.go: 规则引擎 - model/moderation_{incident,rule}.go: 审计 + 规则模型 - controller/moderation.go: 管理端 CRUD + overview + debug - setting/operation_setting/moderation_setting.go: 配置 - types/moderation.go: 共享类型 - controller/relay.go: 在 defer 内挂入异步评分钩子 - main.go: 启动注入 - model/main.go: AutoMigrate 注册 - router/api-router.go: 管理路由（已套 AdminAuth）未引入 - service/moderation_redis_queue.go (PR 中的 159 行) - service/moderation_incident_batcher.go (PR 中的 160 行)

从 PR #1 引入 group monitoring：基于 Redis counter 实时聚合每个分组的请求量、token 用量、首字节延迟、状态码分布；带 Redis 不可用时的 DB 回退路径。涉及文件 - common/monitoring_hook.go: 全局 hook 函数指针，避免 model 反向依赖 service - model/log.go: 在 RecordErrorLog 与 RecordConsumeLog 末尾调用 monitoring hook（hook 为 nil 时零开销） - model/group_monitoring.go: ChannelMonitoringStat / GroupMonitoringStat / MonitoringHistory 三张监控表 - service/group_monitoring{,_metric}.go: 主聚合循环 + Redis/DB 双通道实现 + 历史数据维护 - setting/operation_setting/group_monitoring_setting.go: 配置（采样、排除状态码、可用率窗口等） - controller/group_monitoring.go: 管理 + 公共两套 API - router/api-router.go: /monitoring/admin (AdminAuth) + /monitoring/public (TryUserAuth) - main.go: 注入 StartGroupMonitoringAggregation - model/main.go: AutoMigrate 注册三张监控表

合并 PR #1 全部前端改动：新增页面与组件 - web/src/pages/Risk/index.jsx: 风控中心增强（多 tab：风控/审核/ 处置/订阅）、规则编辑、incident 详情弹窗 - web/src/pages/GroupMonitoring/index.jsx: 群组监控页面入口 - web/src/components/monitoring/*: 群组监控仪表盘、卡片、可用率折线图、历史趋势图 - web/src/components/common/modals/AccountRiskWarningModal.jsx: 用户登录态风控警告弹窗（仅展示模糊提示，不暴露规则） - web/src/components/settings/GroupMonitoringSetting.jsx + pages/Setting/Operation/SettingsGroupMonitoring.jsx: 监控配置项修改的现有文件 - web/src/App.jsx: 路由 + 风控警告弹窗挂载 - web/src/components/dashboard/index.jsx: 监控入口 - web/src/components/layout/SiderBar.jsx: 监控导航项 - web/src/components/settings/OperationSetting.jsx: 加入监控 tab - web/src/helpers/render.jsx: 状态展示工具 - web/src/hooks/common/{useHeaderBar,useNavigation}.js: 顶部/侧栏增加监控入口 - web/src/hooks/dashboard/useDashboardData.js: 拉取监控概览 - web/src/pages/Setting/Operation/SettingsHeaderNavModules.jsx + SettingsSidebarModulesAdmin.jsx: 模块开关 - web/src/i18n/locales/{en,zh-CN,zh-TW,fr,ja,ru,vi}.json: 新增条目

GROUPS is a reserved keyword in MySQL 8.0+, causing Error 1064 in CountEnabledRiskRulesWithoutGroups. Use commonGroupsCol variable (backtick-quoted for MySQL/SQLite, double-quoted for PostgreSQL).

Use GORM's map-based Where/Or conditions so the ORM handles column quoting automatically, eliminating reserved-word issues across all database backends.

The backend returns history records with recorded_at (unix seconds), but the chart read a non-existent timestamp field, producing NaN and rendering an empty chart. Also fix aggregation_interval_minutes being read from the wrong response level in GroupDetailPanel.

…implify drawer - GroupStatusCard: guard null availRate/cacheRate (show N/A), fix is_online for admin format - MiniHistoryChart: use recorded_at (unix seconds) instead of non-existent timestamp field - GroupMonitoringDashboard: fix history response parsing level, admin-only drawer and card click - GroupDetailPanel: remove history chart (now in card), keep only channel details for admin

…across polls - alignAndFillHistory: skip availability_rate/cache_hit_rate when < 0 (backend returns -1 for no-data), preventing chart y-axis from stretching to -1 - Dashboard poll (fetchGroups without history): use functional state update to preserve existing history data instead of overwriting with empty group stats

…ards MiniHistoryChart never rendered because it lacked initVChartSemiTheme initialization and used invalid width:'auto' in the VChart spec. Instead of patching it, reuse the proven AvailabilityCacheChart with a new compact prop (120px, no legends, no y-axis labels, smaller fonts).

Claude's input_tokens EXCLUDES cache_read_input_tokens, while OpenAI's prompt_tokens INCLUDES cached_tokens. The monitoring aggregation formula ct/pt*100 assumed pt always includes ct, producing ~8000% cache hit rates for Claude channels. Fix: at recording time, detect usage_semantic=anthropic and add cache tokens to prompt tokens before HINCRBY, so pt in Redis always means total prompt including cache. Remove the CacheTokensSeparateGroups branching in aggregation since the data is now normalized at source.

Calderic · 2026-04-28T03:37:56Z

9de402b feat: cross-channel retry on server-side stream abort
73db1a5 feat: 风控按 group 隔离 + 统一处置层
9f0a748 feat: OpenAI omni-moderation 内容审核（精简版）
de2584d feat: 接入分组监控
5fadd3b feat: 风控/审核/监控前端 + i18n
a8659c8 perf: 按 3000 RPM + 多机部署调参
21517af / 0efbe83 / 954df6c / 4d63529 / 872fb01 / 66c76b4 / c8065a6 / 6d557c4 (8 个后续修复)

整合时做了以下调整：

保留上游已合的异常流扣费修复，未采纳 PR 的版本
enforcement 计数器改用单事务 FOR UPDATE，修了原 PR 的 TOCTOU
moderation 删除 batcher（17~50 RPS 直写够用）
moderation Redis 队列重写为 per-instance WAL（修了原 PR 的跨实例 key 冲突 + recovery 死信两个 bug）
未采纳 .github/workflows/dev.yml（与本仓库 CI 策略不符）

andrenoah307 added 20 commits April 26, 2026 13:59

andrenoah307 added 5 commits April 28, 2026 02:22

fix: quote groups column in raw SQL for MySQL 8.0 compatibility

264fbf0

GROUPS is a reserved keyword in MySQL 8.0+, causing Error 1064 in CountEnabledRiskRulesWithoutGroups. Use commonGroupsCol variable (backtick-quoted for MySQL/SQLite, double-quoted for PostgreSQL).

refactor: replace raw SQL with GORM map conditions for groups column

383c242

Use GORM's map-based Where/Or conditions so the ORM handles column quoting automatically, eliminating reserved-word issues across all database backends.

andrenoah307 added 2 commits April 28, 2026 04:11

Calderic closed this Apr 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Risk control group enhance#1

Risk control group enhance#1
andrenoah307 wants to merge 27 commits into
Calderic:mainfrom
andrenoah307:dev/risk-control-group-scoping-20260426

andrenoah307 commented Apr 27, 2026

Uh oh!

Calderic commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andrenoah307 commented Apr 27, 2026

⚠️ 提交说明 / PR Notice

📝 变更描述 / Description

🚀 变更类型 / Type of change

🔗 关联任务 / Related Issue

✅ 提交前检查项 / Checklist

📸 运行证明 / Proof of Work

Uh oh!

Calderic commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants