Skip to content

Foss sandbox#28

Merged
UsamaSadiq merged 1193 commits into
foss-mainfrom
foss-sandbox
Jun 2, 2026
Merged

Foss sandbox#28
UsamaSadiq merged 1193 commits into
foss-mainfrom
foss-sandbox

Conversation

@UsamaSadiq

Copy link
Copy Markdown
Collaborator

CREDO23 and others added 30 commits May 9, 2026 18:36
…ickable-cards

feat(a11y): add aria-label to clickable media cards
[Feature] Multi-agent chat: hierarchical timeline, live subagent streaming, and inline HITL approvals
…fixes MODSetter#1376)

- Add formatThreadTimestamp() to surfsense_web/lib/format-date.ts
- Use shared helper in AllPrivateChatsSidebar and AllSharedChatsSidebar
- Remove unused date-fns format import from both sidebar files
- Centralises timestamp formatting policy for future i18n/relative-time changes
AnishSarkar22 and others added 24 commits May 19, 2026 18:57
…ile-hook

fix: use shared useIsMobile (768px) in SidebarSlideOutPanel (MODSetter#1359)
Release v0.0.24: UI revamp, multi-agent parallelization, citations & HITL improvements
Resolved conflicts preserving SSO/proxy-auth customisations:

- config/__init__.py: take all new upstream quota/credit/anon config
  fields; keep AUTH_TYPE default "SSO" (not None)
- app.py: keep ProxyAuthMiddleware import + custom /users/me routes;
  gate local/email auth routers on AUTH_TYPE not in (GOOGLE, SSO);
  take upstream Google OAuth router block
- page.tsx (home): keep SSO cookie-handoff splash (not marketing page)
- auth-utils.ts: keep oauth2-proxy bounce on 401 (not /login redirect)
- TokenHandler.tsx: keep commented-out (SSO uses cookie handoff)
- auth/callback/page.tsx: keep deleted (not used in SSO flow)
- sign-in-button / hero-section: keep isProxyLogin / handleProxyLogin
- navbar: keep minimal nav (Docs only)
- layout.tsx: keep isSplashPage + add upstream isFreeModelChat
- GoogleLoginButton: keep useEffect auto-redirect + take useState
- login/page.tsx: keep getBearerToken + isSSOAuth imports
- new-chat/page.tsx: take upstream fetchWithTurnCancellingRetry +
  filesystem selection for all three request paths (new, resume, regen)
- base-api.service.ts: take upstream X-SurfSense-Client-Platform header
- user-query.atoms.ts: take upstream staleTime Infinity + getBearerToken
- zero/query/route.ts: keep internal URL comment
- footer-new.tsx: keep minimal footer (drop dead-code nav arrays)
- .gitignore: merge both sets of ignores

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…0.0.24 merge

The upstream merge dropped `auth_backend` and `UserCreate` from the
import list in app.py while adding code that references both. This caused
a NameError at module load time when AUTH_TYPE is not GOOGLE/SSO.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…st SSO login

Gated by AUTH_TYPE=SSO and AUTO_PROVISION_LITELLM_KEY=true. On first
login, calls Askii's platform/provision-key endpoint with the user's
mPass JWT scoped to the agent / doc-summary / image-gen / vision models,
then inserts the matching NewLLMConfig / ImageGenerationConfig /
VisionLLMConfig rows and wires up all four SearchSpace FKs so chat,
summaries, image gen and vision work out of the box.

If provisioning fails at login (network blip, transient 5xx), the GET
/searchspaces/{id} handler retries once — the AC4 "one more attempt
when the user lands on My Space" guarantee — and remains a single
SELECT on the steady-state path once the marker row exists.

Pins the askii v0.1.0 SDK from github.com/Pressingly/askii-python and
adds 8 env vars to AskiiConfig.from_env (AUTO_PROVISION_LITELLM_KEY,
ASKII_BASE_URL, ASKII_LITELLM_BASE_URL, ASKII_AGENT_MODEL,
ASKII_DOCUMENT_SUMMARY_MODEL, ASKII_IMAGE_GEN_MODEL, ASKII_VISION_MODEL,
ASKII_LITELLM_KEY_DURATION_DAYS).

19 new unit tests cover: gate variants (AUTH_TYPE / flag / each model
env), missing access-token header, marker-row idempotency, the 4-row
happy path with FK assertions, doc-summary fallback to agent model,
LITELLM_BASE_URL fallback to ASKII_BASE_URL, and the SDK error matrix
(401 / 422 / 500). All 24 related tests pass; ruff + format clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Five Copilot comments triaged, four fixed in this commit; the fifth was
a docstring-only "fail closed" wording inconsistency on ASKII_BASE_URL,
also addressed here.

1. _provision_via_askii now constructs an explicit
   AskiiConfig.from_env(base_url=config.ASKII_BASE_URL) and passes it
   to AsyncAskii. Production previously worked by coincidence (the SDK
   also reads ASKII_BASE_URL from os.environ via from_env), but the
   coupling was implicit; this makes the SurfSense config the single
   source of truth for the outbound Askii endpoint and survives any
   future refactor that moves SurfSense's config off env vars.

2. The four-row DB-write block (session.add x4 / flush / FK mutation /
   commit) is now wrapped in try/except SQLAlchemyError → rollback +
   log + return False, matching the documented best-effort contract.
   Previously a flush/commit failure would propagate up and leave the
   request's session in a failed-transaction state. Added a regression
   test that monkeypatches session.flush to raise SQLAlchemyError.

3. ensure_personal_litellm_keys now takes a SELECT … FOR UPDATE on the
   SearchSpace row before the existence check, so two concurrent
   provisioning attempts (e.g. the same user opening two tabs during
   their first My Space load) serialize. The second waiter sees the
   marker row committed by the first and short-circuits via the
   existing idempotency check, avoiding duplicate upstream Askii keys
   and duplicate config rows.

4. Module + should_auto_provision docstrings rewritten to drop the
   "ASKII_BASE_URL must be set so a half-configured deploy fails closed"
   framing — the base-URL clause is effectively always satisfied since
   the default is non-empty (prod). The real fail-closed gate is the
   feature flag (default FALSE) plus the three required model env vars
   (default empty); the wording now reflects that.

5. test_returns_false_when_auth_type_not_sso no longer leaks its
   httpx.AsyncClient — extracted to a `client = …; try: …; finally:
   await client.aclose()` block matching the surrounding pattern.

Deliberately out of scope (fast-follow if needed):
- Alembic migration adding UNIQUE (user_id, search_space_id, name) on
  the three config tables. The row lock closes the same-host race
  cleanly; the constraint is the stronger long-term backstop and is
  worth adding only if duplicates surface in prod despite the lock.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two follow-up Copilot comments on the prior fix commit:

1. ensure_personal_litellm_keys took a SELECT … FOR UPDATE on the
   SearchSpace row unconditionally — but the lazy guard runs on every
   owner GET /searchspaces/{id}, so the steady state (already
   provisioned) is the hot path and would acquire/release the lock for
   nothing on every page load. Restructured to double-checked locking:
   cheap SELECT first, return True immediately if the marker row
   exists; only when missing do we take the row lock and re-SELECT
   inside the lock to catch the race window. Provisioned users now
   incur zero lock contention.

2. The askii dependency in pyproject.toml was pinned to git tag
   `v0.1.0`. Git tags can be force-moved, so a uv lock re-resolve
   after a tag rewrite would silently change the installed code.
   Pinned to the underlying commit SHA
   eb7591d558d309f7d53161187d69df718b584751 (immutable). Tag name
   preserved in an adjacent comment for readers.

uv.lock refreshed to reflect the new rev string for askii. The
unrelated marker simplifications in the lock are uv resolver
normalization (equivalent dependency graph, simpler boolean conditions
on cuda-bindings / nvidia-* / contourpy markers).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…und 3)

Two functional changes and one new test class, addressing the three
new Copilot comments on the prior round-2 commit:

A. ensure_personal_litellm_keys now wraps its full body in a top-level
   try/except Exception. The cheap pre-lock SELECT, the SELECT ... FOR
   UPDATE itself, AskiiConfig.from_env(), and any future call that
   leaks an unclassified exception are all caught here; we rollback to
   release any lock, log the exception, and return False — restoring
   the "never raise" docstring guarantee. Both callers (users.py /
   search_spaces_routes.py) already wrap us with try/except Exception,
   but the function should hold up its own contract.

B. The row-level lock no longer wraps the entire flow. The previous
   shape took SELECT ... FOR UPDATE on the SearchSpace row before the
   token check and the outbound Askii call, holding the lock across a
   network request that can run 5–30s on a slow upstream and stalling
   every concurrent tab on the same row. The lock now wraps only the
   DB-write window:

       cheap marker SELECT          (no lock, hot path)
       token check                  (no lock)
       Askii provisioning call      (no lock)
       build the 4 row objects
       SELECT … FOR UPDATE
       re-SELECT marker inside lock
         race-loss → rollback + return True (Askii key orphaned)
         else → write rows → flush → set FKs → commit

   The trade-off is that on a rare race-loss (two tabs hit My Space at
   the exact same instant on first login), both workers call Askii
   and one upstream key ends up orphaned — auto-expires at the
   configured TTL (default 90 days). Lock contention exposure shrinks
   from "up to the Askii timeout" to "sub-ms DB write window".

C. (no change) surfsense_backend/.env.example was already cleaned up
   in an earlier commit — comments are on dedicated lines, no inline
   `# …` after `=`. The bot's anchor was against the original PR diff.

Tests: 22/22 green. Two new regression tests:

- test_race_loss_inside_lock_returns_true_without_writes — drives the
  cheap SELECT to None and the re-check inside the lock to a found
  marker, asserts True is returned, no rows added, rollback called.
- test_unexpected_exception_caught_and_rolled_back — drives the cheap
  SELECT to raise RuntimeError, asserts outer except catches it,
  rollback called, returns False.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…(PR #27 round 4)

Replace per-(user, search_space) agent-row marker with a per-user
`litellm_auto_provisioned_at` timestamp column. NULL = eligible for
one-time provisioning; non-NULL = done forever, regardless of whether
config rows still exist or new search spaces are created.

Architecture fixes:
- SAVEPOINT-based lock isolation: race-loss and error paths roll back
  only the nested savepoint, leaving the caller's outer transaction
  intact. `_RaceLossError` sentinel triggers automatic rollback on
  context exit.
- Service flushes only; callers (on_after_login, on_after_register)
  own the commit boundary. Detach-safe persistence via explicit
  UPDATE...WHERE SQL for both FK wiring and the provisioned-at marker.
- Service decoupled from HTTP Request: accepts `access_token: str | None`
  and `cfg: Config` as explicit params. No more deferred imports or
  Request dependency inside service body.
- `_session_from_user_db` helper isolates fastapi-users' internal
  `.session` attribute behind a single call site.

Other fixes:
- Remove RuntimeError anti-pattern in proxy_auth vanished-user path
  (log + else-branch instead).
- Drop lazy guard in search_spaces_routes: on_after_login fires on
  every authenticated request via ProxyAuthMiddleware, making the
  per-route retry redundant.

New migration: moneta_001 — adds nullable
`litellm_auto_provisioned_at TIMESTAMP(timezone=True)` to `user` table.

Tests: 6 new unit tests for on_after_login is_active gating + commit
boundary; existing litellm_provisioning suite updated for new signatures
and SAVEPOINT assertions (39 tests, all passing).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
downgrade() unconditionally dropped litellm_auto_provisioned_at, which
fails on environments where the column was never created (or lives under
a different name/schema). Mirror upgrade()'s column-existence guard via
sa.inspect(conn).get_columns("user") so the migration is symmetric and
safe to run in either direction.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The throttle check read user.last_login and the failure logger read
user.id directly off the ORM object. A detached/expired user makes even
a plain attribute read raise (e.g. MissingGreenlet), which would break
the login flow despite on_after_login's best-effort contract.

Snapshot user.id / user.last_login once under a guard up front, then use
the captured values for both the throttle math and the except logger so
the unhappy path never re-touches the ORM object.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…llm-key

feat: auto-provision personal LiteLLM key via Askii on first SSO login
@jawad-khan jawad-khan self-requested a review June 2, 2026 10:08
@UsamaSadiq UsamaSadiq merged commit d0ce4f1 into foss-main Jun 2, 2026
5 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants