Skip to content

fix: AJDA-2721 drop testConnection pre-flight from extract() path#38

Merged
MiroCillik merged 2 commits into
mainfrom
miro-AJDA-2721-skip-preflight
May 15, 2026
Merged

fix: AJDA-2721 drop testConnection pre-flight from extract() path#38
MiroCillik merged 2 commits into
mainfrom
miro-AJDA-2721-skip-preflight

Conversation

@MiroCillik

Copy link
Copy Markdown
Member

Summary

Follow-up to PR #37 (merged as v3.3.3) which raised the mongosh Process timeout from 30 s → 60 s and added explicit driver-level timeouts. That fix did not resolve the Heureka regression: job 78403737 on 2026-05-14 still timed out at exactly 60 s with connectTimeoutMS=10000&serverSelectionTimeoutMS=5000 already in the URI. mongosh emitted no error during the 60 s, meaning the driver-level timers never fired — the block is below the driver, in libuv DNS or socket setup over the VPN tunnel. No mongosh / Node-driver option can bound that phase.

As long as extract() includes a synchronous mongosh pre-flight check, every slow-VPN customer will hit this same wall. Bumping the wall-clock further just shifts the deadline.

Change

Drop $this->testConnection(); from Extractor::extract() (was line 195). mongoexport runs immediately and surfaces its own connection failures. testConnection() is retained for the UI sync action where the cleaned-up parseMongoshError() mapping is still valuable.

Removed tests/functional/test-wrong-uri/ — it was a pre-flight-specific assertion (URI parse error caught by mongosh during extract). With the pre-flight gone, the path it exercised no longer exists. Error-mapping logic itself is still covered by test-connection-action-fail-wrong-host and unit tests.

Test plan

  • composer ci — composer validate, phplint, phpcs, phpstan level 8, 111 unit tests, 78 functional tests (was 79 — test-wrong-uri removed). All green.
  • testConnection sync-action functional scenarios (test-connection-action-*) all still pass — the user-facing UI button is unaffected.
  • Customer verification (Heureka, project 672) once released — next nightly extraction job should run mongoexport directly and either succeed or surface a real connection error.
  • Datadog query for "exceeded the timeout of 60 seconds" — should drop to zero post-release.

Trade-off (worth flagging in release notes)

Connection failures during extraction now surface as mongoexport's raw error rather than our user-friendly mapped message. This is a less-polished error format, but:

  1. The customer's current experience is a cryptic ProcessTimedOutException, which is worse.
  2. mongoexport errors actually describe the real failure (DNS, auth, etc.) rather than always saying "connection test failed".
  3. The polished mapping is preserved where it matters most — the UI Test Connection button.

Release Notes

  • Justification

    • PR fix: AJDA-2721 raise mongosh testConnection timeout for slow VPN connects #37's bumped timeouts did not fix Heureka's nightly job failures: 60 s mongosh timeouts still occurred with explicit driver-level timeouts (connectTimeoutMS, serverSelectionTimeoutMS) in the URI. The blocker is below the MongoDB driver — DNS or socket setup over the VPN tunnel — which no mongosh option can bound.
    • Removing the redundant pre-flight testConnection() from the extract path lets mongoexport (the real workload) run directly. On VPN routes mongoexport's connection handling is more forgiving, and even when it fails the error is actionable. The UI "Test Connection" button remains unchanged.
  • Plans for Customer Communication

    • No proactive comms. Affected customers (Heureka, any VPN-routed setup) will silently see jobs start running again. Support team to notify Heureka contact (Olena → Hana Bedrichova) when released for verification.
  • Impact Analysis

    • Positive: nightly extractions for VPN-routed customers will no longer fail on the pre-flight check.
    • Negative: connection errors during extraction now show mongoexport's raw error rather than the cleaned-up mapped message. Less polished but more diagnostic. The UI Test Connection button still uses the mapped error.
    • One fewer network round-trip per job (mongosh → mongoexport). Negligible win, just a side effect.
    • Not behind a feature flag — applies to all customers.
  • Deployment Plan

    • Continuous deployment. Not labelled critical — narrow change to one method, well-covered by existing functional tests.
  • Rollback Plan

    • Single-commit revert. Re-adds the $this->testConnection(); line. Two-way door.
  • Post-Release Support Plan

    • Notify Olena (SUPPORT-16285) for Heureka verification.
    • Monitor Datadog for componentid:keboola.ex-mongodb "exceeded the timeout of 60 seconds" — should drop to zero.
    • If any customer reports unexpected connection error messages during extraction, point them at mongoexport's output as the source of truth.

🤖 Generated with Claude Code

PR #37's timeout bumps did not resolve the Heureka regression: job
78403737 still hit the (now-60s) mongosh wall on 2026-05-14 with the
explicit connectTimeoutMS=10000 and serverSelectionTimeoutMS=5000
already in the URI. mongosh did not emit any error during the 60s --
the driver-level timers never fired -- which means it was blocked
below the driver, in libuv DNS or socket setup on the VPN tunnel.

No mongosh / Node-driver option can bound that phase. As long as the
extract pipeline includes a synchronous mongosh pre-flight, every
slow-VPN customer will hit this same wall, and bumping timeouts only
shifts the failure deadline.

testConnection() runs at the start of every extract() but adds no
information that mongoexport would not surface on its own a few
seconds later. Drop the pre-flight call; mongoexport handles its own
connection. testConnection() stays in place as the UI sync action
where the cleaned-up error mapping is useful.

Functional scenario test-wrong-uri was a pre-flight-specific assertion
(URI parsing error caught by mongosh during extract). With the
pre-flight gone, that path no longer exists. Removed. The error
mapping logic itself remains and is still covered by
test-connection-action-fail-wrong-host and unit tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@linear

linear Bot commented May 14, 2026

Copy link
Copy Markdown

AJDA-2721

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Removes the synchronous testConnection() pre-flight from Extractor::extract() because on VPN-routed customers (Heureka) the mongosh-based check blocks at the OS DNS/socket layer below any driver timeout, causing all nightly jobs to hit the 60 s process wall and fail. mongoexport is now invoked directly and surfaces its own connection errors; testConnection() is retained for the UI sync action. The corresponding test-wrong-uri functional test (which asserted on the pre-flight URI parse error) is removed since that code path no longer exists during extract.

Changes:

  • Drop $this->testConnection(); call from Extractor::extract(), replaced with an explanatory comment referencing AJDA-2721.
  • Delete the tests/functional/test-wrong-uri/ fixture (config, expected stderr, expected exit code, gitkeeps).

Reviewed changes

Copilot reviewed 4 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/Extractor.php Removes pre-flight testConnection() from extract(), replaces it with a rationale comment.
tests/functional/test-wrong-uri/source/data/config.json Deletes config that triggered the pre-flight URI parse error.
tests/functional/test-wrong-uri/expected-stderr Deletes expected mapped error message assertion.
tests/functional/test-wrong-uri/expected-code Deletes expected exit code 1 assertion.
tests/functional/test-wrong-uri/expected/data/out/{tables,files}/.gitkeep Deletes empty output dir placeholders for the removed test.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/Extractor.php Outdated
Restore the user-facing contract from the pre-flight era: malformed
config URIs surface as exit-1 UserException with a friendly message,
not exit-2 CRITICAL from an uncaught ProcessFailedException. The
previous commit dropped the pre-flight without preserving the mapping
at the mongoexport layer, breaking that contract.

Add an "error parsing URI"/"error parsing uri" pattern to
Export::handleMongoExportFails so mongoexport's URI parser surfaces
the same clean message that mongosh did via parseMongoshError.

Restore tests/functional/test-wrong-uri/ — same input, same expected
output. Now covers the mongoexport pathway instead of the mongosh
pathway.

Also drop the explanatory comment block from Extractor::extract() —
"why this line was removed" belongs in git history and the PR
description, not in the code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MiroCillik MiroCillik requested a review from ondrajodas May 14, 2026 15:37
@MiroCillik

Copy link
Copy Markdown
Member Author

@ondrajodas plz review, musel jsem nakonec odstranit to testConnection uplne, ale prijde mi, ze to tam bylo zbytecny

@ondrajodas ondrajodas left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jn, já si říkal že to tak nikde neděláme

@MiroCillik MiroCillik merged commit b8250df into main May 15, 2026
8 checks passed
@MiroCillik MiroCillik deleted the miro-AJDA-2721-skip-preflight branch May 15, 2026 08:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants