Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .github/workflows/test-cicd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,15 @@ jobs:
set -o pipefail
uv run pytest -n logical tests --ignore=tests/integration/test_webvoyager_resolution.py --ignore=tests/integration/test_e2e.py --ignore=tests/examples/test_examples.py --ignore=tests/examples/test_readme.py --durations=10 --junitxml=pytest.xml --cov-report=term-missing:skip-covered --cov=packages | tee pytest-coverage.txt

- name: Upload signup email test debug logs
if: always()
uses: actions/upload-artifact@v4
with:
name: signup-debug-logs
path: /tmp/signup-debug/
retention-days: 7
if-no-files-found: ignore

- name: Pytest coverage comment
if: ${{ always() && github.ref != 'refs/heads/main' }}
uses: MishaKav/pytest-coverage-comment@main
Expand Down
64 changes: 53 additions & 11 deletions tests/browser/test_tools.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
import logging
import os
import time
from pathlib import Path

import pytest
from loguru import logger
from notte_browser.errors import NoToolProvidedError
from notte_browser.tools.base import EmailReadAction, PersonaTool
from notte_sdk import NotteClient
Expand Down Expand Up @@ -52,16 +58,52 @@ def test_tool_execution_in_session(persona: NottePersona, action: EmailReadActio
assert len(out.data.structured.get().emails) > 0


@pytest.mark.timeout(120)
@pytest.mark.flaky(reruns=3, reruns_delay=5)
Comment on lines +61 to 62

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

@pytest.mark.flaky was supposed to be removed by this PR.

The stated objective of this PR is to make the test fail-fast by removing @pytest.mark.flaky(reruns=3, reruns_delay=5), but the decorator is still present on Line 62. With @pytest.mark.timeout(120) added, the test can now spend up to 3 × (120 s + 5 s delay) = 375 s before reporting a hard failure — the opposite of the fail-fast intent described in the PR title and description.

🐛 Proposed fix
 `@pytest.mark.timeout`(120)
-@pytest.mark.flaky(reruns=3, reruns_delay=5)
 def test_signup_email_extraction(persona: NottePersona):
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/browser/test_tools.py` around lines 61 - 62, Remove the flaky reruns
decorator so the test fails fast: delete the line with
`@pytest.mark.flaky`(reruns=3, reruns_delay=5) leaving only
`@pytest.mark.timeout`(120) above the test function (i.e., remove the
`@pytest.mark.flaky` decorator attached to the same test so it no longer performs
reruns).

def test_signup_email_extraction(persona: NottePersona):
with notte.Session(headless=True) as session:
agent = notte.Agent(session=session, persona=persona, max_steps=15)
resp = agent.run(
task=(
"Go to console.notte.cc, login with the email signup email, verify the account. "
"Stop after the account is verified, i.e as soon as your are on the 'One more second' page."
"CRITICAL: do not fill the in the onboarding form, just stop after the account is verified"
),
url="https://console.notte.cc",
)
assert resp.success, f"Failed to run agent: {resp.answer}"
log_dir = Path(os.getenv("SIGNUP_DEBUG_LOG_DIR", "/tmp/signup-debug"))
log_dir.mkdir(parents=True, exist_ok=True)
log_path = log_dir / f"signup-{int(time.time() * 1000)}-{os.getpid()}.log"
Comment on lines +64 to +66

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Ruff S108: predictable /tmp path could allow a symlink attack in shared environments.

In a typical single-tenant CI runner the risk is low, and the path is already overridable via SIGNUP_DEBUG_LOG_DIR. Using tempfile.mkdtemp() or Path(tempfile.gettempdir()) / "signup-debug" would silence the lint and is marginally safer.

🧰 Tools
🪛 Ruff (0.15.12)

[error] 64-64: Probable insecure usage of temporary file or directory: "/tmp/signup-debug"

(S108)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/browser/test_tools.py` around lines 64 - 66, The current creation of
log_dir via Path(os.getenv("SIGNUP_DEBUG_LOG_DIR", "/tmp/signup-debug")) is
flagged for using a predictable /tmp path; replace the default with a safe
temp-backed directory: use tempfile.mkdtemp() or construct
Path(tempfile.gettempdir()) / "signup-debug" as the fallback when
SIGNUP_DEBUG_LOG_DIR is not set, then ensure the existing
log_dir.mkdir(parents=True, exist_ok=True) and log_path creation (the variables
log_dir and log_path) continue to work unchanged so tests still write to a valid
directory.

loguru_sink_id = logger.add(str(log_path), level="DEBUG", enqueue=False, backtrace=True, diagnose=True)
stdlib_handler = logging.FileHandler(str(log_path))
stdlib_handler.setLevel(logging.DEBUG)
stdlib_handler.setFormatter(logging.Formatter("%(asctime)s [stdlib %(name)s %(levelname)s] %(message)s"))
logging.getLogger().addHandler(stdlib_handler)
logger.info(f"=== signup_email_extraction start | log_path={log_path} | pid={os.getpid()} ===")
try:
with notte.Session(headless=True) as session:
agent = notte.Agent(session=session, persona=persona, max_steps=15)
resp = agent.run(
task=(
"Go to console.notte.cc and authenticate with the persona's email. "
"If the account does not exist yet, sign up; if it already exists, log in. Either path is fine. "
"ABSOLUTE RULE — NEVER CLICK any of the following buttons under ANY circumstances: "
"'Use Google', 'Continue with Google', 'Sign in with Google', 'Sign up with Google', "
"'Use GitHub', 'Continue with GitHub', 'Sign in with GitHub', 'Sign up with GitHub', "
"or any button whose visible label contains the words 'Google', 'GitHub', 'SSO', "
"'Microsoft', 'Apple', or 'social'. Before EVERY click, read the button's exact text "
"label and verify it is NOT one of the forbidden labels above. If you click one of "
"these by accident, the task has FAILED — you must immediately navigate back to "
"console.notte.cc and start over. "
"The ONLY acceptable authentication path is the plain email flow: enter the email "
"address into the email input field, then click a button labeled exactly 'Send magic link', "
"'Continue with email', 'Sign in with email', or 'Submit' (or a similar email-only button). "
"When a verification or magic-link email is required, check the persona's inbox and open "
"the link from that email to complete authentication. "
"Success = you are authenticated and have landed inside the console (any logged-in page is "
"acceptable, e.g. the 'One more second' interstitial, the personal/agent console, or the "
"dashboard). Stop as soon as you reach any logged-in page. "
"Do not fill in any onboarding form — stop immediately once authenticated."
),
url="https://console.notte.cc",
)
logger.info(f"=== agent.run finished | success={resp.success} | answer={resp.answer!r} ===")
assert resp.success, f"Failed to run agent: {resp.answer}"
except BaseException as exc:
logger.exception(f"=== signup_email_extraction failed: {type(exc).__name__}: {exc} ===")
raise
finally:
logger.info("=== signup_email_extraction end ===")
logger.remove(loguru_sink_id)
logging.getLogger().removeHandler(stdlib_handler)
stdlib_handler.close()
Comment on lines +107 to +109

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bug (P1): logger.remove(loguru_sink_id) raises ValueError on reruns because the sink ID from a previous attempt is no longer valid when the timeout fires mid-run and the finally block executes. Wrap the removal in a try/except to make teardown idempotent.

Suggested change
Suggested change
logger.remove(loguru_sink_id)
logging.getLogger().removeHandler(stdlib_handler)
stdlib_handler.close()
try:
logger.remove(loguru_sink_id)
except ValueError:
pass
logging.getLogger().removeHandler(stdlib_handler)
stdlib_handler.close()
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At tests/browser/test_tools.py, line 107:

<issue>
`logger.remove(loguru_sink_id)` raises `ValueError` on reruns because the sink ID from a previous attempt is no longer valid when the timeout fires mid-run and the `finally` block executes. Wrap the removal in a try/except to make teardown idempotent.
</issue>

Loading