Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions interaction-skills/connection.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,3 +45,33 @@ Prefer navigating an existing tab over `new_tab()`. Tabs created via CDP's `Targ
tab = ensure_real_tab()
goto("https://example.com")
```

## Clicking the "Allow remote debugging?" sheet without the user (macOS, Chrome 148+)

The CDP websocket handshake hangs until the user clicks **Allow** on a native Chrome sheet. The sheet only exists *while a connection attempt is pending*, and its buttons have **no AXTitle** — match by `description` instead. Start the connect in the background, then:

```bash
# detect: the sheet is named "Allow remote debugging?"
osascript -e 'tell application "System Events" to tell process "Google Chrome" to get name of sheets of front window'

# click Allow (buttons are unnamed; "Turn off in settings"/"Cancel"/"Allow" by description)
osascript -e 'tell application "System Events" to tell process "Google Chrome" to click (button 1 whose description is "Allow") of group 1 of group 2 of group 1 of group 1 of group "Allow remote debugging?" of sheet 1 of front window'
```

The first harness call after the click may still report the old handshake timeout — retry once; the Allow grant is sticky.

## Screenshot pixels ≠ click coordinates (Retina)

`click(x, y)` takes CSS/DOM pixels, but on Retina displays `screenshot()` output is downscaled (e.g. a 1800px-wide viewport produces a 900px-wide PNG). Clicking coordinates read off the screenshot lands at half-position and silently hits the wrong element. Get click targets from the DOM instead:

```python
import json
r = json.loads(js("""
(()=>{const el=[...document.querySelectorAll('button')].find(b=>/create/i.test(b.textContent));
const b=el.getBoundingClientRect();
return JSON.stringify({x:Math.round(b.x+b.width/2),y:Math.round(b.y+b.height/2)})})()
"""))
click(r["x"], r["y"])
```

Or scale screenshot coords by `page_info()["w"] / image_width`. Sites with sticky promo banners (Google Cloud console's free-trial bar) also shift everything ~25px between screenshots — re-resolve coordinates from the DOM right before each click.