From 431c89ea5f5d3417960e2f5e0d8a0408cc3e592b Mon Sep 17 00:00:00 2001 From: Barry Sevig Date: Tue, 9 Jun 2026 13:19:24 -0700 Subject: [PATCH] Document Allow-sheet AppleScript click and Retina screenshot/coord mismatch --- interaction-skills/connection.md | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/interaction-skills/connection.md b/interaction-skills/connection.md index b5cab90a..87f8723b 100644 --- a/interaction-skills/connection.md +++ b/interaction-skills/connection.md @@ -45,3 +45,33 @@ Prefer navigating an existing tab over `new_tab()`. Tabs created via CDP's `Targ tab = ensure_real_tab() goto("https://example.com") ``` + +## Clicking the "Allow remote debugging?" sheet without the user (macOS, Chrome 148+) + +The CDP websocket handshake hangs until the user clicks **Allow** on a native Chrome sheet. The sheet only exists *while a connection attempt is pending*, and its buttons have **no AXTitle** — match by `description` instead. Start the connect in the background, then: + +```bash +# detect: the sheet is named "Allow remote debugging?" +osascript -e 'tell application "System Events" to tell process "Google Chrome" to get name of sheets of front window' + +# click Allow (buttons are unnamed; "Turn off in settings"/"Cancel"/"Allow" by description) +osascript -e 'tell application "System Events" to tell process "Google Chrome" to click (button 1 whose description is "Allow") of group 1 of group 2 of group 1 of group 1 of group "Allow remote debugging?" of sheet 1 of front window' +``` + +The first harness call after the click may still report the old handshake timeout — retry once; the Allow grant is sticky. + +## Screenshot pixels ≠ click coordinates (Retina) + +`click(x, y)` takes CSS/DOM pixels, but on Retina displays `screenshot()` output is downscaled (e.g. a 1800px-wide viewport produces a 900px-wide PNG). Clicking coordinates read off the screenshot lands at half-position and silently hits the wrong element. Get click targets from the DOM instead: + +```python +import json +r = json.loads(js(""" +(()=>{const el=[...document.querySelectorAll('button')].find(b=>/create/i.test(b.textContent)); + const b=el.getBoundingClientRect(); + return JSON.stringify({x:Math.round(b.x+b.width/2),y:Math.round(b.y+b.height/2)})})() +""")) +click(r["x"], r["y"]) +``` + +Or scale screenshot coords by `page_info()["w"] / image_width`. Sites with sticky promo banners (Google Cloud console's free-trial bar) also shift everything ~25px between screenshots — re-resolve coordinates from the DOM right before each click.