Unify iframe + agent-browser into one swappable-renderer browser surface#156
Conversation
Reframe the surface's far-left chip as a render-mode + sync indicator and turn its modal into the single "Display" control center for how a surface renders (docs/specs/dor-iframe.md -> Path 1; dor-agent-browser.md -> Headed Pop-Out). UI-only, driven by Storybook: the production panel does not wire the new actions yet, so the live modal is unchanged apart from the cosmetic chip glyphs. - chip: FrameCorners = embed, LockSimple = screencast synced, LockSimpleOpen = scaled (SurfacePaneHeader) - modal: new Render section (Screencast/Embed) gated on setRenderMode; viewport controls grey out in embed; Pop out button gated on canPopOut - types: optional renderMode / setRenderMode / popOut / canPopOut on the screen controller, so existing constructions stay green - stories: renderMode + canPopOut knobs; Embed / EmbedRender / NoPopOut Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…n icons
Builds on the swappable-render storybook UI: make pop-out a third render
backend and restyle the Display modal around the three modes. Still
UI-only (driven by Storybook); production wiring lands later.
- RenderMode is now screencast | popout | embed; drop the separate
popOut() action — pop-out is just setRenderMode('popout'), gated by
canPopOut (hidden on web).
- Each render option uses its exact name (agent-browser screencast /
agent-browser popout / iframe embed) and lists its agent/URL/feel
trade-offs as green-check / red-x rows.
- Screencast's resolution nests under it: Resize with pane (link) vs
Fixed (lock, via Device/Custom); greys out for the other modes. Drop
the now-redundant "Currently"/"RENDER" chrome.
- Far-left chip icons reiterated: link / lock for screencast
resize / fixed, ArrowSquareOut for popout, FrameCorners for embed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Tighten the screencast resolution UI in the Display modal: - collapse the 8-button device grid into a single "Emulate" dropdown (none + the device registry); picking a device disables the dims. - put W / H / DPI inline to the right of "Fixed", sized to their max digits (W/H 4, DPI 1) and borderless (underline only — the boxed inputs were too much framing). - drop the "Resize with pane" detail text and the phantom icon-gap before "agent-browser screencast"; remove the now-dead Currently readout helpers (formatDpr, pane). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Quality cleanup (behavior-preserving):
- collapse the three near-identical render-option blocks into one
RenderOption helper driven by a features array; screencast passes its
nested resolution controls as children.
- fuse screenChipLabel + ScreenChipIcon into one screenChip() returning
{ icon, label } so the glyph and its label can't drift apart, and the
icon renders as a value rather than its own component fiber.
- drop an empty-string className branch.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add the Wall-level in-place renderer swap and route the Display modal's
"iframe embed" choice through it: selecting embed on an agent-browser
surface reads the active tab URL, replaces the pane with an iframe of
that URL in the same dock slot, and closes the now-unneeded browser.
- Wall.replaceSurface(oldId, {component, params, title}) generalizes
createContentSurface's replace-untouched-terminal branch; the shared
closeAgentBrowserSession factors session teardown out of
killPaneImmediately.
- WallActions.onSwapRenderMode(id, mode): agent-browser→embed works now;
iframe→screencast/popout calls the host-gated agentBrowserOpen and is
inert until that capability lands (Stage 3).
- AgentBrowserPanel publishes renderMode 'screencast', gates canPopOut on
agentBrowserPopOut, and setRenderMode('embed') triggers the swap.
- PlatformAdapter gains agentBrowserOpen / agentBrowserPopOut / PopIn /
BringToFront interfaces (host impls next); the Wall caches the last
`dor ab` binaryPath to spawn with.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AgentBrowserPanel gains a popped-out state: setRenderMode('popout') calls
the host relaunch-headed capability; the body becomes a clean stub
("running in a separate window" + Bring to front / Pop back in), the
canvas stays mounted but hidden, and the stream stays connected to observe
tabs/status. setRenderMode('screencast') — or the headed window closing,
auto-reverted once the headed stream has connected — relaunches headless
and resumes. renderMode on the snapshot reflects popout; persisted via
params.poppedOut.
Inert until the host wires agentBrowserPopOut/PopIn/BringToFront; canPopOut
gates the modal option on agentBrowserPopOut.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wire the host side so embed→screencast and pop-out actually work on the VS Code host (standalone/Tauri lacks agent-browser today, so it degrades — the optional methods stay absent there). - agentBrowserOpen(url): mint a managed gui-session, `open <url>`, read the stream port via `stream status --json` — mirrors `dor ab`. Completes the iframe embed → live screencast swap. - agentBrowserPopOut/PopIn: pop-out is a relaunch (headed/headless is fixed at launch), so close the session and reopen it `--headed`/headless at the active URL, returning the new stream port. v1 preserves the active tab URL only; window positioning over the pane is deferred (VS Code can't read screen coords → Chrome places the window). Needs verification against the real agent-browser CLI. - Full plumbing: vscode-adapter methods (+ constructor bindings), message-types request/response, message-router dispatch. - The panel passes the active URL into pop-out/in and hides "Bring to front" unless the host implements agentBrowserBringToFront (no-op for now). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…onal) IframePanel now registers a screen controller, so an iframe embed surface shows the unified browser chrome (URL + far-left chip → Display modal) and can swap back to a live screencast. Gated on the host's agentBrowserOpen capability — without a way to spawn a screencast there's nothing to swap to, so the embed surface keeps its plain title (e.g. the web host). - chromeActions: navigate updates the framed URL (re-resolves the proxy); reload bumps a nonce to re-resolve (a cross-origin frame can't reload via contentWindow); back/forward are no-ops (frame history is unreachable). - setRenderMode(screencast) routes through the Wall's onSwapRenderMode, which spawns an agent-browser for the URL and replaces the pane in place. embed→popout is left out for now (the modal offers screencast only). This completes the bidirectional screencast ↔ embed swap. The dor iframe surface also gains the URL header + swap chip on capable hosts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Update the specs to describe what shipped: the Display modal's three-cell Render section, Wall.replaceSurface, the asymmetric swap directions (the embed→screencast spawn gated on agentBrowserOpen), and pop-out as an in-panel render mode with its v1 limits (active-tab-URL only, no window positioning, VS-Code-only). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
agentBrowserOpen gains a `headed` option so embed→popout is a single spawn: the host launches the new agent-browser headed in one shot, and the Wall mounts the replacement surface with poppedOut: true — straight into the stub, no headless-screencast flash before an immediate relaunch. The iframe controller's canPopOut now gates on agentBrowserPopOut, so the embed Display modal offers Pop-out alongside Screencast. Drops the unused popOutOnReady plumbing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Avoids the interactive prompt / telemetry nag when launching Storybook. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two defects surfaced validating the swap/pop-out matrix on the VS Code host.
Phase 0 confirmed the agent-browser CLI behaves as the host assumes (headed
sessions stream, close kills the browser, relaunch-by-name works), so both were
lib-side, not host/CLI.
1. embed→screencast Apply was disabled. The Display modal gated Apply on the
viewport-drive capability (hostCapable), which an embed surface reports as
false. A render-mode swap only needs the spawn capability (already gating
whether the option shows), so gate the swap separately: Apply is enabled for
any mode switch and the "dor ab set" note is hidden during a swap.
2. Auto-revert resurrected torn-down sessions. Killing or swapping away from a
popped-out surface issues `close`, dropping the headed stream; the panel read
that as a user window-close and relaunched the session headless. A shared
teardown guard (agent-browser-sessions.ts) marks a session closed before
Dormouse closes it so auto-revert stands down; a freshly-mounted panel clears
the mark so a re-created managed name works again.
Also harden readStreamPort with a brief retry so a relaunch that hasn't yet
published its stream port doesn't pin the pane to a dead port ("ended while the
session is live"). TODO.md records the Phase 0 findings + corrected root causes
for the next validation run.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolve the contradiction the half-update left: Path 1 (Swappable Render Backend) shipped but still sat under "# Future Work / Designed, not yet built." Promote "Render Backends: Two Axes" to the implemented body and scope the roadmap framing to Path 2 (Plugin System), which gets its own not-yet-built status note. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The render-swap + headed pop-out work shipped past what the spec described (and a prior half-update left "implemented" callouts under "Future Expansions / not yet built"). Reconcile to as-built: - Render Indicator: the far-left chip now encodes render mode (embed / screencast synced/scaled / popped-out) and opens the Display modal; point glyphs at the BrowserChromeHeader story. - Display modal: replace the old Sync/Device/Custom mockup + tables with the Render + Resolution model, pointing at the AgentBrowserScreenModal story as the UI source of truth; keep the load-bearing behavior (sync, last-writer-wins, persistence, degradation). - Browser-Chrome Header: iframe-embed surfaces also get the chrome (not just terminals=plain); rename the chip. - Host capabilities: add agentBrowserOpen / PopOut / PopIn / BringToFront. - Lifecycle: render-swap-away close + the auto-revert teardown guard. - Session naming: gui-<hex> sessions and their not-`--key`-addressable limit. - Implementation Touchpoints: replaceSurface/onSwapRenderMode, IframePanel controller, agent-browser-sessions.ts. - Headed Pop-Out: rewrite as-built (modal radio not header arrow, no type-char confirm, active-tab-URL only, VS-Code-only, Bring-to-front unimplemented) and move out of "Future Expansions"; note the dropped confirm as a deviation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The spec's chip table had a stale glyph mapping carried over from a superseded storybook pass: the shipped header uses LinkIcon for screencast SYNCED (resizes-with-pane) and LockSimpleIcon for SCALED (fixed) — no open-lock — which also mirrors the Display modal's Resize-with-pane / Fixed controls. Fix the spec to match `SurfacePaneHeader.tsx`. (The TODO glossary was already correct.) TODO: refresh the intro (623 tests; Phase 0 CLI now validated live, the VS Code webview matrix still pending) and the diagnosis guidance (Phase 0 confirmed, so a failure points at the lib reconnect/lifecycle or the host plumbing, not the CLI). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Quality-only cleanups on the render-backend-swap branch (no behavior change):
- AgentBrowserPanel popOut: hoist the duplicated "revert unless stream came
back live" block (repeated in the !ok and catch arms) into a single
revertUnlessLive helper.
- AgentBrowserPanel popIn: drop the duplicated updateParameters({poppedOut:false})
and the redundant early-return guard; the common state writes now run once.
- AgentBrowserPanel frame handler: replace the empty-body `if (poppedOut) {}`
comment anchor with a guard clause around the draw branch.
- IframePanel: read history.index through a ref so back()/forward() — and thus
chromeActions — stay stable, so the screen-controller registration no longer
disposes+re-registers (and re-renders the header) on every navigation.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… + pop-out)
The branch's swappable render backend, headed pop-out, and HiDPI screencast
surface only worked on the VS Code host, which implements the full
agent-browser capability set. The standalone (Tauri) host only had
createIframeProxyUrl, so on standalone the screencast surface, the
embed->screencast render swap, and pop-out were inert. This mirrors the VS Code
host (vscode-ext/src/agent-browser-host.ts) on standalone.
Rust (standalone/src-tauri/src/agent_browser.rs, new module wired into lib.rs):
- agent_browser_command: runs the binary against a session; validates args[0]
against the subcommand allowlist (kept in sync with
AGENT_BROWSER_ALLOWED_SUBCOMMANDS in lib) -- not a general exec channel.
- agent_browser_screenshot: captures one device-resolution frame and returns
the raw bytes as an ArrayBuffer (tauri::ipc::Response), no base64 round-trip.
- agent_browser_edit: host-owned eval for select-all/copy/cut; copy/cut land on
the OS clipboard (tauri-plugin-clipboard-manager, mirroring
vscode.env.clipboard.writeText).
- agent_browser_open: spawns a managed namespaced session (dormouse.1.gui-<hex>)
and opens a url (optionally headed), returning { session, wsPort, binaryPath }.
- agent_browser_pop_out / agent_browser_pop_in: resolve the live active url via
`get url`, close, then relaunch headed/headless, returning the new wsPort.
- agent_browser_stream_status: reads the current `stream status --json` port so
restored panels recover from a stale persisted wsPort.
All accept an optional binaryPath and fall back through
DORMOUSE_AGENT_BROWSER_BIN -> PATH (runWithBinaryFallback equivalent).
TS (standalone/src/tauri-adapter.ts): implemented agentBrowserCommand,
agentBrowserEdit, agentBrowserScreenshot, agentBrowserStreamStatus,
agentBrowserOpen, agentBrowserPopOut, agentBrowserPopIn, each invoking the
matching Rust command. getAgentBrowserStreamUrl is intentionally omitted: the
stream server accepts the tauri://localhost origin, so the panel's built-in
fallback connects directly to ws://127.0.0.1:<port> (no relay). agentBrowserBringToFront
stays unimplemented, consistent with VS Code.
CSP (tauri.conf.json): added ws://127.0.0.1:* and ws://localhost:* to
connect-src so the screencast stream WebSocket can connect.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- agent_browser.rs: add command_error(result, label) helper, collapsing four identical stderr-or-exit-code error blocks (edit/open/pop_out/pop_in). - agent_browser.rs: factor pop_out/pop_in's shared close+relaunch body into a private relaunch(session, url, headed, binary_path) primitive; the two Tauri commands are now thin headed=true/false wrappers. - agent_browser.rs: drop the dead `pid << 64` term in generate_gui_session — it was masked entirely away by `& 0xffff_ffff_ffff`, so the PID never affected the output (session id is unchanged: low 48 bits of nanos). - tauri-adapter.ts: extract errMessage(err) helper, replacing 8 copies of the `err instanceof Error ? err.message : String(err)` idiom. Quality-only, no behavior change. lib tsc, standalone tsc, and cargo check all pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A popped-out surface is a real headed OS window the user drives directly, but the viewport-sync effects (ResizeObserver, wsPort-change, DPR) gate only on `syncEngaged`, never on `poppedOut` — so a swap-to-popout (which seeds `syncEngaged: true`) kept issuing `agent-browser set viewport <pane>` on every Dormouse pane resize, fighting the native window even though popout mode disables the resolution controls. Guard the single `issueSyncToPane` chokepoint on `poppedOutRef` so no `set viewport` is issued while popped out. `syncEngaged` stays true, so sync resumes correctly when the surface pops back in (the wsPort-change effect re-issues against the fresh session). Found by codex review. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A popped-out surface relaunches its agent-browser session headed — a real
OS window detached from the extension host. Nothing closed it on shutdown
(`close` only ran for explicit kill / render-swap), so quitting the editor
left an orphaned Chrome window behind, contrary to the spec's pop-out
lifecycle ("Dormouse/editor quits → headed windows are cleaned up; no
orphans").
Track popped-out sessions in the host (set on pop-out, cleared on pop-in
or any explicit `close`) and close them from `deactivate()`. Headless
sessions are deliberately left alive to reattach across webview reloads.
deactivate() also fires on Reload Window; a popped-out surface then
auto-reverts to a headless screencast on reactivation, which is preferable
to a detached headed Chrome lingering. Found by codex review.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The specs described an agent-browser surface that was VS-Code-only, but the standalone (Tauri) host had since gained the full capability set (command, edit, screenshot, stream-status, open, pop-out, render-swap). Flip the stale "Tauri today is inert" / "VS-Code-only" claims to name the web host as the only one without agent-browser, add the standalone files to the touchpoints, and mark Path 1 implemented on both hosts. Notes the one remaining VS-Code-only gap: closing orphaned headed pop-out windows on shutdown. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The two hosts implemented the same capability set in two languages — Node/TS in the VS Code extension host, a hand-ported 541-line Rust module in standalone — so they drifted (notably: the standalone never closed orphaned headed pop-out windows on shutdown). Collapse both onto a single host-agnostic module, lib/src/host/agent-browser-host.ts, exactly as the iframe proxy is already shared: - lib/src/host/agent-browser-host.ts: the single source of truth (binary resolution/spawn, allowlist, edit scripts, gui-session naming, URL resolution, screenshot, pop-out tracking, closePoppedOut). A factory injecting only the two genuinely host-specific bits — clipboard-write and logging. - VS Code host: slimmed to instantiate the shared module + keep the VS-Code-only stream relay (the vscode-webview:// origin workaround). - Standalone: the Node sidecar runs the bundled agent-browser-host.cjs behind thin Rust forwarders in lib.rs (mirroring iframe_create_proxy_url); deleted agent_browser.rs and the tauri-plugin-clipboard-manager dep. clipboard-ops.js gains writeClipboardText (pbcopy/clip/xclip/wl-copy), and the sidecar's shutdown() now calls closePoppedOut() — which is the orphan-window fix. Screenshot bytes ride the sidecar stdio as base64 and are decoded back to a raw tauri::ipc::Response, so the webview still receives an ArrayBuffer. Also fixes a latent gap surfaced by unifying: a headed `open` (embed→popout, which spawns headed directly) is now tracked for shutdown cleanup; the old code only tracked popOut. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…r surface
Replace the "three surface types" framing with a single surfaceType:'browser'
whose render axis is one canonical renderMode field: ab-screencast | ab-popout |
iframe (the ab- prefix names the engine, leaving room for a future engine; iframe
is engine-less). The URL and renderMode become single-homed, persisted panel
params, which is the fix for the buggy transitions — the URL had no canonical
home for agent-browser and was laundered from a possibly-stale live snapshot at
swap time.
- Canonical BrowserPaneState { url, renderMode, agentBrowser? } as the single
source of truth across every renderer.
- A BrowserPanel shell owns url+renderMode and remounts the matching renderer
child (not a fused component — input models still differ).
- Render-mode transition matrix: iframe -> ab trivial; ab <-> ab silently drops
non-active tabs (accepted, pending profile persistence); ab -> iframe warns +
typed-character confirm when >=2 tabs.
- dor iframe opens a full browser-chrome tab (chrome ungated from swap-capability).
- iframe new-tab attempts (target=_blank / window.open) intercepted by the shim
and prompted -> open as a new browser pane.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The agent-browser surface never persisted its URL — it lived only in the live session/stream — so render-mode swaps and pop-out/auto-revert laundered it from a chrome snapshot that can be empty mid-relaunch, yielding blank-page swaps and about:blank auto-reverts. Mirror the active tab's URL into params.url whenever the chrome snapshot changes, and read params.url first (falling back to the live snapshot) in popOut/popIn and in Wall.onSwapRenderMode's agent-browser->embed path. params.url is now the single source of truth and round-trips through the layout blob. First increment of the browser-surface unification (docs/specs/dor-iframe.md -> Path 1, "url is single-homed"). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The RenderMode enum was screencast | popout | embed. Rename to the engine-prefixed vocabulary (docs/specs/dor-iframe.md → "Render Backends"): ab-screencast, ab-popout, iframe. The ab- prefix names the engine (agent-browser), leaving room for a future engine beside it; iframe is the engine-less DOM embed. Pure rename across the screen controller, both panels, the Display modal, the header chip, stories, and tests — no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The agent-browser daemon re-broadcasts the current frame and tab list on a ~20Hz heartbeat even when nothing changes. The connection forwarded every message, so a *static* page drove ~20 device-resolution screenshots/sec — each a child-process spawn (`agent-browser screenshot`) — plus ~20 `setTabs` re-renders/sec. Worse, each forwarded frame's screenshot poked the daemon into emitting again, a self-perpetuating loop that never settled. Drop byte-identical frame and tab re-broadcasts (djb2 hash of the payload) before emitting `frame-pulse`/`tabs`, resetting the dedupe sentinels on reconnect so a fresh stream always re-primes. A genuine change (animation, navigation, new/closed/focused tab, title) alters the bytes and flows through untouched. On a settled static page this takes idle work from ~20/sec to 0 (the feedback loop breaks and the daemon goes quiet), matching the screenshot loop's contract: "a static page produces no pulses, so no shots and no cost." Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Update the skill from a real debugging session: - Freshness: `close --all` is global (kills the outer session too); document the re-open + stray-about:blank-needs-a-second-open recovery. - Driving: `keyboard` is type/inserttext only (use inserttext — `type` reorders chars); submit via synthetic keydown Enter, not `\r`; wrap evals in an IIFE; click via mouse move/down/up with a dwell; map clicks through the 1:1 canvas→nested-page offset. - Timing: prefer a `window.__M` global + shell polling over mirror lines; note the cmdStart-on-retry and mouse-round-trip skew caveats. - What To Watch: record the static-page churn root cause + dedupe fix and a concrete idle regression check. - Validation: add the `lib/src` tsc + vitest path and the Vite-alias hot-reload note. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The pane that `dor ab open` creates is not the selected pane — the terminal the command ran in is. The canvas mouse handlers gated on `interactive` (passthrough mode AND this pane selected), so the first click on the browser surface was swallowed: it only *selected* the pane (via the root `onClickPanel`), and `selectedId` updates a render later, so the click never reached the page. Clicking a link did nothing until a second click — felt like a "gigantic delay" before a tab opened. Gate mouse down/up on passthrough mode alone, so a direct click on the canvas both selects the pane and passes through (the down/up carry their own coordinates, so forwarding just those is enough). Keyboard and wheel still require full `interactive` so a background pane never steals them. A single click now opens the link's tab immediately. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Clicking a browser tab chip or its × did nothing: a mousedown on the chip bubbles to the pane's onMouseDown → onClickPanel, which selects the pane and makes dockview move this panel's DOM. With a real (non-instant) press, that re-layout lands between mousedown and mouseup, so the node the press started on is gone by release and the browser never synthesizes a `click` — selectTab/closeTab were never called (no command, no error). This only reproduces with a human-length press: synthetic instant down→up and `element.click()` both dodge it, which is why it slipped through earlier. Reproduced in the harness by holding the press ~300ms. Fix: act on mousedown (left button), like the canvas already does via onCanvasMouseDown — it fires synchronously before the re-layout. Tests fire mousedown (not click) so a revert to onClick is caught. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two cleanups from the /simplify review: - IframePanel: commitUrl and observeFrameUrl differed by a single updateParameters line (and repeated the title formatting a third time in goToHistoryIndex). Extract applyFrameUrl(url, persist). - The host's `tab list --json` parser re-implemented the connection's parseTabRecord — same tabId/id fallback, same url/active coercion, even the same "some CLI builds use id" rationale. Lift the record parse into a shared lib/src/lib/agent-browser-tab.ts that both the connection (live stream) and the host (CLI) import, so the tab shape has one home. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two fixes for "clicking a browser tab does nothing": 1. Click-eating, fixed at the source. Selecting a pane on mousedown makes dockview move the panel's DOM, and a real (non-instant) press then spans that relayout, so the browser never synthesizes the `click` and the chip/× silently did nothing. Give every browser surface dockview's `renderer:'always'` (rendererForParams) — the same setting the iframe already used — so the node stays put and the click survives. The tab chip/× revert from the mousedown workaround (cc56e6e) back to onClick. (The canvas passthrough gate and the rAF focus stay — verified those address first-click selection timing and dockview focus ordering, not the DOM move.) 2. Stale canvas on switch. The daemon emits no screencast frame when the active tab changes, and the dedup'd stream (1388844) is otherwise silent on a static page, so the canvas never repainted onto the newly-selected tab. Force one device capture when the active tab id changes so the surface follows the switch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two URL-bar-lies in the new same-frame location plumbing: - The proxy shim posted `location` for any anchor click, so Cmd/Ctrl/ Shift/Alt+click (which open a new tab and leave the frame put) moved the parent URL bar + Back history to a page the iframe wasn't showing. Guard the same-frame post on modifier keys / primary button. - After a shim-observed in-frame navigation, params.url stays at the source URL, so Back to that URL was a no-op write and the proxy effect never re-fired — the frame kept showing the navigated page while the chrome showed the Back target. Bump reloadNonce in goToHistoryIndex to force a re-resolution (fresh proxy port → real reload), matching the reload button. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Deploying mouseterm with
|
| Latest commit: |
15733ef
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://51d2e0e3.mouseterm.pages.dev |
| Branch Preview URL: | https://ab-iframe-unify.mouseterm.pages.dev |
dormouse-bot
left a comment
There was a problem hiding this comment.
Both Build & Test and Standalone Smoketest are red on the same tsc error at AgentBrowserPanel.tsx:546 (platform.agentBrowserCommand possibly undefined). The PR description calls this "pre-existing... present before this branch's latest fixes" — but it isn't on main: there connect() resolved the stream URL via getAgentBrowserStreamUrl?.(wsPort) (optional-chained). This branch rewrote connect() to call platform.agentBrowserCommand(...) directly, which introduces the error. The effect already guards with if (!platform.agentBrowserCommand) return; a few lines up, but TS doesn't carry that narrowing of an optional property into the nested connect closure — so the call still needs a local binding. Inline suggestion below makes the build green.
One lower-confidence observation on the new iframe proxy shim, for your call (no fix pushed since the right behavior is a judgment): the injected click handler in iframe-proxy-rewrite.ts posts location: a.href from the capture phase for same-frame primary-button anchor clicks. Capture runs before the page's own bubble-phase handlers, so a same-origin link the page intercepts and cancels (<a href="/logout" onClick={e => e.preventDefault()}>, or an <a> styled as a button that does a fetch instead of navigating) still reports /logout to the parent, which pushes it into Back history and the URL bar though the frame never navigated. SPA <Link> clicks self-correct via the patched pushState, and javascript:/cross-origin hrefs are already dropped by the parent's origin check — so the residual gap is genuinely-cancelled same-origin clicks. If that matters, deferring the post a tick (setTimeout(() => { if (!e.defaultPrevented) post('location', ...) }, 0)) or relying on the navigation hooks instead of the click guess would close it.
- AgentBrowserPanel's popped-out CDP-connect effect called the optional platform.agentBrowserCommand directly inside the nested connect() closure. TS doesn't carry an optional-property narrowing into the closure, so it failed tsc (TS2722/TS18048) and reddened CI. Bind it to a local const after the guard. - The proxy shim's same-frame click handler runs in the capture phase and posted location before the page's own handlers, so a same-origin link the page cancels (preventDefault, or an <a> that fetches instead of navigating) still desynced the parent URL bar + Back history. Defer the post a tick and skip it when the click was cancelled; real navigations re-report via the next document's shim. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The two surfaces are now one `browser` surface with a swappable renderer (ab-screencast / ab-popout / iframe), so describe them in one spec: shared shell first (chrome, canonical pane state, render swap, lifecycle, host pattern), then each renderer. Removes the cross-file duplication (the 3x "full browser-chrome tab" line, the host-pattern echo, the transitions/tab-drop restatement) by giving each shared concept a single owner. Repoints dor-cli.md links and all code-comment doc-pointers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the two blocks that re-narrated the code — the implementation touchpoints table and the per-method host-capability descriptions — with a compact code map and one-line method notes (signatures already live in types.ts). Trim restated header-layout bullets, the dev-server mechanics, the two-axes render bullet, and the profile-persistence future-work item (each duplicated a fact stated elsewhere). Also drop stray </content>/</invoke> tags that leaked into the file tail. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rewrite the unified browser spec as a terse contract: Title-case sections, a per-section "Source of truth" file list, flat BrowserPanelParams, and declarative invariants/header-contract/swap-behavior tables. Delegates mechanics to the code instead of paraphrasing them (~440 lines).
The contract rewrite renamed sections, orphaning the "→ <section>" anchors in the code comments that point at dor-browser.md. Re-point all of them to the new headings (23 files) and clean a residual sub-anchor. Restore four traps the condensation dropped that aren't recoverable from the code: screencast is DIP/CSS-only so owning CDP wouldn't help; VK map must not be key.charCodeAt(0) (. = VK_DELETE); pop-out must not query the daemon in the close/reopen gap (spawns a blank daemon); sync is last-writer-wins, disengaged only after a frame confirms the issued size. Also fix the iframe-focus source-of-truth paths: use-window-focused.ts is under components/wall/, and registerSurfaceFocusHandle is defined in terminal-lifecycle.ts (terminal-registry only re-exports it). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The iframe panel calls the host capability through a detached reference (`const createProxy = getPlatform().createIframeProxyUrl; createProxy(url)`), which drops `this`. BrowserSidecarAdapter's methods reach `this.host.invoke`, so the detached call threw "Cannot read properties of undefined (reading 'host')" — caught and surfaced as a bogus "Couldn't reach the server" error page in every `dor iframe` pane under the standalone agent-browser dev harness. The Tauri adapter is `this`-free (module-level rawInvoke) and the VS Code adapter already binds for exactly this reason; BrowserSidecarAdapter was the lone adapter that didn't honor the detached-call convention. Bind its this-using methods in the constructor, mirroring vscode-adapter. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
#156 removed dor-iframe.md and dor-agent-browser.md, replacing both with dor-browser.md. Merge main and collapse the two index entries into one pointing at the unified browser surface spec. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
What
Collapses
dor iframeanddor ab(agent-browser) into a singlesurfaceType: 'browser'surface with three interchangeable renderers —ab-screencast,ab-popout, andiframe— all sharing one browser chrome (URL bar + Back/Forward + the far-left Display chip).params.urlbecomes the canonical URL across renderers, so swapping renderers or restoring a session keeps the page.See
docs/specs/dor-iframe.mdand the agent-browser spec for the as-built design.Highlights
browsersurface type; render modes renamed toab-screencast/ab-popout/iframe. Every browser surface gets real chrome (URL, nav, Display modal) on every host.window.openattempts are offered as a new pane instead of vanishing in the single-frame renderer. CSP / X-Frame-Options framing refusals are diagnosed and shown as a served page.dev:standalone:ab) with a debugging skill.Review follow-ups (ultrareview + dormouse-bot)
Three chrome-desync / build issues caught in review, all now fixed with coverage:
locationfor any anchor click, so Cmd/Ctrl/Shift/Alt+click — which open a new tab and leave the frame put — moved the parent URL bar + Back history to a page the iframe wasn't showing. Now guarded on modifier keys / primary button.preventDefault, or an<a>that fetches instead of navigating) still reported a navigation. The post is now deferred a tick and skipped when the click was cancelled; real navigations re-report via the next document's shim.params.urlstays at the source URL, so Back to that URL was a no-op write and the proxy effect never re-fired — the frame kept showing the navigated page while the chrome showed the Back target.goToHistoryIndexnow bumpsreloadNonceto force a re-resolution (fresh proxy port → real reload), matching the reload button.AgentBrowserPanel's popped-out CDP-connect path to call the optionalplatform.agentBrowserCommanddirectly inside a nested closure; TS doesn't carry the optional-property narrowing into the closure, so it failedtscand reddened CI. Bound to a localconstafter the guard. (An earlier revision of this description wrongly called this pre-existing — it is introduced by this branch;mainusedgetAgentBrowserStreamUrl?.(…).)Test
tsc -bclean;pnpm -r run testgreen. The iframe-proxy shim andIframePanelchrome/Back behavior have direct coverage.🤖 Generated with Claude Code