The Password Problem in AI Browser Agents

I started building a personal assistant on a VPS and hit the authentication wall. Here is what I tried, what worked, and what I would do differently.

May 30, 2026

I started building a personal assistant for myself. The setup was straightforward on paper. A VPS running a Claude-powered agent that could read my email, draft replies, pull updates from various sources, and generally handle the small recurring tasks I would otherwise tab through manually. The vision was clear. The implementation hit a wall almost immediately, and the wall was authentication.

The Password Problem in AI Browser Agents

A lot of things I wanted the assistant to do required logging into a website. And the assistant lives on a headless VPS. There is no screen, no mouse, no Chrome profile I can casually open and click around in. Every auth approach I considered had to work entirely on VPS, with no GUI in the loop at any point.

The naive version was easy. Playwright MCP, default config, agent fetches the credentials, let it log in.

Agent → read creds from a file → fill login form → continue

It worked.. It also meant my passwords were sitting in some secret manager, being read into the agent's context at the start of every session, flowing through tool call payloads on the way to the form field. .The architectural problem was that the agent's context window held the credential the moment it filled the form. Anything logged, anything recorded, any future debug trace captured it. The code worked. The design did not.

The next thing I tried was an extension-based setup. I launched Chromium with the Bitwarden extension preloaded, exposed it over the Chrome DevTools Protocol on port 9222, and pointed Playwright MCP at that endpoint instead of letting it spin up its own browser.

Chromium daemon: --load-extension=bitwarden + persistent profile + CDP on :9222
Agent (via MCP): connects over CDP → opens the login page
Bitwarden:       detects form fields → injects credentials into the DOM
Agent:           continues, no credential ever entered its reasoning loop

On paper this felt like the right structure. Credentials lived in bitwardern I already trusted. The injection happened entirely outside the agent's reasoning loop. The agent only ever observed a logged-in browser, never the act of logging in. Clean architecture, at least on the whiteboard.

What the running version showed me was that the credential exposure problem had quietly been swapped for an operational one. The setup requires a persistent Chromium daemon. That daemon has to launch once with the right flags, stay alive across sessions, restart when it crashes, and survive whatever the agent decides to do inside it. Every new Claude session spawns a fresh Playwright MCP process that connects to that single Chromium. They share tabs. They share state. They sometimes step on each other. A couple of days of testing later, I had twenty open tabs in a browser nobody was watching, five zombie Chrome processes between them holding 159 minutes of accumulated CPU time, and a startup script with three workarounds for race conditions. The infrastructure keeping the browser alive did not look after itself.

The textbook answer to all of this is browser profiles. Log into each site once, save the resulting cookies and localStorage to a file, load that file at every session. No persistent browser daemon, no extension, no credentials anywhere in the pipeline.

It is not the cleanest pattern on a VPS. To log in once and save the profile, you need a real browser session with a real GUI somewhere. On a headless server that means tunneling X11 over SSH, running a VNC server, or some equivalent contortion. All of those work technically and all of them add a setup ceremony for what should be a one-time action. Worse, every time a session token expires, you are back to standing up a tunnel just to refresh state. For sites with aggressive reauth policies this turns into constant friction.

The pattern I have not yet tested, is pre-auth outside the agent. A separate service, running anywhere convenient, handles authentication independently. It fetches credentials, completes the login flow, and writes session state to a location the agent can read. The agent itself starts with cookies already loaded and only ever runs the task.

Pre-auth service (anywhere): fetch creds → log in → write session.json
Agent (on the VPS):          load session.json → run task → never touched auth

The appeal is the separation. Authentication becomes a service problem with well-understood patterns. The agent becomes a task executor with no awareness of how it became authorized. Credentials live in one place, the auth flow lives in another, and the agent reasoning loop lives in a third. None of them have to know much about the others.

I have not built this yet. The Bitwarden setup is what I am running while I keep testing, and right now each thing that breaks teaches me something useful about where the real constraints live. At some point the operational debt will outweigh what the current setup is teaching me. The next zombie Chrome process I find will probably tip the decision.

The pattern across all of this is the same. Authentication should not share a surface with agent reasoning. The agent should act on behalf of an already-authenticated session, and where that session comes from is an architectural decision worth making before the first line of automation code, not after.

The Curiosity Ledger

Discussion about this post

Ready for more?