Towards an AI-Native Auth Framework

mfa aint gonna cut it this time

Nov 30, 2024

AI agents are almost ready to automate web tasks. Think booking travel to deploying code to everything in between. We don’t have a good way to create a personal and secure environment which can run tasks.

At the end of the day, this requires a deep rethinking of how auth happens on the web, because 99% of the internet doesn’t even do OAuth today. And the sites that do don’t support the level of granularity that you probably want for this kind of stuff

tl;dr - the twitter version https://x.com/dexhorthy/status/1862709882975396272

what we got

Today's options for agent authentication are…underwhelmingly:

1. Share raw credentials - this is what most people are doing today with browsing agents, but it's a security nightmare

2. Use permanent API keys - better than passwords but still too broad in scope and hard to audit/revoke

3. OAuth - decent standard but very few sites support programmatic auth, and even fewer have the controls for short-lived or tightly scoped access

None of these approaches provide the portability needed for AI agents to safely interact across the web. Even OAuth, while providing a standardized protocol, wasn't designed with the fine-grained, single-operation permissions that AI agents require to safely operate across different services and platforms.

Image of but we don't have ai native auth...all we have is this

doing this today

Let's look at a common approach - using a browsing agent to fill out a form. A tool call like

const result = await handleToolCall('bookFlight', {
    from: 'SFO',
    to: 'NYC',
    date: '2024-03-01'
});

might generate incremental tool calls like this:

await page.fill('input[name="from"]', 'SFO');
await page.fill('input[name="to"]', 'NYC');
await page.fill('input[name="date"]', '2024-03-01');

const result =await requestUserApproval({
    action: 'bookFlight',
    params: {
        from: 'SFO',
        to: 'NYC',
        date: '2024-03-01'
    }
});

if (result.approved) {
    await page.click('button[type="submit"]');
} else {
    console.log('User did not approve');
}

The challenge here is that while you have programmed your browsing agent to e.g. fill out a form, and then wait for permission to hit submit, you're relying on

1. the agent reliably relaying the currently filled form fields to the user when requesting approval (probably works most of the time)

2. the agent *NOT* accidentally hitting submit without permission (I wouldn't trust today's browsing agents for riskier things)

is that safe?

no…no its not. Overall this is not at all airtight, and I would rather blow chili powder in my eyes than trust it for anything that matters.

could this get better?

How about: instead of permanent credentials, your AI assistant requests a short-lived, single-use token for each action. When booking a flight, you receive a secure notification with the exact details, approve with your passkey, and the agent gets a cryptographically-signed token valid only for that specific booking.

Here's how it works:

1. Your agent needs to book a flight:

const result = await handleToolCall('bookFlight', {
   from: 'SFO',
   to: 'NYC',
   date: '2024-03-01'
});

2. You receive a secure prompt and approve with your passkey:

async function handleApproval({ tool, params }) {
    const signedJWT = await requestUserSignature({
        action: tool,
        params,
        expiresIn: '30s'
    });

    return signedJWT;
}

3. The service verifies and executes only the approved action:

app.post('/api/execute', verifyJWT, async (req) => {
    const { tool, params, exp } = req.jwt;
    if (Date.now() >= exp) return error('Expired');
    return await executeTool(tool, params);
});

you probably want this

This might feel like a rehashing of the same auth conversations we’ve been having since twitter implemented oauth 1 back in the early 2010s.

Image of Oh, yeah. We're just improving on it.

But what you get now is:

Security: No permanent access tokens

Granularity: Approve specific actions, not broad access

Auditability: Clear record of what was approved and executed

User Control: Nothing happens without explicit approval

why apple/1pass/okta can't just solve this

A lot of people suggest Apple or 1Password could solve this authentication challenge. This fundamentally misunderstands the problem - no matter how secure their authentication layer is, the target websites need to implement support for granular, time-limited permissions.

You can't bolt security onto existing auth patterns. Sites need to build support for:

- Short-lived access tokens (30s or less)

- Action-specific permissions ("book this exact flight" vs "access travel account")

- Single-use credentials that expire after one operation

- Verification of exact parameters that were approved

Unless apps/sites implement these patterns directly, even the most secure auth provider can only provide the same broad access we have today. We need a new protocol that services themselves adopt, not just a better way to manage existing credentials.

this is a plaid-shaped problem

But it needs 1000x the breadth to be useful.

plaid figured this for banks, but they had a few things going for them

1. Scale & Diversity: Financial institutions are a relatively small, well-defined set of organizations with similar security models. The broader web has millions of services with wildly different authentication approaches.

2. Regulatory Environment: Banks were pushed toward standardization by regulations like PSD2. Most web services have no similar pressure to adopt standardized authentication.

3. Business Model: Financial data aggregation has clear monetization through fintech companies willing to pay for access. There's a clear market for agents that browse the web safely, but its emerging.

4. Technical Complexity: Banking APIs, while varied, generally follow similar patterns in terms of the shape and structure of the data they return.

Image of i don't understand how financial auth infrastructure works let alone some sort of ai-native action framework

Instead of a single aggregator, we probably need an open protocol that services can implement directly - similar to how OAuth evolved, but designed specifically for granular, agent-based access control.

yeah but when dude

We probably won't see one tool that implements the generic access gateway. Someone will get the protocol right and then every site will implement it. Perhaps it could be built as an auth middleware on top of something like Anthropic’s MCP.

The incentive alignment to make this happen isn't clear though. One possibility is that a single major player in a category creates an AI-ready, airtight auth implementation, which then forces the hand of all other companies in their market.

Getting this right is critical for real agents that do real things - I can't see how AI agents can safely automate the web at scale unless we rethink how 99%+ of sites handle auth today. OAuth is a decent standard to frame this in, but very few sites support programmatic auth at all, and only a subset of those have the controls to create the kind of short-lived, tightly-scoped access needed to guarantee limited access in line with "human approved a single operation."

As we start getting deeper into what “human in the loop” looks like in practice and for production workload, I’m excited to figure all this out. If you’re working on this, let’s chat.

acknowledgements

Shouts out to @oleg who kicked off the original Twitter thread on this topic:

Start: https://x.com/olzare/status/1862266264678539480

Followup: https://x.com/dexhorthy/status/1862709882975396272

Shouts out to

Meji Abidoye

who wrote about similar problems for computer use here:

Notes

The Permission Portability Problem: Rethinking Auth for AI Agents

When Anthropic's computer use demo dropped, I wanted to try it, but my first thought was "not on my computer". So, I hacked a way for me to launch an EC2 instance, run Anthropic's container on the instance, then stream the results to my browser. It worked pretty well but I very quickly ran out of interesting use cases to test on a brand new EC2 instance…

9 months ago · Meji Abidoye

The Outer Loop

Discussion about this post