Skip to content

Adding WebMCP tools to speek.dev

Listen to this post
0:00
0:00

I've been thinking a lot about how AI agents interact with the web. Right now, most of that interaction is screen-scraping. An agent loads a page, reads the DOM or takes a screenshot, and tries to figure out what's going on. It works, but it's clumsy. The agent doesn't really know what a page can do, it just sees what's rendered.

WebMCP changes that. It gives sites a way to expose tools directly to AI agents through the browser, using navigator.modelContext. Instead of an agent guessing at what actions are available, the page tells it. The agent gets a typed interface with parameters, descriptions, and return values.

I wanted to try it out on my own site, so I added five tools across a few different pages on speek.dev.

What I added

The /coffee page got a search_coffees tool that lets an agent search the roasts I've tried over the past few years. It supports filtering by roaster, origin, roast level, and a minimum rating. The /uses page got search_things, which does the same kind of thing for my gear, apps, and desk setup, filterable by category.

On the blog listing page, I added get_favorites which fetches like counts for all posts and returns them sorted by most liked. Then on individual blog posts, there are two tools: favorite_post (which actually likes the post and updates the heart UI) and get_post_likes (which just returns the current count).

Nothing fancy, but with five tools you get enough to get a feel for the pattern.

How tool registration works

The core of it is navigator.modelContext.addTool(). You give it a name, a description, a parameter schema, and a callback. Here's what search_coffees looks like:

navigator.modelContext.addTool({
  name: "search_coffees",
  description:
    "Searches the coffee collection. Supports filtering by query, roaster, origin, roastLevel, and minRating (1-5). Returns up to 20 matches.",
  parameters: {
    type: "object",
    properties: {
      query: { type: "string", description: "Search across all text fields" },
      roaster: { type: "string" },
      origin: { type: "string" },
      roastLevel: { type: "string" },
      minRating: { type: "number", description: "Minimum rating, 1-5" },
    },
  },
  execute: async ({ query, roaster, origin, roastLevel, minRating }) => {
    // filter the coffee data and return up to 20 matches
  },
});

The search_things tool on /uses follows the same pattern, just with different parameters:

navigator.modelContext.addTool({
  name: "search_things",
  description: "Searches your tools/gear/apps. Filters by query and category.",
  parameters: {
    type: "object",
    properties: {
      query: { type: "string", description: "Matches title and description" },
      category: {
        type: "string",
        enum: [
          "DevTools",
          "Desktop",
          "Firefox",
          "Chromium",
          "Desk",
          "Carry",
          "Coffee",
        ],
      },
    },
  },
  execute: async ({ query, category }) => {
    // filter and return matching items
  },
});

And for blog posts, favorite_post is the most interesting one because it actually mutates state:

navigator.modelContext.addTool({
  name: "favorite_post",
  description:
    "Likes the current post, updates the heart UI, and returns the new count.",
  parameters: { type: "object", properties: {} },
  execute: async () => {
    const res = await fetch(`/api/likes/${slug}`, { method: "POST" });
    const { likes } = await res.json();
    // update the heart button UI
    return { likes };
  },
});

Feature detection and view transitions

All five tools are wrapped in a feature check:

if ("modelContext" in navigator) {
  navigator.modelContext.addTool({ ... });
}

No modelContext support? The code silently does nothing. No errors or console warnings. A regular browser visitor won't notice anything different.

The other thing worth mentioning is view transitions. speek.dev uses Astro's view transitions, which means navigating between pages doesn't trigger a full page load. That's great for performance, but it means your tool registrations can get wiped out when the page swaps. To handle that, each tool re-registers itself on Astro's astro:after-swap event:

document.addEventListener("astro:after-swap", () => {
  if ("modelContext" in navigator) {
    navigator.modelContext.addTool({ ... });
  }
});

This way the tools stay available as you navigate around the site, even without full reloads.

Why this matters for agentic browsing

Right now, AI agents browsing the web are basically doing what screen readers did before ARIA existed, interpreting structure that wasn't designed for them. They're clicking buttons by guessing at labels, extracting data by parsing HTML that could change any time, and hoping the page doesn't do something unexpected after a click.

WebMCP flips that. Instead of the agent reverse-engineering the page, the page tells the agent what it can do. The agent gets a structured API, with typed parameters and predictable responses, right in the browser context.

For my site, this is a simple example. An agent can search my coffee collection without scraping a table. It can like a blog post without finding and clicking a heart icon. But scale this up and the implications are pretty significant. Imagine e-commerce sites exposing product search, filtering, and cart management as WebMCP tools. Or documentation sites letting agents query across versions and APIs directly. Or dashboards exposing their data and actions so an agent can actually use them instead of just reading pixels off a screen.

The parallel I keep coming back to is structured data for search engines. Sites started adding schema.org markup because it made their content machine-readable, and that unlocked better search results, rich snippets, knowledge panels. WebMCP does something similar, but for agents. It makes your site agent-readable, not just human-readable.

We're early. Browser support is minimal, and the spec is still evolving. But the direction feels right and I think there are real benefits to the human users as well.