Experiments

  • Vibe coding platforms in the real world

    We put Lovable through its paces across a variety of different builds and came away impressed by the speed, skeptical about the depth, and clear on where it breaks down.

    See experiment: Vibe coding platforms in the real world
  • Approaches to automated name matching

    Reconciling duplicate names across disparate datasets is harder than it looks. Here’s what we learned testing algorithmic and AI-based approaches.

    See experiment: Approaches to automated name matching
  • Crowdsourcing park accessibility info

    We tested whether AI tagging and semantic search could replace the periodic manual update cycle that causes government information services to slowly become unreliable.

    See experiment: Crowdsourcing park accessibility info
  • AI powered staff writing assistant

    We tested whether a language model could reliably check government web copy against an official style guide and give writers specific, actionable feedback without a human editor in the loop.

    See experiment: AI powered staff writing assistant
  • Web apps on embedded devices

    We figured out how to bundle an entire web application, assets and all, into a single file that runs reliably on a low-power embedded server.

    See experiment: Web apps on embedded devices
  • Generative AI as an Accessibility Auditor

    We tested whether AI chat tools could replace or supplement traditional accessibility scanners, and learned that the quality of the output had less to do with the model and more to do with how precisely we asked the question.

    See experiment: Generative AI as an Accessibility Auditor
  • Can you just ask your database a question?

    We built a natural language interface to a legacy analytics database to find out how far plain English can get you before SQL becomes unavoidable.

    See experiment: Can you just ask your database a question?
  • Your codebase is a graph

    We mapped a codebase as a knowledge graph and queried it for structural problems, then built 21 metrics and a dashboard to make the findings actually useful.

    See experiment: Your codebase is a graph
  • Cross-tool Semantic Search

    Most teams don’t have a documentation problem. They have a documentation sprawl problem. We built a semantic search layer across Jira, Confluence, and Figma to find out how far AI can go in solving it.

    See experiment: Cross-tool Semantic Search