Experiments

  • Generative AI as an accessibility testing coach

    Automated accessibility tools tell you what failed. We wanted to know if AI could tell you why, and coach a non-expert through fixing it.

    See experiment: Generative AI as an accessibility testing coach
  • Design handoff for your AI coding agent

    We built a tool that converts a design screenshot or Figma file into two machine-readable files, giving AI coding agents the vocabulary and rules they need to generate UI that actually matches your brand.

    See experiment: Design handoff for your AI coding agent
  • Vibe coding platforms in the real world

    We put Lovable through its paces across a variety of different builds and came away impressed by the speed, skeptical about the depth, and clear on where it breaks down.

    See experiment: Vibe coding platforms in the real world
  • Approaches to automated name matching

    Reconciling duplicate names across disparate datasets is harder than it looks. Here’s what we learned testing algorithmic and AI-based approaches.

    See experiment: Approaches to automated name matching
  • AI powered staff writing assistant

    We tested whether a language model could reliably check government web copy against an official style guide and give writers specific, actionable feedback without a human editor in the loop.

    See experiment: AI powered staff writing assistant
  • Generative AI as an accessibility auditor

    We tested whether AI chat tools could replace or supplement traditional accessibility scanners, and learned that the quality of the output had less to do with the model and more to do with how precisely we asked the question.

    See experiment: Generative AI as an accessibility auditor
  • Can you just ask your database a question?

    We built a natural language interface to a legacy analytics database to find out how far plain English can get you before SQL becomes unavoidable.

    See experiment: Can you just ask your database a question?
  • Your codebase is a graph

    We mapped a codebase as a knowledge graph and queried it for structural problems, then built 21 metrics and a dashboard to make the findings actually useful.

    See experiment: Your codebase is a graph
  • Cross-tool Semantic Search

    Most teams don’t have a documentation problem. They have a documentation sprawl problem. We built a semantic search layer across Jira, Confluence, and Figma to find out how far AI can go in solving it.

    See experiment: Cross-tool Semantic Search