Experiments
-
Generative AI as an accessibility testing coach
See experiment: Generative AI as an accessibility testing coachAutomated accessibility tools tell you what failed. We wanted to know if AI could tell you why, and coach a non-expert through fixing it.
-
Design handoff for your AI coding agent
See experiment: Design handoff for your AI coding agentWe built a tool that converts a design screenshot or Figma file into two machine-readable files, giving AI coding agents the vocabulary and rules they need to generate UI that actually matches your brand.
-
Vibe coding platforms in the real world
See experiment: Vibe coding platforms in the real worldWe put Lovable through its paces across a variety of different builds and came away impressed by the speed, skeptical about the depth, and clear on where it breaks down.
-
Approaches to automated name matching
See experiment: Approaches to automated name matchingReconciling duplicate names across disparate datasets is harder than it looks. Here’s what we learned testing algorithmic and AI-based approaches.
-
AI powered staff writing assistant
See experiment: AI powered staff writing assistantWe tested whether a language model could reliably check government web copy against an official style guide and give writers specific, actionable feedback without a human editor in the loop.
-
Generative AI as an accessibility auditor
See experiment: Generative AI as an accessibility auditorWe tested whether AI chat tools could replace or supplement traditional accessibility scanners, and learned that the quality of the output had less to do with the model and more to do with how precisely we asked the question.
-
Can you just ask your database a question?
See experiment: Can you just ask your database a question?We built a natural language interface to a legacy analytics database to find out how far plain English can get you before SQL becomes unavoidable.
-
Your codebase is a graph
See experiment: Your codebase is a graphWe mapped a codebase as a knowledge graph and queried it for structural problems, then built 21 metrics and a dashboard to make the findings actually useful.
-
Cross-tool Semantic Search
See experiment: Cross-tool Semantic SearchMost teams don’t have a documentation problem. They have a documentation sprawl problem. We built a semantic search layer across Jira, Confluence, and Figma to find out how far AI can go in solving it.