Experiments

An accountability trail for AI decisions

When an AI shapes a decision about a person, can you later show what happened and who was accountable? We built a thin governance layer that records exactly that.

See experiment: An accountability trail for AI decisions
Generative AI as an accessibility testing coach

Automated accessibility tools tell you what failed. We wanted to know if AI could tell you why, and coach a non-expert through fixing it.

See experiment: Generative AI as an accessibility testing coach
Design handoff for your AI coding agent

We built a tool that converts a design screenshot or Figma file into two machine-readable files, giving AI coding agents the vocabulary and rules they need to generate UI that actually matches your brand.

See experiment: Design handoff for your AI coding agent
Vibe coding platforms in the real world

We put Lovable through its paces across a variety of different builds and came away impressed by the speed, skeptical about the depth, and clear on where it breaks down.

See experiment: Vibe coding platforms in the real world
Approaches to automated name matching

Reconciling duplicate names across disparate datasets is harder than it looks. Here’s what we learned testing algorithmic and AI-based approaches.

See experiment: Approaches to automated name matching
AI powered staff writing assistant

We tested whether a language model could reliably check government web copy against an official style guide and give writers specific, actionable feedback without a human editor in the loop.

See experiment: AI powered staff writing assistant
Generative AI as an accessibility auditor

We tested whether AI chat tools could replace or supplement traditional accessibility scanners, and learned that the quality of the output had less to do with the model and more to do with how precisely we asked the question.

See experiment: Generative AI as an accessibility auditor
Can you just ask your database a question?

We built a natural language interface to a legacy analytics database to find out how far plain English can get you before SQL becomes unavoidable.

See experiment: Can you just ask your database a question?
Your codebase is a graph

We mapped a codebase as a knowledge graph and queried it for structural problems, then built 21 metrics and a dashboard to make the findings actually useful.

See experiment: Your codebase is a graph
Cross-tool Semantic Search

Most teams don’t have a documentation problem. They have a documentation sprawl problem. We built a semantic search layer across Jira, Confluence, and Figma to find out how far AI can go in solving it.

See experiment: Cross-tool Semantic Search

Experiments

An accountability trail for AI decisions

Generative AI as an accessibility testing coach

Design handoff for your AI coding agent

Vibe coding platforms in the real world

Approaches to automated name matching

AI powered staff writing assistant

Generative AI as an accessibility auditor

Can you just ask your database a question?

Your codebase is a graph

Cross-tool Semantic Search