BACKGROUND
Documentation sprawl is a real problem for most teams. Specs live in Confluence, tickets in Jira, designs in Figma, and finding anything across all three requires knowing where to look, remembering what it was called, and hoping whoever wrote it used the same terminology you’re searching for. Keyword search doesn’t help much when you’re not sure how something was worded. And when context about a feature is split across all three tools, there’s no easy way to get the full picture in one place.
We wanted to see if we could fix this by building a single interface that searches across all three sources using semantic similarity, and that could answer questions by pulling together everything relevant and synthesizing it.
HYPOTHESIS
If we pull data from Jira, Confluence, and Figma into one place and run semantic search over it, teams will spend less time hunting for information across tools.
APPROACH
We built a Django backend with a Qdrant vector store for semantic search and Postgres for storing full item details. A scheduled script pulls from Jira and Confluence APIs daily, generates a summary for each item using an LLM, and stores both the full content and the summary embeddings.
The interesting part was the “ask me anything” feature, where a user asks a question and gets a synthesized answer from across all sources, similar to Perplexity but for your project docs. We tried four different agentic approaches before landing on one that worked:
Single agent with tools. Give one agent all the tools and let it figure it out. Easy to build, mostly reliable, but it’d occasionally go off-script in ways that were hard to predict.
Orchestrator pattern. A coordinator agent that hands off to specialized sub-agents. More complex to implement and didn’t actually perform better, so we dropped it.
Linear multi-agent workflow. Three agents in a fixed sequence: search, fetch details, summarize. Forcing the order removed some of the unpredictability, but we were still using an agent for the middle step (fetching from the DB) where no real decision-making was happening.
Linear with query optimizer (final solution). Same three-step structure, but we replaced the middle agent with a plain function call, fetch_details(item_ids), since there’s no judgment involved there. The real insight was that the first step mattered most. User queries are often vague, weirdly specific, or just not phrased in a way that returns good vector search results. So we added a query_optimizer agent that either exits early (if the query is clearly outside the project’s scope) or rephrases the query until it gets results that are relevant enough to pass forward. We also ran k-means clustering on the vector store to map out how content was distributed across topics, and fed those cluster summaries to the optimizer for context.
TAKEAWAYS
01
Don’t start with AI, start with the question. The biggest lesson here wasn’t about which AI architecture to use. It was that the quality of the answer depends almost entirely on asking the right question of the right data. When someone types a vague query, a naive system just searches for that vague query and returns mediocre results. We added a step that rewrites and refines the query before touching the data, and that single change had more impact on quality than anything else we tried. The equivalent in a human context: a good researcher doesn’t just Google your exact words, they figure out what you’re actually trying to find.
02
Use AI where judgment is needed, not everywhere. Early versions used AI agents for every step of the process, including steps that were really just data lookups with no decision-making involved. That added complexity and introduced failure points for no reason. The final solution uses AI only where it genuinely helps: understanding and refining the question, and synthesizing the answer. Everything in between is just code. Knowing where AI adds value and where it doesn’t is a skill in itself.
03
The right architecture isn’t always the most sophisticated one. We tested four different approaches, including multi-agent orchestration patterns that are popular in the AI space right now. The one that worked best was also the simplest: a clear sequence of steps, AI where needed, plain functions everywhere else. Teams that chase complexity often end up with systems that are harder to maintain and no more accurate.