A pointer back to first principles before diving into the hard parts.
The core product, as described in the Foundation Document, is a workspace where VC teams run due diligence on startups — combining structured manual input with automated data enrichment, shaped around how each fund actually invests. The AI synthesizes that combined input into a diligence output specific to that fund's thesis.
That description is accurate. But it understates the engineering challenge and — crucially — it doesn't yet answer the question of what the product's defensible value is. Enrichment alone is not it. Summary generation is not it. The value is in what happens when you take multiple independent data sources about the same person or company and ask: does any of this contradict itself, and if so, why?
Not data retrieval. Not summarization. Trust calibration.
Most of what feels like "AI due diligence" is straightforward to build: pull a LinkedIn profile, run a web search, compile a summary. The hard part — the part that actually matters — is trust calibration: building a system that can assess the quality and reliability of a signal before acting on it.
Due diligence sources are noisy by nature. LinkedIn profiles are self-reported. Press coverage is marketing. GitHub activity can be gamed. Reference checks are politely filtered. When a product aggregates these sources and presents them as a unified picture, it is implicitly making a claim about their reliability — and that claim is usually wrong in ways that aren't visible to the user.
The specific challenge is conflict detection: the ability to identify when independent sources are inconsistent with each other, and to reason about why. There are at least three distinct failure modes this needs to handle:
A founder claims five years of experience in a domain, but there is no recoverable trace of it — no code, no writing, no affiliation, no contemporaneous mention by anyone in that field. The absence itself is the signal. A naive retrieval system will simply note "limited public information" and move on. A trust-calibrated system will flag the discrepancy between claimed background and verifiable history, and label it clearly.
Some signals are designed to look organic but aren't: coordinated LinkedIn endorsements, press placements purchased through pay-to-play outlets, GitHub repositories inflated with AI-generated commits, social proof that appears clustered in time or network origin. Detecting these requires comparing signal origin, timing, and distribution against baseline expectations — which in turn requires the system to have a model of what genuine activity looks like in a given context.
The subtler version: two sources each appear credible on their own, but they tell different stories about the same fact. The founder's bio says the company launched in 2021; a Wayback Machine snapshot shows a domain registered in 2019. The pitch deck cites a market study; the study, when retrieved, contains different numbers than were quoted. These inconsistencies are rarely fabrications — they're often innocent errors — but they are always important to surface, because they indicate areas where the investor needs to ask a direct question.
The product cannot simply summarize what it found. It must maintain an internal model of source reliability and surface conflicts explicitly, with a clear indication of why something was flagged and what the investor should do with that flag. Anything less collapses the value proposition back to "fancy web search."
Three capabilities. Each earns its keep independently. Together, they create a product that's difficult to leave.
These are not features that require evangelism. They map directly onto problems VCs already know they have and are already spending time — or money — trying to solve.
A first-pass diligence output — founder background, company context, market signals, open questions — in under twenty minutes. Not a summary of what the product found; a structured picture that is actually useful in a partner conversation.
The reason this converts to willingness-to-pay is that it changes when a fund can engage with a deal, not just how efficiently they process it. A two-person fund currently can't run a serious first pass on every inbound deal — there are too many of them. The bottleneck isn't judgment; it's the three hours of grunt work required before judgment can be applied. If that grunt work drops to twenty minutes, deals that would have been passed on for lack of time get a look. That is a structural improvement to the fund's sourcing process, and funds understand that.
Why they'll pay: They are already paying for this time, in analyst hours. They know exactly what it costs them.
The ability to surface, clearly and specifically, when something in a founder's profile or company narrative doesn't add up across independent sources. Not a vague "inconsistencies were noted" — a specific flag with the source, the conflicting claim, and a suggested line of questioning.
This is the product's technical differentiator, and it is also the capability most likely to create word-of-mouth among GPs. Every experienced VC has a story about a deal that looked right on the surface and went wrong after the check was written. The professional cost of a bad investment is significant — not just financially, but reputationally. A product that reliably surfaces the red flags a fast-moving diligence process is most likely to miss speaks directly to that fear.
The important design constraint: the product must flag, not conclude. "These two sources tell different stories about the founding date — worth a direct question" is useful. "This company may have a credibility problem" is not. The investor makes the judgment; the product makes sure they have the information.
Why they'll pay: One bad investment funded in part because diligence was too thin is a very expensive lesson. This is insurance with a measurable risk profile.
Every deal in the fund's active pipeline evaluated with the same depth, the same questions asked, the same signals checked. Not dependent on which analyst happened to handle it, how much time was available that week, or whether the partner meeting was the next morning or three weeks away.
This is less dramatic than conflict detection but, arguably, more important to the fund's long-term operations. Small funds — the core target — have a consistent problem with evaluation quality varying wildly based on capacity constraints. A deal that comes in during a busy week gets a lighter look than one that arrives during a quiet stretch. That inconsistency is expensive in two directions: good deals get passed because the diligence was shallow, and bad deals slip through because the team was stretched. Consistent depth removes that variable.
Why they'll pay: Partners can defend any investment decision if the process was rigorous. Inconsistency is a governance risk and a team management problem, not just an analytical one.
Features that sound useful but don't convert to willingness-to-pay. Build them only if they serve the three things above.
Aggregating and returning data about a company without synthesis or conflict detection. This is already commoditized — Crunchbase, PitchBook, and Harmonic do it at scale. If the product's value is "here is what the internet says about this company," it is competing on a dimension it cannot win.
Any feature that produces a "likelihood to succeed" score, investment rating, or automated ranking. VCs do not want this and will actively distrust any product that offers it. Their job is the judgment. A product that appears to automate it is threatening their identity, not assisting it.
Tracking portfolio companies post-investment. This is a different workflow, different cadence, different data needs, and a different buying decision. It's not a bad product to build eventually — but mixing it into the diligence product blurs the value proposition and lengthens the sales cycle.
Finding which partners know which founders. The "who can make the warm intro" problem is real, but it is solved well enough by LinkedIn and existing CRMs. Building this feature competes on relationship data, which requires network effects this product won't have at launch. It also shifts the product toward CRM territory — adjacent but not the core.
The common thread: these features either commoditize the product (enrichment), threaten the investor's professional identity (scoring), or belong to a different product category (monitoring, network tools). None of them make the three paying features stronger. Time spent building them is time not spent on trust calibration.
What to build first, and why the order matters as much as the features themselves.
The instinct is to lead with the differentiating feature — conflict detection is the technical moat, so build it first. This is wrong. Conflict detection only creates value if the user trusts the rest of the output. A flag raised by a product the analyst doesn't yet believe in will be ignored, or worse, will make the analyst distrust the product more. The credibility of the flag depends entirely on the credibility of everything around it.
| Phase | Focus | What Gets Built | Goal |
|---|---|---|---|
| 1 | Speed + ConsistencyTrust | Reliable, structured first-pass output that is accurate and fast. Every deal gets the same depth. The product does what it says it will do, every time, in under twenty minutes. | Establish baseline trust with the team. Become part of the workflow before adding complexity. |
| 2 | Conflict DetectionDifferentiation | Cross-source inconsistency detection. Implausible absence flags. Manufactured signal identification. Surfaced as specific, actionable flags — not warnings, not scores. | Convert trust into dependency. The analyst starts finding things they wouldn't have found otherwise. The product earns its seat at the partner meeting. |
| 3 | Thesis CalibrationMoat | Each fund's output shaped increasingly by its own historical decisions — the questions it always asks, the signals it has learned to weight, the red flags specific to its thesis. The product becomes harder to replicate for each individual fund. | Create switching cost. The product is no longer just a workflow tool — it contains the fund's institutional knowledge about how to evaluate deals. |
Phase 1 is what design partners evaluate. It needs to be good enough that a GP would be embarrassed not to use it before their next partner meeting. If Phase 1 isn't reliably impressive, Phase 2 never gets the audience it needs to demonstrate its value.
Phase 2 is what converts paying pilots into long-term subscriptions. Once an analyst has had the experience of a conflict flag leading to a question that changed the direction of a deal, the product has a defensible position in their workflow.
Phase 3 is what makes the product worth the pricing we want to charge at scale. It's the layer that creates the data flywheel — more deals in the system means better calibration, means more accurate flags, means a product that a fund that has been using it for two years simply cannot replicate from scratch with a competitor.
How the product actually works, step by step — and why each handoff is a design decision.
Everything in the product reduces to a single repeating cycle. Understanding it clearly is the design contract that all other decisions — data model, conflict detection, agent capability, UX — are built against. The loop has six stages, and each stage is a handoff between the system and the human.
Before any deal is evaluated, the fund encodes what they believe. What stage, what markets, what team profiles, what signals they weight — and crucially, what they have decided not to invest in and why. This is the foundation against which every deal is assessed. It is not a one-time setup; it is a living configuration that reflects the fund's evolving conviction.
Design constraint: Thesis capture must be expressive enough to hold real conviction, not just checklist criteria. A fund that only does climate infrastructure in emerging markets has a different thesis shape than one that does B2B SaaS with repeat founders. The system needs to hold both.
A deal enters the workspace. This may be a pitch deck, a founder name, a company URL, or a structured intake form — the product adapts to how the deal arrived, not the other way around. The system extracts structured information from whatever it receives and creates the deal's working file.
Design constraint: Intake must be low-friction. If the analyst has to re-enter information the deck already contains, the product has failed at the first step.
The agent runs enrichment — pulling background signals on the founder, company context, market data, and public signals — and evaluates the deal against the fund's thesis. The output is not a ranking of how good the startup is. It is an assessment of how well this startup fits this fund's specific investment logic, with clear identification of where it aligns and where it diverges.
Open question: To what degree does a fund's workflow definition influence which data sources the product queries? The current lean is toward product-controlled source selection — source reliability, coverage gaps, and conflict detection logic all depend on the product having a consistent picture of what each source provides. But whether the workflow layer can or should shape source selection is still being decided.
Design constraint: The output must be thesis-aware, not generically positive. A strong startup that doesn't fit the fund's thesis should be surfaced as a pass candidate, not presented as an opportunity.
The product surfaces conflicts and flags proactively — without being asked. When two independent sources tell different stories about the same fact, when a claimed background leaves no verifiable trace, or when a signal pattern looks manufactured rather than organic, the system flags it explicitly: what was found, which sources conflict, and what question the investor should ask directly.
The product flags; it does not conclude. "These two sources disagree about the company's founding date — worth a direct question" is the correct output. "This company may have a credibility problem" is not. The judgment belongs to the VC.
Design constraint: Flags must be specific, sourced, and actionable. A vague warning is worse than no warning — it creates anxiety without giving direction.
The VC reviews the output. They can accept what they have and move forward, pass on the deal, or direct the agent to do more. Direction is the key capability here — the VC doesn't re-run a search or adjust settings; they give the agent a task in plain language: "Find more about this gap in their background," "Check whether this market claim holds," "Draft a question for the founder about the founding date discrepancy."
This is the human-in-the-loop moment, and it is real near-term intent — not a future state. The VC remains the decision-maker at every stage. The agent's job is to reduce the cost of acting on a decision, not to replace the decision itself.
Design constraint: The direction interface must be natural, not mechanical. VCs should express what they want in plain language. Forcing them into a structured task builder defeats the purpose.
The agent executes the task — additional research, a deeper look at a specific source, or drafting a structured question for the founder — and returns with results. These are folded back into the deal's working file and surfaced to the VC. The loop begins again at stage 05: the VC reviews what came back and decides whether they now have enough, or whether another task is warranted.
Design constraint: The loop must feel like progress, not recursion. Each cycle should leave the VC with a clearer picture, not more open questions. If the agent's output consistently generates more uncertainty than it resolves, the product is failing at its core job.
One specific agent task requires a design decision that cannot be deferred: when the VC directs the agent to contact a founder — to ask a specific question, request a document, or follow up on a discrepancy — who does the communication come from, and what does the founder experience?
The product sends the message from a product-branded address or channel. The founder knows they are interacting with a tool. Fast, scalable, and transparent about the process. Risk: some founders react negatively to automated outreach during an active deal relationship. A VC whose deal dynamic gets damaged by this will attribute it to the product.
The agent drafts the outreach or question. The VC reviews and sends it from their own address. The founder interacts with the VC, not the product. The agent's role is invisible externally. Slower, but preserves the deal relationship and keeps the human in the loop on the highest-stakes external action.
Current direction: Option B. The near-term implementation has the agent draft founder communications and the VC send them. This keeps the human in the loop on every external action, protects the deal relationship, and avoids the trust and perception risks of fully automated outreach. Revisit once design partners show comfort with more direct agent-to-founder interaction.