Benchmark · April 2026

Purpose-built vs. general-purpose AI for investment diligence

A controlled head-to-head comparing HuxleyIQ against a frontier general-purpose model on identical source materials for The Wendy's Company (WEN), across competitive intelligence, Porter's Five Forces, and LBO modeling.

The result

A 23.7-point accuracy gap on the same materials.

Both systems received the same documents and the same tasks. Outputs were scored against a shared rubric; the full rubric and side-by-side outputs are in the white paper.

HuxleyIQ 94.2%

Purpose-built workflows with persistent deal context, evidence chains, and specialized agents.

General-purpose AI 70.5%

Strong document organization, but missed decision-critical findings implied across sources.

Accuracy lift 33.6%

HuxleyIQ scored 23.7 percentage points higher, a 33.6% relative accuracy lift over the general-purpose model.

Decision-critical findings surfaced

$50M+ in aggregate deal-structure implications

  • Securitization covenant mechanics
  • Advertising fund cascade
  • Franchise distress signals
  • GLP-1-caused demand headwinds

Methodology

Same inputs. Same tasks. Scored side by side.

Identical source materials

Both systems worked from the same public source materials for The Wendy's Company, a publicly traded quick-service restaurant business.

Three diligence workstreams

Competitive intelligence, Porter's Five Forces, and LBO modeling — the analytical work that shapes whether a deal moves forward.

Shared scoring rubric

Outputs were scored against a common rubric. The full rubric, scoring detail, and side-by-side outputs are available in the white paper.

Why the gap exists

Summary quality wasn't the difference. Synthesis was.

The general-purpose model organized documents well. It fell behind on findings that were not stated in any single document but implied across several — the kind of synthesis that changes deal structure, diligence priorities, and price.

What HuxleyIQ brings

Persistent deal context, evidence chains tied to source documents, specialized agents per analytical function, and firm memory that applies your criteria across the work.

Where general-purpose AI stops

One context, one pass, no deal persistence. Strong recall of what documents say; weaker on what they jointly imply for structure, risk, and valuation.

Go deeper

Get the full WEN teardown.

The white paper includes the complete scoring rubric, methodology notes, and the side-by-side analytical outputs behind every number on this page.

Request the white paper

We'll send the full benchmark memo and answer questions about how the comparison was run.

Request the white paper Or meet with a founder