Purpose-built workflows with persistent deal context, evidence chains, and specialized agents.
Benchmark · April 2026
Purpose-built vs. general-purpose AI for investment diligence
A controlled head-to-head comparing HuxleyIQ against a frontier general-purpose model on identical source materials for The Wendy's Company (WEN), across competitive intelligence, Porter's Five Forces, and LBO modeling.
The result
A 23.7-point accuracy gap on the same materials.
Both systems received the same documents and the same tasks. Outputs were scored against a shared rubric; the full rubric and side-by-side outputs are in the white paper.
Strong document organization, but missed decision-critical findings implied across sources.
HuxleyIQ scored 23.7 percentage points higher, a 33.6% relative accuracy lift over the general-purpose model.
Decision-critical findings surfaced
$50M+ in aggregate deal-structure implications
- Securitization covenant mechanics
- Advertising fund cascade
- Franchise distress signals
- GLP-1-caused demand headwinds
Methodology
Same inputs. Same tasks. Scored side by side.
Identical source materials
Both systems worked from the same public source materials for The Wendy's Company, a publicly traded quick-service restaurant business.
Three diligence workstreams
Competitive intelligence, Porter's Five Forces, and LBO modeling — the analytical work that shapes whether a deal moves forward.
Shared scoring rubric
Outputs were scored against a common rubric. The full rubric, scoring detail, and side-by-side outputs are available in the white paper.
Why the gap exists
Summary quality wasn't the difference. Synthesis was.
The general-purpose model organized documents well. It fell behind on findings that were not stated in any single document but implied across several — the kind of synthesis that changes deal structure, diligence priorities, and price.
What HuxleyIQ brings
Persistent deal context, evidence chains tied to source documents, specialized agents per analytical function, and firm memory that applies your criteria across the work.
Where general-purpose AI stops
One context, one pass, no deal persistence. Strong recall of what documents say; weaker on what they jointly imply for structure, risk, and valuation.
Go deeper
Get the full WEN teardown.
The white paper includes the complete scoring rubric, methodology notes, and the side-by-side analytical outputs behind every number on this page.
Request the white paper
We'll send the full benchmark memo and answer questions about how the comparison was run.
Request the white paper Or meet with a founder