Production System v1.0

Multi-Agent Technical
Decision System

A structured reasoning engine that helps engineers make high-stakes technical decisions through transparent, multi-agent analysis with explicit disagreement handling and cost awareness.

Explicit Disagreement

Unlike single-LLM systems, we surface conflicts between agents. The Disagreement Detector identifies where specialists contradict each other.

Cost Transparency

Every decision iteration costs ~$0.02. We track tokens, cost per agent, and total spend in real-time. No hidden API charges.

Critic & Gate

A dedicated Critic Agent challenges weak reasoning. The Gate Agent enforces confidence thresholds and can force deferral if uncertain.

Submit Technical Decision

Multiple agents will reason independently and disagree constructively

Agent Execution Graph

Click on Agent to see input and output

Planner

Decomposes

Systems

ML/AI

Cost

Product

Detector

Finds conflicts

Critic

Challenges

Synthesizer

Resolves

Gate

Validates

Live Agent Reasoning

Updates as agents complete

Submit a decision prompt to observe agent reasoning

System Architecture

Designed to avoid single-LLM hallucination through structured multi-agent reasoning with explicit disagreement handling.

1. Planner

Decomposes into sub-questions

Systems
ML/AI
Cost
Product

2. Specialists

Parallel independent evaluation

3. Detector

Finds conflicts explicitly

4. Critic

Challenges assumptions

5. Synthesizer

Unifies recommendation

6. Gate

Final validation

Narrow Roles

Each agent has a strictly defined scope. The Systems Agent cannot evaluate ML feasibility. The Cost Agent cannot assess user experience. This constraint prevents single-LLM scope creep.

Explicit Disagreement

The Disagreement Detector forces conflicts into the open. When Systems recommends batch and Product demands real-time, this conflict is surfaced, categorized by severity, and must be resolved.

Schema Validation

All agent outputs are validated against strict schemas. Confidence scores must be 0-1. Recommendations must be from an enum. Rationale must be a string array. No free-form text blobs.

Agent Specifications

Each agent is a specialized reasoning unit with constrained prompts and explicit outputs.

Planner Agent

GPT-5-mini

Decomposes ambiguous technical decisions into structured sub-questions for specialist agents. Forces clarity before reasoning begins.

Outputs: Sub-questions, assumptions, unknowns
Cost: ~$0.005/run

Systems Agent

GPT-5-mini

Evaluates infrastructure, scalability, latency, and operational complexity. Focuses on batch vs online, reliability, and ops overhead.

Focus: Infra, scaling, reliability
Cost: ~$0.0015/run

ML/AI Agent

GPT-5-mini

Assesses model complexity, training vs inference costs, data requirements, and MLOps overhead. Evaluates technical feasibility of ML approaches.

Focus: Model complexity, data, MLOps
Cost: ~$0.0015/run

Cost Agent

GPT-5-mini

Analyzes cloud costs, model inference expenses, and long-term scalability. Evaluates cost-performance trade-offs and budget impact.

Focus: Cloud costs, ROI, scaling costs
Cost: ~$0.0015/run

Product Agent

GPT-5-mini

Evaluates user experience, market fit, feature velocity, and business alignment. Considers latency tolerance and user expectations.

Focus: UX, velocity, market fit
Cost: ~$0.0015/run

Disagreement Detector

GPT-5-mini

Identifies conflicts between specialist recommendations. Surfaces disagreements explicitly with severity ratings and blocking status.

Output: Conflicts, severity, agents
Cost: ~$0.002/run

Critic Agent

GPT-5-mini

Challenges assumptions and identifies weak reasoning. Does not propose solutions—only attacks blind spots to force robustness.

Output: Issues, risks, gaps
Cost: ~$0.003/run

Synthesizer Agent

GPT-5.1

Integrates all perspectives into a final recommendation with confidence score, rationale, trade-offs, and unresolved risks.

Output: Final rec, confidence, risks
Cost: ~$0.003/run

Gate Agent

Rule Based

Validates decision quality against thresholds. Enforces minimum confidence, checks for unresolved blocking conflicts, manages approval tiers.

Tiers: Exploration, Commitment, Override
Cost: $0.00 (rule-based)

Cost Model & Infrastructure

Transparent pricing and serverless deployment on Google Cloud Run.

Cost Per Iteration

Planner Agent $0.005
4× Specialist Agents $0.015
Disagreement Detector $0.002
Critic Agent $0.003
Synthesizer Agent $0.003
Gate Agent (Rule) $0.000
Total per iteration ~$0.02

Iteration 2 (if needed) costs additional $0.02

Hosting Architecture

Google Cloud Run

Serverless container platform with automatic scaling, WebSocket support, and pay-per-use pricing. Single-digit USD/month for idle instance.

FastAPI + Uvicorn

High-performance Python backend with native WebSocket support for real-time agent execution streaming.

Docker Container

Reproducible deployments with locked dependency versions. Scales from zero to thousands of requests instantly.

Monthly Infrastructure Cost ~$5-10
LLM Cost (100 decisions) ~$2-4

Why Cost Transparency Matters

Budget Awareness

Users understand that AI decisions consume real resources. This prevents frivolous queries and encourages thoughtful decision formulation.

Iteration Trade-offs

Iteration 2 costs another $0.02. Users must consciously decide if resolving uncertainty is worth the additional cost, mirroring real engineering trade-offs.

Model Selection

Specialists use GPT-5-mini for cost efficiency. The Synthesizer uses GPT-5.1 for higher reasoning quality. This tiered approach optimizes cost/quality.

Open Source

Full backend implementation, agent logic, and deployment configurations available on GitHub.

Built with FastAPI · WebSockets · GPT-5 · Google Cloud Run