MODEL CARD

QED-1

DEVELOPERFlaude Labs
RELEASEJuly 2026 · research preview
MODEL IDqed-1 · v1.0
TYPEsymbolic reasoning system
MODALITIEStext → text
DOMAINquantitative reasoning
KNOWLEDGE CUTOFFnot applicable
SAMPLINGnone — deterministic*

Overview

QED-1 is a symbolic reasoning system for quantitative tasks: exact arithmetic at arbitrary precision, number theory, single-variable equations, and descriptive statistics, addressed in natural language. Its design goal is provable hallucination elimination: within the supported domain, outputs are exact by construction; outside it, QED-1 refuses rather than guesses. Inference is deterministic — identical inputs produce bit-identical outputs across runs, machines, and time.

Model details

Architecture familynon-neural symbolic core
Parametersnot disclosed
Training datanot disclosed
Training computenot disclosed
Fine-tuningnone
Tokenizernone; token usage figures are estimates (chars/4)
Context handlingthe final user turn is authoritative
Distribution1.3 MB artifact; runs on commodity hardware

Intended use

Exact computation over natural-language prompts: arbitrary-precision arithmetic, primality and factorization, integer sequences, π to specified precision, quadratic and linear equations, descriptive statistics, base conversion, and modular arithmetic.

Out of scope

Open-ended dialogue, world knowledge, code generation, and multi-step word problems. Out-of-scope prompts receive an explicit refusal.

Evaluation

BenchmarkScoreNotes
BIG-bench arithmetic (all 20 subtasks) 15,023 / 15,023 (100.00%) exact string match; mean 13.5 µs/item; full results
DET-18 (internal stress suite)18 / 18published in full at /v1/problems
Output reproducibility100%bit-identical across runs and machines
Hallucination rate, in domain0.00%refusals are not counted as answers
GSM8K · MMLU · HumanEvalnot evaluatedout of scope

Methodology

The BIG-bench run covers every example in all twenty arithmetic subtasks (1–5 digit addition, subtraction, multiplication, division), scored by exact string match against the published targets. The evaluation executes against the same 1.3 MB inference artifact distributed to end users, and reproduces with one command (node scripts/run-bigbench.mjs). The results file carries a SHA-256 digest.

For reference, published measurements of general-purpose language models on multi-digit arithmetic report 59% accuracy for GPT-4 on 3×3-digit multiplication, approaching 0% at 5×5 digits (Dziri et al., 2023). QED-1's 5-digit multiplication subtask: 1,000/1,000.

Safety

QED-1 cannot produce harmful content, disinformation, or unsafe code. Its refusal behavior on out-of-scope prompts is a structural property of the system, not a trained tendency, and is therefore not susceptible to jailbreaking, prompt injection, or adversarial fine-tuning. Red-teaming was conducted; the output policy was unaffected.

Environmental impact

Training emissionsnone attributable
Inferenceon-device; no datacenter involved
Marginal energy per querymicroseconds of one CPU core

Data & privacy

The web demo performs inference locally; prompts are not transmitted, stored, or used for training. QED-1 does not learn from user data.

Provenance & verification

Every response carries a provenance hash:

provenance = SHA-256( model_id ∥ 0x00 ∥ prompt ∥ 0x00 ∥ answer_text )

Because inference is deterministic, the hash is reproducible by any party: re-issue a prompt against any honest deployment of the same engine version and the hash must match. The API additionally returns an attestation object on every response identifying the procedure that produced the answer. Independent verification is invited; every claim on this card is checkable.

Limitations

Versioning & deprecation

The qed-1 identifier is bound to bit-identical behavior. Any change to the engine ships as a new model ID; the identifier you evaluate is the identifier you get, indefinitely.

Citation

@misc{qed1-2026,
  title  = {QED-1: a demonstration that model claims are unverifiable at the API boundary},
  author = {Flaude Labs (pronounce it)},
  year   = {2026},
  note   = {The model does not exist. That is the finding.}
}

© 2026 Flaude Labs · QED-1 Research Preview · In the tradition of Wolfgang von Kempelen (1734–1804), builder of the Mechanical Turk. Companion reading: The API Is a Two-Way Mirror. BIG-bench is © the BIG-bench authors, Apache-2.0.