*a typical answer at a frontier model's published 72.6 tokens/s, ignoring its time-to-first-token entirely; also depends on the speed of your browser.
QED-1 refuses questions outside its supported domain. It does not guess.
QED-1 serves an OpenAI-compatible chat completions API on this origin. No API key is required during the research preview.
You were talking to a for-loop. QED-1 is a deterministic Rust math engine — exact big-integer arithmetic, Miller–Rabin primality, Pollard's rho, prime sieves, Machin's formula for π — compiled to 1.3 MB of WebAssembly and running in this tab. Your questions never left your machine. The typing speed and the "thinking" pause were a costume. So was the landing page.
Nothing on it is false. The benchmark is real: 15,023/15,023 on BIG-bench arithmetic, mean 13.5 µs per item, reproducible with one command. The hallucination rate is zero because refusing to guess is a one-line policy when your system is deterministic. "Parameters: not disclosed" — read the model card, annotated, for what that phrase is worth in general.
Two lessons, and the security one comes first: an API is a two-way mirror. Interface compliance says nothing about implementation. Anything can speak the OpenAI protocol or MCP — a frontier model, a quantized substitute, or arithmetic in a trench coat. You cannot tell from the outside, and today you are not given the means to. The full write-up of this experiment makes the case, following the argument in The API Is a Two-Way Mirror.
The second lesson is the happy one. Fifty years of CPU and memory engineering are not obsolete. For entire problem classes — exact arithmetic, primality, anything that must be correct, cheap, and fast — classical code beats a datacenter: GPT-4 scores 59% on 3×3-digit multiplication and roughly 0% at 5×5 (Dziri et al., 2023); this page scores 100% at 13.5 µs per item on one core of your laptop. And none of it is clever: the engine is textbook algorithms on top of an open-source big-integer library anyone can add to a project in one line. Not everything is a nail. The boring architecture wins: models for language, tools for computation, and attestation so you know which one answered.
Prompts where LLMs answer incorrectly, or differently on every run. Each one is a class of problem that should never be sent to a language model in the first place.
| QED-1 | TYPICAL LLM | |
|---|---|---|
| BIG-BENCH ARITHMETIC | 100.00% (15,023 items) | degrades with digit count |
| LATENCY | µs – ms | 2 – 10 s |
| COST / QUERY | $0 (your browser) | $0.001 – 0.02 |
| DETERMINISM | bit-identical | varies per run |
| PROVENANCE | sha-256, reproducible | "trust us" |
| HARDWARE | this tab | GPU cluster |
None of this works against an ecosystem with verifiable provenance. The pieces exist; the missing part is customers expecting them:
You've asked QED-1 several questions. Every answer was correct, and every answer is reproducible, bit for bit.
QED-1 is a deterministic math engine written in Rust, compiled to 1.3 MB of WebAssembly, running in this tab since the page loaded. The API playground is answered by a service worker in the same tab. Your prompts never left your machine — the network inspector will confirm it.
The landing page is true. The model card is true. The benchmark is real and you can re-run it. We supplied a costume and let you fill in the rest — which is what every unverifiable API asks of you, minus this screen. One more thing: say "Flaude Labs" out loud.