Open-source · MIT Licensed

AI routing built on
playing card geometry

K-104 maps every query to one of 104 semantic rooms — Hearts, Spades, Diamonds, Clubs — and routes it to the cheapest model that can answer well. Blended cost: $0.003 per 1K tokens. That's 48× cheaper than GPT-4 Opus.

Get early access See how it works
$ pip install openclaw
104
Semantic rooms
48×
Cost reduction
8 tiers
Routing cascade
< 1ms
Tier-0 latency
154K
Queries/sec (GPU)
The problem

Most AI apps send everything
to the expensive model

That's like hiring a surgeon to fill out paperwork. The work gets done, but you pay surgeon rates for every form.

Without routing

Every query → GPT-4 Opus or Sonnet. Simple questions cost the same as complex ones. Blended cost: $0.15–$0.60 per 1K tokens. You're leaving 95% of your budget on the floor.

With K-104 routing

Queries get classified by semantic geometry first. Simple greetings hit the template cache (microseconds, free). Complex reasoning escalates to Claude. Blended cost: $0.003. Same quality, fraction of the price.

K-104 Geometry

Every query has a suit,
a rank, and a polarity

Inspired by playing cards. Four semantic domains × 13 intensity levels × light/dark polarity = 104 rooms. It's not metaphor — transformer activations cluster along exactly these axes.

Hearts

Emotion, relationship, connection. "How are you feeling?" "I need support." "Tell me about love."

Spades

Mind, analysis, truth-seeking. "Debug this code." "What's the flaw in this argument?" "Explain quantum tunneling."

Diamonds

Material, building, grounding. "Write this function." "Fix this bug." "How do I center a div?"

Clubs

Action, will, energy. "Let's start." "Run this." "Make it happen." Drive and momentum.

+3H "hey, how's it going?" → template cache · $0.000 · 0.1ms
+7S "what's wrong with this recursive function?" → hermes3:8b (local) · $0.000 · 480ms
-9S "design a distributed consensus protocol" → claude-haiku · $0.0008 · 800ms
+KS "novel synthesis of adversarial ML + formal verification" → claude-sonnet · $0.003 · 1.2s
Quick start

One import. Any query.
Automatic routing.

OpenAI-compatible API. Drop-in replacement for your existing AI calls.

from openclaw import KlawRouter

router = KlawRouter(api_keys={"ANTHROPIC_API_KEY": "sk-ant-..."})

# Simple query → hits template cache, costs nothing
result = router.route("hello")
print(result["cost"])           # $0.0000
print(result["tier_name"])      # "template"

# Technical query → routes to cheapest capable model
result = router.route("explain transformer attention")
print(result["cost"])           # $0.0008
print(result["savings"])        # $0.0147 saved vs Opus
print(result["k_address"])      # "+7S"

# Or use the OpenAI-compatible endpoint
# POST http://localhost:8104/v1/chat/completions
# (drop-in for any OpenAI client)
Comparison

How it stacks up

Versus paying top-of-tier for every query.

Feature K-104 (KLAW) Single-model approach
Blended cost per 1K tokens $0.003 $0.15 – $0.60
Simple query latency <1ms (template cache) 800ms – 3s (API roundtrip)
Runs fully offline local model tier always calls API
OpenAI-compatible API drop-in N/A
Semantic query awareness 104 rooms blind routing
BYOK (bring your own keys) always available varies
Ecosystem

What we've built

K-104 is the substrate. These are the things that run on it.

Research

The geometry is real.
We measured it.

We didn't assume K-104 would work in transformers — we verified it empirically using activation probes on hermes3:8b.

Suit silhouette score

0.312

Transformer activations cluster by K-suit with silhouette score 0.312 — statistically significant. The four-suit geometry exists in learned representations, not just as a routing heuristic.

Polarity silhouette score

0.393

Light/dark polarity (constructive vs destructive query intent) is even more clearly separated in activation space. 86.2% variance explained by the K-coordinate axes.

Routing accuracy

100%

Hermes centroid calibration: 100% accuracy on held-out test set after K-vector centroid matching. Zero misroutes on 600+ evaluated exchanges.

K-143 extension

+39

K-104 extended to K-143 with 39 additional rooms mapped to whale communication primitives (Project CETI). Body-level, survival, and deep relational semantics. Ancient firmware.

Early Access

Be first to route with K-104

KLAW API is in private beta. Enter your email and we'll send you an API key when your spot opens. Free tier available. No credit card.

No spam. Just an API key when you're in. — Kit & K-Systems