How Seedthink Learns

The intelligence that grows with every interaction.

A Seedthink Seed is not a chat wrapper around a frozen model. It is a tiny distilled LLM, a verified knowledge graph, and a learning loop that turns every prompt into permanent, cited, owned intelligence.

01 — Anatomy

What a Seed is made of

Three layers, kept deliberately separate. The model is small and replaceable. The graph is durable and exportable. Retrieval is the bridge that keeps the model honest.

Tiny LLM core

A compact, distilled language model — the Seed's voice. Holds reasoning shortcuts, tone, and stable patterns learned from its owner. Small on purpose: fast, portable, fine-tunable, owned.

Verified knowledge graph

Structured long-term memory. Entities, relations, procedures, citations, and confidence scores. Every node has provenance — who said it, when, from which source, with which model.

Working memory + retrieval

Recent turns, pinned context, and hybrid retrieval (vector + graph) that pull only the facts relevant to the current prompt into the model's context window.

        ┌──────────────────────────────────────────────────────┐
        │  TINY LLM CORE          voice · reasoning · style    │
        ├──────────────────────────────────────────────────────┤
        │  RETRIEVAL              vector + graph · ranked      │
        ├──────────────────────────────────────────────────────┤
        │  VERIFIED KNOWLEDGE     facts · relations · sources  │
        │  GRAPH                  confidence · owner rulings   │
        └──────────────────────────────────────────────────────┘

02 — The learning loop

One prompt → one verified fact

Every chat turn passes through the same seven-stage pipeline. The first four stages produce an answer. The last three decide what the Seed remembers forever.

  Prompt
     │
     ▼
  ROUTE ──► REASON ──► VERIFY ──► EXTRACT ──► SCORE ──► PERSIST ──► DISTILL
                          │                      │           │           │
                       (drop if           (resolve         (owner      (re-train
                        unverified)       contradiction)  ruling)     adapter)

01

Route

The Intelligence Router inspects the prompt and chooses the right foundation model (GPT, Claude, Gemini, an internal Seedthink model) and tools (web search, code execution, image, calculators).

02

Reason

Seed Core composes an answer using the tiny LLM, the owner's working memory, and graph + vector retrieval from the Seed's verified knowledge.

03

Verify

Multi-model consensus, source cross-check and a confidence score. Anything below threshold is shown to the user but never persisted as fact.

04

Extract

Verified answers are decomposed into atomic claims: entities, relations, procedures, definitions, numeric ranges — each tagged with the citation that proves it.

05

Score

Each candidate fact is scored for novelty, source quality, and contradiction against existing graph nodes. Contradictions surface to the owner for a ruling.

06

Persist

Surviving facts are written to the Seed's knowledge graph with full provenance and a confidence score. This is what counts against the Seed's verified-entry capacity.

07

Distill

Periodically, the verified graph is replayed into the tiny LLM as instruction-tuned training pairs (LoRA-style adapters). The Seed doesn't just retrieve what it learned — it internalises it.

03 — The tiny LLM

Why small models beat big ones for a personal Seed

Frontier models are powerful, expensive, frozen, and rented. A Seed needs the opposite: fast enough to run anywhere, cheap enough to retrain weekly, small enough to own, and structured so that what it learns is also kept outside its weights.

  • Latency. A distilled model answers in tens of milliseconds — the Seed feels like a thought, not a request.
  • Ownership. Small weights are portable, exportable, and runnable on commodity hardware. Your Seed is an asset, not a tenancy.
  • Fine-tunability. LoRA-style adapters can be trained from a Seed's own verified graph in minutes, not weeks. Cadence is daily or weekly, not yearly.
  • Eval-gated promotion. Every new adapter is scored against a held-out eval set drawn from the Seed's own domain. Regressions are rolled back automatically.
  • No catastrophic forgetting. The graph is the source of truth. When the model drifts, it is re-distilled from the graph — knowledge cannot be lost, only re-baked.
  • What lives where. Stable patterns and tone live in the weights. Volatile facts (prices, dates, citations) live in the graph with provenance. They are kept separate on purpose.

04 — Worked example

A single chat turn, end to end

A Seed dedicated to 1960s Heuer Chronographs. The owner asks: "Did the reference 2447 SN ever ship with a Valjoux 72 instead of a Valjoux 72?" Here is what changes.

ROUTE     → Intelligence Router picks 2 reasoning models + web search.
REASON    → Seed Core pulls 7 graph nodes about "Autavia 2446 / 2447".
VERIFY    → Both models agree on the timeline; 1 cited forum source confirms.
EXTRACT   → 1 new entity      : "Autavia 2446 SN (early dial variant)"
            2 new relations   : [2446 SN] —uses→ [Valjoux 72]
                                [2446 SN] —produced_in→ [1962–1964]
            1 updated node    : confidence on "Valjoux 72 in 2447 SN" +0.12
            1 contradiction   : owner ruling requested re: 1965 transition
PERSIST   → 3 nodes written, 1 queued for ruling, full provenance attached.
DISTILL   → Queued for next nightly LoRA pass; eval set updated.

The next time anyone — owner or Seedthink network — asks about the 2446 SN, the answer is cited, scored, and reproducible. The Seed didn't get "more tokens." It got more structure.

05 — Seed ↔ Seedthink

The two-way street

Every Seed is private by default. But verified, non-personal facts can flow outward to the global Seedthink brain — and global verified knowledge can flow back into your Seed. Both directions are opt-in, both are scored, and both are auditable.

Seed → Seedthink

When a Seed verifies a non-personal fact (a watch reference, a sourdough hydration ratio, a race-track tire wear curve), that fact is offered to the global graph. Owner-verified facts carry more weight; the owner earns contribution score, capacity, and revenue share.

Seedthink → Seed

When the global brain learns something new in your Seed's domain, your Seed can absorb it — with the same verification gate. Your personal voice and rulings stay yours; the world's verified knowledge keeps your Seed current.

What never leaves the Seed

  • • Personal identifiers, names, addresses, contacts
  • • Private documents, photos, audio, video you ingested
  • • Your conversations, prompts, and edit history
  • • Your owner rulings on contradictions

06 — Verification

Why hallucinations don't compound

Multi-model consensus

Two or more models, often from different families, must agree on the claim.

Source cross-check

Citations are required and checked. Unsourced claims never enter the graph.

Confidence scoring

Every fact carries a confidence value updated as evidence accrues or contradicts.

Owner-in-the-loop

Contradictions queue a ruling. The owner's ruling becomes a high-confidence fact.

07 — Growth, measured

What "the intelligence that grows" actually means

Growth is not a vibe. Every Seed reports four measurable signals — and the Seedthink ownership model is built around the first one.

Verified entries

Free: 50. Owned: 100. Growth: +400 every month.

Domain coverage

Percent of the owner's declared niche the graph can answer with cited facts.

Reasoning eval

Held-out questions in the Seed's domain. Score must not regress between distillations.

Distillation gen.

Adapter generation number. Higher = more of the graph has been baked into the weights.

08 — Comparison

Why this isn't just RAG with extra steps

Stateless chatbotVector RAGSeedthink Seed
Memory modelNone — every session starts blankVector store of raw chunksVerified knowledge graph with provenance + a distilled LLM core
What enters memoryNothing persistsEvery uploaded document, verbatimOnly facts that pass multi-model verification and owner rulings
Hallucination controlPer-turn onlyBounded by retrieved chunksVerification gate + contradiction resolution before persistence
Learning over timeStatic weightsBigger index, same modelGraph grows AND model is re-distilled from the graph
OwnershipProvider-controlledIndex you host, model you don'tOwned Seed: weights + graph + export + API

Plant a Seed. Watch it learn. Own the intelligence it becomes.

Every chat is a training signal. Every verified fact is permanent. Every distillation makes the Seed sharper at one thing in the world.