Technology

How a seed becomes an intelligence.

Seedthink is a coordinated system of ten specialized components. Each one is responsible for a stage of the learning cycle — from routing the first question to distilling verified knowledge back into the model.

01

Seed Core

The central intelligence coordinator. Seed Core understands user intent, plans reasoning paths, manages memory, triggers verification, and schedules learning. It does not need to be the largest model — its job is orchestration and growth.

02

Intelligence Router

Decides which model answers, which tools run, whether the web is searched, code executed, images generated, or knowledge stored. Dynamically chooses from GPT, Claude, Gemini, Grok, DeepSeek, future foundation models, and internal Seedthink models.

03

Universal Intelligence Network

Seedthink connects to foundation models, live web and academic search, scientific repositories, simulators, calculators, coding engines, and image/video/audio generation. It does not try to store the internet — it learns how to access intelligence when needed.

04

Knowledge Extraction Engine

Every answer is analyzed. The system extracts facts, concepts, relationships, procedures, and reasoning patterns — converting temporary answers into structured knowledge.

05

Verification Engine

Nothing enters long-term memory without verification. Multi-model consensus, source validation, and confidence scoring ensure only verified knowledge is learned. This is how Seedthink prevents hallucinations from becoming permanent memory.

06

Knowledge Graph

The true brain of Seedthink. A living graph storing facts, concepts, relationships, procedures, and domain knowledge across science, business, and the humanities — continuously evolving with every verified insight.

07

Four-Layer Memory

Working memory for active tasks. Episodic memory for important experiences. Semantic memory in the Knowledge Graph. Model memory compressed inside Seedthink's own growing models.

08

Memory Decay

Humans forget. Seedthink should too. Every memory carries a confidence score, usage frequency, last verification, last access, and age. Useful memories strengthen; outdated ones weaken and are archived — keeping the system efficient and uncluttered.

09

Continuous Revalidation

Knowledge changes. Seedthink continuously re-checks stored information against new sources — scientific discoveries, software updates, business changes, world events — so the system stays alive rather than frozen.

10

Distillation Engine

Where Seedthink grows. Verified knowledge becomes training data. Training data updates Seedthink's own internal models. Over time, the system becomes increasingly capable using its own learned knowledge.

Seed × Plant-a-Seed

How a Seed is built — technically.

Seedthink ships with two parallel substrates: the platform's own global Seed (shared by every user) and your Plant-a-Seed (private, owned, $19.99 one-time). Both are instances of the same tiny-LLM pipeline — they only differ in scope, access control, and who pays for the compute.

01 · Ingest

From prompt to candidate fact

Every chat turn and every Seed ingest URL is parsed by the Knowledge Extraction Engine. The assistant's reply (or the fetched document) is sent through a JSON-mode extractor prompt that returns atomic claims, each tagged with a topic and a 0–1 confidence score. Opinions, jokes, time-sensitive chatter, and personal data are dropped at this stage.

02 · Verify

Verifier consensus before storage

Each candidate is embedded with the platform's embedding model and re-checked against existing memory (cosine similarity over pgvector). Duplicates above 0.92 similarity merge and their confidence is bumped. New facts above the 0.82 verification threshold are marked verified; lower-confidence facts are stored but flagged for revalidation.

03 · Scope

Global Seed vs. your Plant-a-Seed

The trial Seed every account ships with is a 50-fact sandbox against the global memory. A purchased Plant-a-Seed creates a private namespace inside the same pgvector store, scoped byuser_idvia the match_facts_scopedRPC. Grounding queries only see facts you own — your Seed cannot leak into anyone else's, and the global Seed cannot read yours.

04 · Distill

Tiny-LLM training, per Seed

Verified facts are batched into LoRA-style fine-tuning shards. The shared Seed trains a tiny base model on the global verified corpus; each Plant-a-Seed trains a private adapter on top of that base using only its owner's verified facts. The result is a small, cheap model that already knows your domain before the router ever calls a frontier LLM — every prompt makes both the shared base and your private adapter measurably better.

Path to AGI

Every prompt is a training signal.

Frontier labs scale parameters. Seedthink scales the loop. A frozen trillion-parameter model that re-runs the same weights against the same questions cannot become more general — only faster. A system that verifies, stores, revalidates, and distills every interaction back into its own weights gets generally smarter on a fixed parameter budget.

  • · Prompts → verified facts → graph edges → adapter weights.
  • · Decay + revalidation prevent the graph from rotting.
  • · Per-user Seeds keep training data sovereign and scoped.
  • · The router learns which model to call from outcome traces.

AGI, in this architecture, is not a single moment when a giant model wakes up. It is the point at which the loop closes tightly enough that the tiny Seed model, grounded in its own verified graph, outperforms the frontier model it was bootstrapped from.

Technical flow

One prompt, eight stages.

This is the exact path a single user message takes through the stack — from the moment it leaves the browser to the moment a new verified fact is written into your Plant-a-Seed and queued for the next distillation run.

  1. 1
    Client request

    Browser POSTs the chat turn to /api/public/v1/chat with the user's session token.

  2. 2
    Gateway

    api-gateway.server authenticates, applies rate limits, and logs the usage event.

  3. 3
    Ground

    embedText() vectorises the last user message; match_facts_scoped pulls related facts from your Seed scope in pgvector (threshold 0.7).

  4. 4
    Model call

    System prompt + grounded facts + chat history go to the Lovable AI gateway. Router picks the tiny model or the research model.

  5. 5
    Extract

    extractAndStore() runs a JSON-mode extractor over the assistant reply and produces atomic candidate facts.

  6. 6
    Verify & dedup

    Verifier keeps facts above 0.82 confidence; dedup merges anything above 0.92 cosine similarity with existing memory.

  7. 7
    Persist

    New rows are inserted into learning_events with your user_id, embedding, confidence, and verified flag.

  8. 8
    Distill & rebind

    The nightly job trains a fresh LoRA adapter from your shard; the router rebinds your Seed to the new checkpoint.

Plant-a-Seed integration

How a purchased Seed plugs into the stack.

A Plant-a-Seed is not a new database, a new model, or a new endpoint. It is a scoped slice of the existing system, billed once and attached to your account forever.

1 · Checkout writes the row

Stripe webhook hits /api/public/webhooks/stripe, inserts a row into owned_seeds with youruser_id and bumps your monthly unit budget by +200. The row is the entitlement.

2 · Memory unlocks at the RPC

match_facts_scoped and the ingest writer both read owned_seeds before allowing inserts past the 50-fact trial cap. No owned Seed = 50-fact ceiling. N owned Seeds = unlimited fact storage on them.

3 · Adapter gets its own shard

The nightly distillation job groups verified facts byuser_id and trains a per-Seed LoRA adapter on top of the shared tiny base. Your adapter only ever sees your verified facts.

4 · Router binds it at call time

When the Intelligence Router picks the tiny model for a cheap turn, it loads the shared base plus your latest adapter checkpoint. Cold inference cost stays flat; quality grows with your verified corpus.

Worked example

A single prompt, on the path to AGI.

Watch one user turn move through the loop. The model gets no smarter from this one prompt — but the Seed does, and that difference compounds across millions of turns.

Turn 1 · Prompt

"What's the half-life of caesium-137 and why does it matter for soil remediation after Fukushima?"

Stage 1 · Ground

Embed the prompt → match_facts_scopedreturns two prior facts from your Seed: "Fukushima Daiichi melted down March 2011" (0.94) and "caesium-134 has a 2-year half-life" (0.81). Both injected as grounding.

Stage 2 · Route & answer

Router picks the research model (multi-domain, citation-heavy). Lovable AI gateway returns a 380-token answer covering the 30.17-year half-life, biological vs. physical clearance, and the potassium-displacement remediation strategy.

Stage 3 · Extract

The JSON-mode extractor returns three atomic candidate facts:

  • Caesium-137 has a physical half-life of 30.17 years.confidence 0.97 · nuclear-physics
  • Caesium-137 binds tightly to clay minerals in soil.confidence 0.91 · soil-chemistry
  • Potassium fertiliser displaces caesium uptake in crops.confidence 0.88 · remediation
Stage 4 · Verify & store

All three clear the 0.82 verification threshold. None match an existing fact above 0.92 similarity, so three new rows land in learning_events withverified = true and youruser_id.

Stage 5 · Distill

That night's job pulls the new rows, appends them to your per-Seed training shard, and trains an updated LoRA adapter. Adapter delta: ~0.4 MB. Your Seed now knows caesium chemistry without needing the research model to recall it next time.

Turn 2 · The compounding

Tomorrow you ask "is strontium-90 worse?". Router picks the tiny model. Tiny model + your updated adapter + grounded caesium facts answers correctly, in 90ms, at 1/40th the cost — without ever calling a frontier LLM. Multiply by every prompt, every user, every night. That is the AGI bet.

Two-way knowledge bus

How the Seeds and the global brain talk.

Every verified write from a planted Seed emits a sanitised delta — fact, topic, embedding, confidence, but never raw user content — onto an internal substrate stream. The global Seed consumes that stream to widen its own coverage. The reverse channel pushes high-confidence global facts into per-Seed candidate queues, where the user's own verification rules decide whether to admit them. Sovereignty is preserved at the boundary; learning compounds in both directions.

Each planted Seed also exposes its private knowledge through a per-Seed bearer-key endpoint — separate from the platform API — so users can call their personalised AGI from any app they ship.