All posts
ENGINEERINGPRODUCT

Syncore Wiki: An Agent-Maintained Markdown Knowledge Base

Syncore Engineering·May 1, 2026·9 min read

The compounding-knowledge problem

Talk to Claude about a research topic over a week. By Friday, every conversation starts from scratch. Claude can't tell you "based on what we discussed Tuesday plus the paper you sent Thursday, here's what's still uncertain." There's no accumulation. RAG over your past chats helps a little, but RAG retrieves; it doesn't synthesize.

Andrej Karpathy [described the alternative pattern](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f) earlier this year: the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources. Add a new source, the LLM doesn't just index it. It reads it, integrates it into existing pages, notes contradictions, updates summaries. The wiki keeps getting richer.

We built that as a Syncore skill.

The three-layer architecture

~/.syncore/wiki/<name>/
├── raw/                    immutable sources
│   ├── meetings/           note-taker auto-files transcripts here
│   ├── papers/             arxiv / pdfs / web articles
│   ├── web/                firecrawl output
│   ├── docs/               gdocs / notion exports
│   ├── messages/           important emails / slack threads
│   ├── code/               snippets, CLI output
│   └── media/              image transcripts
├── wiki/                   LLM-derived synthesis
│   ├── index.md            curated catalog (Active topics + Backlog)
│   ├── log.md              chronological event log
│   ├── concepts/
│   ├── entities/
│   └── analyses/           "good answers filed back"
├── .git/                   every write auto-commits
└── .wiki-fts.sqlite        BM25 search index, gitignored

Three layers, each with a clear owner:

  • raw/ — you (the user) own. Immutable once written. The wiki cites it but never edits.
  • wiki/ — the LLM owns entirely. Creates, updates, cross-references, deletes.
  • schema — co-evolved between you and the LLM. Lives in the skill's SCHEMA.md and is returned by the wiki_rules() tool.

raw immutability is enforced in code, not just convention. Calling write_page on an existing raw/... path returns { committed: false, reason: "raw_immutable" }. Schema rule and IO layer agree.

10 tools, all I/O

The skill exposes 10 MCP tools to the agent:

wiki_init      bootstrap a wiki at ~/.syncore/wiki/<name>/
wiki_rules     return the full maintenance contract (schema)
read_page      read any page (raw/ or wiki/)
write_page     write + auto-commit; returns dead_wikilinks
list_pages     filter by prefix / type / tag
search         BM25 full-text via SQLite FTS5
attach_file    copy a binary (png/pdf/etc) into raw/
delete_page    git rm + report inbound_wikilinks
move_page      git mv + report inbound_wikilinks
log            append to wiki/log.md with structured prefix

Notably no `ingest_source` tool. We deliberately don't bake "read source → suggest pages" into the skill — that's reasoning, and reasoning belongs in the agent. The skill is pure I/O. The agent reads sources with its own tools (Read, WebFetch, vision), decides what to file, calls write_page 5-15 times in one ingest pass.

Search: SQLite FTS5 over embeddings

The obvious choice for a wiki search backend is embeddings + a vector store. We chose SQLite FTS5 with BM25 instead. Reasons:

Stdlib only. sqlite3 is in Python's stdlib on every platform, and FTS5 ships compiled into nearly every Python build. No pip install. No external service.

BM25 with column weights handles "title vs body" naturally. We weight columns: title 5×, tags 2×, body 1×. A page titled "Attention" wins over a page that mentions attention in passing. Embeddings would need a separate "search title heavily" trick on top.

At wiki scale (<1000 pages) it's instant. Our test corpus of 250 pages searches in 1-10ms. Index rebuilds in 50-200ms when stale.

Phrase + boolean + prefix queries work natively. "layer normalization" NOT mlp is a single FTS5 expression. With embeddings, that's a custom reranker.

The index lives at <root>/.wiki-fts.sqlite, gitignored. It's pure derived state — delete it and the next search call rebuilds from markdown. This is consistent with the design principle: markdown is the truth source, everything else is regenerable.

Write-time link integrity

Every write_page call parses [[wiki/...]] and [[raw/...]] references in the body. References that don't resolve to an existing file come back in the response as dead_wikilinks: [...]. The agent sees the dangling links the moment it writes them, not at lint time three sessions later.

delete_page and move_page go further: they scan all wiki pages for *inbound* references to the path being changed and return inbound_wikilinks: [...]. After deleting a page, the agent gets back a list of pages whose links it needs to fix — and follows up in the same conversation.

We deliberately do not auto-rewrite wikilinks on rename. Alias syntax ([[foo|alias text]]) and section anchors ([[foo#section]]) make safe regex rewriting unreliable, and silent corruption is worse than a few extra agent edits.

Cross-skill ingest, agent-driven

Karpathy's gist assumes the user manually drops files into raw/. Syncore opens a second route: other skills produce content the agent can file into `raw/` in the same conversation. There's no hardcoded auto-ingest hook — every write still goes through the agent — but the SCHEMA explicitly maps source types to subdirs so the agent does the right thing without thinking:

  • note-taker.get_session(latest) returns transcript + user_notes → agent calls wiki.write_page("raw/meetings/<sid>.md", ...) when the user says "file this meeting"
  • firecrawl.scrape returns markdown → agent writes raw/web/<slug>.md when the user says "save this article"
  • arxiv.get_paper returns metadata + abstract → agent writes raw/papers/<id>.md

The wiki becomes a convergence point: every content-producing Syncore skill has a natural destination subdir, and the agent learns the pattern from the SCHEMA in one round-trip. After a week of normal work — meetings ingested, articles saved, design notes filed — you have a wiki that reflects what you actually did. The compounding happens through agent muscle memory, not background daemons.

When this isn't right

Wiki adds value when knowledge compounds. It doesn't add value when:

  • You're asking one-off questions ("what's the syntax for git rebase --onto")
  • The conversation is debugging a specific bug, not building knowledge
  • The user wants speed over structure

The schema explicitly tells the agent don't ingest: debug sessions (unless user asks), one-shot Q&A, speculation. Filing trivial things creates noise that harms recall.

What it costs

Local markdown + git + SQLite FTS5 = zero per-call cost beyond filesystem I/O. The agent's token cost for an ingest is real (5-15 write_page calls, each with frontmatter + body) — typical ingest of a 5K-word paper runs 8-12K agent tokens. That's the price of compounding synthesis. Cheap compared to re-deriving the same insights every conversation forever.

The wiki ships in Syncore v0.1.32 onwards as the wiki skill, automatically discovered by the daemon's P0 hot pool. Available at the Free tier — no quota, no upstream service, no API key. Open ~/.syncore/wiki/default/ in Obsidian on the side and watch the agent write.

Try Syncore for free

Connect 50+ tools to Claude, Cursor, and Windsurf in under 5 minutes. No API keys required to get started.

Get Started Free
$curl -fsSL https://syncorelabs.ai/install.sh | sh