System Overview

Three layers, one process. Claude Code hooks and the MCP server communicate with the chittad daemon over a Unix domain socket. Inside, DuckDBMind orchestrates storage, embedding, resonance, and self-tuning.

Claude Code
Hooks
hooks/*.sh
MCP Server (chitta-mcp)
Unix Domain Socket
chittad Daemon
Thread Pool (2-16)
RPC Handler (100+ tools)
Subconscious (background)
DuckDBMind
Embedder & Vāk (ONNX)
ResonanceLearner
ThemeManager (xMemory)
SessionContext
DuckDBStore
Embeddings DB (HNSW)
Key Design Decisions
// Single storage engine, no tiers
DuckDB     HNSW vectors + BM25 full-text + graph queries + ACID
Separate   Embeddings DB avoids write contention during HNSW rebuilds
Pool       ConnectionPool for concurrent reads, serialized writes
Socket     Unix domain socket IPC between Claude Code and daemon
Scaling    Auto-scaling thread pool (2-16) with watchdog for slow requests

Core Components

Eleven classes form the skeleton. Each handles one concern, composed at the DuckDBMind level.

Class File Role
DuckDBMindmind/duckdb_mind.hppCentral orchestrator: remember, recall, resonate, self-tune
DuckDBStoreduckdb_store.hppStorage: all DuckDB operations, schema, queries
DuckDBRpcHandlerrpc/duckdb_handler.hppJSON-RPC 2.0 handler, 100+ registered tools
Embeddermind/embedder.hppEmbedding with LRU cache and circuit breaker
AntahkaranaYantravak_onnx.hppONNX Runtime inference for bge-base-en-v1.5
Subconsciousmind/subconscious.hppBackground thread: patterns, hygiene, embedding
ThemeManagertheme_manager.hppxMemory hierarchical memory organization
CodeIntelcode_intel.hppTree-sitter symbol extraction (9 languages)
SymbolResolversymbol_resolver.hppCross-file symbol resolution for call graphs
ThreadPoolrpc/thread_pool.hppAuto-scaling worker pool with watchdog
ProvenanceSpineprovenance.hppKnowledge source tracking and trust scoring

The Resonance Engine

The core of memory retrieval. DuckDBMind::full_resonate() runs 8 phases to find relevant memories — not search, but resonance. Each phase adds signal; post-processing refines it.

Memory (size = relevance)
Processing flow
Phase zone (1-8)
1

Semantic Seeds

Vector similarity search using HNSW index. Returns top-k memories by cosine distance to the query embedding — the initial gravity wells of meaning.

semantic_weight = 0.6 HNSW cosine
2

BM25 Hybrid

Keyword-based search using BM25 scoring complements semantic search for exact term matches. Results merge using weighted combination with confidence factor.

bm25_weight = 0.4 tag_boost = 0.05 conf_factor = 0.5 + 0.5 * confidence
3

Tag Matching

Boost memories whose tags match terms in the query. A small additive signal that rewards explicit categorization.

additive boost
4

Attractor Finding

Identify conceptual gravity wells — clusters of densely connected memories in the triplet graph. Attractors are cached with 5-minute TTL.

max_attractors = 10 basin_boost = 1.15x
5

Spreading Activation

Starting from seed memories, activation spreads through the triplet graph. Connected memories receive activation proportional to edge weight, inversely proportional to distance.

spread_strength = 0.5 spread_decay = 0.5 max_hops = 3
6

Session Priming

Recent observations and active topics from the current session boost related memories. The context of the conversation shapes what surfaces.

priming_boost = 0.3 topic_boost = 0.2
7

Code Intelligence

For code-like queries (detected by heuristic: ::, ->, _, ., or single identifiers), BM25 and term-based search on symbol names and signatures.

code_symbol_weight = 0.5 heuristic detection
8

Post-Processing

Attractor boost, lateral inhibition (similar memories compete), Hebbian learning (co-accessed memories strengthen connections), and credit assignment for the Bayesian bandit.

inhibition = 0.7 similarity_threshold = 0.85 hebbian = 0.03
DuckDBResonanceConfig
struct DuckDBResonanceConfig {
    float spread_strength     = 0.5f;
    float spread_decay        = 0.5f;
    int   max_hops            = 3;
    float hebbian_strength    = 0.03f;
    int   max_attractors      = 10;
    float basin_boost         = 1.15f;
    float similarity_threshold = 0.85f;
    float inhibition_strength = 0.7f;
    float epsilon_boost_alpha = 0.3f;
    float semantic_weight     = 0.6f;
    float activation_weight   = 0.4f;
    float code_symbol_weight  = 0.5f;
};

Storage Layer

DuckDB is an embedded analytical database. CC-Soul uses it for HNSW vector search, BM25 full-text, graph queries via DuckPGQ, and ACID transactions with WAL-based crash recovery.

~/.claude/mind/chitta/
  chitta.duckdb// Main database (memories, triplets, symbols, ledger)
  chitta_emb.duckdb// Embeddings database (HNSW index, separate to avoid contention)
  chitta.duckdb.wal// Write-ahead log (auto-managed)

Key Tables

TableColumnsPurpose
memory id, kind, content, confidence, decay_rate, tags, realm, visibility, timestamps, access_count Core unit of storage. Each memory has Bayesian confidence and configurable decay.
triplet subject, predicate, object, weight Knowledge graph with string-based entities. Predicates: calls, contains, imports, inherits, corrects, relates_to.
symbol id, name, kind, signature, file_path, line_start, line_end, parent, project, description Code intelligence: functions, classes, methods extracted by tree-sitter.
call_edge caller_id, callee_id Call graph edges, populated by SymbolResolver.
ledger session_id, mood, todos, decisions, next_steps, blockers, snapshot Session checkpoints for continuity across conversations.
Connection Pool
// Pre-allocated connections for concurrent reads
// Write operations go through a dedicated write connection
// ScopedConnection: RAII wrapper that returns connection on destruction

Default pool: 4-8 connections
Reads:        shared, parallel
Writes:       serialized, single connection
Overflow:     emergency connections created when pool exhausted

Embedding Engine (Vāk)

The embedding pipeline follows a Vedantic naming convention. Each stage transforms speech into meaning, mirroring the philosophical concept of Vāk — the power of articulated consciousness.

Text
VakPatha
Shabda
AntahkaranaYantra
Artha
ClassSanskrit MeaningRole
VakPathaPath of speechWordPiece tokenizer (vocab.txt)
ShabdaSound-formTokenized input (input_ids + attention_mask)
ArthaMeaning768-dim embedding vector + certainty
AntahkaranaYantraInner instrumentONNX Runtime inference engine
SmritiYantraMemory machineCaching wrapper (LRU, 10000 entries)
ShantaYantraSilent machineZero-vector fallback
Model: bge-base-en-v1.5
Parameters:   110M
Dimensions:   768
Max sequence: 128 tokens
Pooling:      Mean pooling with L2 normalization
Runtime:      ONNX Runtime, sequential execution mode
Batch size:  Up to 32 texts per inference call
LRU cache:   1000 entries, tracks hit/miss rate

Circuit Breaker

Opens after 3 consecutive failures, enters 60-second cooldown, then half-open state for testing recovery.

CLOSED
→ 3 failures →
OPEN
→ 60s cooldown →
HALF_OPEN
→ success →
CLOSED

Theme System (xMemory)

Inspired by the xMemory paper, themes provide hierarchical organization of memories. Each memory is assigned to a theme based on semantic similarity, with sparsity penalizing oversized themes.

score = semantic_weight × cosine_similarity + sparsity_weight × sparsity_score
Where sparsity = 1 / (1 + exp(2 × (theme_size / ideal_size - 1)))

Two-Stage Retrieval

Stage 1: Theme Matching

Find diverse theme representatives matching the query. Ensures breadth across conceptual areas.

Stage 2: Adaptive Expansion

Expand within matching themes for depth. Background maintenance splits, merges, and reassigns as needed.

Code Intelligence

Tree-sitter parsing extracts structural information from source code: symbols, callsites, imports, and type hierarchy. The SymbolResolver links callsites to known symbols for call graph traversal.

Supported Languages

C++
Python
JavaScript
TypeScript
Go
Rust
Java
Ruby
C#

Search Modes

ModeToolMethod
Name searchfind_symbolExact/prefix match on symbol name
Semantic searchsearch_symbolsEmbedding similarity on symbol metadata
BM25 search(internal)Full-text keyword match on name + signature
Call graphsymbol_callers / symbol_calleesTraversal of call_edge table
Extracted Information
Symbols     Functions, classes, methods, structs, enums
            name, kind, signature, file_path, line_range, parent
Callsites   Call, MemberCall, Qualified, New, Ctor, Indirect, LambdaCall
Imports     File dependencies
Hierarchy   Inheritance relationships

Memory Types & Lifecycle

26 node types, each with distinct decay characteristics and quality gates. Confidence is Bayesian, not a simple scalar — it tracks mean, variance, observation count, and decay.

Node Types

0Wisdom
1Belief
2Intention
3Episode
4Failure
5Aspiration
6Dream
7Question
8Correction
9Entity
10Term
11Edge
12Insight
13Signal
14State
15Summary
16Review
17Preference
18Milestone
19Approach
20Outcome
21Gap
22Symbol
23ProjectEssence
24ModuleState
25PatternState

Decay Rates

TypeRateVisualRationale
Belief 0.0
Never decays (core identity)
Symbol 0.0
Code structure doesn't decay
Wisdom 0.005
Proven patterns should persist
Correction 0.005
Important lessons persist
Preference 0.01
Slowly fades if not reinforced
Episode 0.03
Fades unless reinforced

Bayesian Confidence Model

Confidence struct
struct Confidence {
    float mu;           // Mean confidence
    float sigma_sq;     // Variance (uncertainty)
    int   n;            // Number of observations
    float tau;          // Decay parameter

    void observe(float value);  // Update with new evidence
    void decay(float rate);     // Time-based decay
};

// strengthen() calls observe(positive) — increases mu, reduces sigma
// weaken() calls observe(negative) — decreases mu
// decay() gradually reduces confidence based on decay_rate

Quality Gate

Before storing, DuckDBMind::remember() applies
1. Minimum length:       Content must be >= 10 characters
2. Deduplication:        Cosine similarity check (threshold: 0.95)
3. Diversity sampling:   Avoids storing too many similar memories in quick succession

Self-Tuning (Bayesian Bandits)

The ResonanceLearner uses Thompson sampling to automatically optimize resonance parameters. No manual tuning — the system learns what works for each user's memory patterns.

Thompson Sampling

Each tunable parameter has a Bayesian prior. BetaPrior for bounded parameters (0–1), GaussianPrior for unbounded. On each full_resonate() call, parameters are sampled from their current posteriors, balancing exploration and exploitation.

Credit Assignment

When a user strengthens or weakens a memory, the learner attributes credit to the parameters active when that memory was surfaced. QueryContext features — query length, term count, technical terms, domain prefix — inform the bandit. State persists across daemon restarts.

Tuned Parameters
Semantic weight   vs.  BM25 weight       // Balance vector vs keyword
Spread strength   and  Spread decay      // Graph activation dynamics
Hebbian rate                              // Co-access learning speed
Tag boost                                 // Categorical signal magnitude

// QueryContext features:
query_length, term_count, has_technical_terms
has_domain_prefix, avg_term_frequency
← Back to cc-soul