Building Context-Heavy: Knowledge-Graph API for AI Agents
Context-Heavy is a multi-tenant knowledge-graph API built in Go (pgvector + recursive CTEs) to give AI agents persistent, relational context across sessions.
Context-Heavy is a multi-tenant knowledge-graph API I built to give AI agents persistent, relational context across sessions. The core problem it solves: most agent memory systems treat memory as a flat vector store. Real knowledge has structure — entities, relationships, temporal ordering. Context-Heavy stores both.
The problem with flat vector memory
Standard RAG memory:
store(text) → embedding → vector DB
query(question) → similarity search → top-k chunks
This works for factual recall. It breaks for relational queries:
- "What projects did we discuss last week that depend on the auth service?"
- "Which users have reported this error in the past month?"
- "What's the chain of decisions that led to this architecture?"
Similarity search can't answer these. You need a graph.
Data model
Context-Heavy stores knowledge as a property graph in PostgreSQL:
CREATE TABLE entities (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
type TEXT NOT NULL, -- person, project, concept, event
name TEXT NOT NULL,
properties JSONB DEFAULT '{}',
embedding vector(1536), -- pgvector
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE relationships (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
from_id UUID REFERENCES entities(id),
to_id UUID REFERENCES entities(id),
type TEXT NOT NULL, -- depends_on, created_by, mentions
weight FLOAT DEFAULT 1.0,
properties JSONB DEFAULT '{}',
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX ON entities USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
Multi-tenancy: every row has tenant_id. Row-level security enforces isolation — no query can cross tenant boundaries.
Query API
Three query modes over HTTP + JSON:
1. Semantic search — nearest neighbors in embedding space:
POST /query/semantic
{
"q": "authentication service issues",
"limit": 10,
"entity_types": ["project", "event"]
}
2. Graph traversal — recursive CTEs for relationship walks:
POST /query/graph
{
"from_entity": "uuid-of-auth-service",
"relationship": "depends_on",
"depth": 3
}
WITH RECURSIVE deps AS (
SELECT to_id, 1 AS depth
FROM relationships
WHERE from_id = $1 AND type = $2 AND tenant_id = $3
UNION ALL
SELECT r.to_id, d.depth + 1
FROM relationships r
JOIN deps d ON r.from_id = d.to_id
WHERE d.depth < $4 AND r.tenant_id = $3
)
SELECT e.* FROM entities e
JOIN deps d ON e.id = d.to_id;
3. Hybrid — graph seed + semantic re-rank:
POST /query/hybrid
{
"q": "performance issues in services that auth depends on",
"seed_entity": "uuid-of-auth-service",
"hop": 2
}
Hybrid mode is most powerful for agent use: start from a known entity, expand by relationships, re-rank expanded nodes by semantic similarity to the query.
Ingestion pipeline
Agents write knowledge via an ingestion endpoint that extracts entities and relationships from unstructured text:
type IngestRequest struct {
TenantID string `json:"tenant_id"`
Text string `json:"text"`
Context string `json:"context"` // optional: source, date, author
}
func (s *Server) Ingest(w http.ResponseWriter, r *http.Request) {
var req IngestRequest
json.NewDecoder(r.Body).Decode(&req)
// LLM extraction
extracted := s.llm.ExtractEntities(req.Text)
// Upsert entities (merge duplicates by name+type)
for _, e := range extracted.Entities {
s.db.UpsertEntity(req.TenantID, e)
}
// Upsert relationships
for _, rel := range extracted.Relationships {
s.db.UpsertRelationship(req.TenantID, rel)
}
}
Entity deduplication uses fuzzy name matching + embedding similarity — "LetX API" and "letx-api" resolve to the same entity.
Performance at scale
At 10k entities per tenant with 50k relationships, query performance:
| Query type | P50 | P99 | |------------|-----|-----| | Semantic (ivfflat) | 8ms | 22ms | | Graph traversal (depth 3) | 14ms | 45ms | | Hybrid | 28ms | 70ms |
Graph traversal uses recursive CTEs with a depth cap (default: 5) to prevent runaway queries. Beyond depth 5, the graph becomes noise for most agent use cases.
FAQ
What is Context-Heavy? Context-Heavy is a multi-tenant knowledge-graph API that gives AI agents persistent, relational memory using PostgreSQL with pgvector and recursive CTEs.
How is it different from a regular vector database? Vector DBs excel at similarity search but can't answer relational queries. Context-Heavy stores entities and relationships, enabling graph traversal, dependency walks, and hybrid semantic+graph queries.
What's the tech stack? Go API server, PostgreSQL with pgvector extension, Redis for caching, deployed on AWS ECS Fargate with Terraform.
Can I self-host Context-Heavy? Yes — the Go binary + Terraform module are open source. You need a PostgreSQL instance with the pgvector extension (available on RDS, Supabase, and Neon).
Written by Shihab Shahriar Antor — AI Engineer & Founder of Shahriar Labs. See also: Building common-knowledge: Persistent Memory for Agents · pgvector: Vector Search in PostgreSQL.