Building Context-Heavy: Knowledge-Graph API for AI Agents

Context-Heavy is a multi-tenant knowledge-graph API I built to give AI agents persistent, relational context across sessions. The core problem it solves: most agent memory systems treat memory as a flat vector store. Real knowledge has structure — entities, relationships, temporal ordering. Context-Heavy stores both.

The problem with flat vector memory

Standard RAG memory:

store(text) → embedding → vector DB
query(question) → similarity search → top-k chunks

This works for factual recall. It breaks for relational queries:

"What projects did we discuss last week that depend on the auth service?"
"Which users have reported this error in the past month?"
"What's the chain of decisions that led to this architecture?"

Similarity search can't answer these. You need a graph.

Data model

Context-Heavy stores knowledge as a property graph in PostgreSQL:

CREATE TABLE entities (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL,
    type        TEXT NOT NULL,           -- person, project, concept, event
    name        TEXT NOT NULL,
    properties  JSONB DEFAULT '{}',
    embedding   vector(1536),            -- pgvector
    created_at  TIMESTAMPTZ DEFAULT now(),
    updated_at  TIMESTAMPTZ DEFAULT now()
);

CREATE TABLE relationships (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL,
    from_id     UUID REFERENCES entities(id),
    to_id       UUID REFERENCES entities(id),
    type        TEXT NOT NULL,           -- depends_on, created_by, mentions
    weight      FLOAT DEFAULT 1.0,
    properties  JSONB DEFAULT '{}',
    created_at  TIMESTAMPTZ DEFAULT now()
);

CREATE INDEX ON entities USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);

Multi-tenancy: every row has tenant_id. Row-level security enforces isolation — no query can cross tenant boundaries.

Query API

Three query modes over HTTP + JSON:

1. Semantic search — nearest neighbors in embedding space:

POST /query/semantic
{
  "q": "authentication service issues",
  "limit": 10,
  "entity_types": ["project", "event"]
}

2. Graph traversal — recursive CTEs for relationship walks:

POST /query/graph
{
  "from_entity": "uuid-of-auth-service",
  "relationship": "depends_on",
  "depth": 3
}

WITH RECURSIVE deps AS (
    SELECT to_id, 1 AS depth
    FROM relationships
    WHERE from_id = $1 AND type = $2 AND tenant_id = $3
    UNION ALL
    SELECT r.to_id, d.depth + 1
    FROM relationships r
    JOIN deps d ON r.from_id = d.to_id
    WHERE d.depth < $4 AND r.tenant_id = $3
)
SELECT e.* FROM entities e
JOIN deps d ON e.id = d.to_id;

3. Hybrid — graph seed + semantic re-rank:

POST /query/hybrid
{
  "q": "performance issues in services that auth depends on",
  "seed_entity": "uuid-of-auth-service",
  "hop": 2
}

Hybrid mode is most powerful for agent use: start from a known entity, expand by relationships, re-rank expanded nodes by semantic similarity to the query.

Ingestion pipeline

Agents write knowledge via an ingestion endpoint that extracts entities and relationships from unstructured text:

type IngestRequest struct {
    TenantID string `json:"tenant_id"`
    Text     string `json:"text"`
    Context  string `json:"context"` // optional: source, date, author
}

func (s *Server) Ingest(w http.ResponseWriter, r *http.Request) {
    var req IngestRequest
    json.NewDecoder(r.Body).Decode(&req)

    // LLM extraction
    extracted := s.llm.ExtractEntities(req.Text)

    // Upsert entities (merge duplicates by name+type)
    for _, e := range extracted.Entities {
        s.db.UpsertEntity(req.TenantID, e)
    }
    // Upsert relationships
    for _, rel := range extracted.Relationships {
        s.db.UpsertRelationship(req.TenantID, rel)
    }
}

Entity deduplication uses fuzzy name matching + embedding similarity — "LetX API" and "letx-api" resolve to the same entity.

Performance at scale

At 10k entities per tenant with 50k relationships, query performance:

| Query type | P50 | P99 | |------------|-----|-----| | Semantic (ivfflat) | 8ms | 22ms | | Graph traversal (depth 3) | 14ms | 45ms | | Hybrid | 28ms | 70ms |

Graph traversal uses recursive CTEs with a depth cap (default: 5) to prevent runaway queries. Beyond depth 5, the graph becomes noise for most agent use cases.

FAQ

What is Context-Heavy? Context-Heavy is a multi-tenant knowledge-graph API that gives AI agents persistent, relational memory using PostgreSQL with pgvector and recursive CTEs.

How is it different from a regular vector database? Vector DBs excel at similarity search but can't answer relational queries. Context-Heavy stores entities and relationships, enabling graph traversal, dependency walks, and hybrid semantic+graph queries.

What's the tech stack? Go API server, PostgreSQL with pgvector extension, Redis for caching, deployed on AWS ECS Fargate with Terraform.

Can I self-host Context-Heavy? Yes — the Go binary + Terraform module are open source. You need a PostgreSQL instance with the pgvector extension (available on RDS, Supabase, and Neon).

Written by Shihab Shahriar Antor — AI Engineer & Founder of Shahriar Labs. See also: Building common-knowledge: Persistent Memory for Agents · pgvector: Vector Search in PostgreSQL.