Architecture

Context-Fabric uses memory-mapped storage for predictable performance and efficient multi-corpus analysis.

Memory-Mapped Storage

Instead of loading corpus data into Python objects, Context-Fabric maps compiled files directly into the process's address space. The operating system handles paging data in and out as needed.

Characteristic	Context-Fabric	Text-Fabric
Initial load time	Near-instant	Proportional to corpus size
Memory per corpus	~127 MB	~677 MB
Multiple corpora	Linear scaling	Superlinear scaling

This enables:

Multi-corpus analysis: Load Hebrew Bible, Septuagint, Dead Sea Scrolls, and Greek New Testament simultaneously on a laptop
Production deployments: Predictable resource usage across concurrent requests

Multi-Process Sharing

Multiple processes reading the same corpus share physical memory pages at the OS level:

text

Process 1 ─┐
Process 2 ──┼── Page cache ── .cfm files
Process 3 ─┘

Four workers don't use four times the memory—they share read-only data through the kernel's page cache.

How It Works

Context-Fabric loads arrays with mmap_mode='r', which translates to MAP_SHARED at the OS level. Each process gets its own virtual address mapping, but all mappings point to the same physical pages. This is the same mechanism that allows shared libraries to be loaded once and used by hundreds of processes.

Measured Overhead

With 4 forked workers on the BHSA corpus (from benchmarks):

Mode	Total RSS	Per-Worker Overhead
Single process	524 MB	—
Fork (4 workers)	658 MB	~34 MB

The 134 MB total overhead (34 MB × 4) represents Python interpreter state, not corpus data. Without sharing, we'd expect ~2,096 MB.

Note on memory pressure

Under memory pressure, the kernel may evict pages from the page cache. Accessing evicted data triggers a page fault and disk read—trading latency for memory. Resident pages are always shared.

Benchmark Summary

With 10 corpora loaded simultaneously:

Metric	Context-Fabric	Text-Fabric
Total memory	1,348 MB	5,529 MB
Memory variance	±7 MB	±949 MB

For detailed benchmarks and methodology, see the technical paper.

Getting Started

Corpora

Concepts

File Formats

Core Library

MCP Server