Documentation

The technical guide to implementing persistent memory for the enterprise.

Cortyxia is a memory layer for AI applications. It sits between your application and any LLM provider — OpenAI, Anthropic, Google, DeepSeek, xAI, or Groq — and gives every call access to persistent, relevant context.

This documentation covers the system architecture, the memory and retrieval engine, drop-in SDKs for TypeScript and Python, CLI tools such as Claude Code and Codex, and deployment options for self-hosted and cloud environments.

You will learn how the proxy engine intercepts LLM calls, how the memory layer combines BM25 and semantic search to retrieve the right context, how namespaces isolate projects, and how the observability pipeline traces every request from start to finish.

New to Cortyxia? Start with the System Overview, jump straight to the SDK Guide, or set up a CLI agent in the CLI Guide.

Browse by topic

Architecture

System Overview
Memory Management Unit, proxy engine, and observability pipeline.
Memory Layer
Semantic search, BM25, namespace isolation, and ingestion.
Token Optimization
Context compression and 40-60% token savings.

Integration

SDK Guide
TypeScript and Python APIs with OpenAI-compatible endpoints.
CLI Guide
Claude Code, Codex, Kilo Code, and other agentic CLI tools.

Deployment

Deployment Guide
Cloud or self-hosted deployment on SQLite, PostgreSQL, and optional Redis.

Can't find what you're looking for?

Shoot us an email and we'll help you out.

Email Us