Proxy Guide

Automatic context injection for AI applications—route requests through Recurse for grounded responses

Proxy Guide

The proxy is Recurse's most powerful feature. Route any OpenAI-compatible API call through Recurse to automatically enrich it with context from your knowledge graph—and optionally persist useful outputs back into your graph.

New to the proxy? Start with the quick setup guide to get running in 2 minutes.

Why Use the Proxy?

The Problem

When building AI applications, you face a constant challenge: how to provide relevant context to your AI model without manually managing conversation history, knowledge bases, and context windows.

Traditional approaches:

Manual context management: Copy-paste relevant information into prompts
Provider-specific memory: Locked into one AI provider's memory system
No accumulation: Each conversation starts from scratch
Context window limits: Can't include everything relevant

The Solution

The Recurse proxy solves this by:

Automatic context retrieval: Finds relevant information from your knowledge graph
Provider-agnostic: Works with any OpenAI-compatible API
Persistent memory: Your knowledge accumulates over time
Smart filtering: Only includes relevant context, not everything

How It Works

┌─────────────┐         ┌──────────────┐         ┌─────────────┐
│   Your App  │────────▶│  Recurse.cc  │────────▶│ AI Provider │
│             │         │    Proxy     │         │  (OpenAI,   │
│             │◀────────│              │◀────────│  Anthropic) │
└─────────────┘         └──────────────┘         └─────────────┘
                              │
                              │ Retrieves context
                              │ from knowledge graph
                              ▼
                        ┌──────────────┐
                        │ Knowledge    │
                        │ Graph        │
                        └──────────────┘

Step-by-Step Flow

Your application sends a Chat Completions request to the Recurse proxy
Recurse retrieves relevant context from your knowledge graph based on:
- Your query/messages
- The scope you specify
- Semantic similarity to your existing knowledge
Recurse enriches the request with retrieved context
AI provider processes the enriched request and returns a response
If persistence enabled: Recurse saves the assistant's final message into your graph

Basic Usage

Setup

Point your OpenAI SDK (or compatible client) to the Recurse proxy:

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,  // Your OpenAI key
  baseURL: 'https://api.recurse.cc/proxy/https://api.openai.com/v1/',
  defaultHeaders: {
    'X-API-Key': process.env.RECURSE_API_KEY,  // Your Recurse key
    'X-Recurse-Scope': 'my_project'  // Your scope
  }
});

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["OPENAI_API_KEY"],
    base_url="https://api.recurse.cc/proxy/https://api.openai.com/v1/",
    default_headers={
        "X-API-Key": os.environ["RECURSE_API_KEY"],
        "X-Recurse-Scope": "my_project"
    }
)

curl https://api.recurse.cc/proxy/https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "X-API-Key: $RECURSE_API_KEY" \
  -H "X-Recurse-Scope: my_project" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "What did we decide in last week'\''s meeting?"}]
  }'

Make a Request

Once configured, use the client normally:

const completion = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [
    { role: 'user', content: 'What are the main approaches to handling imbalanced datasets?' }
  ]
});

console.log(completion.choices[0].message.content);

Recurse automatically:

Finds relevant sources, frames, and context from your knowledge graph
Adds it to the request (you don't see this)
Returns the enriched response

Enabling Persistence

Persistence allows Recurse to automatically save useful outputs back into your knowledge graph. When enabled, the assistant's final message is stored under the same scope used for retrieval.

When to Use Persistence

Good use cases:

Summarizing meetings or documents
Answering questions that create useful reference material
Generating documentation or notes
Creating knowledge base entries

Be thoughtful:

Don't persist every response (consumes storage)
Don't persist temporary/session-specific content
Do persist content that will be useful later

Enable Persistence

Via Header (recommended):

const client = new OpenAI({
  baseURL: 'https://api.recurse.cc/proxy/https://api.openai.com/v1/',
  defaultHeaders: {
    'X-API-Key': process.env.RECURSE_API_KEY,
    'X-Recurse-Scope': 'meeting_notes',
    'X-Recurse-Persist': 'true'  // Enable persistence
  }
});

Via Request Body:

const completion = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Summarize our weekly meeting decisions.' }],
  persist: true  // Enable persistence for this request
});

Note: Header takes precedence if both are present.

What Gets Persisted

When persistence is enabled, Recurse:

Extracts the assistant's final message
Processes it as a source (frame extraction, embedding)
Stores it in your knowledge graph under the specified scope
Makes it available for future retrieval

Example:

// Enable persistence
const client = new OpenAI({
  baseURL: 'https://api.recurse.cc/proxy/https://api.openai.com/v1/',
  defaultHeaders: {
    'X-API-Key': process.env.RECURSE_API_KEY,
    'X-Recurse-Scope': 'support_responses',
    'X-Recurse-Persist': 'true'
  }
});

// Ask a question
const completion = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{
    role: 'user',
    content: 'How do I reset my password?'
  }]
});

// The assistant's response is automatically saved to your knowledge graph
// Future queries about password resets will include this response

Scopes

Scopes organize your knowledge. They control both retrieval (what context to include) and storage (where to save new content).

Choosing a Scope

Think of scopes like folders or tags:

Per user: user:alice, user:bob
Per project: project:website-redesign, project:mobile-app
Per collection: meeting_notes, support_faqs, research_papers
Per team: team:engineering, team:design

Scope Best Practices

Be specific: meeting_notes is better than notes
Be consistent: Use the same scope for related content
Match your workflow: If you organize by project, use project scopes
Don't over-segment: Too many scopes fragment your knowledge

Examples

// Scope by customer type
const client = new OpenAI({
  baseURL: 'https://api.recurse.cc/proxy/https://api.openai.com/v1/',
  defaultHeaders: {
    'X-API-Key': process.env.RECURSE_API_KEY,
    'X-Recurse-Scope': 'support:enterprise'  // or 'support:consumer'
  }
});

// Scope by research domain
const client = new OpenAI({
  baseURL: 'https://api.recurse.cc/proxy/https://api.openai.com/v1/',
  defaultHeaders: {
    'X-API-Key': process.env.RECURSE_API_KEY,
    'X-Recurse-Scope': 'research:ai-memory-systems'
  }
});

// Scope by team
const client = new OpenAI({
  baseURL: 'https://api.recurse.cc/proxy/https://api.openai.com/v1/',
  defaultHeaders: {
    'X-API-Key': process.env.RECURSE_API_KEY,
    'X-Recurse-Scope': 'team:engineering'
  }
});

Advanced Configuration

Custom Context Window

Control how much context to include:

const client = new OpenAI({
  baseURL: 'https://api.recurse.cc/proxy/https://api.openai.com/v1/',
  defaultHeaders: {
    'X-API-Key': process.env.RECURSE_API_KEY,
    'X-Recurse-Scope': 'my_project',
    'X-Recurse-Max-Context': '5000'  // Max tokens of context to include
  }
});

Multiple Scopes

Search across multiple scopes:

const client = new OpenAI({
  baseURL: 'https://api.recurse.cc/proxy/https://api.openai.com/v1/',
  defaultHeaders: {
    'X-API-Key': process.env.RECURSE_API_KEY,
    'X-Recurse-Scope': 'meeting_notes,support_faqs'  // Comma-separated
  }
});

Filter by Frame Type

Only include specific types of knowledge:

const client = new OpenAI({
  baseURL: 'https://api.recurse.cc/proxy/https://api.openai.com/v1/',
  defaultHeaders: {
    'X-API-Key': process.env.RECURSE_API_KEY,
    'X-Recurse-Scope': 'research_papers',
    'X-Recurse-Frame-Types': 'Claim,Evidence'  // Only include claims and evidence
  }
});

Error Handling

The proxy returns standard HTTP status codes:

200: Success
400: Bad request (invalid parameters)
401: Unauthorized (missing/invalid API key)
429: Rate limit exceeded
500: Server error

Example error handling:

try {
  const completion = await client.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [{ role: 'user', content: 'Hello' }]
  });
} catch (error) {
  if (error.status === 401) {
    console.error('Invalid API key');
  } else if (error.status === 429) {
    console.error('Rate limit exceeded');
  } else {
    console.error('Error:', error.message);
  }
}

Performance Considerations

Context Retrieval Time

Recurse adds ~100-300ms for context retrieval. This is usually negligible compared to model inference time, but consider:

Use caching for frequently accessed knowledge
Limit context size if latency is critical
Use scopes to narrow retrieval

Storage Costs

Persistence consumes storage:

Free tier: 100MB
Pro tier: 10GB
Enterprise: Custom

Monitor usage in your dashboard.

Remove documents and nodes from your knowledge graph

API vs Proxy

Detailed comparison of direct API access versus proxy-based context injection

Proxy Guide

Automatic context injection for AI applications—route requests through Recurse for grounded responses

Proxy Guide

Why Use the Proxy?

The Problem

The Solution

How It Works

Step-by-Step Flow

Basic Usage

Setup

Make a Request

Enabling Persistence

When to Use Persistence

Enable Persistence

What Gets Persisted

Scopes

Choosing a Scope

Scope Best Practices

Examples

Advanced Configuration

Custom Context Window

Multiple Scopes

Filter by Frame Type

Error Handling

Performance Considerations

Context Retrieval Time

Storage Costs

Troubleshooting

Next Steps

Quick Setup

Quickstart

Context Streams

Core Concepts

On this page

Proxy Guide

Automatic context injection for AI applications—route requests through Recurse for grounded responses

No Context Retrieved

Persistence Not Working

Provider Errors

Quick Setup

Quickstart

Context Streams

Core Concepts

On this page