Recall: Retrieving Memories

The Recall operation searches and retrieves relevant memories from a memory bank. Using the TEMPR retrieval strategy, it finds information that matches your query through multiple search methods—semantic similarity, keyword matching, entity relationships, and temporal reasoning.

Overview

Recall enables intelligent memory retrieval:

TEMPR multi-strategy search (semantic, keyword, graph, temporal)
Relevance-ranked results
Configurable result limits
Memory type filtering (World Facts, Experience, Observations)

Basic Usage

Python
TypeScript
cURL

from hindsight_client import Hindsight

client = Hindsight(
    base_url="https://api.hindsight.vectorize.io",
    api_key="your-api-key"
)

# Simple recall
result = client.recall(
    bank_id="your-bank-id",
    query="What are the user's preferences?"
)

for memory in result.results:
    print(f"[{memory.type}] {memory.text}")

import { HindsightClient } from '@vectorize-io/hindsight-client';

const client = new HindsightClient({
  baseUrl: 'https://api.hindsight.vectorize.io',
  apiKey: 'your-api-key'
});

// Simple recall
const result = await client.recall(
  'your-bank-id',
  "What are the user's preferences?"
);

result.results.forEach(memory => {
  console.log(`[${memory.type}] ${memory.text}`);
});

curl -X POST https://api.hindsight.vectorize.io/v1/default/banks/{bank_id}/memories/recall \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are the user'\''s preferences?"
  }'

How It Works

When you call Recall:

Query Processing - Your query is analyzed for semantic meaning, keywords, entities, and temporal references
TEMPR Search - Four parallel search methods execute:
- Semantic - Finds conceptually similar memories
- Keyword (BM25) - Matches exact terms and phrases
- Graph - Traverses entity relationships
- Temporal - Handles time-based queries ("last week", "in March")
Ranking - Results from all methods are fused and ordered by relevance
Filtering - Optional filters are applied (type, date, etc.)
Response - Top matching memories are returned

Request Parameters

Parameter	Type	Required	Description
`bank_id`	string	Yes	Memory bank to search (in URL path)
`query`	string	Yes	Search query (natural language)
`types`	array	No	Filter by memory types
`budget`	string	No	Search depth: "low", "mid", "high" (default: "mid")
`max_tokens`	integer	No	Max tokens in response (default: 4096)
`trace`	boolean	No	Include debug trace (default: false)
`query_timestamp`	string	No	Reference time for temporal queries (ISO 8601)

Response

{
  "results": [
    {
      "id": "mem_abc123",
      "text": "User prefers dark mode interfaces",
      "type": "observation",
      "entities": ["user"],
      "context": "",
      "mentioned_at": "2024-03-15T10:30:00Z"
    },
    {
      "id": "mem_def456",
      "text": "User's timezone is Pacific Standard Time",
      "type": "world",
      "entities": ["user"],
      "context": "",
      "mentioned_at": "2024-03-14T14:20:00Z"
    }
  ],
  "entities": {
    "user": {
      "entity_id": "ent_456",
      "canonical_name": "user",
      "observations": []
    }
  }
}

Field	Description
`results`	Array of matching memories
`results[].id`	Unique memory identifier
`results[].text`	The memory text
`results[].type`	Memory category (world, experience, observation)
`results[].entities`	Entities mentioned in the memory
`results[].context`	Context when the memory was formed
`results[].mentioned_at`	When the memory was stored
`entities`	Entity details for entities in results

Query Best Practices

Use Natural Language

Good
Less Effective

memories = client.recall(
    bank_id=bank_id,
    query="What programming languages does the user know?"
)

memories = client.recall(
    bank_id=bank_id,
    query="programming languages"
)

Be Specific

Good
Less Effective

memories = client.recall(
    bank_id=bank_id,
    query="What did we discuss about the project timeline in our last meeting?"
)

memories = client.recall(
    bank_id=bank_id,
    query="timeline"
)

Ask Questions

Framing queries as questions often yields better results:

# Questions work well
memories = client.recall(query="What are the user's hobbies?")
memories = client.recall(query="When does the client prefer to have meetings?")
memories = client.recall(query="What technology stack is the project using?")

Filtering Results

By Memory Type

# Only get world facts and observations
memories = client.recall(
    bank_id=bank_id,
    query="Tell me about the user",
    types=["world_fact", "observation"]
)

By Relevance Score

# Only highly relevant results
memories = client.recall(
    bank_id=bank_id,
    query="user preferences",
    min_score=0.8
)

Limiting Results

# Get top 5 results
memories = client.recall(
    bank_id=bank_id,
    query="project requirements",
    limit=5
)

Understanding Scores

The relevance score (0-1) indicates how well a memory matches your query:

Score Range	Interpretation
0.9 - 1.0	Excellent match, directly relevant
0.8 - 0.9	Strong match, highly relevant
0.7 - 0.8	Good match, relevant
0.6 - 0.7	Moderate match, somewhat relevant
< 0.6	Weak match, may not be useful

Advanced Patterns

Contextual Search

Combine query with context for better results:

context = "We're discussing the new mobile app feature"
question = "What design preferences have been mentioned?"

memories = client.recall(
    bank_id=bank_id,
    query=f"{context} {question}"
)

Start broad, then narrow down:

# First, broad search
all_prefs = client.recall(query="user preferences")

# Then, specific search based on results
color_prefs = client.recall(query="preferred colors for UI design")

Combining with Retain

Build conversational memory:

# Store what the user says
client.retain(
    bank_id=bank_id,
    content=f"User said: {user_message}"
)

# Recall relevant context for response
context = client.recall(
    bank_id=bank_id,
    query=user_message,
    limit=5
)

Using in the UI

The Recall view in memory banks provides a debugging interface:

Navigate to your memory bank
Click Recall in the sidebar
Enter your search query
View results with:
- Memory content
- Relevance scores
- Memory types
- Retrieval traces

This is useful for:

Testing query effectiveness
Debugging retrieval issues
Understanding score distributions

Token Usage

Recall operations consume tokens based on:

Query length
Number of results retrieved
Memory content sizes

Monitor usage on the Usage Analytics page.

Error Handling

Python
TypeScript

try:
    result = client.recall(bank_id=bank_id, query=query)
    for memory in result.results:
        print(memory.text)
except Exception as e:
    print(f"Error: {e}")

try {
  const result = await client.recall(bankId, query);
  result.results.forEach(m => console.log(m.text));
} catch (error) {
  console.error('Error:', error.message);
}

Common Errors

Error	Cause	Solution
401 Unauthorized	Invalid API key	Check your API key
402 Payment Required	Insufficient credits	Add credits to your account
404 Not Found	Invalid bank_id	Verify the bank exists
400 Bad Request	Empty query	Provide a search query

Performance Tips

Limit results - Only request the number of memories you need
Filter by type - Narrow scope when you know what you're looking for
Use min_score - Filter out low-relevance matches
Cache results - Store frequently-needed memories locally

Overview​

Basic Usage​

How It Works​

Request Parameters​

Response​

Query Best Practices​

Use Natural Language​

Be Specific​

Ask Questions​

Filtering Results​

By Memory Type​

By Relevance Score​

Limiting Results​

Understanding Scores​

Advanced Patterns​

Contextual Search​

Iterative Refinement​

Combining with Retain​

Using in the UI​

Token Usage​

Error Handling​

Common Errors​

Performance Tips​