Recall: Retrieving Memories
The Recall operation searches and retrieves relevant memories from a memory bank. Using the TEMPR retrieval strategy, it finds information that matches your query through multiple search methods—semantic similarity, keyword matching, entity relationships, and temporal reasoning.
Overview
Recall enables intelligent memory retrieval:
- TEMPR multi-strategy search (semantic, keyword, graph, temporal)
- Relevance-ranked results
- Configurable result limits
- Memory type filtering (World Facts, Experience, Observations)
Basic Usage
- Python
- TypeScript
- cURL
from hindsight_client import Hindsight
client = Hindsight(
base_url="https://api.hindsight.vectorize.io",
api_key="your-api-key"
)
# Simple recall
result = client.recall(
bank_id="your-bank-id",
query="What are the user's preferences?"
)
for memory in result.results:
print(f"[{memory.type}] {memory.text}")
import { HindsightClient } from '@vectorize-io/hindsight-client';
const client = new HindsightClient({
baseUrl: 'https://api.hindsight.vectorize.io',
apiKey: 'your-api-key'
});
// Simple recall
const result = await client.recall(
'your-bank-id',
"What are the user's preferences?"
);
result.results.forEach(memory => {
console.log(`[${memory.type}] ${memory.text}`);
});
curl -X POST https://api.hindsight.vectorize.io/v1/default/banks/{bank_id}/memories/recall \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"query": "What are the user'\''s preferences?"
}'
How It Works
When you call Recall:
- Query Processing - Your query is analyzed for semantic meaning, keywords, entities, and temporal references
- TEMPR Search - Four parallel search methods execute:
- Semantic - Finds conceptually similar memories
- Keyword (BM25) - Matches exact terms and phrases
- Graph - Traverses entity relationships
- Temporal - Handles time-based queries ("last week", "in March")
- Ranking - Results from all methods are fused and ordered by relevance
- Filtering - Optional filters are applied (type, date, etc.)
- Response - Top matching memories are returned
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
bank_id | string | Yes | Memory bank to search (in URL path) |
query | string | Yes | Search query (natural language) |
types | array | No | Filter by memory types |
budget | string | No | Search depth: "low", "mid", "high" (default: "mid") |
max_tokens | integer | No | Max tokens in response (default: 4096) |
trace | boolean | No | Include debug trace (default: false) |
query_timestamp | string | No | Reference time for temporal queries (ISO 8601) |
Response
{
"results": [
{
"id": "mem_abc123",
"text": "User prefers dark mode interfaces",
"type": "observation",
"entities": ["user"],
"context": "",
"mentioned_at": "2024-03-15T10:30:00Z"
},
{
"id": "mem_def456",
"text": "User's timezone is Pacific Standard Time",
"type": "world",
"entities": ["user"],
"context": "",
"mentioned_at": "2024-03-14T14:20:00Z"
}
],
"entities": {
"user": {
"entity_id": "ent_456",
"canonical_name": "user",
"observations": []
}
}
}
| Field | Description |
|---|---|
results | Array of matching memories |
results[].id | Unique memory identifier |
results[].text | The memory text |
results[].type | Memory category (world, experience, observation) |
results[].entities | Entities mentioned in the memory |
results[].context | Context when the memory was formed |
results[].mentioned_at | When the memory was stored |
entities | Entity details for entities in results |
Query Best Practices
Use Natural Language
- Good
- Less Effective
memories = client.recall(
bank_id=bank_id,
query="What programming languages does the user know?"
)
memories = client.recall(
bank_id=bank_id,
query="programming languages"
)
Be Specific
- Good
- Less Effective
memories = client.recall(
bank_id=bank_id,
query="What did we discuss about the project timeline in our last meeting?"
)
memories = client.recall(
bank_id=bank_id,
query="timeline"
)
Ask Questions
Framing queries as questions often yields better results:
# Questions work well
memories = client.recall(query="What are the user's hobbies?")
memories = client.recall(query="When does the client prefer to have meetings?")
memories = client.recall(query="What technology stack is the project using?")
Filtering Results
By Memory Type
# Only get world facts and observations
memories = client.recall(
bank_id=bank_id,
query="Tell me about the user",
types=["world_fact", "observation"]
)
By Relevance Score
# Only highly relevant results
memories = client.recall(
bank_id=bank_id,
query="user preferences",
min_score=0.8
)
Limiting Results
# Get top 5 results
memories = client.recall(
bank_id=bank_id,
query="project requirements",
limit=5
)
Understanding Scores
The relevance score (0-1) indicates how well a memory matches your query:
| Score Range | Interpretation |
|---|---|
| 0.9 - 1.0 | Excellent match, directly relevant |
| 0.8 - 0.9 | Strong match, highly relevant |
| 0.7 - 0.8 | Good match, relevant |
| 0.6 - 0.7 | Moderate match, somewhat relevant |
| < 0.6 | Weak match, may not be useful |
Advanced Patterns
Contextual Search
Combine query with context for better results:
context = "We're discussing the new mobile app feature"
question = "What design preferences have been mentioned?"
memories = client.recall(
bank_id=bank_id,
query=f"{context} {question}"
)
Iterative Refinement
Start broad, then narrow down:
# First, broad search
all_prefs = client.recall(query="user preferences")
# Then, specific search based on results
color_prefs = client.recall(query="preferred colors for UI design")
Combining with Retain
Build conversational memory:
# Store what the user says
client.retain(
bank_id=bank_id,
content=f"User said: {user_message}"
)
# Recall relevant context for response
context = client.recall(
bank_id=bank_id,
query=user_message,
limit=5
)
Using in the UI
The Recall view in memory banks provides a debugging interface:
- Navigate to your memory bank
- Click Recall in the sidebar
- Enter your search query
- View results with:
- Memory content
- Relevance scores
- Memory types
- Retrieval traces
This is useful for:
- Testing query effectiveness
- Debugging retrieval issues
- Understanding score distributions
Token Usage
Recall operations consume tokens based on:
- Query length
- Number of results retrieved
- Memory content sizes
Monitor usage on the Usage Analytics page.
Error Handling
- Python
- TypeScript
try:
result = client.recall(bank_id=bank_id, query=query)
for memory in result.results:
print(memory.text)
except Exception as e:
print(f"Error: {e}")
try {
const result = await client.recall(bankId, query);
result.results.forEach(m => console.log(m.text));
} catch (error) {
console.error('Error:', error.message);
}
Common Errors
| Error | Cause | Solution |
|---|---|---|
| 401 Unauthorized | Invalid API key | Check your API key |
| 402 Payment Required | Insufficient credits | Add credits to your account |
| 404 Not Found | Invalid bank_id | Verify the bank exists |
| 400 Bad Request | Empty query | Provide a search query |
Performance Tips
- Limit results - Only request the number of memories you need
- Filter by type - Narrow scope when you know what you're looking for
- Use min_score - Filter out low-relevance matches
- Cache results - Store frequently-needed memories locally