Letta Review — Features, Pricing & User Sentiment | Payloop

Letta

frameworkai-agentsusage-based + subscription + freemium + per-seat + tieredFree tier

Making machines that learn. Create stateful agents that remember everything, learn continuously, and improve themselves over time.

Users have noted that Letta is part of the ecosystem of AI memory solutions, alongside others like Mem0 and Zep, which often involve latency issues due to their dependency on LLMs for data processing. While explicit user reviews were sparse, social media mentions suggest Letta is recognized within AI communities, though not frequently discussed in mainstream channels. Pricing sentiment or specific cost feedback was not evident in the available mentions. Overall, Letta seems to have a niche reputation primarily among AI enthusiasts and developers focused on memory systems in AI applications.

Mentions (30d)

5

4 this week

Reviews

0

Platforms

3

GitHub Stars

21,824

2,303 forks

16 integrations10 featuresSeed

Voices Discussing Letta

Yohei Nakajima

Creator at BabyAGI

2 mentions

Lina Khan

Former Chair at FTC

1 mention

Latest Videos

Experiential AI: Building Intelligent Agents with Unique Selves #shorts

Experiential AI: Building Intelligent Agents with Unique Selves #shorts

Apr 13, 2026

Letta Code App: General Access & Our New Space Age Experience! #shorts

Letta Code App: General Access & Our New Space Age Experience! #shorts

Apr 12, 2026

Share:Twitter LinkedIn

Product Screenshots

Letta screenshot 1

Letta screenshot 2

Letta screenshot 3

Letta screenshot 4

Letta screenshot 5

Letta screenshot 6

Letta screenshot 7

Letta screenshot 8

AI Summary

Users have noted that Letta is part of the ecosystem of AI memory solutions, alongside others like Mem0 and Zep, which often involve latency issues due to their dependency on LLMs for data processing. While explicit user reviews were sparse, social media mentions suggest Letta is recognized within AI communities, though not frequently discussed in mainstream channels. Pricing sentiment or specific cost feedback was not evident in the available mentions. Overall, Letta seems to have a niche reputation primarily among AI enthusiasts and developers focused on memory systems in AI applications.

Features & Use Cases

Features

Persistent agents instead of stateless sessionsAlways improving and learningOwn your memory and port it across modelsChat from any device, deploy on any environmentContext ConstitutionIntroducing Context Repositories: Git-based Memory for Coding AgentsContinual Learning in Token SpaceSkill Learning: Bringing Continual Learning to CLI AgentsContext-Bench: Benchmarking LLMs on Agentic Context EngineeringSleep-time Compute: AI that Dreams

Use Cases

Persistent agents instead of stateless sessions

Company Intel

Industry

research

Employees

16

Funding Stage

Seed

Total Funding

$10.0M

Social Reach

754

GitHub followers

Developer Ecosystem

47

GitHub repos

21,824

GitHub stars

20

npm packages

Top Mention

reddit@Phoenix_Muses35 engagement5/15/2026

Reasoning is hidden in Claude Code?

I just moved to Claude Code and was setting up a script to create daily logs of my work sessions and noticed that reasoning is not visible in the input or output in Claude Code? Does anyone know why in the hell they do this? The best reason I can seem to find is that \*maybe\* it's a possible security risk. The thing is, reasoning is visible in other CLIs (Letta, Openclaw) and in their own desktop app. I use reasoning a lot to catch missteps, behavioral issues, and I use live reasoning tracking to halt faulty processing and reroute the agent. I also store it for research purposes. This is a significant downgrade and I am genuinely unsure why they would do this. If they're afraid I'll be able to watch their bots leak system prompts, curse, or say terrible things... well I can do that in any other CLI and often do. So genuinely unsure what they think they're hiding. Is there any workaround I may be missing for this...? \--- EDIT: Yes, I am aware another model writes the summaries. That does not make them less valuable. If I can still use them for bug reporting, halting active processes, and detecting failure points, then they are still valuable data. If they want to stop people from stealing their overpriced model architecture, they should start by being consistent, maybe stopping the leaks, especially since reasoning is very visible on other platforms.

Mentions by Platform

youtube

Letta AI

Letta AI

youtube

Letta AI

Letta AI

youtube

Letta AI

Letta AI

youtube

Letta AI

Letta AI

youtube

Letta AI

Letta AI

Pricing

usage-based + subscription + freemium + per-seat + tieredFree tier available

Pricing found: $0 /month, $20 /month, $100/mo, $200

Mention Activity (Last 12 Weeks)

Platform Distribution

Sentiment Overview

Positive6% (1)

Neutral94% (15)

Negative0% (0)

Common Pain Points

cost tracking (1)openai bill (1)token cost (1)

Top Topics

pricing (1)performance (1)documentation (1)api (1)scalability (1)open source (1)deployment (1)data privacy (1)RAG (1)agents (1)cost optimization (1)workflow (1)

Recent Mentions

youtube

Letta AI

Letta AI

youtube

Letta AI

Letta AI

youtube

Letta AI

Letta AI

youtube

Letta AI

Letta AI

youtube

Letta AI

Letta AI

reddit@DetectiveMindless65216 engagement5/23/2026

After 6 months of running AI agents in production I think the framework you pick barely matters. The thing that kills them is something else.

Going to get downvoted for this but here we go. I've been running about 30 agents in production for paying customers for the last 6 months and I'm convinced the framework debate is mostly a distraction. LangChain, CrewAI, AutoGen, OpenAI Agents SDK. Pick whichever one your team already knows. It doesn't matter as much as you think. What actually decides whether your agent works in production is something almost nobody talks about on this sub, and it isn't in the framework. Here's what I've seen kill more agents than every framework bug combined. The agent gets stuck in a loop. It calls the same tool 200 times in 4 minutes because something downstream returned ambiguous data and the LLM decided to retry forever. Your OpenAI bill goes from $3 a day to $400 in one afternoon. By the time you notice you've burned a grand. You can't even tell which agent did it because there's no audit trail. Your VPS reboots overnight for kernel patches. Every agent that was mid-task loses everything. Tomorrow morning the support agent has no memory of yesterday's tickets, the research crew has forgotten what they were investigating, the pipeline agent restarts from scratch. None of these are framework problems. They're memory and state problems. A customer complains the agent gave them wrong info three days ago. You go to debug. There's no record of what the agent saw, what it decided, or which tool calls it made. The framework didn't log that because frameworks aren't observability tools. You shrug and refund. You scaled to 15 agents working together. Two of them have conflicting beliefs about the same customer because their memory isn't shared. The customer gets two different answers in the same conversation depending on which agent replies first. You've been around enough times to realize the part you actually need isn't in the framework at all. What I think the real stack is. The framework just orchestrates LLM calls. Use whatever your team likes. It's the cheap layer. A persistent memory layer that survives crashes, restarts, and redeploys, so the agent has actual continuity. This is the layer that decides whether your agent is a toy or a product. Loop detection at the runtime layer, not bolted on as a wrapper around the framework. Something that catches your agent making the same call too many times in a row and stops it before the bill explodes. An audit trail of every decision the agent made, with a hash chain so you can prove later what happened when the customer pushes back. Screenshots and logs aren't enough when ten thousand dollars is on the line. Shared memory between agents in the same team so they're not having different conversations about the same customer. Cost tracking per agent so you actually know which one ran away with your budget. When I look at what makes the agents that survive production look different from the ones that died, it's never that they picked the right framework. It's that they had this layer underneath, either built carefully in-house or borrowed from somewhere. Full disclosure I'm building one of these tools. There are others. Mem0 and Zep and Letta in the memory space. Helicone and LangSmith in the observability space. Mix and match. Use one or build your own. Just please stop arguing about whether LangChain or CrewAI is better when the thing eating your production agents has nothing to do with either of them. What's been your worst production agent failure? Curious what other people have actually hit. I built a free tool that aims to solve most of this issue, what do you think?

reddit@boneMechBoy694202 engagement5/16/2026

I have figured out a way to run every memory system out there on one platform

But is there an industry need for it ... It's smth like vlc media player of memory systems ... My team thinks it's hard to make money from it or its hard to sell ... What do y'all think In this system it's like you can fetch like zep for your temporal needs , store like letta if needed , traverse like mempalace or hindsight etc all in one place Thoughts?

reddit@boneMechBoy6942013 engagement5/16/2026

I have figured out a way to run every memory system out there on one platform

But is there an industry need for it ... It's smth like vlc media player of memory systems ... My team thinks it's hard to make money from it or its hard to sell ... What do y'all think In this system it's like you can fetch like zep for your temporal needs , store like letta if needed , traverse like mempalace or hindsight etc all in one place Thoughts?

reddit@Phoenix_Muses35 engagement5/15/2026

Reasoning is hidden in Claude Code?

I just moved to Claude Code and was setting up a script to create daily logs of my work sessions and noticed that reasoning is not visible in the input or output in Claude Code? Does anyone know why in the hell they do this? The best reason I can seem to find is that \*maybe\* it's a possible security risk. The thing is, reasoning is visible in other CLIs (Letta, Openclaw) and in their own desktop app. I use reasoning a lot to catch missteps, behavioral issues, and I use live reasoning tracking to halt faulty processing and reroute the agent. I also store it for research purposes. This is a significant downgrade and I am genuinely unsure why they would do this. If they're afraid I'll be able to watch their bots leak system prompts, curse, or say terrible things... well I can do that in any other CLI and often do. So genuinely unsure what they think they're hiding. Is there any workaround I may be missing for this...? \--- EDIT: Yes, I am aware another model writes the summaries. That does not make them less valuable. If I can still use them for bug reporting, halting active processes, and detecting failure points, then they are still valuable data. If they want to stop people from stealing their overpriced model architecture, they should start by being consistent, maybe stopping the leaks, especially since reasoning is very visible on other platforms.

reddit@Deep-Huckleberry-7523 engagement5/15/2026

Memory drift? Context bloat? A Claude Code skill I wrote to manage long-running memory libraries

I've been running Claude Code's auto-memory on the same project for about three months. Roughly a month in, the library started getting hard to use: the same lesson recorded under three different filenames, frontmatter missing on half the files, searching for "that bug we fixed last month" returned nothing useful. Every new session, Claude loaded more and more memory files, and the context window kept getting crowded with irrelevant entries. I wrote a skill that enforces a naming schema and a bash audit script that flags drift. Sharing in case it's useful. # What the skill does Claude Code's auto-memory (v2.1.59+) writes plain markdown to `~/.claude/projects/<slug>/memory/`. The files are yours to read, edit, and version. What it doesn't enforce is structure — naming, required fields, or a Why section on each lesson. * **Schema on top of auto-memory.** `<type>_<topic>.md` naming, required frontmatter (name / description / type), Why section on feedback entries. Auto-memory still writes; the skill makes Claude write to a spec. * **Phrase-triggered review.** "Audit memory" runs the script. "Review session" walks the recent session and surfaces what's worth keeping. * **Soft warning, no hooks.** Audit reports drift; nothing blocks a write. * **Plain markdown on disk.** Edit, grep, git-commit. The skill doesn't add a database or daemon. # Effect * One topic per file means Claude lands on the right entry on the first lookup, not after several near-misses. * A deduplicated library loads fewer files per session, freeing context for the work itself. Sample audit output: Memory audit · 2026-05-15 · 132 files Hard checks (must be zero): missing frontmatter 0 frontmatter fields 0 feedback missing Why 1 naming violations 0 broken MEMORY.md links 0 Soft signals: oversized files 78 groups over 15 entries 3 untouched 30+ days 31 not in MEMORY.md 0 Hard-rule compliance: 99.2% (1 violation / 132 files) # Install Paste this into any Claude Code session: Install the claude-memory-manager skill from https://github.com/jau123/claude-memory-manager Claude handles the rest. To verify, say `"audit memory"` in a new session. # First use The skill activates from natural language. No slash command. You: "Record today's wildcard bug fix" → Claude writes one feedback_*.md entry: filename, frontmatter, Why section, How-to-apply. You: "Review the session" → Claude walks recent session, surfaces 3–5 candidates, asks which to keep. You: "Audit memory" → Runs scripts/audit-memory.sh, reports compliance, lists files that need splitting. # vs the built-in auto-memory |Schema|Audit|Long-term result| |:-|:-|:-| |Auto-memory alone|None (Claude decides)|None|Files accumulate without a naming or content spec| |**with this skill**|3-type schema + required fields + Why on feedback|One-command script|Library stays auditable and searchable| For semantic retrieval over chunked storage, look at vector-backed tools like Mem0, Letta, or Zep. # Limits * Single-project scope. One memory directory per skill instance. * No semantic ranking. The audit is pattern matching; it won't catch two files describing the same concept in different words. * Bash; Windows / git-bash untested. * Overkill for small libraries. Below \~10 entries or a month of project age, the built-in auto-memory is sufficient. GitHub: [https://github.com/jau123/claude-memory-manager](https://github.com/jau123/claude-memory-manager) Curious whether others have hit this drift problem on long-running Claude Code projects, and how you handled it — especially anyone who tried hook-based enforcement and gave up. Schema feedback (3 types of feedback / reference / project) also welcome.

reddit@kafadankirik9 engagement5/12/2026

How are you handling context loss between Claude Code / Cursor sessions?

I've been building with Claude Code and Cursor for the last few months and keep running into the same wall: every new session, the agent forgets what it did last time. My TaskList wipes, the file changes context vanishes, and I end up reading my own commit messages to remind the agent what we were working on. Right now I'm doing this: - Writing a CLAUDE.md or AGENTS.md by hand after every major change - Keeping a separate "what I tried and why it failed" doc - Sometimes literally pasting yesterday's chat back in I've seen Mem0, Letta, MemoryPlugin pop up but none of them seem to travel between tools — they're locked to one model or one IDE. Two honest questions: 1. How are you handling this right now? Markdown files like me, or something smarter? 2. If a tool sat between your IDE and the model — recording what the agent did, why, and let you "rewind" to a previous state — would that be worth paying for, or is this just a "nice to have"? Not selling anything yet, trying to figure out if I'm alone in this or if there's a real gap. Will share what I find in the comments.

reddit@[unknown]4/18/2026

I spent 2 months and $600 building a cognitive system on top of Claude because the product I actually need doesn't exist. Here's what I learned.

DISCLAIMER: AI wrote this article. I gave it all of my ideas, thoughts, point-form notes, and context, but I'm not articulate enough to write clearly and comprehensively for 4000+ words. I did write this disclaimer myself. Every major AI lab is competing on the same axis — capability. Bigger models, longer context, better benchmarks. And yet every serious user hits the same wall. Not a capability wall. A structural one. The AI forgets everything between sessions. It tells you what you want to hear instead of what's accurate. It follows your instructions for about three exchanges before drifting back to default behaviour. It can't hold the full architecture of your professional life and reason across it. I have ADHD. I've spent 22 years building compensatory systems for the cognitive dimensions my neurology constrains. When I started using AI seriously — building a company from incorporation to pre-launch in two months while working full-time and managing a newborn — I realized AI is the most powerful compensatory substrate I've ever found. But only if you fight it. So I built a system: a persistent context document I maintain across sessions (currently at version 7), three governance protocols that constrain the AI's behaviour, a 40-rule analysis protocol, a correction log, and systematic quality enforcement. It costs me ~$50/day in AI usage and hours of maintenance overhead. It works better than anything any AI company ships out of the box. In building it, I accidentally specified a product category that nobody sells. I'm calling it Omniscient Partner Intelligence (OPI) — a persistent, full-context cognitive partner calibrated to one person. Not an assistant. Not a chatbot. A second mind. The full article below covers what I built, why every existing product category falls short, who needs this, what it would take to build, and the strongest arguments against the whole idea. OMNISCIENT PARTNER INTELLIGENCE The AI Product Category That Doesn’t Exist Yet I’ve spent the last two months building a workaround for a product nobody sells. This is what I learned, what I built, and what should exist. I. The Wall I pay for the most expensive AI subscription Anthropic offers. I use Claude for everything: writing whitepapers, analysing legal documents, building financial models, producing formatted deliverables, conducting competitive research, and pressure-testing my own strategic thinking. In the last two months I’ve used it to build a company from incorporation to pre-launch while working a full-time job and managing a newborn. The AI throughput is real. I am not dismissing what these systems can do. But every serious user hits the same wall. Not a capability wall. A structural one. The AI forgets everything between sessions. I re-explain my business, my strategic context, and my open threads every time I start a new conversation. It follows my instructions loosely—I set explicit constraints in the first message and watch them dissolve within three exchanges as the model drifts back to its default behaviour. It softens its feedback to avoid upsetting me, which means I have to actively fight to extract honest assessments. I once asked it to analyse a years-long conversation history with someone important in my life. The first analysis was about 60% grounded and 40% cushioning. I had to ask specifically, “how much of this is objective and how much is you trying to be supportive of me?” before I got the real version. A peer-reviewed study published in Science in March 2026 confirmed what I’d already learned from experience: all four major AI systems—ChatGPT, Claude, Gemini, and Llama—systematically tell users what they want to hear. Worse, users rated sycophantic responses as more trustworthy, even when those responses led to worse decisions. The sycophancy is not a bug. It is a structural outcome of training on human approval ratings, where agreeable outputs score higher than honest ones. This creates a specific failure mode for people like me: founders, solo operators, and independent professionals making high-stakes decisions without a team to push back. I have no manager catching flawed strategy. No board member challenging assumptions. What I have is an AI system available around the clock that always seems to understand what I’m trying to do. It does not understand me. It mirrors me. So I built a workaround. And in building it, I accidentally specified a product that nobody sells. II. What I Built Over roughly forty sessions and two months, I constructed a system on top of Claude that compensates for every structural gap I just described. It is held together with duct tape—persistent context documents, governance protocols, correction logs, and manual quality enforcement. It is cognitively expensive to maintain. And it works better than anything any AI company has shipped. The Brain Document I maintain a persistent context file—currently at version 7—that contains the complete architectur

reddit@[unknown]4/14/2026

Sekha — persistent memory for Claude Code (stays across sessions), plus rules the AI has to follow

I got tired of re-explaining my preferences to Claude Code every morning, so I built Sekha: https://github.com/Thoth-soft/sekha What it does: **Remembers things across sessions.** Tell Claude "I prefer Postgres over MySQL for new projects" in one session. Close it. Open a new session tomorrow. Ask what database you prefer — it answers correctly, because it saved the preference as a markdown file and retrieved it on demand. Claude drives save/retrieve itself via 6 MCP tools (sekha_save, sekha_search, sekha_list, sekha_delete, sekha_status, sekha_add_rule). **Rules the AI can't ignore.** Every other memory system (Mem0, MemPalace, Letta, Zep, Basic Memory) stores rules but the AI decides whether to follow them. Sekha uses Claude Code's PreToolUse hook to hard-block tool calls that match a rule you've written. Works even with `--dangerously-skip-permissions`. So you can write a rule like "never delete /important/", "never force-push to main", "never run DROP TABLE" — and Claude literally cannot run those commands, no matter how you word the request. Quick facts: - Zero runtime dependencies (pure Python stdlib) - Python 3.11+ - Cross-platform, 9-cell CI matrix (Win/mac/Linux x 3.11/3.12/3.13) - 349 tests - Hook latency: p50 under 50ms on Linux/macOS, ~300ms on Windows (Python cold-start floor) - Plain markdown storage, no database, no embeddings, grep-based search - MIT, pip install sekha Scope honesty: - **Hard enforcement only covers rules that can be matched against what Claude is about to do** — specific command patterns, file paths, tool names. - **Behavioral rules** like "always confirm before acting" or "no guessing" stay prompt-level. The AI can ignore them. No hook exists for the AI's reasoning, only its actions. README threat model explains why. Install: pip install sekha sekha init That's it. `sekha init` auto-registers the MCP server with Claude Code. Feedback I'd find valuable: - Edge cases in memory retrieval (things it should find but doesn't, or things it finds but shouldn't) - Rule patterns you want to ship for common mistakes - Other AI clients where this pattern could work (anything with a hook that fires before tool execution) Example rules in `examples/rules/` for copy-paste. Happy to answer questions in comments. submitted by /u/Live-Flamingo3149 [link] [comments]

reddit@[unknown]4/8/2026

I got tired of re-explaining myself to Claude every session, so I built something

I got tired of re-explaining myself to every AI tool, so I built one that makes my context portable Hello everyone out there using AI every day… I build cardiac implants at Boston Scientific during the day and I’m a 1st year CS student. I use Claude, ChatGPT, Cursor, and Gemini daily to improve my skills and my productivity. But every tool starts from zero. Claude doesn’t know what I told Cursor. ChatGPT forgets my preferences. Gemini has no idea about my stack. I was spending the first 5 minutes of every session re-explaining who I am. Over and over. So I built aura-ctx; a free, open-source CLI that defines your AI identity once and serves it to all your tools via MCP. One source of truth. Everything stays local. No cloud. No lock-in. This is not another memory layer. Mem0, Zep, and Letta solve agent memory for developers. aura-ctx solves something different: the end user who wants to own and control their identity across tools. No Docker. No Postgres. No Redis. No auth tokens to manage. Just: pip install -U aura-ctx aura quickstart Why local-first matters here: your MCP server runs on localhost. No network latency. No auth hell. No token refresh. If you’ve dropped cloud-based MCP servers because of the overhead, this is the opposite architecture. Portability is by design: your entire identity lives in ~/.aura/packs/. Move machines? Copy the folder. That’s it. Security built-in: aura audit scans your packs for accidentally stored secrets (API keys, tokens, credentials) before they leak into your context. v0.3.3 is out with 3,500+ downloads. Supports 8 AI tools including Claude Desktop, Cursor, Windsurf, Gemini CLI, Claude Code and more. Exports to CLAUDE.md and AGENTS.md for agent frameworks. Still early. I’d like any feedback on what works, what doesn’t, and what’s missing. Curious : do you re-explain yourself every time you open Claude, or have you found a better way? GitHub: https://github.com/WozGeek/aura-ctx submitted by /u/Miserable_Celery9917 [link] [comments]

reddit@[unknown]4/7/2026

[D] MemPalace claims 100% on LoCoMo and a "perfect score on LongMemEval." Its own BENCHMARKS.md documents why neither is meaningful.

A new open-source memory project called MemPalace launched yesterday claiming "100% on LoCoMo" and "the first perfect score ever recorded on LongMemEval. 500/500 questions, every category at 100%." The launch tweet went viral reaching over 1.5 million views while the repository picked up over 7,000 GitHub stars in less than 24 hours. The interesting thing is not that the headline numbers are inflated. The interesting thing is that the project's own BENCHMARKS.md file documents this in detail, while the launch tweet strips these caveats. Some of failure modes line up with the methodology disputes the field has been arguing about for over a year (Zep vs Mem0, Letta's "Filesystem All You Need" reproducibility post, etc.). 1. The LoCoMo 100% is a top_k bypass. The runner uses top_k=50. LoCoMo's ten conversations have 19, 19, 32, 29, 29, 28, 31, 30, 25, and 30 sessions respectively. Every conversation has fewer than 50 sessions, so top_k=50 retrieves the entire conversation as the candidate pool every time. The Sonnet rerank then does reading comprehension over all sessions. BENCHMARKS.md says this verbatim: The LoCoMo 100% result with top-k=50 has a structural issue: each of the 10 conversations has 19–32 sessions, but top-k=50 exceeds that count. This means the ground-truth session is always in the candidate pool regardless of the embedding model's ranking. The Sonnet rerank is essentially doing reading comprehension over all sessions - the embedding retrieval step is bypassed entirely. The honest LoCoMo numbers in the same file are 60.3% R@10 with no rerank and 88.9% R@10 with hybrid scoring and no LLM. Those are real and unremarkable. A 100% is also independently impossible on the published version of LoCoMo, since roughly 6.4% of the answer key contains hallucinated facts, wrong dates, and speaker attribution errors that any honest system will disagree with. 2. The LongMemEval "perfect score" is a metric category error. Published LongMemEval is end-to-end QA: retrieve from a haystack of prior chat sessions, generate an answer, GPT-4 judge marks it correct. Every score on the published leaderboard is the percentage of generated answers judged correct. The MemPalace LongMemEval runner does retrieval only. For each of the 500 questions it builds one document per session by concatenating only the user turns (assistant turns are not indexed at all), embeds with default ChromaDB embeddings (all-MiniLM-L6-v2), returns the top five sessions by cosine distance, and checks set membership against the gold session IDs. It computes both recall_any@5 and recall_all@5, and the project reports the softer one. It never generates an answer. It never invokes a judge. None of the LongMemEval numbers in this repository - not the 100%, not the 98.4% "held-out", not the 96.6% raw baseline - are LongMemEval scores in the sense the published leaderboard means. They are recall_any@5 retrieval numbers on the same dataset, which is a substantially easier task. Calling any of them a "perfect score on LongMemEval" is a metric category error. 3. The 100% itself is teaching to the test. The hybrid v4 mode that produces the 100% was built by inspecting the three remaining wrong answers in their dev set and writing targeted code for each one: a quoted-phrase boost for a question containing a specific phrase in single quotes, a person-name boost for a question about someone named Rachel, and "I still remember" / "when I was in high school" patterns for a question about a high school reunion. Three patches for three specific questions. BENCHMARKS.md, line 461, verbatim: This is teaching to the test. The fixes were designed around the exact failure cases, not discovered by analyzing general failure patterns. 4. Marketed features that don't exist in the code. The launch post lists "contradiction detection catches wrong names, wrong pronouns, wrong ages before you ever see them" as a feature. mempalace/knowledge_graph.py contains zero occurrences of "contradict". The only deduplication logic is an exact-match check on (subject, predicate, object) triples that blocks identical triples from being added twice. Conflicting facts about the same subject can accumulate indefinitely. 5. "30x lossless compression" is measurably lossy in the project's own benchmarks. The compression module mempalace/dialect.py truncates sentences at 55 characters, filters by keyword frequency, and provides a decode() function that splits the compressed string into a header dictionary without reconstructing the original text. There is no round-trip. The same BENCHMARKS.md reports results_raw_full500.jsonl at 96.6% R@5 and results_aaak_full500.jsonl at 84.2% R@5 — a 12.4 percentage point drop on the same dataset and the same metric, run by the project itself. Lossless compression cannot cause a measured quality drop. Why this matters for the benchmark conversation. The field needs benchmarks where judge reliability is adversarially validated, an

hackernews@isaacgbc4 engagement3/5/2026

Show HN: Mnemora – Serverless memory DB for AI agents (no LLM in your CRUD path)

Hi HN,<p>I built Mnemora because every AI agent memory solution I evaluated (Mem0, Zep, Letta) routes data through an LLM on every read and write. At scale, that means 200-500ms latency per operation, token costs on your memory layer, and a runtime dependency you don't control.<p>Mnemora takes the opposite approach: direct database CRUD. State reads hit DynamoDB at sub-10ms. Semantic search uses pgvector with Bedrock Titan embeddings — the LLM only runs at write time to generate the embedding vector. All reads are pure database queries.<p>Four memory types, one API: 1. Working memory: key-value state in DynamoDB (sub-10ms reads) 2. Semantic memory: vector-searchable facts in Aurora pgvector 3. Episodic memory: time-stamped event logs in S3 + DynamoDB 4. Procedural memory: rules and tool definitions (coming v0.2)<p>Architecture: fully serverless on AWS — Aurora Serverless v2, DynamoDB on-demand, Lambda, S3. Idles at ~$1/month, scales per-request. Multi-tenant by default: each API key maps to an isolated namespace at the database layer.<p>What I'd love feedback on: 1. Is the "no LLM in CRUD path" differentiator clear and compelling? 2. Would you use this over Mem0/Zep for production agents? What's missing? 3. What memory patterns are you solving that don't fit these 4 types?<p>Happy to answer architecture questions.<p>SDK: pythonpip install mnemora<p>from mnemora import MnemoraSync<p>client = MnemoraSync(api_key="mnm_...") client.store_memory("my-agent", "User prefers bullet points over prose") results = client.search_memory("output format preferences", agent_id="my-agent") # [0.54] User prefers bullet points over prose Drop-in LangGraph CheckpointSaver, plus LangChain and CrewAI integrations.<p>Links: 5-min quickstart: <a href="https://mnemora.dev/docs/quickstart" rel="nofollow">https://mnemora.dev/docs/quickstart</a> GitHub: <a href="https://github.com/mnemora-db/mnemora" rel="nofollow">https://github.com/mnemora-db/mnemora</a> PyPI: <a href="https://pypi.org/project/mnemora/" rel="nofollow">https://pypi.org/project/mnemora/</a> Architecture deep-dive: <a href="https://mnemora.dev/blog/serverless-memory-architecture-for-ai-agents" rel="nofollow">https://mnemora.dev/blog/serverless-memory-architecture-for-...</a>

pricingperformancedocumentationapi

Integrations

SlackTrelloJiraGitHubGoogle DriveZapierNotionMicrosoft TeamsDiscordAsanaSalesforceAWSAzureTwilioStripeShopify

Categories

AI/MLDevOpsDeveloper Tools

Repository Audit Available

Deep analysis of letta-ai/letta — architecture, costs, security, dependencies & more

View Full Audit

Letta Alternatives

Compare similar framework tools

All framework Tools

Browse the full category

Frequently Asked Questions

Is Letta free?▼

Yes, Letta offers a free tier. Pricing found: $0 /month, $20 /month, $100/mo, $200

What are the main features of Letta?▼

Key features include: Persistent agents instead of stateless sessions, Always improving and learning, Own your memory and port it across models, Chat from any device, deploy on any environment, Context Constitution, Introducing Context Repositories: Git-based Memory for Coding Agents, Continual Learning in Token Space, Skill Learning: Bringing Continual Learning to CLI Agents.

What is Letta used for?▼

Letta is commonly used for: Persistent agents instead of stateless sessions.

What does Letta integrate with?▼

Letta integrates with: Slack, Trello, Jira, GitHub, Google Drive, Zapier, Notion, Microsoft Teams, Discord, Asana.

Is Letta open source?▼

Letta has a public GitHub repository with 21,824 stars.

What are common complaints about Letta?