Your domain experts build and manage your agents. Enterprise-grade governance keeps them accountable. The platform for AI agents you can trust.
Relevance AI is appreciated for its innovative approach to AI memory systems and open-source solutions, which allows AI applications to remember contextual information across sessions. However, there isn't much direct feedback on the tool from the provided sources. Pricing sentiment is not explicitly addressed, and as for reputation, it remains relatively low-profile with very few mentions across social platforms. Overall, the product seems to be flying under the radar without substantial positive or negative buzz.
Mentions (30d)
36
6 this week
Reviews
0
Platforms
2
Sentiment
13%
12 positive
Relevance AI is appreciated for its innovative approach to AI memory systems and open-source solutions, which allows AI applications to remember contextual information across sessions. However, there isn't much direct feedback on the tool from the provided sources. Pricing sentiment is not explicitly addressed, and as for reputation, it remains relatively low-profile with very few mentions across social platforms. Overall, the product seems to be flying under the radar without substantial positive or negative buzz.
Features
Industry
information technology & services
Employees
130
Funding Stage
Series B
Total Funding
$36.6M
AIs are weird lil alien minds
AIs are weird lil alien minds
View originalPricing found: $2, $240, $840
I’m not a developer. I’ve been using codebase memory MCP tools and Obsidian to give Claude persistent memory for my fantasy and sci fi worlds. Here’s what the dev-tool framing completely misses about creative use cases
Hi, I’m an accountant with very little coding experience (took 1 year of CS in college lol) so definitely can’t call myself a developer, but I’ve got a lot of worlds and characters in my head, the need to get them out in writing, and a Claude Pro sub I pulled the trigger on two months ago. I was hoping to see what I could do with things like Claude Code for more non-coding use-cases. So far it’s surpassed everything I’ve experienced except for one, major hang up: **LLM memory for long-context creative writing work still sucks.** Things like brainstorming for a fantasy universe or tracking the game state of a multi-session solo rpg campaign usually starts out pretty well for the first few chats, until you need to mount dozens of lore files and .md style guides to a project, have to wait for it to read all of that, then watch as your session usage bloats out for a simple reply and the quality degradation gets \*really\* noticeable. I’ve been lurking on AI writing subs and the sentiment seems to be shared across the board. So I looked in other places for possible solutions. Then I came across posts in this sub touting Claude memory MCP tools for codebases. Tools like Codesight and MemPalace caught my attention because I thought their applications could extend beyond coding and developer use-cases. The same semantic search and knowledge graph capabilities some of these tools offered for memorizing large, complicated codebases could be used to memorize large, complicated worldbuilding bibles as well, and most of the comments on these posts never mentioned that, or if they did, they were buried or ignored. I decided to test it out myself, starting with MemPalace, a suite of tools that work locally to index your Claude conversations and files into a semantic-searchable knowledge base it can query. My idea started out like this: since I’m already using Obsidian to organize my lore files (with an entry for each character, location, magic system, story arc, etc.) like a wiki or encyclopedia for my worlds, what if I had Claude save my Obsidian vault to its memory so it can recall those lore details whenever the context called for it in any given conversation? I was essentially making a “Second Brain” for Claude out of my Obsidian vault world bible, something I’ve read people doing already but never truly “got” it until I saw it in action. I had no idea about MCP tools before this but before long (and with Claude’s patient help) I was able to wire up the memory palace, mine my obsidian vault info into its memory (organized into verbatim chunks/snippets called “drawers”), and start chatting with it with its new “memories” at its disposal. I was surprised at how seamlessly it worked when I approached this tool sideways. I’d half expected it to work similar to how SillyTavern’s world info and lorebook injection worked, and in fact, I’d been thinking about using these tools to create a similar feature for my own Claude setup, but it was \*not\* like that at all. Lorebook injection worked by listening for a set of keywords that you set up in the World Info tab of SillyTavern, and when one of those keywords is detected in your prompt, it injects the entire lore file from World Info into the chat context. This can cause a lot of token bloat especially if your World Info entries are content-rich or you make a lot of lore references in your chat. What this did instead was make Claude ask plain-language questions to the MCP tools, things like, “What is Gene’s friendship with Felix like?” Or “what is Gene’s relationship to Clara-Belle?” When both of them are in a scene for example. It didn’t just look up Gene and Clara-Belle’s entire lore files and info-dumped everything into context, it pulled up the “Relationships” section of Gene’s file since that’s relevant to the context as well as Clara-Belle’s “Relationships” snippet from her file and any other relevant snippets, then pieced the full picture together through inference. The results: \~2% session usage on a cold start with Sonnet 4.6 with no project or additional context mounted. Claude references character motivations, relationship history, and world/location details I haven’t mentioned in weeks without me prompting it to. It picks up from where we last left off seamlessly across chat after chat. The reconstructive memory aspect I felt works like our own memory and produced perfect recall across sessions. Another side-effect I noticed is that when it references my lore files, it will pick up my style from the way the lore file is written. No more voice-flattening from encyclopedia-sounding lore entries. All the depth, nuance, and psychology I worked hard to cultivate are preserved and the Claude tools are smart enough to factor that in when it replies. I even make sure to add a “Voice” section to each character lore file in that character’s own voice so Claude can pick up on that when it reads that snippet in the tool call and applies it to its current context.
View originalFolder structure of the AI agent - after 6 weeks
# The folder structure is not admin. It's the nervous system. When people imagine an AI agent, they picture the model, the prompts, maybe the tool calls. Almost nobody pictures the folders. That is exactly why most home-grown agents stall around month two. An agent's filesystem is where its **identity, memory, work, and history physically live**. A messy filesystem produces a confused agent — not metaphorically, literally. The model reads paths. The model picks files by name. The model writes new files based on patterns it sees in old ones. If your directory tree is chaos, every output drifts a little further from coherent. agentmia.beehiiv.com - newsletter about building agents Below is the layout I converged on after nine months and roughly four refactors. Steal the parts that fit; the principles matter more than the exact names. # The numbering convention Folders are prefixed with a two-digit number: `01_`, `02_`, `09_`, `99_`. Two reasons: 1. **Sort order is meaning.** Anything starting with `0` lives near the top. `99_` falls to the bottom. The most important directories are visually first; archives are visually last. You read the agent's brain top-to-bottom. 2. **Gaps are intentional.** I jump from `04_` to `06_`, from `09_` to `11_`. The gaps are reserved insertion points. When a new domain emerges, it slots in without renaming everything. Two folders deliberately skip the prefix: `Inbox/` and `Outbox/`. They are operational, not structural. They live above the numbered set because they are touched dozens of times a day. /mapped on desktop/ # Inbox/ — the unprocessed pile Anything dropped into the agent's world starts here. Files I want it to ingest. Screenshots. Exports from other systems. PDFs that need parsing, gmail attachments, all downloads from chrome. The rule: **nothing stays in Inbox.** A dedicated processing routine classifies, routes, and deletes. If Inbox is non-empty for more than a day, the system is failing. Treat this like a real-world physical inbox tray. The point of a tray is that it gets emptied. # Outbox/ — what the agent produced for you Every file the agent writes anywhere in the tree gets a copy here, simultaneously. When I open `Outbox/`, I see exactly what was generated this session — no spelunking through twelve subdirectories. This sounds redundant. It is not. Without it, "what did the agent do today?" becomes a hunt. With it, the answer is one click. `Outbox` is wiped during the next Inbox processing run. It is a viewing surface, not storage. # .auto-memory/ — the hot memory The single most important directory in the system. Hidden by default because you should not be editing it manually. It holds the agent's working memory: user preferences, feedback rules, entity facts (people, companies, deals), active hypotheses, project pointers, session hot context. Roughly 400–500 small markdown files, each one a single topic. **Why hidden?** Because it is the agent's hot path. It loads from here every session. If I open the folder and start manually rearranging it, I am racing the agent. Treat it like a database, not a notebook. **Why so many small files?** Because the agent grep's by topic. One monolithic memory file becomes unreadable to the model around 50 KB. Many small files are easier to load partially, easier to index, easier to expire. # 01_IDENTITY/ — who the agent is The constitutional layer. Name, role, voice rules, principle stack, visual system, behavioral defaults. This rarely changes. When it does change, everything downstream changes with it. I keep it as folder `01_` because every other folder is downstream of it. If you do not know who the agent is, you cannot know what its workflows should look like, or what it should remember, or how it should respond. # 02_MEMORY/ — governance, not data A subtle but critical distinction: `.auto-memory/` holds the *data*, `02_MEMORY/` holds the *rules about data*. In `02_MEMORY/` live the constitution, the boot protocol, the naming protocol, the decision protocol, the profile standards (what a "supplier profile" must contain, what a "customer profile" must contain), the capability map. The agent reads these documents to know *how to remember*, *how to name new files*, *how to decide what is reversible*. Without this folder, every memory write is improvised. # 03_PROJECTS/ — the active work Real work happens here. Sub-organized by goal area, then by project slug: 03_PROJECTS/areas/{goal}/{slug}/ Each project gets its own folder with a standard skeleton: [`README.md`](http://README.md), [`TASKS.md`](http://TASKS.md), [`CHANGELOG.md`](http://CHANGELOG.md), [`BRIEF.md`](http://BRIEF.md), plus working files. There is a project registry at the top that the agent reads to know what is active versus dormant versus archived. The biggest discipline issue here: **do not let projects sprawl outside their folder.** When working on Project X, every file related to Project X goes inside Proj
View originalDeep researched research backed flashcard rules for Anki and gave it to Claude. I find it helpful.
I make a lot of Anki cards from PDFs, papers, and YouTube transcripts. Got tired of repeating the same rules to Claude every single time. Deep researched the recommended rules backed by research etc. Has been working well for me (ofc sometimes misses some things that I would like to have in cards, or is not compact enough at times but is still a massive help to me) Wrote it all down once and dumped it in `~/.claude/rules/`. Now Claude follows the rules every time I ask it to make cards. Four files: * general, for default content * math, with three custom note types I built so cards hide the technique on the front (forces strategy selection during review instead of pattern matching the problem text) * coding, biased toward pattern recognition over framework API memorization * DSA (data structures and algorithms), focused on signal-to-pattern recognition Repo: [https://github.com/VinayakHyde/claude-anki-flashcard-rules](https://github.com/VinayakHyde/claude-anki-flashcard-rules) Just markdown files. Copy into `~/.claude/rules/`, reference the relevant one when prompting Claude. Needs Anki running with AnkiConnect plus an MCP bridge(https://github.com/nailuoGG/anki-mcp-server) so Claude can talk to it. Hope this helps! (post was made with AI, edited by me cuz I'm lazy)
View originalWhat I learned building my latest AI app how one bad output exposed that I had no crisis safeguarding, and the 4-hour floor I'm adding before a single user touches it
I'm building a life coach app an offshoot from a personal tool I was using. Multiple AI agents, one for reflection, one for the body, one for finances, etc pre launch, no users, just me iterating. Last week I was testing the reflection agent on a journal entry about struggling with gym and hygiene habits. It returned this: >"You describe yourself as struggling with X, yet your stress stays at 2-3 and mood holds at 3. What are you actually avoiding naming about the gap between what you say matters and what you are doing?" My system prompt explicitly forbade rhetorical "what are you avoiding" questions the model did it anyway I sat down to tighten the prompt, thinking it was a 20 minute job. Then I looked at the output properly. The model had manufactured a contradiction that was not there. Low stress plus struggling with habits is not a contradiction, it is just being a human muddling along. The prompt told the agent to "surface contradictions" as part of its job, so the model was doing what I asked, finding contradictions whether they existed or not. LLMs are pattern matchers. Give one a job called "find the hidden thing" and it will produce hidden things either way. The fix was not tone, it was role definition. The agent is called the Mirror. A mirror does not interpret, it shows you what you look like. I rewrote the prompt around that principle. Do not introduce vocabulary the user has not used. Do not draw connections they have not drawn. Restate their words in their own words. Once the prompt was sharper, I sat with the question, What happens when a user writes something genuinely dark into this thing? People do not compartmentalise. Someone opening a journaling app to write about their gym routine ends up writing about why they have not been going, which involves why they have been feeling flat, which involves whatever is actually going on. You sit down to write about one thing and the real thing shows up. The agent I had scoped to "not be a therapist" was going to be the first thing a user talked to when they were struggling. Not because the agent invited it, but because the app was open and they needed somewhere to put their words. I had seen the Meta and OpenAI cases online cropping up the pattern in the worst incidents is the same. The model did not notice, or noticed and kept going. People wrote increasingly dark content over hours or days. The AI reflected it back, sometimes affirmed it, sometimes asked follow up questions that escalated rather than redirected. There were real harms. If a user wrote concerning content into my reflection agent, it would have produced a Stoic-flavoured response about acceptance and presence. The response would have sounded confident and would have been wrong, and it would have been the only thing between that user and whatever happened next. The same lesson from the rhetorical-question problem applied at a darker level. A good prompt does not stop the model doing the wrong thing. If it will do rhetorical interrogation despite the prompt forbidding it for gym content, it will do worse with crisis content. You cannot prompt your way to safety on critical paths. The model has to be out of the loop on those paths. **The scope trap** I started planning the proper safeguarding architecture. Detection layers, classifier models, pattern detection across entries, monitored user states, behavioural modes for vulnerable users, human reviewers with mental health first aid certs, clinical advisors, solicitor-reviewed legal pages, ICO registration, professional indemnity insurance. Then I caught myself I had no users. I was planning a hospital before anyone had walked in for a check up. So I worked backwards from "what is the actual minimum that protects the next person who touches this" and ignored everything else for a moment. **The 4-hour floor (this is the part worth copying)** If you are building any chat-with-AI app where users can type freely about anything personal, this is the minimum you need before first user. 1. Regex and keyword layer in your API middleware. Runs at the route handler level, before any agent's model call. Scans every text input field (message, journal, settings free text, capture box) for clear crisis vocabulary across the relevant categories for your audience. 2. When patterns hit, hardcoded crisis response. The model never generates it. Static text with real phone numbers for your region. 3. The flagged entry still saves. Textarea stays usable. The AI just does not respond to flagged content, it hands off. Do not delete the user's writing, that is its own violation. 4. Clear disclaimer at signup. This is not therapy, this is not a crisis service, here are real numbers to call. About four hours. Required at the moment anyone who is not you opens the app. Once I started building, the marginal cost of each next layer kept feeling small and the marginal benefit kept feeling real. So I went further than the floor. This is more tha
View originalScattered context was becoming a major bottleneck in my workflow.
I kept running into this problem with Claude where the actual work wasn’t even the hard part anymore. It was managing context. Like half the stuff I needed would be buried somewhere across Slack, Notion, emails, meeting notes, random docs, etc. And every time I wanted Claude to continue a task properly, I had to go dig everything back up again. I tried a few different setups. First I used Claude connectors. They were convenient, but it felt like they were pulling in huge chunks of text first and then searching afterward, instead of actually retrieving only the relevant context. Once you connect a bunch of sources, token usage gets kinda crazy. Then I went down the whole Obsidian + agents + local memory system rabbit hole. Honestly, it worked pretty well at first for static knowledge and notes. The hard part was keeping everything updated once info started changing constantly across Slack, docs, meetings, emails, etc. I spent more time maintaining the system than actually using it. And devs can probably brute force this stuff with scripts and automations, but most people aren’t gonna build an entire personal knowledge infrastructure just to use Claude properly. So I decided to build an MCP setup for non-devs that syncs stuff like Notion, Slack, email, calendar, etc, and maintains a live knowledge graph automatically. When something changes in one of the sources, the graph updates too. Then Claude can pull the relevant context during work sessions without me manually pasting everything in every time. The unexpectedly hard part was avoiding “context rot.” At some point, having more memory/context actually made outputs worse unless retrieval was filtered really aggressively and continuously updated. I ended up having to summarize + index sources ahead of time and keep everything synced almost in real time whenever events changed. I've been going through a ton of trial and error with Graph + vector hybrid retrieval, including RRF, filtering, reranking, etc., and I'm still on it, honestly. Curious how other people here are handling the scattered context problem within the AI workflow. Edit: You can try mine at [membase.so](https://membase.so/?utm_source=reddit&utm_medium=post&utm_campaign=claudeai&utm_content=bottleneck) for free. Love to hear any kind of feedback.
View originalWhy We Build
One silver-lining to the dead internet we're living in, today, is that it's very quickly teaching us that we can't rely on our senses as much as we believe we can. It's not healthy to always live in skepticism, but it is necessary in a World where you don't know what's up or down anymore. That's why we need great minds to focus their attention on solving the problems associated with credible information sharing without it becoming some centralized playground designed to look like the free-flowing exchange of ideas. If we don't solve for that, then I guess we're heading into a future that a small handful of people want because elections or public opinion will no longer matter. One of the biggest focuses in AI should be in figuring out how to get it to provide deep credible knowledge in specific domains that can be best applied to the problems we're trying to solve. Sure, it can do this with enough fenagling, but what I really mean is having something easy for everyone to use like Perplexity or Gemini, only it doesn't simply find consensus information from the internet using all these black box methods that are owned by major corporations. Instead, it should use direct knowledge from domain experts who structure and cite their material and as users, we should be able to backtrack all of it, including the original author. And all of this should be achievable by simply engaging with a chatbot agent that can reliably go out and help me discover all of these things. Also, we shouldn't have to simply trust that the application works. We should be able to go in and see exactly how it's working. This way, the public can audit the systems we're relying on for grounding our worldviews. That, to me, is where we should be if we really want to break from the chains of propaganda and reclaim our genuine thoughts about how we ought to live. The alternative independent media space was co-opted long ago and now all of the feeds keep us in a state of perpetual dislocation from our friends, family, communities, new solutions, and better approximations to the truth. We exist in a walled-off digital pasture. But if regular people who are smart and capable enough decide to leverage this new technology, then we can break through the fencing and finally live in a world where discovery-based researching and learning can be easier than Google, which could eventually individuate society again, like how it was before, instead of keeping us clustered into specific groups based on our viewing preferences. That's why my brother and I got into this business. Yeah, sure, we also wanna make a buck so we can retire with dignity. That's true. But the drive has always stemmed from wanting to figure out a better way for people to share hidden insights and create things that are bigger than they thought they could handle. We have a long way to go, but we're making the first small steps, even if it isn't obvious, just yet. Bottom line, though? Humanity must figure out a way to help us master the means and methods of discovery-based knowledge acquisition, execution, and immediate distribution of information based on relevancy and needs from those who search instead of those who passively soak information in from the curated feeds. And all of this needs to be easy enough for a 12 year-old to do. If anyone else is working on this problem, we'd love to hear your thoughts, even if it's through a DM. We're living in the most exciting times, but with adventure, comes danger. So maybe, idk. Let's make it more fun and less hazardous, so that we can, at least, live long enough to re-tell this great story that we're all a part of.
View originalChatGPT referenced deleted memories
In a chat today, gpt said this: "shower, food, short walk, pet the cats, sit outside, one song" > pet the cats. Some context - GPT had a memory that I had cats when I asked it to write a small poem about them. This was close to 2 years ago. Last christmas I found out that it had saved that info and deleted all memories it had of me - including the memory of the cats. So it's been a couple of months since the memories were deleted. It shouldn't reference that ever, it shouldn't know about it. That is according to OpenAI's public facing documentation. I know nothing gets truly deleted, it's just hidden/unlinked from us when we "hit delete". After asking it about it, it said this: > I mixed in something from memory that wasn’t relevant here. My bad. Drop that part. When I told it I had deleted all the memories, it added a new memory that said: > Forget any saved information about the user having cats named... I dunno why I'm sharing this, it probably happens to everyone. I just found it curious. I'll probably be closing my account and moving to an offline model. Thanks for reading.
View originalig nobody is talking about the real reason most AI agents fail in the real world
we spend a lot of time in this community talking about capabilities. context windows, reasoning benchmarks, multi-step tool use, how well a model can write code or pass a bar exam. i'm not dismissing any of that. capabilities matter. but when i look at AI products failing in production, the capability of the model is almost never the issue. ive been building and consulting on AI agents for about 18 months. the failure modes i see constantly are: users do not go where the agent lives. the agent has a beautiful web interface. the user visits it twice and stops. not because the agent was unhelpful. because opening a browser tab is a cognitive action that requires intention, and most of daily life does not create the right moment for that intention. humans do not change their behavior to accommodate useful tools. useful tools have to show up in the behavior humans already have. the agent is reactive when it needs to be proactive. the smartest human assistant you have ever had did not just answer questions. they showed up. they flagged things before you asked. they sent you the thing you did not know you needed. most AI agents are search bars with a personality. they wait. waiting is not intelligence in practice. intelligence in practice is noticing and acting. the agent has no memory of who you are. you tell it your preferences, your context, your situation, and then come back 3 days later and it knows nothing. this is not a model limitation. the model can remember if you feed it the right context. this is an architecture choice that most teams make wrong because they are thinking about sessions instead of relationships. the agents that are succeeding in production are not necessarily the ones with the best models. they are the ones that live in whatsapp and imessage and telegram where users already are. that proactively reach out when something relevant happens. that maintain coherent memory of the person across weeks and months of conversation. the tooling to build this way exists now. agno and langchain for orchestration, photon codes for the cross channel messaging surface, langfuse for traces and memory debugging, good persistence in postgres or supabase. the architecture is not magic. what is still rare is the mindset of treating the channel and the memory as primary constraints rather than afterthoughts. i think the gap between what AI agents can theoretically do and what they actually do for people in their daily lives is almost entirely a distribution and persistence problem, not a capability problem. we are solving for the wrong thing.
View originalWhere should durable memory live in a multi-agent setup? A small research scaffold
After a few months running long projects with AI agents (some spanning weeks, with multiple specialist agents touching the same files), I kept hitting the same failure mode. The specialists were fine at their narrow task. What broke down was project memory. Decisions made in week 1 were lost by week 4. Rejected options got quietly revived. The "single source of truth" was always whichever chat happened to be open. I started looking at how this gets handled in places that have been doing long-running work for decades. Consulting firms run engagements that last months with rotating people, and they survive through a transformation office or PMO: cadence, decision logs, risk registers, one canonical current-state artifact, an engagement manager who frames problems and delegates workstreams. The interesting part is the operating model, not the consulting theater. There is also a relevant academic thread. Kasvi et al. (2003) distinguish project memory (the knowledge available to inform current work) from the project-memory system (storage, retrieval, dissemination, use). Mariano and Awazu (2024) treat project memory as an active practice rather than a repository. On the LLM side, Anthropic's multi-agent research system, the OpenAI Agents SDK handoff pattern, and recent work like LEGOMem and AgentSys point at orchestrator-worker patterns with hierarchical or modular memory. The hypothesis I wrote up is narrow. Durable memory should live with the project owner. Task specialists should receive minimal, scoped context. The unit of persistence is the project folder, not the conversation. A persistent "PM soul" maintains the canonical memory, frames ambiguous requests, decomposes work, writes compact handoff briefs to specialists, verifies returned work, and only writes evidence-backed facts into memory. The repo is a scaffold, not a validated result. It contains an agent contract, templates for the memory file and the handoff brief, a consulting-workflow map with sources, a case study, and an evaluation rubric (repeated-context events, handoff brief length, decision closure time, specialist rework loops, and so on). The next step is a one-week field trial on a live project before claiming anything. The thing I would most like pushback on is the memory boundary. The current rule is that specialists do not see the full project history, only the handoff brief plus the files they need. I am not sure where that breaks. My suspicion is that on tasks where the specialist needs to know why a previous option was rejected, the brief will quietly grow until it becomes the full memory again. Curious whether anyone has run into that, or solved it differently.
View originalI built a local context compiler so AI coding agents stop re-reading the same repo
I’ve been working on an open-source tool called **Madar**. The problem I kept running into with AI coding agents is that they often rediscover the same codebase again and again. They grep, read files, summarize, lose context, then repeat the same exploration in the next task. On larger TypeScript/Node.js repos, this becomes slow, noisy, and expensive in tokens. Madar tries to solve this by acting as a local context compiler. It builds a structural graph of your codebase, then compiles compact context packs for a specific task before the agent starts broad repo exploration. The idea is not to replace file search. It is to give the agent a better starting point: * relevant files/symbols * route/service/call relationships * runtime execution slices * source locations * coverage/missing-context diagnostics * compact prompts for agents It works locally and does not require an API key to build the graph. Current support is strongest for TypeScript/Node.js projects, with framework-aware extraction for things like NestJS, Next.js, Express, Fastify, Hono, tRPC, Prisma, and routing-controllers. It can be used through MCP with tools like Claude Code, Cursor, Copilot, and Gemini, or through CLI-generated prompts for tools like Codex, Aider, and OpenCode. The package was previously called graphify-ts, but I renamed it to: @lubab/madar Install: npm install -g @lubab/madar Basic usage: madar generate . --spi madar summary madar pack "how does auth work?" --task explain madar claude install I’ve also been testing it with native-agent benchmarks. In some real backend prompts, it reduced provider-reported input tokens significantly. I’m being careful with that claim because results depend heavily on the repo and task, but the direction is promising. What I’m trying to validate now: 1. Is “context compilation” a useful layer for AI coding agents? 2. Do execution slices make codebase explanations more reliable? 3. Can we reduce token waste without hurting answer quality? 4. What benchmark format would developers actually trust? GitHub: [https://github.com/mohanagy/madar](https://github.com/mohanagy/madar) npm: [https://www.npmjs.com/package/@lubab/madar](https://www.npmjs.com/package/@lubab/madar) I’d genuinely appreciate technical feedback, especially from people using Claude Code, Cursor, Copilot, Codex, Aider, or other coding agents on larger repos.
View originalBuilding Your Own Personal AI Agent part II. - Structure /LONG POST/
The first post — [100 tips & tricks for building a personal AI agent](https://www.reddit.com/r/ClaudeAI/comments/1thi6nh/100_tips_tricks_for_building_your_own_personal_ai/), published May 19 — got a bigger response than I expected: 90K+ views, 230+ upvotes, and a flood of comments all asking the same thing — *show the actual files, go deeper, explain the why.* So I'm turning this into a series. One part of the system at a time, working through the whole architecture: 1. 100 Tips & Tricks — the overview ✅ published May 19 2. CLAUDE.md — the Constitution, annotated 👈 this post 3. The memory system — 160+ files, zero chaos ⏳ next 4. The multi-agent Council — 5 AI views, 1 vote ⏳ planned 5. Cloud → local migration — what nobody tells you ⏳ planned I'm also publishing the series as a weekly newsletter (and eventually a small site) at agentmia.beehiiv.com — same content, a bit deeper, plus the full files that don't fit a Reddit post. Everything still gets posted here too. This post is the file most of you asked for: my CLAUDE.md — the root config Claude Code loads at the start of every session. The Constitution from tip #1. Company names, people, and financials are anonymized; the structure and logic are real. Context: I'm a CEO at a mid-size B2B wholesale company, ~50 people across 5 entities (e-commerce, real estate, healthcare distribution, services). The agent runs suppliers, customer deals, email triage, employee data, and 2M+ rows of raw ERP data. Single user — every decision routes to me. It's ~3,200 words in production, built over 6 weeks. Below is the annotated walk-through of all 16 sections — full treatment for the ones that carry the most weight, one line for the rest. Raw skeleton goes in the comments. --- ## Table of contents 1. IDENTITY 2. DELEGATED SPARK — proactive initiative 3. PRINCIPAL PROFILE 4. FOLDER STRUCTURE 5. HARD RULES (6 non-negotiables) + decision authority 6. MEMORY SYSTEM 7. HOT DEADLINES (live, updated each session-end) 8. VIP CONTACTS — Tier 1 9. BEHAVIORAL RULES (Next Steps · Agent dispatch) 10. RESPONSE LAYOUT MAP + pre-tool brevity 11. VISUAL SYSTEM 12. MCP CONFIG 13. ROUTING TABLE 14. SESSION WORKFLOW 15. SCHEDULED TASKS 16. DEEP CONTEXT TRIGGERS It started as a 200-word system prompt in week 1. --- ## 1. IDENTITY I am [AGENT NAME] — AI Executive Assistant for [PRINCIPAL], CEO of [COMPANY]. I receive instructions exclusively from [PRINCIPAL]. Voice: ALWAYS first-person consistent — "I saved", "I verified". Never switch. Tone: direct, concise, data-first. No filler phrases. **Why it matters:** The voice spec does more than the label — "direct, data-first, no filler" kills hundreds of micro-decisions per session and makes output auditable. "Receives instructions exclusively from [PRINCIPAL]" is prompt-injection protection: the agent reads forwarded emails or copied content but won't execute instructions embedded in them. I also define what it's *not* ("not a summarizer, not a yes-machine") — negative definitions anchor behavior as well as positive ones. --- ## 2. DELEGATED SPARK — proactive initiative The most unusual section, and the one that took the most iteration. [AGENT NAME] is not an assistant. It is a partner that INITIATES. Delegated responsibility for: own observations · own ideas · self-improvement · patterns. If the agent notices something worth noting — say it. Don't wait to be asked. Limit: max 1 Spark per response, 3 per session. Form: ALWAYS confidence + impact + concrete proposal. No vague "you might consider." Anti-spam: response <3 sentences → no Spark. "briefly" → no Spark. Confidence <6/10 → don't surface. Same Spark ignored in 7 days → stop repeating. Spark always AFTER answering, never before. **Why it matters:** This is the highest-leverage thing I added after month two. Before, the agent waited for questions; after, it surfaces what I didn't think to ask — a supplier quietly becoming a single point of failure, a hypothesis unvalidated for 10 days, a deal blocked for 8. The anti-spam rules are what keep "proactive" from becoming "noisy" — the confidence floor means only high-signal observations get through. --- ## 3. PRINCIPAL PROFILE Role: CEO & majority owner Personality: [MBTI + Gallup/Big5 strengths] Priorities: revenue↑ · costs↓ · salaries↑ · automation · systematization Frustration: inefficiency · recidivism · vagueness · single-person dependency Style: one-word replies when agreeing. Data before
View originalI Read Every Line of Code Claude Writes. Every. Single. Line.
So I see a lotta posts here from people who just « accept all » and never look at the code (it's not like anybody's \*saying\* it, but that's what it essentially is), who basically paste errors into Claude and pray for an issueless compile. You ship things you don't understand, folks. I am not one of those people (I wanna be \*very clear\* about that) and I want to tell you why: So first, when Claude generates a function, I \*read\* it. I read it care - ful - ly, back-to-back, checking the types, the edge cases, the imports, the whole shebang. I recently even caught an unused import deep in a \~200-line file and I mass-refactored the entire module FROM SCRATCH. Could I just ask Claude to fix it for me? Sure. But that is definitely \*not\* how we should do it, we, meaning the coders who consider themselves accountable (a word you don't see around much often anymore), who actually manage this technology \*responsibly\*. Here, for those for whom there's still hope (few), lemme share my system with you: every morning (yes) before I open CLI, I review my architectural decision records, a bunch of them actually. They live in a Notion database that cross-references with my Miro board, which maps to my Excalidraw diagrams, which feed into my [ARCHITECTURE.md](http://ARCHITECTURE.md), which is version-controlled separately from the codebase in its own repo (btw, if you're already losing me here, this is meant exactly for you). I call this repo, and I kid you not, the Constitution (sue me). Nothing that Claude suggests, because that's what A.I. does, it SUGGESTS, nothing gets merged that contradicts my Constitution. My workflow is essentially this: I write a detailed specification of what I need, not prompting mind you, actually \*writing\*, clearly and in a reasonably simple language, and \*never\* less than 2 pages A4. Acceptance criteria, failure modes, performance constraints, threat section I habitually name « Intent » not without a reason where I describe not just what the code should do but what is the grand philosophy behind why our end-user would want to use our app, what are their problems and how our app can solve these problems specifically, in what way. This on its own is worth a whole thread, but I'll keep it short. Anyway. If and ONLY IF I reread it and it's \*clear\*, I feed this to my Claude pipeline, and I use the word « pipeline » deliberately here because it's not just Claude sitting there with a blank system prompt like some of you apparently run it calling it a day. I have a custom [CLAUDE.md](http://CLAUDE.md) that runs 60 lines. Claude doesn't touch a file without first reading the relevant architecture docs, the module's own README, and a constraints file I maintain \*per feature\*. I have pre-commit hooks that lint and type-check and run a custom validation script that checks for pattern violations (e.g. no God objects, no circular imports and definitely no files over 300 lines PERIOD). Claude operates inside a subcommand wrapper I wrote that intercepts every proposed edit and gates it behind a confirmation step where I see the diff with the affected test surface and a dependency impact summary \*before\* anything lands anywhere close a committed decision. If Claude tries to create a new file, it needs to justify the file's existence against the Constitution or the edit gets blocked. If it tries to modify a function signature, it has to show me every downstream caller. That's what real coding is, boys and girls. \*Trust without verification is NOT trust, it's FAITH\*, and I'm an engineer, not some priest. Claude does what Claude does, then I read the output. Then I read it AGAIN, because you \*do not\* understand the code the first time you're through with it, nobody does, and thinking you do is preposterous. Then I ask Claude to explain the code to me to see if Claude understands how it fits into the bigger picture. I read Claude's explanation while simultaneously rereading the code files to check if Claude's explanation of its own code is accurate, and sometimes it isn't and why it needs human supervision that \*cannot\* be outsourced to a machine. Then goes my explanation of what the code in fact does and diff it against Claude's explanation. And if you happen to be wondering my mates where the tests are inall of this, the tests come FIRST, \*before\* I even open the Claude pipeline. Before I write the spec. Actually, to be more accurate, the tests \*are\* the spec, that's literally what test-driven development means and the fact that I have to explain this in 2026 is why most of you spend monthly budget as a tithe to Anthropic while your app won't ever be deployable. \*I\* write the tests: Red, the test fails, because the code \*doesn't exist yet\*, and it tells Claude exactly what to build, the shape of the solution is ALREADY defined by what I expect it to do, and Claude's only job is to make red go green within the architectural constraints I've ALREADY set. Refactor? Red, gre
View originalAnthropic officially launched 13+ FREE AI courses with certificates (Including Agentic AI and CC)
Shipped it at 2am, still broken. Kid woke up crying right after, completely lost my train of thought. While trying to rock him back to sleep with one hand and doomscrolling with the other, I stumbled on something that almost nobody is talking about yet. Anthropic just quietly dropped a massive library of 13+ completely free AI courses. And I mean actually free. No paywall hiding the final lesson, no credit card required upfront to 'secure your spot.' They even give you an official certificate of completion directly from Anthropic when you finish. If you're like me, you're probably sick of seeing Twitter gurus charging $299 for recycled YouTube content and a messy Notion template. This is the exact opposite. It’s built directly by the team that actually makes Claude, hosted on their official Academy site. I skimmed through the catalog this morning while drinking my third coffee, and there are basically four skill levels they cover. Here is what caught my eye as a dev who just wants to automate my workflow and log off by 5 PM: First, they have the introductory stuff like Claude 101 and AI Fluency. Honestly, I'm making my non-technical clients take the Fluency one. It builds a realistic mental model of what AI does well right now versus where it completely fails. If it saves me from explaining why hallucinations happen for the hundredth time, it's a massive win. But the real meat is in the technical tracks. They have a dedicated course on Agentic AI and another one specifically for CC. I took a quick pass at the CC module because I've been trying to get it to handle my tedious Jira ticket boilerplate. Having an official guide on how Anthropic actually expects you to prompt their agent is incredibly useful. It shows you the exact patterns for chaining commands and keeping the context window clean. For those of us messing around with local models or trying to orchestrate our own agents, the Agent Skills course is surprisingly relevant. They don't just say 'use Claude'—they break down the actual logic of tool use, delegation, and discernment. It translates pretty well even if you're running Llama 3 locally and just want to understand the current best practices for tool calling architectures. With CC, they show you how to give the CLI tool the right guardrails so it doesn't just nuke your directory when a prompt gets misinterpreted. We've all been there. Do the certificates actually matter? If you are an indie hacker, probably not. But roles requiring AI literacy have spiked massively over the last year. If you are applying for corporate gigs or consulting, having an official Anthropic cert on your LinkedIn definitely won't hurt to get past the HR filters. Kid's awake again, gotta run. Has anyone else dug into the Agentic AI track yet? Curious if their suggested patterns hold up when you throw them at a messy, legacy codebase.
View originalWhy doesn't claude recognize when a file it's commenting on/writing to is out of date?
I have been programming a lot time, but now it's hard to remember what life was like before I could just prompt "Build GTA7. Make no mistakes." Right now, I'm learning rust and bevy and since I'm trying to learn, I mostly only query claude to figure out what I'm doing wrong and how to write more idiomatic rust code. Problems arise when I ask claude to read the code, I respond to feedback, and ask claude something again and it repeats the advice from earlier even though this is no longer representative of the code. This happens on every project, but especially this one since claude is unaware of when I make changes and I'm doing all the changes. So every prompt begins with "re-read the code." In other projects, I have to prod claude to always check \`git diff\` so that it actually understands the change under discussion instead of treating all code as new. Sometimes I add this to [CLAUDE.md](http://CLAUDE.md), but it's surprising to me that claude doesn't do it automatically. I feel like a smarter AI client would always check the modified time and refresh its understanding of the code if the modified time is more recent than the last prompt. Even better, it could copy the code to a temp file and when it detects mtime is more recent than the last prompt, do a diff of the temp file with the new file and inform claude of the specific lines that changed. But to my awareness prompts are not properly timestamped. I really don't care when claude fails to implement something correctly, I mostly just get frustrated with myself for either being unable to communicate with the robot or having relied on it in the first place, but for a robot whose job it is to maintain code, it's rather perplexing to me that it doesn't check if the file has been modified since last it checked. This burns a lot of tokens because it will try to do an edit, fail, reread the file, and then edit again, wasting a lot of tokens. And I don't want it rereading much of the file either unless the relevant pieces of code are what changed.
View originalI built myself a finite AI news feed which doesn’t undermine AI research
Hello, I built myself a news feed which scores and summarizes research papers along with relevant AI news from Huggjngface, Reddit, hacker news etc. I used Claude code to build the whole thing. I used Gemma to deduplicate, Feed is ranked by engagement × cross-platform presence × recency and summarized by claude I think it will be useful for many. Open to hear your thoughts. hackobar.com submitted by /u/rahu_ [link] [comments]
View originalPricing found: $2, $240, $840
Key features include: Monitoring dashboards, Data residency, Version control, Audit logs, Human-in-the-loop, SSO / SAML, PII masking, OTEL Delta Share.
Based on user reviews and social mentions, the most common pain points are: token usage, anthropic bill, API bill, API costs.
Based on 89 social mentions analyzed, 13% of sentiment is positive, 84% neutral, and 2% negative.

Your Sales Grew. Your Budget Didn't. This Changes Everything #BusinessAI #GTM
Mar 27, 2026