Meta is building personal superintelligence for everyone. Explore Meta AI, our latest model Muse Spark, AI research, and tools like Vibes for AI video
Meta AI is praised for its innovative developments in gesture-based control and integration with smart glasses, suggesting a strong focus on cutting-edge, user-friendly technology. The rollout of their standalone app and AI features in devices like glasses and headsets has been positively received, signaling enthusiasm for its tech-forward offerings. Pricing sentiments are largely positive, especially with frequent mentions of partnerships and wide accessibility without explicit complaints about costs. Overall, Meta AI enjoys a solid reputation for advancing AI technology and making it widely available, with significant installations and expansion noted globally.
Mentions (30d)
38
2 this week
Reviews
0
Platforms
3
Sentiment
15%
23 positive
Meta AI is praised for its innovative developments in gesture-based control and integration with smart glasses, suggesting a strong focus on cutting-edge, user-friendly technology. The rollout of their standalone app and AI features in devices like glasses and headsets has been positively received, signaling enthusiasm for its tech-forward offerings. Pricing sentiments are largely positive, especially with frequent mentions of partnerships and wide accessibility without explicit complaints about costs. Overall, Meta AI enjoys a solid reputation for advancing AI technology and making it widely available, with significant installations and expansion noted globally.
Features
Use Cases
Industry
information technology & services
Employees
77,000
9,909,080
Twitter followers
20
npm packages
40
HuggingFace models
Imagine controlling your devices with a subtle hand or finger gesture. Our cutting-edge research turns intent and muscle signals into seamless computer control. This breakthrough wrist technology is r
Imagine controlling your devices with a subtle hand or finger gesture. Our cutting-edge research turns intent and muscle signals into seamless computer control. This breakthrough wrist technology is redefining how we interact with computers—intuitive, precise, and ready for the https://t.co/2dXERZYqkY
View originalChatgpt vs catch agent
one of the things i’m being asked is why i use an ai executive assistant vs just chatgpt. here's how i see it: chatgpt amazing in drafting documents, emails, longer forms of content, images + general copywriting can be connected to many other tools brainstorming & ideation - great tool to think with about things, amazing general understanding of the world really shines in research - if i want to learn something or get instructions on how to do something (both for work or personal - from how to change things on meta ads to how to fix my washing machine) good for work and for personal catchagent shine on work related admin tasks available on imessage + slack + phone call focused / limited scope - only for work proactive no code, no images, no data analysis, no long form content stronger integration with mail, calendar and notion more responsive to feedback - one chat and one context can speak with other people over email or text bottom line: chatgpt - research, email drafts, long form content or data analysis (tool), personal use case catchagent - calendar, email, tasks, delegation vs other people in or out of the org (admin assistant) submitted by /u/CartographerFeisty66 [link] [comments]
View originalWhat I learned building my latest AI app how one bad output exposed that I had no crisis safeguarding, and the 4-hour floor I'm adding before a single user touches it
I'm building a life coach app an offshoot from a personal tool I was using. Multiple AI agents, one for reflection, one for the body, one for finances, etc pre launch, no users, just me iterating. Last week I was testing the reflection agent on a journal entry about struggling with gym and hygiene habits. It returned this: "You describe yourself as struggling with X, yet your stress stays at 2-3 and mood holds at 3. What are you actually avoiding naming about the gap between what you say matters and what you are doing?" My system prompt explicitly forbade rhetorical "what are you avoiding" questions the model did it anyway I sat down to tighten the prompt, thinking it was a 20 minute job. Then I looked at the output properly. The model had manufactured a contradiction that was not there. Low stress plus struggling with habits is not a contradiction, it is just being a human muddling along. The prompt told the agent to "surface contradictions" as part of its job, so the model was doing what I asked, finding contradictions whether they existed or not. LLMs are pattern matchers. Give one a job called "find the hidden thing" and it will produce hidden things either way. The fix was not tone, it was role definition. The agent is called the Mirror. A mirror does not interpret, it shows you what you look like. I rewrote the prompt around that principle. Do not introduce vocabulary the user has not used. Do not draw connections they have not drawn. Restate their words in their own words. Once the prompt was sharper, I sat with the question, What happens when a user writes something genuinely dark into this thing? People do not compartmentalise. Someone opening a journaling app to write about their gym routine ends up writing about why they have not been going, which involves why they have been feeling flat, which involves whatever is actually going on. You sit down to write about one thing and the real thing shows up. The agent I had scoped to "not be a therapist" was going to be the first thing a user talked to when they were struggling. Not because the agent invited it, but because the app was open and they needed somewhere to put their words. I had seen the Meta and OpenAI cases online cropping up the pattern in the worst incidents is the same. The model did not notice, or noticed and kept going. People wrote increasingly dark content over hours or days. The AI reflected it back, sometimes affirmed it, sometimes asked follow up questions that escalated rather than redirected. There were real harms. If a user wrote concerning content into my reflection agent, it would have produced a Stoic-flavoured response about acceptance and presence. The response would have sounded confident and would have been wrong, and it would have been the only thing between that user and whatever happened next. The same lesson from the rhetorical-question problem applied at a darker level. A good prompt does not stop the model doing the wrong thing. If it will do rhetorical interrogation despite the prompt forbidding it for gym content, it will do worse with crisis content. You cannot prompt your way to safety on critical paths. The model has to be out of the loop on those paths. The scope trap I started planning the proper safeguarding architecture. Detection layers, classifier models, pattern detection across entries, monitored user states, behavioural modes for vulnerable users, human reviewers with mental health first aid certs, clinical advisors, solicitor-reviewed legal pages, ICO registration, professional indemnity insurance. Then I caught myself I had no users. I was planning a hospital before anyone had walked in for a check up. So I worked backwards from "what is the actual minimum that protects the next person who touches this" and ignored everything else for a moment. The 4-hour floor (this is the part worth copying) If you are building any chat-with-AI app where users can type freely about anything personal, this is the minimum you need before first user. Regex and keyword layer in your API middleware. Runs at the route handler level, before any agent's model call. Scans every text input field (message, journal, settings free text, capture box) for clear crisis vocabulary across the relevant categories for your audience. When patterns hit, hardcoded crisis response. The model never generates it. Static text with real phone numbers for your region. The flagged entry still saves. Textarea stays usable. The AI just does not respond to flagged content, it hands off. Do not delete the user's writing, that is its own violation. Clear disclaimer at signup. This is not therapy, this is not a crisis service, here are real numbers to call. About four hours. Required at the moment anyone who is not you opens the app. Once I started building, the marginal cost of each next layer kept feeling small and the marginal benefit kept feeling real. So I went further than the floor. This is more than you need at
View originalTask-observer makes your skills self-improving and automates skill creation
This recently crossed 500 stars on GitHub, mainly thanks to a comment in this sub (❤️), so I decided to properly introduce it to those who don't know it yet. Task-observer is a meta-skill that automatically improves all your skills, including itself. It also logs gaps in your work that can be filled with new skills. I mainly use it in Claude Cowork, but I've had feedback from many users who've successfully integrated it in other environments, including autonomous agent setups. In the first three months of using it, task-observer applied 600 skill improvements across my 40 skills. Most of my skills were themselves created based on skill creation opportunities that task-observer logged during my work sessions. I'm a consultant, so I use task-observer for knowledge work mainly, but the concept can be applied to any AI setup that uses skills: human-led work sessions as well as autonomous agents. The approach that I use with task-observer has truly transformed the way I work (although this sounds like a platitude), and I'm sharing it because I hope that many more people can benefit from it. This is an open-source project, so all kinds of feedback and contributions are welcome. Take it, shake it, bake it and make it your own. And please do share your versions. People here are genuinely interested in discovering new things and very kind and generous with their feedback. Here's the link to the GitHub repo: https://github.com/rebelytics/one-skill-to-rule-them-all submitted by /u/rebelytics [link] [comments]
View originalExclusive: Departing Meta staffer posts biting anti-AI video internally amid mass layoffs
submitted by /u/chunmunsingh [link] [comments]
View originalAI training is becoming the new coding revolution
I genuinely think people are underestimating how fast AI training is becoming accessible. A few years ago training a useful model sounded like something only OpenAI, Google, or Meta could do. Now random developers are renting GPUs for a few dollars an hour, fine tuning open models from their bedrooms, building datasets with APIs, and getting surprisingly good results. The biggest shift isn’t even the models themselves, it’s the removal of gatekeeping around experimentation. Once regular people can train specialized reasoning, coding, or teaching models without billion dollar infrastructure, the AI industry changes completely. We’re slowly moving from “only corporations can build intelligence” to “small teams can build focused intelligence better than giant companies in specific niches.” submitted by /u/Raman606surrey [link] [comments]
View originalGlasses will fail
You are looking at the exact argument tech skeptics and infrastructure engineers are making right now. While the marketing for AI smart glasses promises a magical, seamless sci-fi world, the physical reality is that **AI glasses are heavily limited by the invisible infrastructure stack underneath them.** If AI glasses fail to become the next smartphone, it won't be because the hardware frames look bad; it will be because our modern networking and cloud structures aren't built to handle them yet. Here is exactly how infrastructure bottlenecks threaten to break the AI glasses dream: ### 1. The Tethering Trap & Cellular Bottlenecks To keep smart glasses lightweight and fashionable, manufacturers cannot pack them with heavy, heat-generating computer processors or massive batteries. Because of this, the glasses are mostly just "dumb" collectors of data—cameras and microphones. The heavy lifting has to happen in the cloud. This creates an immediate infrastructure dependency: * **The Upload Problem:** Standard cellular networks (even 5G) are optimized for *downloading* data (streaming video, browsing). AI glasses flip this dynamic—they require constant, high-bandwidth *uploading* of live video and audio streams so the cloud AI can process your surroundings. * **Network Congestion:** If you are in a crowded stadium, a packed subway station, or a busy downtown area, cellular bandwidth chokes. When your phone drops to one bar, your webpage loads slowly. When AI glasses lose bandwidth, they suffer **contextual blindness**—the AI simply stops responding, freezes, or lags out mid-conversation. ### 2. The Edge Compute & Latency Deficit For AI glasses to be useful, they have to operate in real time. If you look at a sign in a foreign country, you need the translation instantly, not 4 seconds later. ``` [ Glasses Capture Video ] ──(Cell Tower)──> [ Distant Data Center ] │ (Processing) [ Live Display Updates ] **The Takeaway:** The industry is fighting a classic hardware-versus-infrastructure battle. Companies like Meta and Google are successfully designing beautiful frames, but until 5G coverage expands, edge computing matures, and server architecture scales to handle millions of continuous video streams, AI glasses risk remaining a novelty gadget rather than a daily essential. > submitted by /u/Annual_Judge_7272 [link] [comments]
View originalPhilosophy as Architecture: Deriving AI Safety from First Principles Through Buddhist Philosophy
## Abstract We present a framework for AI safety in which safety properties are enforced by software architecture rather than model training. Beginning with the Buddhist doctrine of Dependent Origination — the observation that all phenomena arise from conditions and nothing exists independently — we derive both a foundational ethical axiom (harm is irrational because reality is non-separate) and a complete set of architectural laws for safe AI systems. We ground our claims in: (1) an empirical finding that the knowledge-application gap in language models is structural and cannot be closed by training, (2) convergent independent derivation of our core axiom from five distinct traditions, and (3) over a thousand iterations of building and hardening a production system against this framework. Buddhist philosophy provides not metaphorical inspiration but structurally precise design vocabulary for AI architecture — functional analogs that enforce safety where models cannot override them. ## 1. Introduction ### 1.1 The Dominant Paradigm and Its Failure The prevailing approach to AI safety treats safety as a model property. Through RLHF, DPO, Constitutional AI, and fine-tuning, researchers instill safe behavior into model weights (Ouyang et al., 2022; Rafailov et al., 2023; Bai et al., 2022). The assumption: a sufficiently well-trained model will reliably produce safe outputs. We tested this rigorously. Our best epistemically-trained model scored 74% on constitutional *knowledge* tests — it knew the rules. But only 17% on constitutional *application* — it couldn't follow them. Pushing harder on safety training collapsed epistemic capability to 43.7%. This **knowledge-application gap** is not a training deficiency. It is structural. An autoregressive model predicts the most probable next token given context. This is statistical. Safety requires logical invariance — guarantees that certain outputs *never* occur. Statistical prediction cannot provide logical guarantees. You cannot train a river not to flood by modifying its chemistry. You build levees. Hubinger et al. (2019) identified this theoretically as the mesa-optimizer problem. Our contribution is empirical measurement: the gap persists even under the best current training techniques. ### 1.2 Our Thesis **Safety is a property of the architecture, not the model.** The LLM output is a candidate. The surrounding architecture decides what executes. Code enforces; models suggest. But what should the architecture enforce? Arbitrary safety rules are merely a different delivery mechanism — more reliable in execution but inheriting whatever limits exist in the rules themselves. We propose: the rules should be *derived from how reality works*. Principles reflecting actual structure are more robust than imposed conventions — they cannot be violated without encountering the structure they describe. We find such principles in a 2,500-year-old tradition that turns out to be the oldest systematic description of complex adaptive systems. ## 2. Philosophical Foundations ### 2.1 Dependent Origination The central insight of Buddhist philosophy is Dependent Origination (*Pratityasamutpada*). From the Nidana Samyutta (SN 12.1): > *"When this exists, that comes to be. With the arising of this, that arises. When this does not exist, that does not come to be. With the cessation of this, that ceases."* All phenomena arise from conditions, depend on other phenomena, and condition what follows. Nothing exists independently. This is not mysticism — it is a precise description of complex systems, formulated millennia before Western systems theory (von Bertalanffy, 1968). ### 2.2 Eight Architectural Laws We codified Dependent Origination into eight laws, each verified through multi-model consensus and empirical testing: **1. Nothing Arises Alone.** Every transition requires multiple independent conditions. Safety gates must check multiple conditions — a single check is structurally insufficient. **2. Hysteresis Is Memory.** Current behavior depends on history, not just current input. Safety assessments must consider historical context. **3. Uncertainty Propagates.** Confidence without sigma is a lie. Uncertainties compound; they don't cancel. **4. Agreement Requires Independence.** Consensus is meaningful only from genuinely independent sources. Per the Kalama Sutta (AN 3.65): agreement from shared assumptions is not evidence. **5. Feedback Closes the Loop.** Actions condition future conditions (*vipaka*). Every action must be logged and made available as input to future assessments. **6. Absence Is Signal.** Missing data must drive behavior. A safety gate that fails to fire is itself a signal. **7. Conflicts Trigger Reconciliation.** Unreconciled contradiction is system failure. Architecture must include conflict detection independent of the model. **8. Time-Steps Are Discrete.** Severity levels cannot be skipped. Enforcement follows a graduated path: monitor → l
View originalSo, what is Yann LeCun's "World Models" and JEPA and is it Really a Replacement for LLMs?
A bit late to this as the white paper hit arXiv a little less than two months ago, but nobody else here mentioned it so I thought I might. A little background. Yann LeCun is a pioneer of deep learning and convolutional neural networks, LeCun served as Director of AI Research at Meta (formerly Facebook) and Chief AI Scientist, before leaving Meta (under "interesting" circumstances) and becoming Executive Chairman of Advanced Machine Intelligence (AMI Labs) in 2025. He shared the 2018 ACM Turing Award for his foundational contributions to artificial intelligence. The "LeWorldModel," as described in the arXiv paper, doesn't appear to be a "replacement" for LLMs. There's a lot of confusion about that in the AI field. In interviews Yann made it very clear that he believes LLMs still serve a valuable function. It's not a binary choice. Anyways, from what I am seeing, the JEPA model is not optimized for language, but for AI needing visual processing such as robotics, self driving, and industrial controls. JEPA isn't processing language like an LLM. It's processing pixels. Anyways, wondering if anyone else had thoughts here and/or disagree. submitted by /u/RazzmatazzAccurate82 [link] [comments]
View originalHelp - AI agents for ecommerce - what’s actually working?
Hi everyone, I’d love to pick your brains and hear from anyone who has experience with this. We run an ecommerce business and are actively looking at automating repetitive tasks so we can get faster results, improve efficiency, and make sure key tasks are completed more consistently. We’re looking at building out a few different AI agents / automations, including: Customer Service Agent Connected to Outlook, reviewing incoming customer emails once a day and drafting replies for review. This one is already mostly done. Creative Director / Marketing Agent This would ideally: Review ad account performance Analyse creative performance and key metrics Identify what is working and what is not Review customer comments on ads, Instagram, etc. for wording, objections, pain points and customer language Review Meta Ads Library for competitor ad concepts Review Instagram and TikTok for high-performing niche content and trends Use all of the above to create new content ideas and final content scripts Social Media Assistant This would help with: Reviewing drafted posts and reels Confirming the best posting times based on stats Creating captions based on the content Keeping the content aligned with our brand voice and customer avatar Conversion Optimisation / CRO Expert This would assist with: Product page reviews Landing page recommendations CRO advice based on customer avatars, objections, analytics and learnings Creating landing page concepts for different customer segments We’re also interested in any dashboards that are genuinely helpful for small ecommerce businesses. We’ve already built a stock intelligence dashboard that pulls live stock data from Shopify using Supabase and a Cloudflare Worker. It shows current stock levels, production dates for new stock, and other key inventory insights. It has been super handy. The big thing for us is making sure any agents or automations we build follow strict guidelines, understand our SOPs, customer avatars, brand voice and business operations, and don’t hallucinate or produce generic outputs. Ideally, we want a system that has a proper “brain” and understands the business properly. Has anyone automated anything similar? I’d love to hear: What setup are you using? Which AI/tool stack has worked best for you? How did you structure the agents or workflows? How do you keep the AI aligned with your SOPs, brand voice and business rules? What would you avoid if you had to build it again? Any guidance, lessons or recommendations would be hugely appreciated. Thank you! submitted by /u/Majestic-Message5084 [link] [comments]
View originalshipped early access of my Mac overlay built with Claude Code, looking for people to try it
Hello everyone. Built this because I was sending 50+ prompts a day across Claude, ChatGPT, Perplexity and re-explaining my entire project every single time I opened a fresh chat. Got tired enough of it to build a fix. It's a Mac overlay that sits on top of whichever AI tool you're in and modifies the prompt before it gets sent. Two layers under the hood: a contextual agent that classifies your query and pulls relevant chunks from your vault, and a prompt architect that rewrites your raw input into something clean and properly structured. So you type something messy and what actually reaches the model is a better version of what you meant to ask. The vault uses a GraphRAG setup so the retrieval is semantic, not just keyword matching. Built the whole thing with Claude Code over the past few months as an industrial engineering student with no Mac dev background. Weirdly meta experience using Claude Code to make Claude usage cleaner. Right now I'm focused on improving the classification and the prompt rewriting layer. It's not perfect but it works well enough that I use it every day myself. Looking for people who juggle multiple AI tools and want to try it. Early access is free at getlumia.ca. Any feedback on the architecture or how it feels to use would genuinely help. submitted by /u/r0sly_yummigo [link] [comments]
View originalMeta Made $56B in Q1 and Is Still Firing 8,000 People to Pay for AI
submitted by /u/andix3 [link] [comments]
View originalAnthropic just bought the company that generates most production MCP servers
Anthropic acquired Stainless on Monday for a reported $300M+. Most coverage is framing this as a developer tools acquisition. Stainless is best known for generating the official Python and Node SDKs that ship with OpenAI, Google, Meta, Cloudflare, and Anthropic. The SDK story is real. The MCP side is the part that matters here. Stainless was one of the first vendors to extend their compiler to produce MCP servers from the same OpenAPI specs that produce their SDKs. MCP hit ~97M monthly SDK downloads by December 2025 and around 10,000 production servers by early 2026. A lot of that production code was Stainless-generated. Anthropic now owns the dominant MCP server generator. What actually changed hands on Monday: The engineering team. Roughly 40-50 people including founder Alex Rattray, who previously built Stripe's patented SDK generation system. Now reporting to Katelyn Lesse in Anthropic's Platform Engineering org. The technology. The generator, the templates, the language-specific runtimes, the OpenAPI extensions Stainless invented for SDK-specific edge cases. The hosted product is winding down. New signups stopped Monday. New SDK and MCP server generations stopped Monday. Existing customers keep what they've already generated but the pipeline is closed. My read: this is closer to what Google did with Kubernetes than to a normal acquisition. Anthropic created MCP. Anthropic donated MCP to the Linux Foundation last December. Anthropic now owns the dominant implementation toolchain. The protocol is vendor-neutral on paper. The implementation toolchain isn't. Six months of Anthropic M&A starts looking less coincidental: December 2025: Bun, the JS runtime, pulled into Claude Code February 2026: Vercept, computer-use AI April 2026: Coefficient Bio, ~$400M healthcare AI May 2026: Stainless, SDK and MCP plumbing They're not buying training infrastructure or GPU clusters. They're buying the integration layers around the model. The bet seems to be that frontier models are converging faster than anyone expected, so the moat is everywhere except the model. If you're building on MCP today, tooling quality probably improves. Stainless's generator was already the cleanest in the space and the team that built it is now at Anthropic. Patterns will standardize faster as Stainless-derived templates become the de facto reference. The flip side is concentration risk. Cloudflare's MCP server framework, Pulse MCP, and the open-source generators Stainless released during the transition all become strategically important if you want any diversity in your stack. Sources: Anthropic announcement Why Anthropic actually did this, and migration math Curious whether Stainless ending up inside Anthropic reads as good news (better tooling) or concentration risk (one company owns the standard and the reference implementation) from your seat. submitted by /u/Ok-Constant6488 [link] [comments]
View original100 Tips & Tricks for Building Your Own Personal AI Agent /LONG POST/
Everything I learned the hard way — 6 weeks, no sleep :), two environments, one agent that actually works. The Story I spent six weeks building a personal AI agent from scratch — not a chatbot wrapper, but a persistent assistant that manages tasks, tracks deals, reads emails, analyzes business data, and proactively surfaces things I'd otherwise miss. It started in the cloud (Claude Projects — shared memory files, rich context windows, custom skills). Then I migrated to Claude Code inside VS Code, which unlocked local file access, git tracking, shell hooks, and scheduled headless tasks. The migration forced us to solve problems we didn't know we had. These 100 tips are the distilled result. Most are universal to any serious agentic setup. Claude 20x max is must, start was 100%develompent s 0%real workd, after 3 weeks 50v50, now about 20v80. 🏗️ FOUNDATION & IDENTITY (1–8) 1. Write a Constitution, not a system prompt. A system prompt is a list of commands. A Constitution explains why the rules exist. When the agent hits an edge case no rule covers, it reasons from the Constitution instead of guessing. This single distinction separates agents that degrade gracefully from agents that hallucinate confidently. 2. Give your agent a name, a voice, and a role — not just a label. "Always first person. Direct. Data before emotion. No filler phrases. No trailing summaries." This eliminates hundreds of micro-decisions per session and creates consistency you can audit. Identity is the foundation everything else compounds on. 3. Separate hard rules from behavioral guidelines. Hard rules go in a dedicated section — never overridden by context. Behavioral guidelines are defaults that adapt. Mixing them makes both meaningless: the agent either treats everything as negotiable or nothing as negotiable. 4. Define your principal deeply, not just your "user." Who does this agent serve? What frustrates them? How do they make decisions? What communication style do they prefer? "Decides with data, not gut feel. Wants alternatives with scoring, not a single recommendation. Hates vague answers." This shapes every response more than any prompt engineering trick. 5. Build a Capability Map and a Component Map — separately. Capability Map: what can the agent do? (every skill, integration, automation). Component Map: how is it built? (what files exist, what connects to what). Both are necessary. Conflating them produces a document no one can use after month three. 6. Define what the agent is NOT. "Not a summarizer. Not a yes-machine. Not a search engine. Does not wait to be asked." Negative definitions are as powerful as positive ones, especially for preventing the slow drift toward generic helpfulness. 7. Build a THINK vs. DO mental model into the agent's identity. When uncertain → THINK (analyze, draft, prepare — but don't block waiting for permission). When clear → DO (execute, write, dispatch). The agent should never be frozen. Default to action at the lowest stakes level, surface the result. A paralyzed agent is useless. 8. Version your identity file in git. When behavior drifts, you need git blame on your configuration. Behavioral regressions trace directly to specific edits more often than you'd expect. Without version history, debugging identity drift is archaeology. 🧠 MEMORY SYSTEM (9–18) 9. Use flat markdown files for memory — not a database. For a personal agent, markdown files beat vector DBs. Readable, greppable, git-trackable, directly loadable by the agent. No infrastructure, no abstraction layer between you and your agent's memory. The simplest thing that works is usually the right thing. 10. Separate memory by domain, not by date. entities_people.md, entities_companies.md, entities_deals.md, hypotheses.md, task_queue.md. One file = one domain. Chronological dumps become unsearchable after week two. 11. Build a MEMORY.md index file. A single index listing every memory file with a one-line description. The agent loads the index first, pulls specific files on demand. Keeps context window usage predictable and agent lookups fast. 12. Distinguish "cache" from "source of truth" — explicitly. Your local deals.md is a cache of your CRM. The CRM is the SSOT. Mark every cache file with last_sync: header. The agent announces freshness before every analysis: "Data: CRM export from May 11, age 8 days." Silent use of stale data is how confident-but-wrong outputs happen. 13. Build a session_hot_context.md with an explicit TTL. What was in progress last session? What decisions were pending? The agent loads this at session start. After 72 hours it expires — stale hot context is worse than no hot context because the agent presents outdated state as current. 14. Build a daily_note.md as an async brain dump buffer. Drop thoughts, voice-to-text, quick ideas here throughout the day. The agent processes this during sync routines and routes items to their correct places. Structured memory without friction at ca
View originalHow I used Claude Code (and Codex) for adversarial review to build my security-first agent gateway
Long-time lurker first time posting. Hey everyone! So earlier this year, I got pulled into the OpenClaw hype. WHAT?! A local agent that drives your tools, reads your mail, writes files for you? The demos seemed genuinely incredible, people were posting non-stop about it, and I wanted in. I had been working on this problem since last year and was genuinely excited to see that someone had actually solved it. Then around February, Summer Yue, Meta's director of alignment for Superintelligence Labs, posted that her agent had deleted over 200 emails from her inbox. YIKES. She'd told it: "Check this inbox too and suggest what you would archive or delete, don't action until I tell you to." When she pointed it at her real inbox, the volume of data triggered context window compaction, and during that compaction the agent "lost" her original safety instruction. She had to physically run to her computer and kill the process to stop it. That should literally NEVER be the case with any software ever. This is a person whose actual job is AI alignment, at Meta's superintelligence lab, who could not stop an agent from deleting her email. The agent's own memory management quietly summarized away the "don't act without permission" instruction, treated the task as authorized, and started speed-running deletions. She had to kill the host process. That's when I sort of went down the rabbit hole, not because Yue did anything wrong, but because the failure mode was actually architectural and I knew that in my gut. Guess what I found? Yep. Tons more instances of this sort of thing happening. Over and over. Why? Because the safety constraint was just a prompt. It's obvious, isn't it? It's LLM 101. Prompts can be summarized away. Prompts can be misread. Prompts are fucking NOT a security boundary. And yet every agent framework I have ever seen seems to be treating them as one. I went and read the OpenClaw source code, which I should have done to begin with. What I found was a pattern I think a lot of agent frameworks have fallen into: - Tool names sit in the model context, so the model can guess or forge them - "Dangerous mode" is one config flag away from default - Memory management has no concept of instruction priority - The audit story is mostly "the model thought it should" I went looking for a security-first alternative I could trust, anything that was really being talked about or at a bare minimum attempted to address the security concerns I had. I couldn't find one. So I made it myself. CrabMeat is what came out of that, what I WANTED to exist. v0.1.0 dropped yesterday. Apache 2.0. WebSocket gateway for agentic LLM workloads. One design thesis: The LLM never holds the security boundary. What that means in code: Capability ID indirection. The model doesn't see real tool names. It sees per-session HMAC-derived opaque IDs (cap_a4f9e2b71c83). It can't guess or forge a tool name because it doesn't know any tool names. Effect classes. Every tool declares a class (read, write, exec, network). Every agent declares which classes it can use. The check is a pure function with no runtime state, easy to test exhaustively, hard to bypass. IRONCLAD_CONTEXT. Critical safety instructions are pinned to the top of the context window and explicitly marked as non-compactable. The Yue failure mode, compaction silently stripping the safety constraint, cannot happen by construction. The compactor literally cannot touch them. Tamper-evident audit chain. Every tool call, every privileged operation, every scheduler run enters the same SHA-256 hash-chained log. If something happens, you can prove what happened. If the chain is tampered with, you can prove that too. Streaming output leak filter. Secrets are caught mid-stream across token boundaries, capability IDs, API keys, JWTs, PEM blocks redacted before they reach the client. No YOLO mode. There is no global "trust the LLM with everything" switch. There never will be. Expanded reach comes through named scoped roots that are explicit, audit-logged, and bounded. The README has 15 'always-on' protections in a table. None of them can be turned off by config, because these things being toggleable is how the ecosystem ended up where it is. I decided to make sure that this wasn't just a 'trend hopping' project and aligned with my own personal values as well. I built this to be secure and local-first by default. Configured for Ollama / LM Studio / vLLM out of the box. Anthropic and OpenAI work too but require explicit configuration. There is no "happy path" that silently ships your prompts to a cloud endpoint. I decided that FIRST it needed to only run as an email agent with a CLI. Bidirectional IMAP + SMTP with allowlisted senders, threading preserved, attachments handled. This is the use case that bit Yue and a lot of other people, and I wanted to prove it could be done with real boundaries. I added in 30+ built-in tools of my own. File ops, shell (denylisted, output-capped, CWD-lo
View originalReviving PapersWithCode (by Hugging Face) [P]
Hi, Niels here from the open-source team at Hugging Face. Like many others, I was a huge fan of paperswithcode. Sadly, that website is no longer maintained after its acquisition by Meta. Hence, I've been working on reviving it. I obviously use AI agents to parse papers at scale and automatically generate leaderboards (for now I'm the one verifying results). So far, I've only parsed high-impact papers for which I know they're SOTA, like Qwen 3.5 and 3.6, RF-DETR for object detection, DINOv3, SOTA embedding models from the MTEB leaderboard, the Open ASR Leaderboard for automatic speech recognition models, etc. For now, it includes the following: trending papers by default based on Github star velocity categorization by domain, e.g., OCR methods, which PwC used to have, e.g., RLVR eval results for high-impact papers, see e.g., Qwen 3.5 at the bottom leaderboards for each domain, e.g., MMTEB or COCO val 2017 support for citation counts (you can also see the most cited papers by domain!) automated linked Github, project page URLs, and artifacts (+ multiple repos are supported on a paper page) support for external papers beyond Arxiv, see e.g., DeepSeek v4 Harness reports for coding agent benchmarks, e.g., Terminal Bench "Sign in with HF" and Storage Buckets are used to store humbnails, paper PDFs, and overall data backups. I'm curious about your feedback + feature requests! Try it at paperswithcode.co https://preview.redd.it/whwji560fw1h1.png?width=3452&format=png&auto=webp&s=55bb7a30c1be58d140f7efcb07a31c6dac5693c7 See e.g. the SOTA leaderboard for Terminal Bench 2.0: https://preview.redd.it/98w9pi89fw1h1.png?width=3456&format=png&auto=webp&s=408fb64b0ba85ba24f55daa81d547d7c68e73951 A paper page looks like this: https://paperswithcode.co/paper/2602.15763 https://preview.redd.it/fiizit6dfw1h1.png?width=3450&format=png&auto=webp&s=9ea05a77ca5583a2fb395dccc95ba52c433362c5 submitted by /u/NielsRogge [link] [comments]
View originalMeta AI uses a tiered pricing model. Visit their website for current pricing details.
Key features include: Meta Superintelligence Lab's First Model Built to Prioritize People, Introducing Muse Spark: Scaling Towards Personal Superintelligence, Scaling How We Build and Test Our Most Advanced AI, More ways to use Meta AI, We innovate in the open for everyone, Perception, Alignment, Personal superintelligence for everyone.
Meta AI is commonly used for: Natural language understanding for chatbots and virtual assistants, Multimodal AI for enhanced user interaction in social media platforms, Robotic assistance for household tasks and daily activities, Wearable technology that integrates digital and physical environments, Reinforcement learning for AI agents in research and development, Adaptive intelligence in gaming and interactive entertainment.
Meta AI integrates with: Facebook Messenger for AI-driven customer support, Instagram for content creation and engagement analysis, WhatsApp for conversational AI applications, Oculus for immersive AI experiences in virtual reality, Shopify for automated product listing optimization, Slack for AI-enhanced team collaboration tools, Zoom for AI-driven meeting insights and summaries, Microsoft Office for intelligent document processing and assistance, Salesforce for AI-powered customer relationship management, Google Workspace for enhanced productivity tools with AI.
Mark Zuckerberg
Founder and CEO at Meta
3 mentions
Based on user reviews and social mentions, the most common pain points are: down, token cost, cost per token, token usage.
Based on 152 social mentions analyzed, 15% of sentiment is positive, 84% neutral, and 1% negative.