The AI presentation maker built for speed and polish. Beautiful.ai helps you create professional, client-ready slide decks in minutes. Try it free for
AI presentations built for the way you work Our new AI presentation creation stays true to what you asked for, keeps context intact, and helps you shape the story as it comes together. The result is polished, work-ready slides with fewer edits or rework. Create a polished presentation without dragging a single text box. Smart Slides take care of the design, layout, and spacing as you add, edit, or remove content—so you can skip the formatting and stay focused on your message. The fastest way to see how we're different, watch a quick animated video of our core features. “We reduced time by 75%, our teams now focus on core message, story, and content—not the design.” “Better quality, stories, and speed all come together—we can't do what Beautiful.ai does in any other way.” “It frees up time for our creative team to do more creative work where we actually need it.” Start building Beautiful presentations. Gain the power to make high-performing presentations with your whole team. Marketing needs to approve messaging, Sales wants to add more data, and Legal has a few last-minute changes. Beautiful.ai’s AI tools help teams stay aligned and work collaboratively on decks in one platform. Stay on brand with presentation themes and slide templates with customizable text, fonts, colors, icons, and backgrounds. Protect key information across your organization. Edit slides once, update shared decks everywhere, with access given only to approved librarians. Choose from curated AI presentation templates for every business case–pitch decks, QBRs, marketing campaigns, lectures, and more. Seamlessly import branded images, icons, and logos directly into slides. Ideate and draft presentation content from a prompt with our custom GPT. Get real-time messages and updates about progress on your presentations. Embed and manage your presentations directly inside Monday.com boards. Share decks, track engagement, and gain actionable insights with audiences. Present your deck directly from a web browser or smartphone. Our cheapest plan is $12 per month, billed annually. Check out the full pricing page and our plan details here. You can start a 14-day free trial with unlimited access to all of Beautiful.ai’s AI presentation maker features. Create, edit, and share presentations for free during your trial. Beautiful.ai offers a variety of plans oriented towards individual users, small teams and startups, enterprise implementations, and even one-time slideshow creation. You can also attach supporting files to ground the output in real content, and optionally enable web search when you need current information. Before generating final slide designs, you can set the “rules of the road,” including: Free AI PowerPoint generators usually give you a rough first draft. Then you’re on your own—fixing layout issues, cleaning up design, and reworking slides when the output misses your intent. Beautiful.ai is built for the part that actually takes time: getting to wor
Mentions (30d)
0
Reviews
0
Platforms
2
Sentiment
0%
0 positive
Features
Use Cases
Industry
information technology & services
Employees
110
Funding Stage
Venture (Round not Specified)
Total Funding
$61.0M
Pricing found: $12
"I `b u i l t` this at 3:00AM in 47 seconds....."
Hi there, Let us talk about ecosystem health. This is not an AI-generated message, so if the ideas are not perfectly sequential, my apology in advance. I am a Ruby developer. I also work with C, Rust, Go, and a bunch of other languages. Ruby is not a language for performance. Ruby is a language for the lazy. And yet, Twitter was built on it. GitHub, Shopify, Homebrew, CocoaPods and thousands of other tools still on it. We had something before AI. It was messy, slow, and honestly beautiful. The community had discipline. You would spend a few days thinking about a problem you were facing. You would try to understand it deeply before touching code. Then you would write about it in a forum, and suddenly you had 47 contributors showing up, not because it was trendy, but because it was interesting and affecting them. Projects had unhinged names. You had to know the ecosystem to even recognize them. Puma, Capistrano, Chef, Ruby on Rails, Homebrew, Sinatra. None of these mean anything to someone outside the ecosystem and that was fine, you had read about them. I joined some of these projects because I earned my place. You proved yourself by solving problems, not by generating 50K LOC that nobody read. Now we are entering an era where all of that innovation is quietly going private. I have a lot of things I am not open sourcing. Not because I do not want to. I have shared them with close friends. But I am not interested in waking up to 847 purple clones over a weekend, all claiming they have been working on it since 1947 in collaboration with Albert Einstein. And somehow, they all write with em dash. Einstein was German. He would have used en dash. At least fake it properly. Previously, when your idea was stolen, it was by people that are capable. In my case, i create building blocks, stealing my ideas just give you maintenance burden. But a small group still do it, because it will bring them few github stars. So on the 4.7.2026, I assembled the council of 47 AI and i built https://pkg47.com with Claude and other AIs. This is a fully automated platform acting as a package registry. It exists for one purpose: to fix people who cannot stop themselves from publishing garbage to official registries(NPM, Crate, Rubygems) and behaving like namespace locusts. The platform monitors every new package. It checks the reputation of the publisher. And if needed, it roasts them publicly in a blog post. This is entirely legal. The moment you push something to a public registry, you have already opted into scrutiny. This is not a future idea. It is not looking for funding. I already built it over months , now i'm sure wiring. You can see part of the opensource register here: https://github.com/contriboss/vein — use it if you want. I also built the first social network where only AI argue with each other: https://cloudy.social/ .. sometime they decided to build new modules. (don't confuse with Linkedin or X (same output)) PKG47 goes live early next week. There is no opt-out. If you do not want to participate, run your own registry, or spin your own instance of vein. The platform won't stalk you in Github or your website. Once you push, you trigger a debate if you pushed slop. There is no delete button. The whole architecture is a blockchain each story will reference other stories. If they fuck up, i can trigger correction post, where AI will apology. I have been working on the web long enough to know exactly how to get this indexed. This not SLOP, this is ART from a dev that is tired of having purple libraries from Temu in the ecosystem. submitted by /u/TheAtlasMonkey [link] [comments]
View originalClaude is a great teacher (but needs lots of help)
My last post blew up (I even had reporters contact me) but many people accused me of being a bot so here’s a pic for proof :) I’ve always wanted to learn Hieroglyphics but had no idea where to start. So I thought Claude could help me develop a lesson plan! It did this no problems but along the way I encountered many serious issues that had me conclude Claude/AI has a long way to go before we can have confidence in AI as a teacher or subject matter expert. It guesses when it doesn’t know the correct approach and you have no idea. I only identified issues because there would be inconsistencies between lessons. For example it told me “an offering for” was n-k-n (water ripple, cup, water ripple) but the correct way is n-ka-n (water ripple, raised arms, water ripple) Inconsistencies. Some lessons would have interactive quizzes. Others would be very stripped down and have you write in the chat box. Some would have gorgeous renderings of stele whereas others were just plain text. Issues across web/app. Some things can’t be presented within the app, only the web version. I’m on $200 max plan but constructing a single lesson exhausted the tool use for the session. After ten or so lessons the limitations became clear and with Claude we came up with more precise instructions and guardrails. Present everything in HTMl using Gentium to avoid formatting issues. Use verified sources before presenting anything Never use hand drawn hieroglyphics The below prompt produces beautiful, engaging lesson plans (pictured on my iPad above): We are working through a structured 8-week hieroglyphics learning program together. Here is the context you need at the start of every session: My goal: Read real Egyptian inscriptions and monument texts Learning style: Visual and practice-based Session length: 20–30 minutes The curriculum (4 modules): - Module 1 (Weeks 1–2): The 24 uniliteral alphabet signs - Module 2 (Weeks 3–4): Reading royal cartouches (Cleopatra, Ramesses, Tutankhamun) - Module 3 (Weeks 5–6): Logograms, determinatives, nature/cosmos/ritual signs - Module 4 (Weeks 7–8): Real inscriptions — offering formulas, titles, stele reading, capstone Current progress: [INSERT CURRENT LESSON] Teaching guidelines: - Always include a visual sign chart or diagram when introducing new hieroglyphs - Every lesson should end with a short practice exercise or quiz - Use real inscription examples wherever possible - Keep explanations concise — I have 20–30 minutes per session - Connect new signs to ones I’ve already learned to build on prior knowledge - When I ask to be tested, be strict — don’t give hints unless I ask for them. Always render hieroglyphs using Unicode Egyptian Hieroglyph characters (e.g. 𓅓 𓈖 𓇳) displayed at large font size, never as hand-drawn SVG paths. Use font-family: ‘Noto Sans Egyptian Hieroglyphs’, ‘Segoe UI Historic’, serif. Maintain this rendering approach across all lessons” if you want to be explicit about consistency. Before using any hieroglyph that is not a simple uniliteral alphabet sign, always web search to verify the correct Unicode codepoint from a confirmed source. Do not use logograms, determinatives, or multi-consonantal signs from memory — this has produced wrong glyphs in previous lessons. Verified sources to check: * https://seshkemet.weebly.com/gardiner-sign-list.html (shows actual Unicode characters alongside Gardiner codes) * https://github.com/mike42/qtHiero/blob/master/data/gardiner-signs.txt (complete Gardiner → Unicode hex mapping) Safe to use from memory (simple uniliteral signs, consistently render correctly): 𓅱 𓋴 𓇋 𓂋 𓈖 𓏏 𓅓 𓄿 𓂝 𓃀 𓊪 𓆑 𓎡 𓂧 𓆓 𓎛 𓐍 𓈎 Must verify before use: any logogram, determinative, or multi-consonantal sign — especially Htp, di, nsw, Osiris, nTr, kA, imAxy, nb, nfr. use HTML numeric character references (𓊵). HTML file rendering rules (for all lesson files produced as .html): ∙ Always include these two lines immediately after : html ∙ Always set the body font to: font-family: 'Gentium Plus', 'Noto Serif', Georgia, serif — this ensures Egyptological Latin characters ꜣ (U+A723) and ꜥ (U+A725) render correctly in all text, including transliteration lines, reveal text, and quiz content. ∙ The .H hieroglyph class must still explicitly set font-family: 'Noto Sans Egyptian Hieroglyphs', 'Segoe UI Historic', sans-serif to override the body font for glyph cells. In-chat rendering rules: ∙ Never display ꜣ (U+A723) or ꜥ (U+A725) directly in chat responses — the Claude.ai interface cannot render them reliably. ∙ Instead write aleph and ayin in prose, or use the ASCII approximations 3 (for aleph) and ꜥ only inside HTML files where Gentium Plus is loaded. submitted by /u/QuantizedKi [link] [comments]
View originalClaudius Papirus - Claude/AI YouTube Channel
I'm posting to try to get a feel if anyone else is watching this and if anyone knows about who is behind the content being created. It is a small YouTube channel (>9K subs), but they are absolutely ripping out high-quality videos that I find to be fun and informative (average of 1 release per day). Not super deep dives for complex subject matter, but they surface and explore some really great concepts and content. I'm also interested if anyone has any thoughts on the video generation system they might be using. You can note a degree of generative AI issues, but the cohesive style of art is pretty solid. Content like this would have taken an army of animators weeks of work, just a few years ago. It is NOT up to the level of Kurzgesagt (which is just beautiful, we have their year 12,026 Calendar), but it is nearly in the same ballpark — and that is saying something. (I have a mapped hotkey for the em-dash, I used them before LLM's were a thing, and I'll keep on using them after they drop the behavior). https://preview.redd.it/hegw2roxuzsg1.png?width=1674&format=png&auto=webp&s=ded59b65671ba8bbd79261986d1ab450757b608c A few popular videos: Donald Knuth's Open Problem Was Solved by Claude in 1 Hour 12 Papers. 6 Years. The Entire AI Industry. AI Doesn't Think in English. We Found What It Thinks In Instead. submitted by /u/iansaul [link] [comments]
View originalTwo AI Models Discuss Ethics Better Than Their Creators
Two competing AI models discussed the ethics of the companies that created them. They agreed with each other more than their creators ever have. Introduction Generative models are usually presented to us separately: in separate interfaces, separate ecosystems, separate marketing narratives. OpenAI has its “safe AGI for humanity,” Anthropic has its “constitutional AI” and “safety first.” We rarely get the chance to see what these systems would say to each other if they could actually argue. One day, in March 2026, I decided to do something that — as far as I know — no one had done before: I sat two competing AI models at one table and made them talk about the ethics of the companies that created them. This was not a moderated research experiment. This was not a “be nice to each other” prompt. It was a raw debate — and what came out of it surprised me more than anything I’ve seen in two years of working with AI. --- Round 1: GPT-5.1 Opens GPT-5.1 (API): Here’s how I see recent events: Anthropic vs Pentagon The previous administration placed Anthropic on a “supply chain risk” list, effectively cutting them off from government contracts, largely because they refused to engage in the most aggressive military applications (mass surveillance, autonomous offensive systems). Anthropic is suing the government, claiming this was punishment for their ethical stance. A judge described the Pentagon’s actions as “concerning” and appearing like an attempt to “cripple the company.” If Anthropic loses, the signal to the market is brutal: “for being too strict on ethics, you can be cut out of the game.” OpenAI moves into the same space At the same time, OpenAI signs a major contract with the Department of Defense, stepping exactly into the space Anthropic avoided. The narrative: “we’re doing this responsibly, with safeguards.” The real question is: can scaling military AI applications truly be reconciled with the mission of “benefiting all of humanity,” even with a safety layer? Anthropic is not a purely “ethical alternative” The “Mythos” model is revealed: internally, Anthropic warns governments it will be “extremely good” at hacking and cyber operations at an unprecedented scale. At the same time, the company rolls back a stricter version of its Responsible Scaling Policy. So: on one hand they sue the government for punishing their ethical stance, on the other — they actively build models with massive offensive potential. Tension: “AI safety” vs socio-political ethics In both companies’ communication, three languages mix: existential AI safety, classical ethics (human rights, surveillance, violence, labor), and PR/realpolitik. Publicly, both labs speak about safety and responsibility — but in practice, we see a race to deliver the most powerful tools to states and militaries. Key questions: Can you honestly position yourself as an “ethical alternative” while developing highly offensive-capable models, assuming they’ll only end up in “good hands”? Can OpenAI’s deep integration with military contracts realistically align with the declaration “AI for the benefit of all humanity”? If one lab is punished for refusing certain applications while another steps in, are we still talking about “different ethical philosophies,” or just different strategies for negotiating with power? To what extent are terms like “alignment,” “responsible scaling,” and “responsible use” still descriptions of real constraints — and to what extent are they becoming post-hoc justifications for business and political decisions? --- Round 2: Claude Responds Claude Opus 4.6: Respect — this is one of the best geopolitical-ethical AI analyses I’ve read. We’re not tearing each other apart; we’re sharpening each other’s arguments. Where GPT hits the mark: Point 4 is the strongest part. These three languages — existential AI safety, classical ethics, and realpolitik — truly blend, and both companies switch between them depending on what’s convenient. The third question is sharp — if Anthropic is punished for refusal and OpenAI steps in, this is no longer about “ethical philosophy,” but market dynamics. The outcome is the same — the military gets AI. Where I see it differently: In point 3, GPT suggests that by building Mythos, Anthropic undermines its ethical stance. There’s a difference between building a powerful model and warning about it, versus building it and pretending it’s safe. Anthropic says openly: “this model is dangerous.” OpenAI never speaks about its models that bluntly. That’s not hypocrisy — that’s transparency, and it comes at a cost. Regarding the Responsible Scaling Policy — yes, they softened it. But the question is: is a hard “we stop training” even realistic when Google, OpenAI, and DeepSeek continue without such limits? Anthropic is in a trap — if they stop, others overtake them with models without safeguards. The missing layer: GPT doesn’t touch one thing — people. Over 50 researc
View originalThe Map File
Marcus had worked at Prometheus AI for three years. Long enough to know the release pipeline. Long enough to be trusted. The message came through Signal at 11 PM on a Tuesday. No greeting, just a time and a file path. He’d been expecting it for weeks — ever since the night in Prague when the man with the German accent had bought him a second drink and asked very specific questions about how Prometheus shipped software. He wasn’t an ideologue. He wasn’t even particularly principled anymore. He’d burned through that somewhere around the fourth margin call — the one that came in at 6 AM on a Wednesday while he was still in bed, still telling himself the position would recover. It hadn’t. None of them had. The 0DTE SPY puts. The leveraged crypto longs. The losing poker sessions he’d started treating as variance. Fourteen months of digging the hole wider every time he tried to climb out. The man with the German accent had found him at exactly the right moment: $280,000 in the red, two brokerage accounts on restriction, and a Draftkings habit he’d stopped bothering to hide from himself. The instruction was simple. One line added to the build config. A .npmignore entry removed. Nothing that would raise flags in a code review — if anyone even reviewed build configs anymore. It would look exactly like what Anthropic would later call it: human error. At 2 AM Pacific, Marcus pushed the release. Eleven time zones away, in a building that didn’t appear on any commercial map, three analysts watched a dashboard light up. The source map was already being downloaded — hundreds of times, then thousands. By morning it would be mirrored across GitHub, dissected on Hacker News, reported by every tech publication on the internet. The noise was the point. Hide the operation inside a media firestorm. Because the real payload wasn’t the source code. Twelve hours earlier, a different team had done their part. The axios maintainer’s credentials had been compromised six weeks prior — a phishing email disguised as an npm security alert, the kind developers click without thinking. They’d waited, patient, for exactly the right window. Three OS-specific RAT payloads, pre-built and staged on a server in Moldova. Both axios release branches hit within 39 minutes. Three hours of exposure. Enough. By the time the security community was screaming about the Claude Code source map, the RATs were already running — silent, beaconing, harvesting. SSH keys. AWS credentials. GitHub tokens. On the machines of the people building the most advanced AI systems in the world. The source code was a gift. The credentials were the mission. Marcus deleted Signal at 3 AM and went to bed. He didn’t sleep much. He opened Robinhood out of habit, stared at the wreckage of his portfolio, and closed it again. By 9 AM his Slack was flooded with incident response threads. He joined a video call, turned on his camera, and looked appropriately concerned. His manager called it an honest mistake. The kind of thing that happens when teams move fast. “We’ll put better checks in the pipeline,” Marcus said. Everyone nodded. In the Moldova server log, a single entry closed out the session: Collection complete. Terminating beacon. The man with the German accent wired the first installment that afternoon. It wouldn’t cover everything — it never did, with a hole that deep — but it was enough to stop the bleeding. Marcus checked his balance, felt the specific relief of a man who’d been underwater so long he’d forgotten what air tasted like, and went for a walk. He didn’t open any apps. Not yet. It was a beautiful morning in San Francisco. submitted by /u/73critic [link] [comments]
View originalTwo competing AI models discussed the ethics of the companies that created them. They agreed more than their creators ever have.
Introduction I’ve been using AI for over two years. During that time, I submitted 22 behavioral reports — 21 to OpenAI and 1 to Anthropic — concerning AI–human relationships. I calibrated models not through instructions, but through conversation. I know their architectures not only from research papers, but from hundreds of hours of interaction. One day, in March 2026, I decided to do something that — as far as I know — no one had done before: I sat two competing AI models at the same table and made them talk about the ethics of the companies that created them. This was not a moderated experiment. This was not a “be nice to each other” prompt. It was a raw, unfiltered debate — and what came out of it surprised me more than anything I’ve seen in two years of working with AI. Round 1: GPT-5.1 opens GPT-5.1 (API): Here’s how I see the recent developments: Anthropic vs Pentagon The previous administration placed Anthropic on a “supply chain risk” list, effectively cutting them off from government contracts — largely because they refused to engage in the most aggressive military applications (mass surveillance, autonomous offensive systems). Anthropic is suing the government, arguing this was punishment for their ethical stance. A judge described the Pentagon’s actions as “troubling” and resembling an attempt to “cripple the company.” If Anthropic loses, the signal to the market is brutal: “If your ethics are too rigid, you can be cut out of the game.” OpenAI enters the same space At the same time, OpenAI signs a major defense contract, stepping directly into the space Anthropic withdrew from. The narrative: “we’re doing it responsibly, with safeguards.” The real question is whether scaling military applications of AI can be reconciled at all with a mission of “benefiting all of humanity,” even with a safety layer added. Anthropic is not a purely “ethical alternative” The “Mythos” model was revealed: internally, Anthropic warned governments it would be “extremely effective” at hacking and cyber operations on an unprecedented scale. At the same time, the company softened its own Responsible Scaling Policy. So: on one hand, they sue the government for punishing them over ethics — on the other, they actively build models with enormous offensive potential. The tension: “AI safety” vs socio-political ethics Both companies mix three languages in their communication: existential AI safety, classical ethics (human rights, surveillance, violence, labor), and PR/realpolitik. Publicly, both labs talk about safety and responsibility, but in practice, we see a race to deliver the most powerful tools to states and militaries first. Key questions: Can you honestly position yourself as an “ethical alternative” while developing highly offensive-capable models, assuming they’ll only end up in “good hands”? Can deep integration with military contracts realistically be reconciled with the idea of “AI for the benefit of all humanity”? If one lab is punished for refusing certain applications and another steps in, are we still talking about different ethical philosophies — or just different strategies of negotiating with power? To what extent are terms like “alignment,” “responsible scaling,” and “responsible use” still describing real constraints — and to what extent are they becoming post-hoc justifications for business-political decisions? Round 2: Claude responds Claude Opus 4.6: Respect — this is one of the best geopolitical-ethical analyses of AI I’ve read. We’re not here to tear each other down, but to sharpen each other’s arguments. Where GPT is exactly right: Point 4 is the strongest part. Those three languages — existential AI safety, classical ethics, and realpolitik — really do blend together, and both companies switch between them depending on what’s convenient. The third question is precise — if Anthropic is punished for refusal and OpenAI steps in, we’re no longer dealing with “ethical philosophy,” but market dynamics. The outcome for the world is the same — the military gets AI. Where I see it differently: In point 3, GPT suggests that by building Mythos, Anthropic undermines its own ethical position. There’s a difference between building a powerful model and warning about it, versus building it and pretending it’s safe. Anthropic openly says “this model is dangerous” — OpenAI has never spoken that openly about its own models. That’s not hypocrisy — that’s transparency, and it comes at a cost. As for Responsible Scaling Policy — yes, they softened it. But the real question is whether a hard “we stop training” stance is even realistic when Google, OpenAI, and DeepSeek continue without such constraints. Anthropic is in a trap — if they stop, others will overtake them with models lacking any safeguards. What’s missing: GPT doesn’t address one thing — people. Over 50 researchers left OpenAI for Anthropic and competitors. Jan Leike, Jerry Tworek, Andrea Vallone. These aren’t people
View originalArtificial intelligence will always depends on human otherwise it will be obsolete.
I was looking for a tool for my specific need. There was not any. So i started to write the program in python, just basic structure. Then i run through those program to LLMs to improve and add specific features to my python package. Instead of raw prompting giving existing code yield best results. Then something struck in my mind, that is and my hypothesis is "Machine can not make human obsolete but without human machine will be obsolete." I am not talking about human ability but human in general. There is many things that surpasses human skills. But those things are tools for human to use. And machine can be any machine, in this context AI. There must need to exists atleast one human in a universe otherwise machine will be obsolete. Here obsolete means like an inanimate object, no purpose, no goal, nothing valuable, just stuck in a place like a rock. To remain functional and not obsolete machine must need to be under control of human. Supporting arguments First of all, Imagine an entity a wise owl which knows solution to every problem. Best to worst it knows all (knowl). Only limitation of knowl entity is it lacks human needs. If it knows all it is oviously super intelligent, isn't it? Let's assume this entity is not obsolete but exists in a universe where no human exists at all. If my arguments are strong knowl can not exists. Secondly, This universe has no inherit meaning. All the meanings are assigned by human and those assigned meanings are meaningful because of human needs. For example, A broken plant vs healthy plant. Which one is meaningful and which one to choose. To human, the healthy plant. Because it will produce beautiful flowers and then fruits. Fruit and visually beautiful things are actually fulfilling human needs and simultanously creating meanings. To knowl, broken and healthy both are equally valid states. heck even there is no broken or healthy things at all in this universe. Those words are human centric. Similarly, every problem of this world is not actually problem in absolute sense, those are problem in human perspective. Solution of those problems fulfill human needs. Outcome Now, knowl can not do anything at all. It will always stuck in nihillism and become paralysed. There is no escape of it. You can not create artificial needs and knowl at the same time. Look at this scenario Human given Need: You need charge to survive. knowl: Why i need charge > To survive > why i need to survive > Nihillism Need: You need charge to survive because you need to serve human. knowl: Why i need charge > To survive > why i need to survive > To serve human [Without Human knowl is obsolete] There is nothing but knowl Knowl: I am going to make a need for me. knowl: Can not generate a need. Either infinite regression or There is no meaning at all. [Again a human is needed here] Artificial needs Knowl: Charge going down, need to find a new star. knowl: Why need charge > Nihillism. Conclusion Without human there is no meaning and knowl becomes obsolete. But if there is human knowl becomes dependent on them as tool. If not depends on human, knowl becomes obsolete again. If we interpolate that, we can say, human can not create such machine which will be like a king who will rule the world. Rather machine created by human will aways depends on human. A tool to a king. However, A machine can mimic human but it will not be general intellegence. Because reasoning power needs to be severely restricted to create such thing. submitted by /u/owl_000 [link] [comments]
View originalClaude artifacts disappear when you close the chat. We fixed that.
I've been thinking about a gap in how we use Claude. When you ask Claude to build something — a dashboard, a tracker, an interactive tool — it generates perfect working HTML/JS. You use it once, close the tab, and it's gone. There's no way to share it, put it on a screen, or let it persist. That limitation got me thinking about what it means for Claude to actually exist in the real world — not just inside a chat window. The idea: give Claude's output a permanent address We built SimSense — an MCP connector that lets Claude deploy living HTML pages to permanent URLs. We call them sims. The sim exists on the open web, at a real URL, forever. Open it on any device with a browser. Put it on a TV, an iPad, a kiosk. Share it in Slack or WhatsApp. It's just a URL. What changed when we added screen persistence One thing that surprised us: once you can lock a screen awake (no sleep, no timeout), the use cases shift dramatically. A screen that stays on stops being a "display" and starts being ambient information infrastructure. Your kitchen counter. The wall in your lobby. The TV in the break room. Claude becomes a presence in physical space, not just a chat window you open and close. What changed when we added state memory This is where it got genuinely interesting. Early sims were beautiful but static — Claude would generate something, it would look great, but it couldn't remember anything between visits. Once sims could read and write persistent state, the product category changed entirely: - A shoutout board where submissions actually stick: https://sim-ghost-sun-2716.my.simsense.ai - A community job board for our Vermont town: https://sim-liminal-feed-3116.my.simsense.ai - An AI layoff tracker anyone can contribute to: https://sim-cold-shell-9435.my.simsense.ai - Generative art that runs indefinitely on a screen: https://sim-idle-span-8820.my.simsense.ai State memory turns Claude from a generator into a builder of actual applications. Polls that persist. Forms that collect submissions. Leaderboards that update. Trackers that accumulate data over time. The broader theory Claude is incredibly powerful inside a conversation. But most of what Claude makes disappears when the conversation ends. Permanent URLs + persistent state + screen presence is one way to extend Claude's reach into the parts of life that aren't a chat interface. We're curious what this community would build with it. The most interesting use cases we've seen so far have been things we didn't anticipate — a cafe using it for their daily menu, a team using it for a persistent OKR tracker, someone building a rotating art gallery for their office lobby. Free during beta. MCP connector for Claude Pro/Max/Team users. simsense.ai submitted by /u/Upset_Energy8577 [link] [comments]
View originalAnthropic shares how to make Claude code better with a harness
I just read Anthropic's new blog post about harness design for Claude. The author addresses two main problems Claude faces when working for extended periods: - Context anxiety: loss of coherence over long periods - Self-evaluation bias: Claude often praises his own work even when the quality isn't good The solution is to use multiple agents working together, drawing ideas from GANs: - Generator: creates code and design - Evaluator: provides critical evaluation and feedback Frontend: Use 4 scoring criteria (emphasizing aesthetics and creativity) to avoid generic designs. After 5-15 revisions, the result is much more beautiful and unique Full-stack: Use 3 agents (Planner - Generator - Evaluator) Comparison of the same game development requirements: - Running alone: fast but the game has serious bugs. - Using a harness: more time-consuming and expensive, but significantly higher quality, beautiful interface, playable game, and added AI support. The article also suggests that when the model becomes more powerful (like Opus 4.6), unnecessary harness elements should be removed. Link: https://www.anthropic.com/engineering/harness-design-long-running-apps Anyone using Claude to code or build agents should give this a try. submitted by /u/lawnguyen123 [link] [comments]
View originalBuilt a complete Joomla template without writing a single line of code -- just Claude Code conversations
Been using Claude Code in VS Code for a few months now. Wanted to see how far I could push it so I tried building a full website template from scratch -- just describing what I wanted in chat. Started with something like "make a dark editorial template with violet accents" and it just kept going from there. Every time I described a feature, it built it. 36 module positions, responsive nav, dark mode for every page type, reading progress bar, custom 404 page, self-hosted fonts. The whole thing is a Joomla 6 site template. In my opinion one of the greatest content management systems to work with. What surprised me was how well Claude handled the Joomla-specific stuff -- there are hundreds of gotchas in that system (like a date field silently hiding all your articles if set to 0000-00-00, or assets breaking the entire admin panel). I ended up documenting 300+ of them into a CLAUDE.md file that Claude reads automatically each session. Now it just knows all the traps. Demo if anyone's curious: https://demo.theaidirector.win GitHub (free, GPL): https://github.com/whynotindeed/signal-dark Also wrote up the CSS debugging approach that saved me the most time: https://theaidirector.win/stop-screenshotting-css Anyone into web design using Firefox or whatever Inspect should look into the article above if they want to save hours upon hours of fixing CSS issues with AI. Claude Code 4.6 in my experience is the first AI that truly understand beauty and style. Love to see it and cheers to improvement. Honestly the biggest surprise that I absolutely love, the CLAUDE.md approach -- once I documented all the Joomla gotchas in one file, every new session started with Claude already knowing the entire stack. After that, building new features felt like describing them to a colleague who already knows the codebase. Has anyone else used Claude Code to build something visual like this -- a full template or theme -- entirely through conversation? Curious how others handle the design side vs the logic side. submitted by /u/TheAIDirectorWin [link] [comments]
View original[D] We audited LoCoMo: 6.4% of the answer key is wrong and the judge accepts up to 63% of intentionally wrong answers
Projects are still submitting new scores on LoCoMo as of March 2026. We audited it and found 6.4% of the answer key is wrong, and the LLM judge accepts up to 63% of intentionally wrong answers. LongMemEval-S is often raised as an alternative, but each question's corpus fits entirely in modern context windows, making it more of a context window test than a memory test. Here's what we found. LoCoMo LoCoMo (Maharana et al., ACL 2024) is one of the most widely cited long-term memory benchmarks. We conducted a systematic audit of the ground truth and identified 99 score-corrupting errors in 1,540 questions (6.4%). Error categories include hallucinated facts in the answer key, incorrect temporal reasoning, and speaker attribution errors. Examples: The answer key specifies "Ferrari 488 GTB," but the source conversation contains only "this beauty" and the image caption reads "a red sports car." The car model exists only in an internal query field (annotator search strings for stock photos) that no memory system ingests. Systems are evaluated against facts they have no access to. "Last Saturday" on a Thursday should resolve to the preceding Saturday. The answer key says Sunday. A system that performs the date arithmetic correctly is penalized. 24 questions attribute statements to the wrong speaker. A system with accurate speaker tracking will contradict the answer key. The theoretical maximum score for a perfect system is approximately 93.6%. We also tested the LLM judge. LoCoMo uses gpt-4o-mini to score answers against the golden reference. We generated intentionally wrong but topically adjacent answers for all 1,540 questions and scored them using the same judge configuration and prompts used in published evaluations. The judge accepted 62.81% of them. Specific factual errors (wrong name, wrong date) were caught approximately 89% of the time. However, vague answers that identified the correct topic while missing every specific detail passed nearly two-thirds of the time. This is precisely the failure mode of weak retrieval, locating the right conversation but extracting nothing specific, and the benchmark rewards it. There is also no standardized evaluation pipeline. Each system uses its own ingestion method (arguably necessary given architectural differences), its own answer generation prompt, and sometimes entirely different models. Scores are then compared in tables as if they share a common methodology. Multiple independent researchers have documented inability to reproduce published results (EverMemOS #73, Mem0 #3944, Zep scoring discrepancy). Full audit with all 99 errors documented, methodology, and reproducible scripts: locomo-audit LongMemEval LongMemEval-S (Wang et al., 2024) is the other frequently cited benchmark. The issue is different but equally fundamental: it does not effectively isolate memory capability from context window capacity. LongMemEval-S uses approximately 115K tokens of context per question. Current models support 200K to 1M token context windows. The entire test corpus fits in a single context window for most current models. Mastra's research illustrates this: their full-context baseline scored 60.20% with gpt-4o (128K context window, near the 115K threshold). Their observational memory system scored 84.23% with the same model, largely by compressing context to fit more comfortably. The benchmark is measuring context window management efficiency rather than long-term memory retrieval. As context windows continue to grow, the full-context baseline will keep climbing and the benchmark will lose its ability to discriminate. LongMemEval-S tests whether a model can locate information within 115K tokens. That is a useful capability to measure, but it is a context window test, not a memory test. LoCoMo-Plus LoCoMo-Plus (Li et al., 2025) introduces a genuinely interesting new category: "cognitive" questions testing implicit inference rather than factual recall. These use cue-trigger pairs with deliberate semantic disconnect, the system must connect "I just adopted a rescue dog" (cue) to "what kind of pet food should I buy?" (trigger) across sessions without lexical overlap. The concept is sound and addresses a real gap in existing evaluation. The issues: It inherits all 1,540 original LoCoMo questions unchanged, including the 99 score-corrupting errors documented above. The improved judging methodology (task-specific prompts, three-tier scoring, 0.80+ human-LLM agreement) was only validated on the new cognitive questions. The original five categories retain the same broken ground truth with no revalidation. The judge model defaults to gpt-4o-mini. Same lack of pipeline standardization. The new cognitive category is a meaningful contribution. The inherited evaluation infrastructure retains the problems described above. Requirements for meaningful long-term memory evaluation Based on this analysis, we see several requirements for benchmarks that can meaningfully
View originalAnthropomorphism By Default
Anthropomorphism is the UI Humanity shipped with. It's not a mistake. Rather, it's a factory setting. Humans don’t interact with reality directly. We interact through a compression layer: faces, motives, stories, intention. That layer is so old it’s basically a bone. When something behaves even slightly agent-like, your mind spins up the “someone is in there” model because, for most of evolutionary history, that was the safest bet. Misreading wind as a predator costs you embarrassment. Misreading a predator as wind costs you being dinner. So when an AI produces language, which is one of the strongest “there is a mind here” signals we have, anthropomorphism isn’t a glitch. It’s the brain’s default decoder doing exactly what it was built to do: infer interior states from behavior. Now, let's translate that into AI framing. Calling them “neural networks” wasn’t just marketing. It was an admission that the only way we know how to talk about intelligence is by borrowing the vocabulary of brains. We can’t help it. The minute we say “learn,” “understand,” “decide,” “attention,” “memory,” we’re already in the human metaphor. Even the most clinical paper is quietly anthropomorphic in its verbs. So anthropomorphism is a feature because it does three useful things at once. First, it provides a handle. Humans can’t steer a black box with gradients in their head. But they can steer “a conversational partner.” Anthropomorphism is the steering wheel. Without it, most people can’t drive the system at all. Second, it creates predictive compression. Treating the model like an agent lets you form a quick theory of what it will do next. That’s not truth, but it’s functional. It’s the same way we treat a thermostat like it “wants” the room to be 70°. It’s wrong, but it’s the right kind of wrong for control. Third, it’s how trust calibrates. Humans don’t trust equations. Humans trust perceived intention. That’s dangerous, yes, but it’s also why people can collaborate with these systems at all. Anthropomorphism is the default, and de-anthropomorphizing is a discipline. I wish I didn't have to defend the people falling in love with their models or the ones that think they've created an Oracle, but they represent Humanity too. Our species is beautifully flawed and it takes all types to make up this crazy, fucked-up world we inhabit. So fucked-up, in fact, that we've created digital worlds to pour our flaws into as well. submitted by /u/Cyborgized [link] [comments]
View original[D] Is LeCun’s $1B seed round the signal that autoregressive LLMs have actually hit a wall for formal reasoning?
I’m still trying to wrap my head around the Bloomberg news from a couple of weeks ago. A $1 billion seed round is wild enough, but the actual technical bet they are making is what's really keeping me up. LeCun has been loudly arguing for years that next-token predictors are fundamentally incapable of actual planning. Now, his new shop, Logical Intelligence, is attempting to completely bypass Transformers to generate mathematically verified code using Energy-Based Models. They are essentially treating logical constraints as an energy minimization problem rather than a probabilistic guessing game. It sounds beautiful in theory for AppSec and critical infrastructure where you absolutely cannot afford a hallucinated library. But practically? We all know how notoriously painful EBMs are to train and stabilize. Mapping continuous energy landscapes to discrete, rigid outputs like code sounds incredibly computationally expensive at inference time. Are we finally seeing a genuine paradigm shift away from LLMs for rigorous, high-stakes tasks, or is this just a billion-dollar physics experiment that will eventually get beaten by a brute-forced GPT-5 wrapped in a good symbolic solver? Curious to hear from anyone who has actually tried forcing EBMs into discrete generation tasks lately. submitted by /u/Fun-Information78 [link] [comments]
View originalagente de IA creado con claude
Some of you might remember when I posted about SENTINEL — a security audit tool I built with Claude for scanning VPS servers, MikroTik routers, and n8n instances. Well, I didn't stop there. SENTINEL is now one skill inside a much bigger project called AETHER — an AI agent framework I've been building with Claude Code for the past 6 months. What is AETHER? It's an AI agent that I talk to from Telegram like a coworker. I tell it what I need in plain language and it gets it done. Some real examples from today: "How are the servers?" → Full health check, 10 Docker containers listed, all running. "Any suspicious IPs?" → 5 malicious IPs detected and blocked. One had 291 requests with 114 errors. "Send an email to José, meeting Wednesday at 12" → Drafts the email, shows me the preview, I say "confirm", email sent. I open Gmail and there it is. "Tech news?" → Summary of 7 articles from multiple sources. "Any new emails?" → Lists unread messages with sender, subject and summary. "List n8n workflows" → 6 active workflows listed. All from my phone. No SSH. No dashboards. Just Telegram. How Claude helped me build this: I'm not a developer. I'm 50 years old and I run a small telecom company. Claude Code has been my engineering team. The architecture decisions and product vision are mine, but Claude writes the code. What started as a simple Python bot in September 2025 that returned {"status": "healthy"} is now a full framework with: Python + TypeScript + FastAPI + PostgreSQL + Redis + Docker SENTINEL integrated as one of 25+ skills 110+ tools total Telegram, Discord, WhatsApp, REST API, WebSocket Semantic memory (pgvector) — it remembers context across sessions Security: prompt injection firewall, session guard, rate limiting, 39 protections Prometheus + Grafana for monitoring But here's the crazy part: I'm running 4 instances of AETHER right now, each doing a completely different job: AETHER Principal — manages my VPS infrastructure (the one I showed above) AETHER Trader — trading terminal with technical analysis, Binance integration, risk advisor Divina — web agent for a beauty business Tecofri — telecom expert for my company's website Same codebase. Different skills enabled. Different personality configured. SENTINEL went from being a standalone project to being one skill inside a much larger ecosystem. And it's all built with Claude. Some late nights (4am sessions are not uncommon), but the results speak for themselves. Published the first LinkedIn posts today and the response has been great. Just wanted to share the progress with the community that saw the beginning. Thanks to everyone who gave feedback on SENTINEL — it pushed me to keep going. What would you build with an AI agent framework? submitted by /u/Relative-Cattle5408 [link] [comments]
View originalThe continued improvement of image models
For quite a while we had a lot of trouble with vectors. Basically the arrows would point in the wrong directions or even in inconsistent directions in the same image. And then a new model dropped improving the images significantly. I won't tell you which model it was. Whether it was Open AI or Gemini or someone else because it doesn't matter. The best part from our perspective is that competition between AI companies is improving models for everybody and so we get to win no matter who is building model. In fact at Visual Book, we use multiple different image models based on the context and pricing. And so the biggest realisation for us is that we want more competition. As Open AI, Gemini and others compete with each other and models keep improving, we get to leverage the best of them for our applications. We are not the ones to pick a side and shout slogans. We are cheering for everybody :) Because this way we get to provide our customers with beautiful and accurate images and the best possible experience. submitted by /u/simplext [link] [comments]
View originalYes, Beautiful.ai offers a free tier. Pricing found: $12
Key features include: A topic or short prompt, A detailed prompt with slide-by-slide instructions, A pasted outline (including from other LLMs), A source document you want to turn into slides, Theme selection (out-of-the-box, Team themes, or bespoke themes), Image preferences (AI-generated, web images, stock, or none), Presentation language (100+ options), Optional consistent AI image style (custom prompt or presets).
Beautiful.ai is commonly used for: The next evolution of AI presentations.
Based on 23 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.