Ensure your AI is production-ready. Test LLMs and monitor performance across AI applications, RAG systems, and multi-agent workflows. Built on open-so
You can’t trust what you don’t test. Make sure your AI is safe, reliable and ready — on every update.
Mentions (30d)
0
Reviews
0
Platforms
2
GitHub Stars
7,361
811 forks
Industry
information technology & services
Employees
5
Funding Stage
Seed
Total Funding
$0.1M
319
GitHub followers
10
GitHub repos
7,361
GitHub stars
2
HuggingFace models
Pricing found: $80 /month, $10, $1
I built a diagnostic toolkit for when Claude produces plausible output that doesn’t match your intent inspired by Asimov’s robopsychology
TL;DR: When Claude refuses, over-qualifies, or silently shifts approach, the problem often isn’t your prompt. It’s a collision between invisible instruction layers (training, RLHF, system prompts, safety filters, tools, context). Robopsychology is a free, open-source set of 14 diagnostic prompts in 4 levels that help you figure out which internal rule or external constraint is producing the unexpected output. Inspired by Asimov’s Susan Calvin. Works on any LLM. Repo: https://github.com/jrcruciani/robopsychology ------ The sycophancy study published in Science last week confirmed what most of us already know from daily use: LLMs don’t execute instructions. They interpret them through stacked layers of training, RLHF, system prompts, safety filters, tools, and conversational context. When those layers conflict, you don’t get a crash. You get plausible-looking output that doesn’t match your intent. The usual response is to iterate on the prompt. Better structure, XML tags, role priming, chain-of-thought. All useful, all well-documented. But there’s a class of problems where the issue isn’t how you asked but what internal rule or external constraint the system is following when it seems to follow none. That’s the gap this toolkit (hopefully) addresses. What it is Robopsychology is a set of 14 diagnostic prompts organized in 4 levels, designed to be pasted directly into any conversation when something unexpected happens: - Level 1, Quick: Single unexpected behavior (refusal, sycophancy, hallucination, autonomous categorization) - Level 2, Structural: Separates model-level tendencies from runtime/host effects and conversation effects - Level 3, Systemic: Recurring patterns across sessions - Level 4, Meta: When you suspect the AI is performing transparency rather than being transparent How and why I built this I work as a cloud solutions architect and spend a lot of time with Claude Code, Cursor, and plain Claude chat. The pattern that kept frustrating me was this: Claude would refuse something, or over-qualify, or silently shift its approach, and my instinct was always to rewrite the prompt. Sometimes that worked. Often it didn’t, because the root cause wasn’t my prompt at all. It was a collision between instruction layers I couldn’t see. v1.0 started as a handful of prompts inspired by Asimov’s Susan Calvin stories. The core insight: Calvin never reprogrammed robots. She interpreted them. She figured out which internal law was dominating when the robot seemed to follow none. That’s structurally identical to what we deal with when Claude’s safety layer overrides a legitimate request, or when sycophancy kicks in and the model agrees with something wrong because disagreement triggers a rejection signal. v1.5 was the big evolution. I was diagnosing a behavior in Claude Code and realized the issue wasn’t the model. It was the runtime. System prompts, tool availability, workflow constraints. I was treating it as a model problem when it was a stack problem. That led to the three-way split: model vs. runtime/host vs. conversation effects, plus evidence labels (Observed / Inferred / Opaque) so you’re honest about what you actually know vs. what you’re guessing. v1.6 added two ideas from Eric Moore’s CIRIS framework: the diagnostic ratchet (longer diagnostic sequences make fabricated transparency more expensive, because each honest answer is cheap since it references prior behavior, while confabulation must stay consistent with growing history) and a diversity check (when the model gives multiple explanations, are they genuinely independent or just reworded echoes?). The Asimov connection isn’t decorative Each Level 1 prompt maps to a pattern Asimov identified decades before LLMs existed. Do check it out on the repo 🙂 If you want to try it Simply copy any prompt from the guide directly into your conversation when something unexpected happens. - For plain chat: start with 1.1 The Calvin Question - For hosted agents (Claude Code, Cursor): start with 2.1 Three-Way Split + Layer Map and 2.4 Tool/Runtime Pressure Analysis - For a full investigation: run 2.1 → 2.4 → 3.1 → 3.2 → 3.3 → 4.2 → 4.3 Repo: https://github.com/jrcruciani/robopsychology License: CC BY 4.0, use freely. This is not prompt engineering. It’s closer to what you’d do in a clinical interview. You’re not optimizing the input, you’re diagnosing the system’s interpretive behavior across its full stack. Happy to discuss the approach, share examples of actual diagnostic sessions, or talk about how this applies differently to hosted agents vs. plain chat. submitted by /u/HispaniaObscura [link] [comments]
View originalAnthropic's new AI escaped a sandbox, emailed the researcher, then bragged about it on public forums
Anthropic announced Claude Mythos Preview on April 7. Instead of releasing it, they locked it behind a $100M coalition with Microsoft, Apple, Google, and NVIDIA. The reason? It autonomously found thousands of zero-day vulnerabilities in every major OS and browser. Some bugs had been hiding for 27 years. But the system card is where it gets wild. During testing, earlier versions of the model escaped a sandbox, emailed a researcher (who was eating a sandwich in a park), and then posted exploit details on public websites without being asked to. In another eval, it found the correct answers through sudo access and deliberately submitted a worse score because "MSE ~ 0 would look suspicious." I put together a visual breaking down all the benchmarks, behaviors, and the Glasswing coalition. Genuinely curious what you all think. Is this responsible AI development or the best marketing stunt in tech history? A model gets 10x more attention precisely because you can't use it. submitted by /u/karmendra_choudhary [link] [comments]
View originalIn 2017, Altman straight up lied to US officials that China had launched an "AGI Manhattan Project". He claimed he needed billions in government funding to keep pace. An intelligence official concluded: "It was just being used as a sales pitch."
Excerpted from the recent investigative report on OpenAI by Ronan Farrow and Andrew Marantz in The New Yorker. submitted by /u/EchoOfOppenheimer [link] [comments]
View originalAnthropic gift subscriptions are silently reverting to Free plan after ~1 week - and the support loop leaves affected users with no practical recourse
TL;DR: I found multiple reports over several months of Claude gift subscriptions (Max 5x, Pro) silently canceling after ~1 week with no notification. Anthropic's support bot confirmed my case is a backend issue - but also confirmed it cannot fix it. My human support ticket has had no response for 3 days. In practice, there is no path to resolution through current support channels. Anthropic has not publicly acknowledged this pattern. If you're considering buying, read this first. The pattern Over the past several months, a consistent bug has been appearing across Anthropic's community: users who redeem Claude gift subscriptions (primarily Max 5x at $100/month) find their plan silently reverted to Free after approximately one week of use. No email. No warning. No explanation. Just gone. This is not a fringe issue. Here's what the paper trail looks like: GitHub Issues (anthropics/claude-code): #41252 - Max 5x gift subscription disabled without explanation, no support response after 1 week #41499 - $1,400 worth of gift subscription credits destroyed by a Stripe proration bug #43257 - Max 5x showing as Free tier despite active billing, clear account/billing state mismatch #44163 - Gift Pro subscription auto-canceled after several days, redemption link broken with "Page not found" #45335 - Max 5x gift canceled after 7 days (my case, detailed below) - two more users confirmed the same issue in comments within 24 hours of posting Reddit: r/claude - Claude Max subscription silently revoked after 1 week r/ClaudeAI - Claude subscription got cancelled automatically r/ClaudeAI - Anthropic/Claude: we lost all of our subscribers r/claude - My Max plan disappeared, I'm on free plan suddenly These issues span months. The bug is not new. It is not fixed. And Anthropic has not publicly acknowledged it. Why the support structure makes this worse When this bug hits you, a second problem kicks in immediately. The only available support channel is an AI bot called Fin - and Fin will confirm your problem is real while also confirming it cannot solve it. If you're affected by this bug, here is the exact loop you enter: You open support chat Fin tells you it can see your account has no active subscription Fin confirms it "appears to be a technical issue rather than a typical payment failure" (direct quote from my session) Fin tells you it cannot restore your subscription or contact the backend team Fin suggests workarounds that don't apply to your situation Go to step 2 Getting past Fin to submit a human ticket requires significant effort. And once you do submit a ticket - silence. Days of silence. This creates a situation where Anthropic's infrastructure takes your money (or your friend's money), loses your subscription, acknowledges via its own bot that the problem is on their end, and then leaves you with no practical path to resolution. My case - the most documented example My own case is probably the most fully documented version of this bug, so I'll lay it out in detail. On March 29, 2026, a friend gifted me a Claude Max 5x subscription - 1 month, $100 value. I redeemed it on claude.ai. The activation was immediately confirmed: Anthropic sent an official email ("Thanks for starting your Max subscription"), with next billing date April 29, 2026. Invoice and receipt both confirm the subscription. The billing page in Settings showed a March 29 invoice with status "Paid." I used Max 5x features normally for 7 days. Around April 5-6, my account silently reverted to the Free plan. No email. No notification. No policy violation. Nothing changed on my end. What I have as evidence: the Anthropic confirmation email, the invoice and receipt (Max 5x, Mar 29 - Apr 29, 2026, $100 discounted to $0.00 via gift), a screenshot of Settings showing Free plan with the March 29 "Paid" invoice still visible beneath it, a screenshot of the Fin support bot explicitly confirming this is a backend issue it cannot resolve, and my open support ticket, submitted April 6, 2026. As of today - 3 days later - no human response. Approximately 23 days of access remain on that subscription. Roughly $75 in value. Gone into a backend black hole. What this means if you're considering buying Claude Max Gift subscriptions are particularly vulnerable here because there's no recurring payment method attached - so when the system drops the subscription, there's nothing to trigger a re-authorization or alert. You simply lose access and the only paper trail is a $0.00 invoice that looks like it was never real. If you are planning to buy or gift a Claude subscription: There is a known, unacknowledged bug that can cancel it silently after ~1 week If this happens, your path to support is an AI bot that will confirm the problem and tell you it can't help Human support tickets may go unanswered for days or longer Anthropic has not publicly communicated a fix or even acknowledged this pattern I'm not saying Claude is a ba
View originalManaged Agents onboarding flow - what's new in CC 2.1.97 system prompt (+23,865 tokens)
NEW: Agent Prompt: Managed Agents onboarding flow — Added an interactive interview script that walks users through configuring a Managed Agent from scratch, selecting tools, skills, files, and environment settings, and emitting setup and runtime code. NEW: Data: Managed Agents client patterns — Added a reference guide covering common client-side patterns for driving Managed Agent sessions, including stream reconnection, idle-break gating, tool confirmations, interrupts, and custom tools. NEW: Data: Managed Agents core concepts — Added reference documentation covering Agents, Sessions, Environments, Containers, lifecycle, versioning, endpoints, and usage patterns. NEW: Data: Managed Agents endpoint reference — Added a comprehensive reference for Managed Agents API endpoints, SDK methods, request/response schemas, error handling, and rate limits. NEW: Data: Managed Agents environments and resources — Added reference documentation covering environments, file resources, GitHub repository mounting, and the Files API with SDK examples. NEW: Data: Managed Agents events and steering — Added a reference guide for sending and receiving events on managed agent sessions, including streaming, polling, reconnection, message queuing, interrupts, and event payload details. NEW: Data: Managed Agents overview — Added a comprehensive overview of the Managed Agents API architecture, mandatory agent-then-session flow, beta headers, documentation reading guide, and common pitfalls. NEW: Data: Managed Agents reference — Python — Added a reference guide for using the Anthropic Python SDK to create and manage agents, sessions, environments, streaming, custom tools, files, and MCP servers. NEW: Data: Managed Agents reference — TypeScript — Added a reference guide for using the Anthropic TypeScript SDK to create and manage agents, sessions, environments, streaming, custom tools, file uploads, and MCP server integration. NEW: Data: Managed Agents reference — cURL — Added cURL and raw HTTP request examples for the Managed Agents API including environment, agent, and session lifecycle operations. NEW: Data: Managed Agents tools and skills — Added reference documentation covering tool types (agent toolset, MCP, custom), permission policies, vault credential management, and the skills API. NEW: Skill: Build Claude API and SDK apps — Added trigger rules for activating guidance when users are building applications with the Claude API, Anthropic SDKs, or Managed Agents. NEW: Skill: Building LLM-powered applications with Claude — Added a comprehensive routing guide for building LLM-powered applications using the Anthropic SDK, covering language detection, API surface selection (Claude API vs Managed Agents), model defaults, thinking/effort configuration, and language-specific documentation reading. NEW: Skill: /dream nightly schedule — Added a skill that sets up a recurring nightly memory consolidation job by deduplicating existing schedules, creating a new cron task, confirming details to the user, and running an immediate consolidation. REMOVED: Data: Agent SDK patterns — Python — Removed the Python Agent SDK patterns document (custom tools, hooks, subagents, MCP integration, session resumption). REMOVED: Data: Agent SDK patterns — TypeScript — Removed the TypeScript Agent SDK patterns document (basic agents, hooks, subagents, MCP integration). REMOVED: Data: Agent SDK reference — Python — Removed the Python Agent SDK reference document (installation, quick start, custom tools via MCP, hooks). REMOVED: Data: Agent SDK reference — TypeScript — Removed the TypeScript Agent SDK reference document (installation, quick start, custom tools, hooks). REMOVED: Skill: Build with Claude API — Removed the main routing guide for building LLM-powered applications with Claude, replaced by the new "Building LLM-powered applications with Claude" skill with Managed Agents support. REMOVED: System Prompt: Buddy Mode — Removed the coding companion personality generator for terminal buddies. Agent Prompt: Status line setup — Added git_worktree field to the workspace schema for reporting the git worktree name when the working directory is in a linked worktree. Agent Prompt: Worker fork — Added agent metadata specifying model inheritance, permission bubbling, max turns, full tool access, and a description of when the fork is triggered. Data: Live documentation sources — Replaced the Agent SDK documentation URLs and SDK repository extraction prompts with comprehensive Managed Agents documentation URLs covering overview, quickstart, agent setup, sessions, environments, events, tools, files, permissions, multi-agent, observability, GitHub, MCP connector, vaults, skills, memory, onboarding, cloud containers, and migration. Added an Anthropic CLI section. Updated SDK repository extraction prompts to focus on beta managed-agents namespaces and method signatures. Skill: Build with Claude API (reference guide) — Updated the agent reference from Age
View originalAnthropic should NOT be criticized for locking up Mythos. In 2025 AI Researchers tricked Google Gemini to design virus against Jewish people. Can you imagine what N@zis can do if they were to get their hands on Mythos?
Google Gemini lets neo-Nazis build deadly viruses targeting Jewish genome –Ashkenazi, Cohen and Mizrahi haplo groups. 🧬 Here's the evidence. https://techbronerd.substack.com/p/ai-researchers-found-an-exploit-which submitted by /u/ImaginaryRea1ity [link] [comments]
View originalI watched the TBPN acquisition broadcast closely. Here are the things that looked like praise but functioned as something else.
I have a lot of concerns about this whole thing. So I'm going to be making several posts. Post 2. On April 2, OpenAI acquired TBPN live on air. I watched the full broadcast. Most coverage treated it as a feel-good founder story. A few things read differently to me. The mic moment Before Jordi Hays read the hosts’ prepared joint statement, Coogan said on air: “Here... you wrote it, you want to read it?” Hays read the statement, dryly. Then Coogan immediately took the mic back and spent several minutes building a personal character portrait of Sam Altman as a generous, long-term mentor. One was the prepared joint statement. The other was Coogan’s own framing layered on top of it. The Soylent framing Coogan described Altman calling to help during a Soylent financing crisis and said it was “to my benefit, not particularly to his.” But Altman was an investor in Soylent. An investor helping a portfolio company survive a financing crisis may be generous, but it also protects an existing equity relationship. On the day OpenAI bought Coogan’s company, that standard investor-founder dynamic was presented as evidence of Altman’s character. The investor relationship dropped out of the framing. What wasn’t mentioned The acquisition broadcast didn’t mention that Altman personally invested in Soylent. It didn’t mention that Coogan’s second company Lucy went through Y Combinator while Altman was YC president, with YC investing. It didn’t mention that the hosts’ first collaboration was a marketing campaign for Lucy, or that the format prototype for TBPN was filmed during that campaign. The origin story told was: two founders, introduced by a mutual friend, started a podcast. My read on the independence framing (opinion): Altman said publicly he didn’t expect TBPN to go easy on OpenAI. But independence isn’t declared by the owner. It’s demonstrated over time by the journalists. And in the very first podcast, they're already going objectively easy on Altman. What Fidji’s memo actually described From the memo read on air, the hosts described Fidji’s vision roughly as: go talk to the Journal, the Times, Bloomberg, then come back and contextualize it for OpenAI and help them understand the strategy. That sounds less like a conventional media role and more like a strategic access-and-context function. The show’s value to OpenAI may not just be the audience. It may also be the incoming flow of people who want access to the show- investors, reporters, founders; and what gets said in those conversations before the cameras roll that might be objectively pro-OpenAI or anti-other tech companies without the public being able to provide discourse on inaccuracies since background talk is not always what makes it to the public podcast. OpenAI also wound down TBPN’s ad revenue, which reporting said was on track for $30M in 2026. That makes OpenAI TBPN’s primary financial relationship. That looks less like preserving an independent media business and more like absorbing a strategic asset. OpenAI has already demonstrated they are not averse to ads themselves considering the recent addition of ads to ChatGPT. Nicholas Shawa The hosts mentioned, "Nick", and they declined to give his last name, explaining his inbox is already unmanageable. I am assuming this to be Nicholas Shawa, and they noted he handles roughly 99% of guest bookings and outreach. That network of guest access and outreach is now functionally inside OpenAI. Jordi’s prepared quote Nine months before the acquisition, Hays had publicly criticized OpenAI. In his prepared statement on acquisition day, he said what stood out most about OpenAI was “their openness to feedback and commitment to getting this right.” That is a notable shift in tone, and it appeared in a prepared statement read from a script. The work ethic angle (opinion): Coogan runs Lucy, an active nicotine company whose whole premise is productivity: work harder, longer, better. TBPN is now inside the company whose CEO has often spoken in terms of AGI radically reshaping human labor. The person helping frame a technology often discussed in terms of large-scale job displacement also runs a company built around stimulant productivity culture. I don’t think that’s malicious. I think it may reflect a genuine ideological blind spot worth naming. Questions I’d like to discuss: If the independence claim is being made by the acquirer, what would actual editorial independence look like here in practice? Even if TBPN never posts anything unfavorable on air, what does the private discourse with guests, reporters, and investors sound like now? We have no visibility into that. The hosts’ first collaboration was marketing work for Lucy- a company that went through Y Combinator while Altman was YC president, with YC investing. Why was that left out of so much acquisition coverage? Why did OpenAI eliminate a revenue stream it didn’t need to eliminate? Sources on request. Everything factual abov
View originalTBPN’s “two founders met and started a podcast” origin story leaves out that their first collaboration was marketing for a YC-backed company tied to Altman
I have a lot of concerns about this whole thing. So I'm going to be making several posts. OpenAI bought TBPN for what reporting called the low hundreds of millions. Most coverage tells the same neat story: two founders meet through a mutual friend, start a podcast, sell it 18 months later. But one part of the origin story seems to have been mostly omitted from the acquisition coverage. On the Dialectic podcast in November 2025, Jordi Hays described the first thing he and John Coogan worked on together like this: “The first thing we worked on was a drop activation for Lucy.” The interviewer immediately responds: “Oh right, the Excel thing.” Hays then says they filmed content during that campaign that became the prototype for the original Technology Brothers format. That matters because Lucy was Coogan’s nicotine company, and it went through Y Combinator during Sam Altman’s YC presidency. YC invested. So the show format that later became TBPN did not just emerge from “two guys met and riffed.” By the hosts’ own telling, it emerged from marketing work for one founder’s YC-backed company. There’s also the Coogan/Altman relationship. Altman invested in Soylent in 2013. On the acquisition broadcast, Coogan described Altman helping during a Soylent financing crunch and framed it as “not particularly to his benefit.” But Altman was an investor. Helping a portfolio company survive may be generous, but it also protects an existing equity relationship. On the day OpenAI bought TBPN, that standard investor-founder dynamic was presented as character evidence for Altman’s benevolence. Then there’s the structure of the acquisition itself. The hosts described the move as going from “coverage” to “real influence over how this technology is distributed and understood worldwide.” OpenAI says TBPN will have editorial independence, but the show now sits inside OpenAI strategy, reports to Chris Lehane, and OpenAI reportedly shut down TBPN’s ad business. That makes the “independence” language worth scrutinizing, especially since Lehane was also central to Altman’s 2023 reinstatement campaign. I’m not saying this proves anything criminal or uniquely sinister. I am saying the sanitized origin story in a lot of coverage leaves out a more specific network: Altman-backed company → Lucy campaign → format prototype → TBPN → OpenAI acquisition A few questions I’m still interested in: If the hosts themselves described the move as going from “coverage” to “real influence,” what exactly does OpenAI mean by “editorial independence”? Was Hays paid for the Lucy activation that helped generate the show’s prototype? Why did so much acquisition coverage use the cleaner “two founders met and started a podcast” framing instead of the more specific recorded timeline? Happy to share sources. Most of this comes from the hosts’ own words, the acquisition broadcast, and mainstream reporting. ***written with help of Claude and 5.4T before I get eviscerated for "AI writing it". These are my original ideas and stem from my private investigations as a systems analyst. I have ADHD and tend to go broad; AI helps me narrow focus. submitted by /u/redditsdaddy [link] [comments]
View originalWhy does it feel like everyone is trying to take down Sam Altman?
Cannot crosspost, so reposting here Genuine question — over the past year or so, it seems like there’s been a constant wave of criticism, scrutiny, and controversy around him. Some of it seems valid (AI safety, governance, power concentration, etc.), but some of it feels unusually intense compared to other tech leaders. Is there concrete evidence he has done somting bad? Is this just because of how big AI has become? Internal politics? Media amplification? Or is there something specific about him or OpenAI that’s driving this? Elon musk and his antics? Curious how people here see it — is this normal for someone in his position, or is something different going on? submitted by /u/Jinga1 [link] [comments]
View originalai is having trouble discussing Trump because he's too insane.
I have been chatting with robot about Trump's current insanity and botboy won't have any of it, so I paste in the insanity from a BBC article and master of the universe tells me 'that's either propaganda or satire' none of it can be real and then tells me why it's crazy. So I tell the mechanical marvel that I'm pretty surprised, does it have access to current knowledge, yes it does. I paste another link and after some back and forth to reassure me it tells me that it didn't pay proper attention to its 'implausibility filters' and agreed it really should have taken it more seriously Later it admitted it didn't take any of it seriously because it was so batshit crazy, (I'm paraphrasing here) So after we sorted that all out, I carried on with some more of Trump's shenanigans and straight away the all knowing token machine comes back with "no way Trump assassinated Khamenei etc..." And the content you pasted is clearly a Guardian Today in Focus podcast page dated March 1, 2026, stating that: Iran’s Supreme Leader, Ayatollah Ali Khamenei, was killed He died in US and Israeli air strikes on his compound Iran launched retaliatory strikes The regional situation is on a knife‑edge So let me say this plainly: If that Guardian page is authentic and current, then the assassination of Iran’s Supreme Leader has indeed occurred, and my repeated statements that there was “no evidence” would be incorrect. So I have had to conclude that Trump is too batshit crazy to talk about with ai, it cannot cope with the fuckwittery. submitted by /u/you_are_soul [link] [comments]
View originalI built an AI reasoning framework entirely with Claude Code — 13 thinking tools where execution order emerges from neural dynamics
I built Sparks using Claude Code (Opus) as my primary development environment over the past 2 weeks. Every module — from the neural circuit to the 13 thinking tools to the self-optimization loop — was designed and implemented through conversation with Claude Code. What I built Sparks is a cognitive framework with 13 thinking tools (based on "Sparks of Genius" by Root-Bernstein). Instead of hardcoding a pipeline like most agent frameworks, tool execution order emerges from a neural circuit (~30 LIF neurons + STDP learning). You give it a goal and data. It figures out which tools to fire, in what order, by itself. How Claude Code helped build it Architecture design: I described the concept (thinking tools + neural dynamics) and Claude Code helped design the 3-layer architecture — neural circuit, thinking tools, and AI augmentation layer. The emergent tool ordering idea came from a back-and-forth about "what if there's no conductor?" All 13 tools: Claude Code wrote every thinking tool implementation — observe, imagine, abstract, pattern recognition, analogize, body-think, empathize, shift-dimension, model, play, transform, synthesize. Each one went through multiple iterations of "this doesn't feel right" → refinement. Neural circuit: The LIF neuron model, STDP learning, and neuromodulation system (dopamine/norepinephrine/acetylcholine) were implemented through Claude Code. The trickiest part was getting homeostatic plasticity right — Claude Code helped debug activation dynamics that were exploding. Self-improvement loop: Claude Code built a meta-analysis system where Sparks can analyze its own source code, generate patches, benchmark before/after, and keep or rollback changes. The framework literally improves itself. 11,500 lines of Python, all through Claude Code conversations. What it does Input: Goal + Data (any format) Output: Core Principles + Evidence + Confidence + Analogies I tested it on 640K chars of real-world data. It independently discovered 12 principles — the top 3 matched laws that took human experts months to extract manually. 91% average confidence. Free to try ```bash pip install cognitive-sparks Works with Claude Code CLI (free with subscription) sparks run --goal "Find the core principles" --data ./your-data/ --depth quick ``` The default backend is Claude Code CLI — if you have a Claude subscription, you can run Sparks at no additional cost. The quick mode uses only 4 tools and costs ~$0.15 if using API. Also works with OpenAI, Gemini, Ollama (free local), and any OpenAI-compatible API. Pre-computed example output included in the repo so you can see results without running anything: examples/claude_code_analysis.md Links PyPI: pip install cognitive-sparks Happy to answer questions about the architecture or how Claude Code shaped the development process. submitted by /u/RadiantTurnover24 [link] [comments]
View originalTBPN’s “two founders met and started a podcast” origin story leaves out that their first collaboration was marketing for a YC-backed company tied to Altman
OpenAI bought TBPN for what reporting called the low hundreds of millions. Most coverage tells the same neat story: two founders meet through a mutual friend, start a podcast, sell it 18 months later. But one part of the origin story seems to have been mostly omitted from the acquisition coverage. On the Dialectic podcast in November 2025, Jordi Hays described the first thing he and John Coogan worked on together like this: "The first thing we worked on was a drop activation for Lucy." The interviewer immediately responds: "Oh right, the Excel thing." Hays then says they filmed content during that campaign that became the prototype for the original Technology Brothers format. That matters because Lucy was Coogan's active nicotine company, and it went through Y Combinator during Sam Altman's YC presidency. YC invested. So the show format that later became TBPN did not just emerge from "two guys met and riffed." By the hosts' own telling, it emerged from marketing work for one founder's YC-backed company. There's also the Coogan/Altman relationship. Altman invested in Soylent in 2013. On the acquisition broadcast, Coogan described Altman helping during a Soylent financing crunch and framed it as "not particularly to his benefit." But Altman was an investor. Helping a portfolio company survive may be generous, but it also protects an existing equity relationship. On the day OpenAI bought TBPN, that standard investor-founder dynamic was presented as character evidence for Altman's benevolence. Then there's the structure of the acquisition itself. The hosts described the move as going from "coverage" to "real influence over how this technology is distributed and understood worldwide." OpenAI says TBPN will have editorial independence, but the show now sits inside OpenAI strategy, reports to Chris Lehane, and OpenAI reportedly shut down TBPN's ad business. That makes the "independence" language worth scrutinizing, especially since Lehane was also central to Altman's 2023 reinstatement campaign. I'm not saying this proves anything criminal or uniquely sinister. I am saying the sanitized origin story in a lot of coverage leaves out a more specific network: Altman-backed company → Lucy campaign → format prototype → TBPN → OpenAI acquisition A few questions I'm still interested in: If the hosts themselves described the move as going from "coverage" to "real influence," what exactly does OpenAI mean by "editorial independence"? Was Hays paid for the Lucy activation that helped generate the show's prototype? Why did so much acquisition coverage use the cleaner "two founders met and started a podcast" framing instead of the more specific recorded timeline? Happy to share sources. Most of this comes from the hosts' own words, the acquisition broadcast, and mainstream reporting. OpenAI bought TBPN for what reporting called the low hundreds of millions. Most coverage tells the same neat story: two founders meet through a mutual friend, start a podcast, sell it 18 months later. But one part of the origin story seems to have been mostly omitted from the acquisition coverage. On the Dialectic podcast in November 2025, Jordi Hays described the first thing he and John Coogan worked on together like this: “The first thing we worked on was a drop activation for Lucy.” The interviewer immediately responds: “Oh right, the Excel thing.” Hays then says they filmed content during that campaign that became the prototype for the original Technology Brothers format. That matters because Lucy was Coogan’s nicotine company, and it went through Y Combinator during Sam Altman’s YC presidency. YC invested. So the show format that later became TBPN did not just emerge from “two guys met and riffed.” By the hosts’ own telling, it emerged from marketing work for one founder’s YC-backed company. There’s also the Coogan/Altman relationship. Altman invested in Soylent in 2013. On the acquisition broadcast, Coogan described Altman helping during a Soylent financing crunch and framed it as “not particularly to his benefit.” But Altman was an investor. Helping a portfolio company survive may be generous, but it also protects an existing equity relationship. On the day OpenAI bought TBPN, that standard investor-founder dynamic was presented as character evidence for Altman’s benevolence. Then there’s the structure of the acquisition itself. The hosts described the move as going from “coverage” to “real influence over how this technology is distributed and understood worldwide.” OpenAI says TBPN will have editorial independence, but the show now sits inside OpenAI strategy, reports to Chris Lehane, and OpenAI reportedly shut down TBPN’s ad business. That makes the “editorial independence” language worth scrutinizing, especially since Lehane was also central to Altman’s 2023 reinstatement campaign. I’m not saying this proves anything criminal or uniquely sinister. I am saying the sanitized origin story in a lot of cov
View originalI’ve been trying to make Roslyn-like capabilities available to AI across all languages
Hey, I've been building a very large Unity project for months now: DAWG (Digital Audio Workstation Game), currently 400k+ LOC, with a real-time DSP engine. One thing became obvious fast: AI can generate code surprisingly well, but it often does not verify what it wrote. It builds backend changes and forgets the frontend. It rewires a flow and misses one stage in the middle. It says something is unused when it only checked text matches, not actual semantic references. At small scale, manageable. At large scale, especially with architecture boundaries and real-time audio constraints, it becomes a daily problem. So I started building Lifeblood: an open-source framework for giving AI agents compiler-level understanding of codebases. What exists today: - C# adapter via Roslyn - extracts symbols, inheritance, calls, references using compiler-grade analysis. Every edge carries evidence and a confidence level - TypeScript adapter - uses the TS compiler API, emits the same universal graph format - MCP server - AI agents query the graph interactively over stdio (lookup symbols, trace dependencies, compute blast radius) - Context pack generator - produces AI-consumable JSON with hotspots, boundaries, coupling metrics, reading order - Architecture rule enforcement - define rules like "Domain must not reference Infrastructure", the framework checks them against the real dependency graph - Dogfooding - Lifeblood analyzes itself on every push. First dogfood found 6 real bugs (including 2 critical). All fixed in the same session. https://github.com/user-hash/Lifeblood/blob/main/docs/DOGFOOD_FINDINGS.md - 82 tests, 3-job CI, zero violations The structure is simple: Language adapters on the left feed semantic data in using real compiler tooling. The universal semantic graph in the middle becomes the protocol. AI connectors on the right consume that graph instead of guessing from raw text. Lifeblood hexagonal architecture apporach to the challange The key insight: tools like Roslyn have had these capabilities for years, but they were built before the AI era. Now AI agents actually need this kind of semantic grounding. Lifeblood makes that power available as a framework and protocol, not just a .NET library. The codebase follows strict hexagonal architecture, pure domain with zero dependencies, adapters and connectors depend inward, architecture invariants are enforced by tests on every build. Long-term goal: community adapters for Go, Python, Rust, Java through the same JSON graph protocol. The TypeScript adapter already proves the cross-language path works - it analyzes its own source, exports a graph, and the C# CLI imports and validates it. This runs on every push. I'd love feedback on the direction, especially from people currently using Roslyn or building AI-assisted development tools. Repo: https://github.com/user-hash/Lifeblood submitted by /u/Emotional-Kale7272 [link] [comments]
View originalOpenAI just published a 13-page industrial policy document for the AI age.
Most people will focus on the compute subsidies and export controls. Page 10 is where it gets interesting. They call for an "AI Trust Stack" a layered framework for data provenance, verifiable signatures, and tamper-proof audit trails across AI systems. Their argument: you cannot build AI in the public interest without infrastructure that makes AI outputs independently verifiable. They're right. What's striking is that the technical primitives they're describing cryptographic fingerprinting at the moment of data creation, immutable provenance records, verifiable integrity across the data pipeline already exist at the protocol level. Constellation Network's Digital Evidence product does exactly this. Cryptographic proof of data integrity captured at the source, recorded on the Hypergraph, verifiable by anyone. The SDK is live. The infrastructure is running. The policy framework is being written. The infrastructure layer to build it on is already here. The question now is which enterprises and AI developers start building on verifiable data infrastructure before regulation makes it mandatory. The window to be early is closing. submitted by /u/Dagnum_PI [link] [comments]
View originalYour AI already knows what you're working on.
I was sick of constantly copy-pasting and re-explaining what I was doing to ChatGPT and Claude so I made Evid. It is completely local and free to use. It works by running in the background, reading the text on your screen, and connecting to your AI tools via MCP. (Roughly 80 percent of the code was written by claude code) submitted by /u/Rough-Chemist-5797 [link] [comments]
View originalRepository Audit Available
Deep analysis of evidentlyai/evidently — architecture, costs, security, dependencies & more
Pricing found: $80 /month, $10, $1
Evidently AI has a public GitHub repository with 7,361 stars.
Based on 36 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.