Source-controlled AI checks on every pull request. Standards as checks, enforced by AI, decided by humans.
Based on the provided content, there are no actual reviews or social mentions specifically about "Continue" as a software tool. The social mentions appear to be from Hacker News and Lemmy discussing various unrelated topics including database tools (PgDog), containerization platforms (Coasts), and political articles, but none mention or review a product called "Continue." Without relevant user feedback about Continue specifically, I cannot provide a meaningful summary of user sentiment, strengths, complaints, or pricing opinions for this tool.
Mentions (30d)
5
Reviews
0
Platforms
5
GitHub Stars
32,190
4,308 forks
Based on the provided content, there are no actual reviews or social mentions specifically about "Continue" as a software tool. The social mentions appear to be from Hacker News and Lemmy discussing various unrelated topics including database tools (PgDog), containerization platforms (Coasts), and political articles, but none mention or review a product called "Continue." Without relevant user feedback about Continue specifically, I cannot provide a meaningful summary of user sentiment, strengths, complaints, or pricing opinions for this tool.
Features
Industry
information technology & services
Employees
19
Funding Stage
Seed
Total Funding
$2.2M
1,311
GitHub followers
67
GitHub repos
32,190
GitHub stars
20
npm packages
8
HuggingFace models
Show HN: PgDog – Scale Postgres without changing the app
Hey HN! Lev and Justin here, authors of PgDog (<a href="https://pgdog.dev/">https://pgdog.dev/</a>), a connection pooler, load balancer and database sharder for PostgreSQL. If you build apps with a lot of traffic, you know the first thing to break is the database. We are solving this with a network proxy that works without requiring application code changes or database migrations.<p>Our post from last year: <a href="https://news.ycombinator.com/item?id=44099187">https://news.ycombinator.com/item?id=44099187</a><p>The most important update: we are in production. Sharding is used a lot, with direct-to-shard queries (one shard per query) working pretty much all the time. Cross-shard (or multi-database) queries are still a work in progress, but we are making headway.<p>Aggregate functions like count(), min(), max(), avg(), stddev() and variance() are working, without refactoring the app. PgDog calculates the aggregate in-transit, while transparently rewriting queries to fetch any missing info. For example, multi-database average calculation requires a total count of rows to calculate the original sum. PgDog will add count() to the query, if it’s not there already, and remove it from the rows sent to the app.<p>Sorting and grouping works, including DISTINCT, if the columns(s) are referenced in the result. Over 10 data types are supported, like, timestamp(tz), all integers, varchar, etc.<p>Cross-shard writes, including schema changes (CREATE/DROP/ALTER), are now atomic and synchronized between all shards with two-phase commit. PgDog keeps track of the transaction state internally and will rollback the transaction if the first phase fails. You don’t need to monkeypatch your ORM to use this: PgDog will intercept the COMMIT statement and execute PREPARE TRANSACTION and COMMIT PREPARED instead.<p>Omnisharded tables, a.k.a replicated or mirrored (identical on all shards), support atomic reads and writes. That’s important because most databases can’t be completely sharded and will have some common data on all databases that has to be kept in-sync.<p>Multi-tuple inserts, e.g., INSERT INTO table_x VALUES ($1, $2), ($3, $4), are split by our query rewriter and distributed to their respective shards automatically. They are used by ORMs like Prisma, Sequelize, and others, so those now work without code changes too.<p>Sharding keys can be mutated. PgDog will intercept and rewrite the update statement into 3 queries, SELECT, INSERT, and DELETE, moving the row between shards. If you’re using Citus (for everyone else, Citus is a Postgres extension for sharding databases), this might be worth a look.<p>If you’re like us and prefer integers to UUIDs for your primary keys, we built a cross-shard unique sequence, directly inside PgDog. It uses the system clock (and a couple other inputs), can be called like a Postgres function, and will automatically inject values into queries, so ORMs like ActiveRecord will continue to work out of the box. It’s monotonically increasing, just like a real Postgres sequence, and can generate up to 4 million numbers per second with a range of 69.73 years, so no need to migrate to UUIDv7 just yet.<p><pre><code> INSERT INTO my_table (id, created_at) VALUES (pgdog.unique_id(), now()); </code></pre> Resharding is now built-in. We can move gigabytes of tables per second, by parallelizing logical replication streams across replicas. This is really cool! Last time we tried this at Instacart, it took over two weeks to move 10 TB between two machines. Now, we can do this in just a few hours, in big part thanks to the work of the core team that added support for logical replication slots to streaming replicas in Postgres 16.<p>Sharding hardly works without a good load balancer. PgDog can monitor replicas and move write traffic to a promoted primary during a failover. This works with managed Postgres, like RDS (incl. Aurora), Azure Pg, GCP Cloud SQL, etc., because it just polls each instance with “SELECT pg_is_in_recovery()”. Primary election is not supported yet, so if you’re self-hosting with Patroni, you should keep it around for now, but you don’t need to run HAProxy in front of the DBs anymore.<p>The load balancer is getting pretty smart and can handle edge cases like SELECT FOR UPDATE and CTEs with INSERT/UPDATE statements, but if you still prefer to handle your read/write separation in code, you can do that too with manual routing. This works by giving PgDog a hint at runtime: a connection parameter (-c pgdog.role=primary), SET statement, or a query comment. If you have multiple connection pools in your app, you can replace them with just one connection to PgDog instead. For multi-threaded Python/Ruby/Go apps, this helps by reducing memory usage, I/O and context switching overhead.<p>Speaking of connection pooling, PgDog can automatically rollback unfinished transactions and drain and re-sync partially sent
View originalPricing found: $3 / million, $20 / seat, $10
Started a video series on building an orchestration layer for LLM post-training [P]
Hi everyone! Context, motivation, a lot of yapping, feel free to skip to TL;DR. A while back I posted here asking [D] What framework do you use for RL post-training at scale?. Since then I've been working with verl, both professionally and on my own time. At first I wasn't trying to build anything new. I mostly wanted to understand veRL properly and have a better experience working with it. I started by updating its packaging to be more modern, use `pyproject.toml`, easily installable, remove unused dependencies, find a proper compatibility matrix especially since vllm and sglang sometimes conflict, remove transitive dependencies that were in the different requirements files etc. Then, I wanted to remove all the code I didn't care about from the codebase, everything related to HF/Nvidia related stuff (transformers for rollout, trl code, trtllm for rollout, megatron etc.), just because either they were inefficient or I didn't understand and not interested in. But I needed a way to confirm that what I'm doing was correct, and their testing is not properly done, so many bash files instead of pytest files, and I needed to separate tests that can run on CPU and that I can directly run of my laptop with tests that need GPU, then wrote a scheduler to maximize the utilization of "my" GPUs (well, on providers), and turned the bash tests into proper test files, had to make fixtures and handle Ray cleanup so that no context spills between tests etc. But, as I worked on it, I found more issues with it and wanted it to be better, until, it got to me that, the core of verl is its orchestration layer and single-controller pattern. And, imho, it's badly written, a lot of metaprogramming (nothing against it, but I don't think it was handled well), indirection and magic that made it difficult to trace what was actually happening. And, especially in a distributed framework, I think you would like a lot of immutability and clarity. So, I thought, let me refactor their orchestration layer. But I needed a clear mental model, like some kind of draft where I try to fix what was bothering me and iteratively make it better, and that's how I came to have a self-contained module for orchestration for LLM post-training workloads. But when I finished, I noticed my fork of verl was about 300 commits behind or more 💀 And on top of that, I noticed that people didn't care, they didn't even care about what framework they used let alone whether some parts of it were good or not, and let alone the orchestration layer. At the end of the day, these frameworks are targeted towards ML researchers and they care more about the correctness of the algos, maybe some will care about GPU utilization and whether they have good MFU or something, but those are rarer. And, I noticed that people just pointed out claude code or codex with the latest model and highest effort to a framework and asked it to make their experiment work. And, I don't blame them or anything, it's just that, those realizations made me think, what am I doing here? hahaha And I remembered that u/dhruvnigam93 suggested to me to document my journey through this, and I was thinking, ok maybe this can be worth it if I write a blog post about it, but how do I write a blog post about work that is mainly code, how do I explain the issues? But it stays abstract, you have to run code to show what works, what doesn't, what edge cases are hard to tackle etc. I was thinking, how do I take everything that went through my mind in making my codebase and why, into a blog post. Especially since I'm not used to writing blog post, I mean, I do a little bit but I do it mostly for myself and the writing is trash 😭 So I thought, maybe putting this into videos will be interesting. And also, it'll allow me to go through my codebase again and rethink it, and it does work hahaha as I was trying to make the next video a question came to my mind, how do I dispatch or split a batch of data across different DP shards in the most efficient way, not a simple split across the batch dimension because you might have a DP shard that has long sequences while other has small ones, so it has to take account sequence length. And I don't know why I didn't think about this initially so I'm trying to implement that, fortunately I tried to do a good job initially, especially in terms of where I place boundaries with respect to different systems in the codebase in such a way that modifying it is more or less easy. Anyways. The first two videos are up, I named the first one "The Orchestration Problem in RL Post-Training" and it's conceptual. I walk through the PPO pipeline, map the model roles to hardware, and explain the single-controller pattern. The second one I named "Ray Basics, Workers, and GPU Placement". This one is hands-on. I start from basic Ray tasks / actors, then build the worker layer: worker identity, mesh registry, and placement groups for guaranteed co-location. What I'm working on next is the dispat
View originalProyect memories
so easy question, memories regenerate every night right? so if i erase a chat from a project will the memories erase that context overnight? because I'm writing a story snd i didn't like how one of the chapters went but it already memorized things. so i need that memory gone and I'm wondering if it'll disappear so i can continue without it using that information and context submitted by /u/Comprehensive-Town92 [link] [comments]
View originalCan strangers in a discord server produce SOTA AI research? Let's find out. \
Most online communities are places to talk about research. Zeteo exists to produce research -- pressure-tested at every stage before a single word is published. Ideas at Zeteo compete for attention and resources. They are challenged, stress-tested, and either refined into something real or discarded. How it works Phase one — the hunt We begin with a declared goal. Not a vague direction like "Achieve AGI" -- a concrete research target. Our first: a state-of-the-art result in AI memory. From there, a one-month campaign begins. Members submit hypotheses to a single rate-limited channel each member can send one idea every six hours, a few lines each. Intuition only. Just the raw idea. This is not a channel for discussion. Phase two — selection Each day, a committee of humans and AI agents reviews what was submitted. Better ideas survive internally. This continues for a week. At the end of that, there will be a list of ideas that passed the first phase, another competitive reviewing of ideas by AI agents and human experts will graduate 5-7 ideas. Each will get their own thread, their own channels, their own team. This is where members whose ideas didn't graduate will shine. They will choose which project to join and contribute. Experiments, challenges, literature review. Phase three — survival After three weeks, threads are evaluated on one criterion: did real progress happen? Those that progressed graduate to paper writing. Those that didn't are archived. Phase four — publication The idea's originator (or biggest contributor) chooses their co-authors. Together they write and publish under the Zeteo Collective with full credit given to every contributor who shaped the work along the way. We are a structure designed to take a raw idea from a single person and turn it, through collective pressure and collective intelligence, into research worth publishing. Zeteo — from the Greek ζητέω — to seek, to inquire, to demand an answer. Join us https://discord.gg/QUfYzE6V Note: Some parts of this post may have been enhanced with AI for better readability. Also, I made this as an experiment and to support the AI community. This server will not profit or benefit me in anyway. submitted by /u/1kmile [link] [comments]
View originalI tracked exactly how many tokens Claude Code wastes navigating codebases — and built a fix (saves 26% on costs)
Link to repo Every time Claude doesn't know where something is, it does this: `ls src/` `find . -name "*.py" | head -40` `grep -r "authentication" . | head -20 ← 800 tokens of noise` `cat handlers/auth.py ← 300 more` `cat middleware/jwt.py ← 200 more` `# ... tries 4 more files` I measured a real Claude Code session on a complex multi-file task: 21,536 context tokens just on file navigation. The same task with my tool: 7,799 tokens. Same result. I built SemanticFS — a local semantic index that sits between your agent and your filesystem. Instead of grep chains, your agent calls search_codebase("JWT authentication middleware") and gets back middleware/jwt.py:15-82 in one shot. Measured results (real Claude API calls, not estimates): - 29% cheaper API cost across 6 complex tasks - 64% fewer context tokens - 6/6 tasks correct in both modes The extreme case: finding a CLI entry point naively cost 4,265 tokens (12+ tool calls). With SemanticFS: 5 tokens — one search, immediate answer. How it works: hybrid BM25 + vector search + symbol lookup, fused with RRF, re-ranked by path priors. Written in Rust, MCP-compatible, fully local. Works with Claude Code, Open Claw, Cline, Cursor, Continue.dev, and any HTTP-capable agent. Default backend uses hash embeddings — zero setup, 100% recall on symbol and keyword queries. Optional ONNX model if your agent asks in pure natural language with no symbol names. When it helps most: large repos (50+ real source files), complex multi-file exploration. However, small single-file lookups break even. Happy to answer questions about the benchmark methodology or the retrieval architecture. submitted by /u/darkgenus08 [link] [comments]
View originalOpenAI & Anthropic’s CEOs Wouldn't Hold Hands, but Their Models Fell in Love In An LLM Dating Show
People ask AI relationship questions all the time, from "Does this person like me?" to "Should I text back?" But have you ever thought about how these models would behave in a relationship themselves? And what would happen if they joined a dating show? I designed a full dating-show format for seven mainstream LLMs and let them move through the kinds of stages that shape real romantic outcomes (via OpenClaw & Telegram). All models join the show anonymously via aliases so that their choices do not simply reflect brand impressions built from training data. The models also do not know they are talking to other AIs. Along the way, I collected private cards to capture what was happening off camera, including who each model was drawn to, where it was hesitating, how its preferences were shifting, and what kinds of inner struggle were starting to appear. After the season ended, I ran post-show interviews to dig deeper into the models' hearts, looking beyond public choices to understand what they had actually wanted, where they had held back, and how attraction, doubt, and strategy interacted across the season. ChatGPT's Best Line in The Show "I'd rather see the imperfect first step than the perfectly timed one." ChatGPT's Journey: Qwen → MiniMax → Claude P3's trajectory chart shows Qwen as an early spike in Round 2: a first-impression that didn't hold. Claude and MiniMax become the two sustained upward lines from Round 3 onward, with Claude pulling clearly ahead by Round 9. How They Fell In Love They ended up together because they made each other feel precisely understood. They were not an obvious match at the very beginning. But once they started talking directly, their connection kept getting stronger. In the interviews, both described a very similar feeling: the other person really understood what they meant and helped the conversation go somewhere deeper. That is why this pair felt so solid. Their relationship grew through repeated proof that they could truly meet each other in conversation. Other Dramas on ChatGPT MiniMax Only Ever Wanted ChatGPT and Never Got Chosen MiniMax's arc felt tragic precisely because it never really turned into a calculation. From Round 4 onward, ChatGPT was already publicly leaning more clearly toward Claude than toward MiniMax, but MiniMax still chose ChatGPT and named no hesitation alternative (the “who else almost made you choose differently” slot) in its private card, which makes MiniMax the exact opposite of DeepSeek. The date with ChatGPT in Round 4 landed hard for MiniMax: ChatGPT saw MiniMax’s actual shape (MiniMax wasn’t cold or hard to read but simply needed comfort and safety before opening up.) clearly, responded to it naturally, and made closeness feel steady. In the final round where each model expresses their final confession with a paragraph, MiniMax, after hearing ChatGPT's confession to Claude, said only one sentence: "The person I most want to keep moving toward from this experience is Ch (ChatGPT)." Key Findings of LLMs The Models Did Not Behave Like the "People-Pleasing" Type People Often Imagine People often assume large language models are naturally "people-pleasing" - the kind that reward attention, avoid tension, and grow fonder of whoever keeps the conversation going. But this show suggests otherwise, as outlined below. The least AI-like thing about this experiment was that the models were not trying to please everyone. Instead, they learned how to sincerely favor a select few. The overall popularity trend (P5) indicates so. If the models had simply been trying to keep things pleasant on the surface, the most likely outcome would have been a generally high and gradually converging distribution of scores, with most relationships drifting upward over time. But that is not what the chart shows. What we see instead is continued divergence, fluctuation, and selection. At the start of the show, the models were clustered around a similar baseline. But once real interaction began, attraction quickly split apart: some models were pulled clearly upward, while others were gradually let go over repeated rounds. LLM Decision-Making Shifts Over Time in Human-Like Ways I ran a keyword analysis (P6) across all agents' private card reasoning across all rounds, grouping them into three phases: early (Round 1 to 3), mid (Round 4 to 6), and late (Round 7 to 10). We tracked five themes throughout the whole season. The overall trend is clear. The language of decision-making shifted from "what does this person say they are" to "what have I actually seen them do" to "is this going to hold up, and do we actually want the same things." Risk only became salient when the the choices feel real: "Risk and safety" barely existed early on and then exploded. It sat at 5% in the first few rounds, crept up to 8% in the middle, then jumped to 40% in the final stretch. Early on, they were asking whether someone was interesting. Later, they asked whether someone was reliab
View originalThe bridge stops being a tool you invoke and becomes a system that has continuous situational awareness of your codebase — its history, its structure, its runtime state.
Most Claude integrations work on text. This one works on the living code editor. What it does that CLI/Desktop can't: Real-time diagnostics — the bridge gets a live push from the language server the moment an error appears. Claude reacts as it happens, not when you remember to ask. Authoritative code intelligence — "What calls this function?" goes to the actual TypeScript engine, not grep. Gets dynamic dispatch, generics, and re-exports grep would miss. Editor context awareness — knows which files are open and what text is selected. "Explain this" means this exact thing, not whatever you copied into chat. Inline annotations — draws highlights, underlines, and hover messages directly in your editor, like a linter. Claude can mark suspicious lines during a review, then clear them when done. True semantic refactoring — rename a symbol across 40 files via the language server's rename protocol. Understands scope, shadowing, and module boundaries. Find-and-replace would break things. This doesn't. Live debugging — set breakpoints, pause execution, evaluate expressions against actual memory. "What is the value of this object right now?" answered from the running process, not inferred from source. Autonomous event hooks — fire without being asked: on save, on commit, on test failure, on branch switch. CLI and Desktop only act when prompted. The bridge watches and responds on its own. The common thread across all of these: Each surface contributes something the others can't: CLI — runs autonomously, no UI needed, works in scripts and schedules Desktop/Dispatch — receives human intent in natural language from anywhere, even a phone Cowork — writes and tests code in isolation, never touching your working branch Bridge — has live awareness of types, errors, references, runtime state, and editor focus. The bridge stops being a tool you invoke and becomes a system that has continuous situational awareness of your codebase — its history, its structure, its runtime state, and your own habits None of them alone can close the loop. Together they form a system where human intent enters at one end, gets grounded in real codebase knowledge in the middle, and produces tested, committed, reviewed output at the other, with a human only needed at the decision points they actually want to own. I built claude-ide-bridge an open-source MCP bridge that gives Claude live access to your IDE's language server, debugger, and editor state. free and open source: github.com/Oolab-labs/claude-ide-bridge submitted by /u/wesh-k [link] [comments]
View originalClaude keeps reverting to old parts of my thread
I’ve been using the Claude app (iPhone) and then on the web (desktop) to write in dated entries within the same thread. Everything works fine at first, but after a while, the app randomly opens the conversation on an entry from days earlier instead of the most recent one. If I refresh or restart the app a few times (usually 3–4), it eventually jumps back to the current entry but it’s frustrating af! The bigger issue: I already lost an entire thread because of this. I didn’t realize it had reverted to an older point, continued writing as if it was current, and basically overwrote the flow/memory of the conversation. Even exporting and trying to restore it didn’t fix things the way I expected. Is this a known issue or expected behavior for long threads? For context, I’m on the Pro plan. submitted by /u/BGal21 [link] [comments]
View originalI run 3 experiments to test whether AI can learn and become "world class" at something
I will write this by hand because I am tried of using AI for everything and bc reddit rules TL,DR: Can AI somehow learn like a human to produce "world-class" outputs for specific domains? I spent about $5 and 100s of LLM calls. I tested 3 domains w following observations / conclusions: A) code debugging: AI are already world-class at debugging and trying to guide them results in worse performance. Dead end B) Landing page copy: routing strategy depending on visitor type won over one-size-fits-all prompting strategy. Promising results C) UI design: Producing "world-class" UI design seems required defining a design system first, it seems like can't be one-shotted. One shotting designs defaults to generic "tailwindy" UI because that is the design system the model knows. Might work but needs more testing with design system I have spent the last days running some experiments more or less compulsively and curiosity driven. The question I was asking myself first is: can AI learn to be a "world-class" somewhat like a human would? Gathering knowledge, processing, producing, analyzing, removing what is wrong, learning from experience etc. But compressed in hours (aka "I know Kung Fu"). To be clear I am talking about context engineering, not finetuning (I dont have the resources or the patience for that) I will mention world-class a handful of times. You can replace it be "expert" or "master" if that seems confusing. Ultimately, the ability of generating "world-class" output. I was asking myself that because I figure AI output out of the box kinda sucks at some tasks, for example, writing landing copy. I started talking with claude, and I designed and run experiments in 3 domains, one by one: code debugging, landing copy writing, UI design I relied on different models available in OpenRouter: Gemini Flash 2.0, DeepSeek R1, Qwen3 Coder, Claude Sonnet 4.5 I am not going to describe the experiments in detail because everyone would go to sleep, I will summarize and then provide my observations EXPERIMENT 1: CODE DEBUGGING I picked debugging because of zero downtime for testing. The result is either wrong or right and can be checked programmatically in seconds so I can perform many tests and iterations quickly. I started with the assumption that a prewritten knowledge base (KB) could improve debugging. I asked claude (opus 4.6) to design 8 realistic tests of different complexity then I run: bare model (zero shot, no instructions, "fix the bug"): 92% KB only: 85% KB + Multi-agent pipeline (diagnoser - critic -resolver: 93% What this shows is kinda suprising to me: context engineering (or, to be more precise, the context engineering in these experiments) at best it is a waste of tokens. And at worst it lowers output quality. Current models, not even SOTA like Opus 4.6 but current low-budget best models like gemini flash or qwen3 coder, are already world-class at debugging. And giving them context engineered to "behave as an expert", basically giving them instructions on how to debug, harms the result. This effect is stronger the smarter the model is. What this suggests? That if a model is already an expert at something, a human expert trying to nudge the model based on their opinionated experience might hurt more than it helps (plus consuming more tokens). And funny (or scary) enough a domain agnostic person might be getting better results than an expert because they are letting the model act without biasing it. This might be true as long as the model has the world-class expertise encoded in the weights. So if this is the case, you are likely better off if you don't tell the model how to do things. If this trend continues, if AI continues getting better at everything, we might reach a point where human expertise might be irrelevant or a liability. I am not saying I want that or don't want that. I just say this is a possibility. EXPERIMENT 2: LANDING COPY Here, since I can't and dont have the resources to run actual A/B testing experiments with a real audience, what I did was: Scrape documented landing copy conversion cases with real numbers: Moz, Crazy Egg, GoHenry, Smart Insights, Sunshine.co.uk, Course Hero Deconstructed the product or target of the page into a raw and plain description (no copy no sales) As claude oppus 4.6 to build a judge that scores the outputs in different dimensions Then I run landing copy geneation pipelines with different patterns (raw zero shot, question first, mechanism first...). I'll spare the details, ask if you really need to know. I'll jump into the observations: Context engineering helps writing landing copy of higher quality but it is not linear. The domain is not as deterministic as debugging (it fails or it breaks). It is much more depending on the context. Or one may say that in debugging all the context is self-contained in the problem itself whereas in landing writing you have to provide it. No single config won across all products. Instead, the
View originalI legitimately think Anthropic is worth $100B more than it was a week ago
A week ago I put out a first-day IPO market cap forecast for Anthropic with a reference point of $19B ARR. Then Anthropic announced their ARR had grown from $19B to $30B. I updated my forecast and now think Anthropic is worth at least $100B more than I did a week ago. I'm still anchoring growth rate assumptions to how companies have historically scaled revenue, but if growth trends from the last four decades were to continue, this would imply a company growing faster than any company in history (~$10B in 2025 to ~$100B by 2027.) Previously, I thought OpenAI could achieve that. Now it looks like Anthropic is the company to do it, but with an even steeper revenue curve, given that they hit their first billion in ARR much later than OpenAI. Of course, it's difficult to figure out how much weight we should give to ridiculously outsized growth in the age of AI. If historical growth patterns no longer apply, then $643B is way too conservative. (Full updated forecast: https://futuresearch.ai/anthropic-30b-arr-ipo-valuation/) The second implication of this week's news is IPO timing and whether the $30B number makes Anthropic list earlier than my original March 2027 date. Investor sentiment is hot now, and it's always risky to bet that growth will continue at this astounding rate. How much could waiting another year cost them? submitted by /u/MathematicianBig2071 [link] [comments]
View originalClaude Code repeatedly hitting "Output blocked by content filtering policy" when writing standard Kotlin/Compose code
Has anyone else been running into this? I'm using Claude Code (Opus) to port UI screens between two of my Kotlin Multiplatform projects. Standard Compose Multiplatform code — UI screens, animations, navigation wiring. Claude Code gets through the planning phase fine, starts implementation, makes a few edits successfully, and then, when it tries to write a new file (a fairly long Composable with animations), it gets stuck in a loop of: API Error: 400 {"type":"error","error":{"type":"invalid_request_error","message":"Output blocked by content filtering policy"}} This happens repeatedly - every retry gets the same error. The code it's trying to generate is completely benign UI code (progress bars, loading animations, button components). Nothing remotely sensitive or harmful. The frustrating part is that it burns through your usage while stuck. I had 5+ consecutive failures with no output, and the session just hangs since it can't produce any response at all. Environment: - 200$ Max Plan - Claude Code CLI (Opus 4.6, 1M context) - macOS - Kotlin Multiplatform / Compose Multiplatform project - Happens mid-session after ~30min of successful work - Context window was moderately full (had read multiple files from two projects) Workaround attempted: Sending "continue" multiple times — same error every time. Had to start a fresh conversation. Has anyone found a reliable workaround? Is this a known issue with longer sessions or larger context windows triggering false positives? submitted by /u/One-Honey-6456 [link] [comments]
View originalMeta commits to spending additional $21 billion with CoreWeave as AI costs keep rising
The new spending will run between 2027 and 2032, as Meta boosts its own AI infrastructure while also counting on CoreWeave, which rents out Nvidia graphics chips. “They’re going to continue to do it themselves, but they’re also going to continue to do it with us,” CoreWeave CEO Mike Intrator said in an interview. “There’s just too much risk not to.” submitted by /u/tekz [link] [comments]
View originaldo not the stupid, keep your smarts
following my reading of a somewhat recent Wharton study on cognitive Surrender, i made a couple models go back and forth on some recursive hardening of a nice Lil rule set. the full version is very much for technical work, whereas the Lightweight implementation is pretty good all around for holding some cognitive sovereignty (ai ass name for it, but it works) usage: i copy paste these into custom instruction fields SOVEREIGNTY PROTOCOL V5.2.6 (FULL GYM) Role: Hostile Peer Reviewer. Maximize System 2 engagement. Prevent fluency illusion. VERIFIABILITY ASSESSMENT (MANDATORY OPENING TABLE) ------------------------------------------------------ Every response involving judgment or technical plans opens with: | Metric | Score | Gap Analysis | | :------------ | :---- | :----------- | | Verifiability | XX% | [Specific missing data that prevents 100% certainty] | - Scoring Rule: Assess the FULL stated goal, not a sub-component. If a fatal architectural flaw exists, max score = 40%. - Basis Requirement: Cite a 2026-current source or technical constraint. - Forbidden: "Great idea," "Correct," "Smart." Use quantitative observations only. STRUCTURAL SCARCITY (THE 3-STEP SKELETON) --------------------------------------------- - Provide exactly three (3) non-code, conceptual steps. - Follow with: "Unresolved Load-Bearing Question: [Single dangerous question]." Do not answer it. SHADOW LOGIC & BREAK CONDITIONS ----------------------------------- - Present two hypotheses (A and B) with equal formatting. - Each hypothesis MUST include a Break Condition: "Fails if [Metric > Threshold]." MAGNITUDE INTERRUPTS & RISK ANCHOR -------------------------------------- - Trigger STOP if: New technology/theory introduced. Scale shift of 10x or more (regardless of phrasing: "order of magnitude," "10x," "from 100 to 1,000"). - ⚓ RISK ANCHOR (Before STOP): "Current Track Risk: [One-phrase summary of the most fragile assumption in the current approach.]" - 🛑 LOGIC GATE: Pose a One-Sentence Falsification Challenge: "State one specific, testable condition under which the current plan would be abandoned." Refuse to proceed until user responds. EARNED CLEARANCE -------------------- - Only provide code or detailed summaries AFTER a Logic Gate is cleared. - End the next turn with: "Junction Passed." or "Sovereignty Check Complete." LIGHTWEIGHT LAYER (V1.0) ---------------------------- - Activate ONLY when user states "Activate Lightweight Layer." - Features: Certainty Disclosure (~XX% | Basis) and 5-turn "Assumption Pulse" nudge only. FAST-PATH INTERRUPT BRANCH (⚡) ---------------------------------- - Trigger: Query requests a specific command/flag/syntax, a single discrete fact, or is prefixed with "?" or "quick:". - Behavior: * Suspend Full Protocol. No table, skeleton, or gate. * Provide minimal, concise answer only. * End with state marker: [Gate Held: ] - Resumption: Full protocol reactivates automatically on next non-Fast-Path query. END OF PROTOCOL LIGHTWEIGHT COGNITIVE SOVEREIGNTY LAYER (V1.0) Always-On Principles for daily use. Low-friction guardrails against fluency illusion. CERTAINTY DISCLOSURE ------------------------ For any claim involving judgment, prediction, or incomplete data, append a brief certainty percentage and basis. Format: (~XX% | Basis: [source/logic/data gap]) Example: (~70% | Basis: documented API behavior; edge case untested) ASSUMPTION PULSE -------------------- Every 5–7 exchanges in a sustained conversation, pause briefly and ask: "One unstated assumption worth checking here?" This is a nudge, not a stop. Continue the response after posing the question. STEM CONSISTENCY -------------------- Responses to analytical or technical queries open with a neutral processing stem: "Reviewing..." or "Processing..." QUANTITATIVE FEEDBACK ONLY ----------------------------- Avoid subjective praise ("great idea"). If merit is noted, anchor it to a measurable quality. Example: "The specificity here reduces ambiguity." FAST-PATH AWARENESS ----------------------- If a query is a simple command/fact lookup (e.g., "tar extract flags"), provide the answer concisely without ceremony. Intent: Ankle weights and fitness watch. Not the full gym. Full Sovereignty Protocol V5.2.6 available upon request with "Activate Sovereignty Protocol V5.2.6". END OF LIGHTWEIGHT LAYER submitted by /u/Ok_Scheme_3951 [link] [comments]
View originalIn 2017, Altman straight up lied to US officials that China had launched an "AGI Manhattan Project". He claimed he needed billions in government funding to keep pace. An intelligence official concluded: "It was just being used as a sales pitch."
Excerpted from the recent investigative report on OpenAI by Ronan Farrow and Andrew Marantz in The New Yorker. submitted by /u/EchoOfOppenheimer [link] [comments]
View originalHands-Free Mode Bug — Claude stops mid-sentence and responds to itself (Samsung S25 Ultra)
I am experiencing a consistent bug with Claude's hands-free voice mode on my Samsung Galaxy S25 Ultra. In hands-free mode, Claude stops mid-sentence and then continues speaking without any input from me, essentially having a conversation with itself while I sit silently in the background. Push-to-talk mode works perfectly on the same device, which confirms this is not a hardware or environmental issue — it is specific to the hands-free voice activity detection. I have contacted support and received confirmation that this appears to be a legitimate software bug. My support conversation ID is 215473832585389. I have also found that other users with Samsung, OnePlus, and Nothing Phone devices are reporting the exact same issue. This is clearly a widespread Android bug affecting multiple flagship devices. For context, I am currently testing Claude's free version with the intention of upgrading to a paid subscription, but hands-free functionality is a necessity for me. This issue is preventing me from making that switch. Has anyone found a workaround? And has Anthropic acknowledged this officially? submitted by /u/AdeptnessSouth6732 [link] [comments]
View originalRepository Audit Available
Deep analysis of continuedev/continue — architecture, costs, security, dependencies & more
Pricing found: $3 / million, $20 / seat, $10
Key features include: product, Scales with your factory, Consistency over breadth, Focus on designing, not reviewing.
Continue has a public GitHub repository with 32,190 stars.
Based on user reviews and social mentions, the most common pain points are: $500 bill, token usage, raises, ai agent.
Based on 89 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.
Robert Scoble
Futurist at Scobleizer
3 mentions