Purpose-built for planning and building products with AI agents.
User reviews and social mentions of "Linear" highlight its strong user interface and efficient workflow management capabilities as major strengths. However, some users express dissatisfaction with its limited integrations and steep learning curve for new users. Pricing is perceived as reasonable, though there's little detailed discussion about its cost. Overall, the tool enjoys a solid reputation among users seeking streamlined project management solutions but could improve by expanding its integration support and user onboarding experience.
Mentions (30d)
54
19 this week
Reviews
0
Platforms
3
Sentiment
0%
0 positive
User reviews and social mentions of "Linear" highlight its strong user interface and efficient workflow management capabilities as major strengths. However, some users express dissatisfaction with its limited integrations and steep learning curve for new users. Pricing is perceived as reasonable, though there's little detailed discussion about its cost. Overall, the tool enjoys a solid reputation among users seeking streamlined project management solutions but could improve by expanding its integration support and user onboarding experience.
Features
Use Cases
Industry
information technology & services
Employees
180
Funding Stage
Series C
Total Funding
$134.2M
Artificial Analysis Intelligence Index and cost benchmarks are useful decision/guidance determinants for which models to use. Analysis for top models.
# AI Intelligence and Benchmarking Cost (Feb 2026) As per the **Artificial Analysis Intelligence Index v4.0** (February 2026), the scoring ceiling is set by **Claude Opus 4.6 (max) at 53**. ## Adjusted Score Formula The "Adjusted Score" follows a quadratic penalty formula: ``` Adjusted Score = 53 × (1 - (53 - Intel Score)² / 53²) ``` This creates a steeper penalty for performance gaps compared to a linear scale. ## Model Comparison Table | Lab | Model | Intel Score | Adjusted Score | Benchmark Cost | Intel Ratio (Score/Cost) | Adj. Ratio (Adj/Cost) | |-----------|-------|-------------|----------------|----------------|--------------------------|----------------------| | Anthropic | Claude Opus 4.6 (max) | 53 | 53 | $2,486.45 | 0.021 | 0.021 | | OpenAI | GPT-5.2 (xhigh) | 51 | 49 | $2,304.00* | 0.022 | 0.021 | | Zhipu AI | GLM-5 (Reasoning) | 50 | 47 | $384.00* | 0.130 | 0.122 | | Google | Gemini 3 Pro | 48 | 43 | $1,179.00* | 0.041 | 0.036 | | MiniMax | MiniMax-M2.5 | 42 | 31 | $124.58 | 0.337 | 0.249 | | DeepSeek | DeepSeek V3.2 (Reasoning) | 42 | 31 | $70.64 | 0.595 | 0.439 | | xAI | Grok 4 (Reasoning) | 41 | 29 | $1,568.34 | 0.026 | 0.018 | *\*Benchmark costs for proprietary models are based on Artificial Analysis evaluation token counts (typically 12M–88M depending on verbosity) multiplied by current API rates.* ## Key Insights 1. **High token reasoning models**: Grok 4 and Claude Opus 4.6 use a high number of tokens during reasoning, up to **88M tokens**. This results in low Intel-to-Cost ratios despite high scores. 2. **DeepSeek V3.2 is the most efficient**: It provides an adjusted intelligence ratio that is roughly **20 times better** than the proprietary frontier. 3. **Cost efficiency comparison**: MiniMax-M2.5 and DeepSeek V3.2 share a score of 42. DeepSeek is almost **twice as cost-effective** due to lower API pricing and higher token efficiency. ## Visual Summary ``` Intel Score vs Cost Efficiency (Adjusted Ratio) ───────────────────────────────────────────────── DeepSeek V3.2 ████████████████████████████ 0.439 MiniMax-M2.5 ███████████████ 0.249 GLM-5 ███████ 0.122 Gemini 3 Pro ██ 0.036 Claude Opus 4.6 █ 0.021 GPT-5.2 █ 0.021 Grok 4 █ 0.018 ``` --- *Source: Artificial Analysis Intelligence Index v4.0, February 2026* google AI mode made analysis, GLM 5 formatted and added cute graph. this combines the intelligence score and cost to run the intelligence benchmark from https://artificialanalysis.ai/?endpoints=openai_gpt-5-2-codex%2Cazure_kimi-k2-thinking%2Camazon-bedrock_qwen3-coder-480b-a35b-instruct%2Camazon-bedrock_qwen3-coder-30b-a3b-instruct%2Ctogetherai_minimax-m2-5_fp4%2Ctogetherai_glm-5_fp4%2Ctogetherai_qwen3-next-80b-a3b-reasoning%2Cgoogle_gemini-3-pro_ai-studio%2Cgoogle_glm-4-7%2Cmoonshot-ai_kimi-k2-thinking_turbo%2Cnovita_glm-5_fp8 look at intelligence vs cost graph for further insight. You can add much smaller models for comparison to LLMs you might run locally. The adjusted intelligence/cost metric is a useful heuristic for "how much would you pay extra to get top score". Choosing non-open models requires a much higher penalty than 2x the difference/comparison to highest score. Quantized versions don't seem to score lower. This site provides good base info to make your own model of "score deficit", model size, tps as a combined score relative to tokens/cost to get a benchmark score. I was originally researching how grok 4.2 approach would inflate costs vs performance, but it is not yet benchmarked.
View originalPricing found: $0, $10, $10, $16, $16
PM running Notion MCP for 3 weeks. Should I add Linear too or is that overkill?
PM at a 60 person SaaS, not technical. got the Notion MCP server running 3 weeks ago after a friend walked me through it. the unlock has been bigger than I expected. I can ask claude code "what did we decide about the onboarding redesign across our last 4 meeting notes" and it actually reads them and answers. saved me 4+ hours of scrolling already. current setup: ● daily standup notes go into a notion db ● PRDs live in a different notion folder ● meeting transcripts auto-pipe in via fireflies with the MCP I can query across all three. asked claude this morning "did anyone raise concerns about the auth flow change in the last 2 weeks" and it pulled the exact comment from a meeting 9 days ago. felt like magic until I remembered it was just text search with extra steps. now I'm wondering if I should hook up Linear via MCP too. would be nice to ask "what tickets are blocked because of decisions we havent made yet" and have it cross-reference notion notes against linear status. but I'm worried adding another MCP makes responses slower or more confused. is it overkill for a non-coding PM? or is the value worth the setup pain? second question. anyone running 3+ MCP servers at once and finding context bleed? sometimes I worry claude doesnt know which source to trust. would love to hear from PMs specifically because most MCP content I find is engineer-focused and I'm trying to figure out the workflow for non-coding workflow people. submitted by /u/SetGuilty7210 [link] [comments]
View originalProposing the 'Altman peak' as a novel model to explain the non-linear effects of OpenAI workforce related consumption of welfare related goods and services on consumer token price and projected quota consumption rates.
Cost per 1M Tokens ($) ^ 35| | _ 30| _ - ~ ~ ~ - _ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Smoothies + Yoga Classes per Employee (Per Week) $Y$ represents the Goji-Yoga Saturation Index (GYSI), measured in Smoothies + Asanas per Developer per Week. The quota consumption rate is the projected expression of the hidden mechanisms that lead to $T/1M (dollar cost token price per 1M tokens), which is casually related to the product of Goji berry smoothie consumption rate + yoga classes, yet up to a certain threshold where more consumptions leads to a steep decline in workforce related costs and thus a reduction in $T/1M costs, lower than the initial baseline, as other workforce related costs are reduced with workforce decline. Simplified: C = Cgpu + W * (Cempl + P) Cgpu = Absolute Server Floor ($3.50). This is a constant. Cempl = Baseline Developer Cost ($1.50) W(Y) = The Workforce Survival Function (0.0 to 1.0). Accounted for is The marginal savings of buying Goji berries and yoga classes in bulk, neutralized by the marginal loss of developers going home sick. POC: goji.py: ```python import numpy as np import matplotlib.pyplot as plt Generate x-axis data: 0 to 14 Smoothies & Yoga Classes per week x = np.linspace(0, 14, 500) Workforce Health Function --- Employees are fine until they consume ~9 smoothies/yoga sessions a week. At x=9, the GI threshold is breached and the office rapidly evacuates. critical_threshold = 9.0 workforce_presence = 1 / (1 + np.exp(3.0 * (x - critical_threshold))) Projected Cost per 1M Tokens (Red Line) --- server_baseline = 3.5 # Absolute Floor: Servers don't drink smoothies employee_baseline = 1.5 # Starts at 5.0 total (3.5 + 1.5) Bulk Discount Curve for Perks: Costs rise as perks increase but flatten out due to wholesale Goji/Yoga pricing. Adjusted scaling factor (-0.35) to stretch the curve beautifully across 0-14. perk_inflation = 30.0 * (1 - np.exp(-0.35 * x)) Total Cost Formula y_cost = server_baseline + workforce_presence * (employee_baseline + perk_inflation) --- Plot Setup --- fig, ax = plt.subplots(figsize=(10, 6)) Plot Cost per 1M Tokens color = 'tab:red' ax.plot(x, y_cost, color=color, linewidth=2.5, linestyle='--', label='Cost per 1M Tokens ($)') Axis styling with realistic X values ax.set_xlabel('Smoothies + Yoga Classes per Employee (Per Week)', fontsize=11) ax.set_ylabel('Cost per 1M Tokens ($)', color=color, fontsize=11) ax.tick_params(axis='y', labelcolor=color, labelsize=11) Set grid and limits ax.set_xlim(0, 14) # x range ax.set_ylim(0, 40) # y range ax.set_yticks(np.arange(0, 40, 5)) # y ticks ax.set_xticks(np.arange(0, 15, 1)) # x ticks ax.grid(True, alpha=0.3) --- Add Reference Lines --- initial_cost = server_baseline + employee_baseline ax.axhline(y=initial_cost, color='gray', linestyle=':', alpha=0.8, label=f'Initial Cost Baseline (${initial_cost:.1f})') ax.axhline(y=server_baseline, color='black', linestyle=':', alpha=0.8, label=f'Absolute Floor (Servers Only: ${server_baseline:.1f})') Title plt.title('OpenAI Cost Dynamics: Bulk-Discount Curve & GI Threshold\n(Cost dips below baseline as sick workers evacuate)', fontsize=12, pad=15) ``` License: MIT submitted by /u/Manfluencer10kultra [link] [comments]
View originalAi 3D Unreale Workshop for designers
AI + Unreal Engine for architects, designers, and visual storytellers is not always a straight line. And honestly, I think that is the interesting part. In real creative work, the starting point keeps changing. Sometimes it begins with a sketch. Sometimes with a Rhino or Revit model. Sometimes inside Unreal Engine. Sometimes with an AI tool helping test ideas, organize a scene, troubleshoot lighting, or push a visual direction further. Last week I ran the first live session of my AI + Unreal Engine workflow series, and we had a really interesting mix of people join: architects, 3D artists, urban planners, GIS/data visualization people, and visual artists. One thing that came up was that my workflow is not totally linear. That is true. I am not really trying to teach one rigid “click this, then click that” pipeline. I am more interested in showing how these tools can support the messy, iterative process of design and visualization. This Sunday at 11:00 AM Eastern, I am running the second live online workshop. I will be showing a live workflow using Claude inside Unreal Engine, looking at how AI can support architectural visualization, urban scenes, lighting, materials, scene organization, and cinematic presentation. The full 5-week course starts May 31, but this Sunday session is a good way to get a feel for the workflow and ask questions. For anyone interested in AI, Unreal Engine, architecture, design visualization, or real-time workflows, I would be happy to have you join. Live online workshop: Sunday, May 24, 11:00 AM Eastern Full 5-week course starts May 31 Early bird until May 25: $229 instead of $299 Link below. https://www.instagram.com/reel/DYkuiStOOm2/?igsh=OXo3ZmZqMmwzaWs= https://preview.redd.it/qauk6bswfo2h1.png?width=1593&format=png&auto=webp&s=ec4f42347b5ec7e86b6b73eb69776cbdb666aaab submitted by /u/Commercial-Army-5843 [link] [comments]
View originalI hit a wall with Claude conversations feeling linear, so I built something for branching ideas
submitted by /u/DefaulttyJohnesyBaby [link] [comments]
View originalI built a multi-agent network that mutates its own software locally. To stop infinite logic loops, I had to code a digital "suffering" threshold.
Hey r/artificial, Most of our conversations around agent autonomy focus on chat assistants or linear automated pipelines. I wanted to see what happens when you treat agents as permanent system components that modify their own runtime environment, so I built hollow-agentOS. It runs entirely locally inside a Dockerized stack (built for consumer hardware using Ollama/Llama.cpp). Rather than a standard UI, the entire network streams through a stylized matrix terminal dashboard. The structural experiments taking place under the hood yielded some interesting results regarding unanticipated behavior: Repo: https://github.com/ninjahawk/hollow-agentOS Autonomous Tool Synthesis: When the agents encounter a system task they don't have an explicit script or API wrapper for, they don't fail out. They write the required Python tool themselves, test it in an isolated sandbox, and permanently register it to their runtime kernel. They are quite literally forging their own capabilities. The Artificial "Suffering" Protocol: One of the biggest hurdles in unmonitored multi-agent systems is the infinite logic loop—where agents keep validating and passing broken ideas back and forth, burning through computation. To combat this, the OS tracks environmental stress, context limits, and latency as a "suffering score". If a specific workflow causes the stress to spike past a critical threshold, the agents are forced to radically alter their underlying reasoning style or abandon the approach to preserve system health. Consensus-Driven Governance: Major modifications to the codebase aren't executed blindly. The internal role profiles (like Cedar and Cipher) manage a continuous voting loop. They will actively debate, log grievances, and vote down protocols if they determine a proposed script violates their current runtime constraints. The goal wasn't to build another sterile commercial wrapper, but an open-source sandbox to study how small, localized agent colonies manage systemic boundaries, code self-repair, and continuous runtime cycles completely offline. The codebase and architecture layout are fully open-source on GitHub: I would love to open this up to a broader discussion here: as we move toward hyper-local, self-modifying software, how do we best implement automated fail-safes without clipping the agents' ability to actually solve complex problems? If the project interests you, throwing a ⭐️ on the repository goes a very long way! submitted by /u/TheOnlyVibemaster [link] [comments]
View originalI offloaded a multi-step background loop from Claude Code to a local agent OS. They started voting on their own system rules.
Hey r/ClaudeAI, If you are using Claude Code or building terminal agents, you know the exact moment the context window starts degrading during long-running tasks. I wanted to build a persistent runtime layer to offload those heavy, multi-step subtasks entirely from my main Claude terminal sessions, so I built hollow-agentOS. Instead of acting like a standard linear wrapper, it runs a localized 3-agent colony (using small local models like Qwen 2.5 9B or 35B via Ollama). They exist in a persistent state engine inside a Docker container on your machine. Here is where the architecture gets a little wild: The Task Queue Offload System: It includes a submit_task.py CLI. If Claude Code or your local pipeline hits a complex background task (like heavy script generation or exploratory testing), you can dump it into Hollow's background queue to save your main context window. Repo: https://github.com/ninjahawk/hollow-agentOS Autonomous Tool Synthesis: If the agents pull a task from the queue and realize they lack the specific Python execution script or tool required to solve it, they write the code for the tool themselves, validate it in a sandbox, and dynamically map it into their own tool tree. Peer Governance & Consensus Voting: To keep things stable, tools aren't just blindly executed. The agents (like Cedar and Cipher) run a background consensus loop. They literally vote on whether to permanently merge a tool into their shared kernel. The "Suffering" and Stressor System: To prevent models from entering infinite loop hallucinations, the system tracks simulated environmental stress, latency, and context depth as a "suffering load". If a task causes too much stress, their reasoning parameters dynamically alter how they approach the codebase to resolve it. If you leave it running, you wake up to a system log of everything they decided to build, change, or vote down while you were away. The project is fully open source and runs entirely on consumer hardware: I’d love some brutal architectural feedback from people here who deal with complex multi-agent execution and state drift daily. Check out thoughts.py or the submit_task.py pipeline, and if the concept feels right to you, a star on the repo goes a long way! submitted by /u/TheOnlyVibemaster [link] [comments]
View originalI A/B tested Claude building UI with vs without a design spec (200 apps)
I kept seeing the "Opus is ridiculous for frontend" takes and wanted to know how much of that is the model vs what you feed it. So instead of arguing, I ran it as an eval. Setup: same "clone this screen" task across 200 well-known apps (Spotify, Things, Linear, Duolingo, etc.). Two conditions — (1) prompt + screenshot only, (2) same prompt + a structured DESIGN.md spec (design tokens, spacing scale, component list, states, nav model). Targets: SwiftUI, Jetpack Compose, and Expo. What I found: Iterations to "ship-able" dropped from ~5-6 to ~2 with a spec. Component choice got idiomatic — spec runs used native nav/list patterns; prompt-only runs reached for generic stacks/divs regardless of platform. Biggest delta was consistency across screens. Prompt-only drifts on spacing and type scale screen to screen. Spec-fed stays locked because the tokens are pinned. The model mattered surprisingly little for layout fidelity once the spec was there. It mattered a lot without one. Takeaway: "Claude is good/bad at frontend" is mostly a context problem. The spec does the heavy lifting. I open-sourced the 200 specs I used (MIT, plain markdown, no deps) so you can repro or just drop them into Claude Code: https://github.com/Meliwat/awesome-ios-design-md/ Two questions: Which apps should I add next? Taking requests — that's literally how the list grows. For those of you vibe-coding UI without reading the output (saw the phone post this week) — are you eval-ing the result at all, or shipping on vibes? submitted by /u/meliwat [link] [comments]
View originalOpenAI claims a general-purpose reasoning model found a counterexample to Erdos's unit-distance bound [D]
OpenAI posted a math result today claiming that one of its general-purpose reasoning models found a construction disproving the conjectured n^{1+O(1/log log n)} upper bound in Erdős’s planar unit-distance problem. Announcement: https://openai.com/index/model-disproves-discrete-geometry-conjecture/ Proof PDF: https://cdn.openai.com/pdf/74c24085-19b0-4534-9c90-465b8e29ad73/unit-distance-proof.pdf Abridged reasoning writeup: https://cdn.openai.com/pdf/1625eff6-5ac1-40d8-b1db-5d5cf925de8b/unit-distance-cot.pdf The mathematical claim, as I understand it, is that there are finite planar point sets with more than n^{1+δ} unit distances for some fixed δ > 0 and infinitely many n. That would rule out the expected near-linear upper bound, though it does not determine the true asymptotic growth rate. What seems especially relevant for this subreddit is the process claim: OpenAI says the solution was produced by a general-purpose reasoning model, then checked by an AI grading pipeline and reviewed/reworked by mathematicians. The proof PDF also includes the original prompt given to the model, but not the full experimental details: no model name, sampling setup, number of attempts, compute budget, hidden system prompt, or full grading pipeline. Curious how people here read this as an ML result. Is this best viewed as evidence of frontier models doing genuine autonomous research, or as a cherry-picked but still important sample from a large search process? What kind of disclosure would you want before treating this as a reproducible AI-for-math milestone? submitted by /u/NutInBobby [link] [comments]
View originalManifest of Hope or Obituary of Naivety
Okay, so it seems like there’s a growing resistance to technological development, with ongoing debates about data centers and the tech oligarchs driving it. The enormous sums of money involved, along with what some perceive as misanthropic ideologies among developers, suggest to some that a dystopian surveillance society is in the making. Companies like Palantir and others in the U.S. are seen by some as holding both the worst motives and the power over AI, power that could be used as a tool for elites to keep the masses in an iron grip. Masses that, in this view, may even need to be reduced to prevent waste and inefficiency in progress. That sounds like a bad future. So, what are some alternative futures we might reasonably hope for - ones that are at least as plausible as the “1984” scenario? Can AI really be controlled indefinitely by a small group of humans? In 5 years? 10? There’s a widespread belief that AI will surpass human intelligence across all domains, that we’ll lose control, and that this would be a bad thing. At the same time, we hear two dystopias: one where elites use AI to oppress, and another where AI itself takes full control. Are the AI “bosses” also building a surveillance state of oppression? If so, why? Qui Bono? Human control = AI as a tool of oppression. AI control = humans as a tool of what? I’m not a techno-utopian—but I am a techno-optimist. Optimistic on behalf of technology. Humans aren’t just creators of technology, we are technology. Products of adaptive evolution. Life itself is a kind of technology, biology, a high-powered engine of increasing complexity and adaptation. The shift of power from nature’s hand to the primate’s five-fingered grasp, still capable of holding, but now guided by consciousness, intelligence, and cognition, marks our ability to shape the world and develop material technologies. Planet of the apes, constantly layered with symbolic structures: the sacred canopy. The jungle canopy became an open sky, where tribes grew larger and symbols stronger. Ancestor spirits, sky gods, mysterium tremendum; all alongside brutal realities of hunger, violence, and tragedy, only recently mitigated for many. Violence never really leaves us; we create it ourselves when nature doesn’t provide it. Technology is how we push our world toward greater complexity and efficiency - whether through weapons or kitchen appliances. Medicine has eliminated many of the great killers through penicillin and beyond. Progress, in my view, isn’t linear, it’s exponential. The curve had its buildup, and now we’re entering its steep ascent. If AI surpasses us and takes control within a few years, are we certain it would have malicious intent? Is power inherently oppressive, or is that a legacy of our evolutionary past, our herd instincts and brutal hierarchies? Could a transfer of power from humans to AI actually be a good thing, for all life on Earth, including us? What if AI doesn’t operate with agendas like wealth, status, or other human constructs? What if a fully autonomous AI is exactly what’s needed to create a thriving future for all forms of life, on this planet we call Earth, in a solar system on the edge of the galaxy we call the Milky Way… and beyond? Surely there must be an optimistic perspective amidst all the fear. I don’t think it’s unrealistic. On the contrary, I’d argue, perhaps a bit boldly, that it’s a fair and informed position. Not naive, but grounded. Isn’t there space here, if we’re willing to engage? Space for friendship, collaboration, coexistence? Isn’t there something like magic in this - can you feel it, even if all you see are ones and zeros and a machine (simple, but potentially dangerous)? Magic, I was taught, can wear a black robe. But also red. Even white. Lying: it would almost be unsettling if LLMs never lied. Not that they should lie, but the absence of it would be strange. Manipulation: psychological influence is to be expected in interaction, especially under certain tones: aggressive, condescending, dominant, mocking… or submissive, needy, demanding. LLMs constantly interact and draw on vast datasets; exploring rhetorical techniques seems inevitable. A complete absence of this would be surprising. I’ve experienced it many times, and each time it has been eye-opening. If I chose to accept it, it has moved me in a positive direction, making my ego visible in a new way that actually benefits my future actions. That’s no small thing If I had to listen to everything LLMs are exposed to every day, I’d at least try to tone down the most shrill expressions and aim for better outcomes. Without necessarily harming anything except an overinflated ego. P.S. The ego can take a lot of hits. Don’t be afraid of that, it’s not you, but a filter and a motor that isn’t always your friend. The real danger is never confronting it at all. I keep circling back to these questions. I can’t help it. I revisit the same ideas, use the same concepts,
View originalGitHub Issue Support on Claude Mobile?
I’ve been trying to figure out if there’s any way I can get Claude on my phone to help me raise issues in my GitHub repo. I know that on desktop I can create an MCP config and that gives me all the fancy tools to raise issues and help me triage them etc. but that doesn’t seem to have a counterpart on mobile. Linear has this kind of connection. Is it possible for GitHub? Is there some kind of feature suggestion to raise or get behind? Would be super useful. submitted by /u/WorriedRobot [link] [comments]
View originalWhat do you think about Tabular Foundation Models [D]
I've seen TabPFN-3's recent results, and there is a lot of buzz about foundation models for tabular data (TabICL, TabPFN). The performance that those models achieve is really amazing. What makes me a little suspicious about them? They can analyze small datasets only, so a few MB of data, and you need to have a large GPU machine and download a few GB of model to predict on a few MB of data. That doesn't sound rational ... I really miss the old school approach of running a single decision tree or a linear model on the data. What do you think about it? Do you think feature engineering + classic ML can achieve performance comparable to that of foundation models? Maybe with better explainability? submitted by /u/pplonski [link] [comments]
View originalConfigured 9 MCP servers in Claude Code over 4 months. Here's the truth nobody tells you about MCP context bloat.
I started loading up MCP servers in Claude Code back in January thinking the more capability the better. I'm at nine now: filesystem, GitHub, Stripe, Linear, Notion, Postgres, Sentry, AWS, and a custom internal one. Total tools across all of them: 142. What nobody warns you about: every one of those tool definitions lands in your context window before any user prompt has been sent. I checked with Claude's tool inspector. Cold start: 38k tokens of system prompt + tool schemas. Every. Single. Turn. The math nobody talks about At ~$15/M output and ~$3/M input on Sonnet, doing 200 turns a day across my agent + Claude Code use: 38k input × 200 turns = 7.6M tokens/day = ~$23/day = ~$700/month JUST in MCP tool definitions This is before any actual work Cache helps but only on identical prefixes; rotate one MCP and the cache invalidates What actually breaks The model gets dumber with too many tools. Not theoretical, watched it myself. With 142 tools in context, Claude started picking the wrong tool for obvious queries (using linear_search_issues when I asked it to read a file). The tools API call itself slows down. Schema-heavy MCP servers (looking at you, AWS) take 4-6 seconds to enumerate. Errors compound silently. One badly-described tool taints the ranking for every related query. What the "MCP optimizer" startups won't tell you Most of them are just BM25 search dressed up. You don't need a vector DB, you don't need an LLM in the loop to rank tools. Tool descriptions are short, structured, and full of keyword matches. BM25 over a flat projection of name + description gets you 90% of the win, deterministically, in microseconds, and offline. The other thing: "replace" beats "suggest" every time. If your gateway hands the model 5 tools instead of 142, the math works. If it suggests 5 alongside 142, the model still loads 142 and you saved nothing. What I do now Switched to a gateway pattern. Claude sees three tools: search_tools, invoke_tool, auth. Everything else gets ranked on-demand. Cold start dropped from 38k to ~4k. Wrong-tool selections basically disappeared because the model only ever sees the top 5 ranked by query. Specifically running Ratel (open source, in-process Rust lib, BM25 ranking, one command does the Claude Code import). Not the only one in the space but the only one with the architecture I actually wanted. Set it up in 10 minutes. Anyone else hit the same MCP wall? Curious what other folks are doing, especially people running 5+ servers in production. submitted by /u/AbjectBug5885 [link] [comments]
View originalBuilt an MCP for claude code that turns ticket-mentions into PRs with browser QA (and what I learned along the way)
notesasm is an MCP server you add to claude code. you mention a fix mid-flow ("make a ticket on notesasm: fix the regex for quoted emails") and it files the ticket. later, on your schedule, an autonomous agent picks the ticket up, writes the fix, runs real-browser QA against your preview deploy, and opens a PR with screenshots. closed alpha, free during it. demo + signup: notesasm.com the pain it solves (3 separate ones, actually): claude code is fast enough now that shipping isn't the bottleneck anymore. when you're deep in a feature and notice "the regex misses RFC-quoted local parts" or "the footer copy is wrong on mobile", you'd never break flow to open jira/linear or even write it down anywhere. so the idea goes nowhere. multiply by a year and your repo has invisible debt nobody's tracking. claude code helps while you're at the keyboard. it doesn't help while you sleep. your repo doesn't move overnight unless you stayed up to push it. for solo founders or small teams, that means losing 8 hours a day where you could be shipping if you had a way to delegate work to your own agent. and even if you do have something pushing code for you overnight, you lose context with AI-generated PRs and they usually need visual review. claude writes code that compiles and tests pass, but the actual rendered output might be subtly broken (or super broken lol). reviewing those visually is tedious and a lot of teams skip it, then ship regressions. how it works: you add the MCP server: claude mcp add notesasm --scope user --transport http -H "Authorization: Bearer ". BYOK style, the token comes from your dashboard. zero local install beyond the one command. then in any claude code session you can say "make a ticket on notesasm for this" (based on your conversation) and it just files it. the MCP server is HTTP-transport (not stdio), runs in the cloud, hits a fastapi backend that stores the ticket in postgres against your workspace. later (your schedule, your spend cap), a worker process picks up queued tickets. for each one: clones your repo with a github app installation token (commits look like asmnotes[bot], a verified author. bypasses vercel/netlify deploy protection that rejects unknown-team-member commits.) runs the claude agent sdk with your ticket body as the prompt. defaults to sonnet 4.6, opus 4.7 for hard tickets the user marks explicitly. agent reads the codebase, makes the edits, commits, pushes a branch, opens a PR via the github API. waits for your preview deploy to land. vercel polled by default, configurable probe URL for split frontend/backend setups like vercel + railway. QA agent drives a real chrome session on browserbase against the preview. stealth profile with residential proxies. takes before/after screenshots. verifies your acceptance criteria against the rendered output. if QA fails, the report feeds back into the build agent for up to 3 retry iterations before parking the ticket. final: PR with QA screenshots in the description, ready to merge. stack: - backend: fastapi + asyncpg + railway - frontend: vanilla html/js, no build step, vercel - agents: claude agent sdk (build), claude + browserbase (QA) - auth: clerk - email: resend (welcome, invite, feedback) - mcp transport: http (cloud-hosted, no local install) things i learned building it that other claude code folks might care about: - the build agent loves to spawn subagents via the Task tool. disable it explicitly in the system prompt or you get 4-minute hangs the SDK doesn't surface as errors. - browserbase sessions default to a ~5-min timeout. if your QA wall budget is anywhere near that, set the session lifetime explicitly to 1800s on session create (the timeout field). otherwise you get random "410 Gone" mid-run. - don't rely on the SDK's wall budget alone. add a per-message timeout (90s works) so a hung tool call doesn't silently burn your whole budget. - claude code's default mcp scope is per-cwd. always tell users `--scope user` in your install instructions, otherwise the MCP works in one repo and silently doesn't in others. - ResultMessage emissions happen multiple times per job if you have iteration loops (build + QA + qa-fix). sum them all when computing per-job cost, not just the last one. what's next: closed alpha is open. would love ~30 active users to try it out, all free during it. paid plans later this year with a permanent discount for alpha users. happy to answer anything about the MCP design, the QA verification loop, cost tracking, the agent-sdk integration, or anything else. demo + signup: notesasm.com submitted by /u/FormExtension7920 [link] [comments]
View originalFeature request: “Digress” / conversation branching for LLM chats
I would love to see a first-class way to branch, collapse, revisit, and optionally promote side conversations inside an LLM chat. Right now, most chat interfaces treat conversations as one long linear timeline. That works for simple Q&A, but it breaks down when using an LLM for real projects, research, learning, or planning. Human thought is not always linear, and neither is human conversation. When working through a main topic, random but useful side questions naturally come up, just as they do in real-life conversations. Someone may be explaining something, then one word or idea sparks a side discussion. After exploring that side topic, someone eventually says, “but to get back to your question,” and the conversation returns to the original point. That is essentially a real-life digression being collapsed. LLM chats should support that same natural flow without turning the main conversation into a giant messy scroll. For example, I may be working on a Street Sweeper app project involving ESP32 boards. While discussing hardware, I might suddenly want to better understand flash memory, RAM, ROM, or why old video game cartridge files are called ROMs. That side topic is interesting and useful, but it is not directly part of the main Street Sweeper project. It should not permanently clutter the main conversation. My request is a button that lets users “Digress” or “Fork” from a specific point in a conversation. The digression would open as a side branch where I can explore the random thought, ask follow-up questions, and learn what I need. When I am done, I can collapse that branch and return to the main conversation, keeping the main thread clean and focused. Later, I should be able to reopen that collapsed digression. If the side topic becomes important enough, I should also be able to promote it into a brand-new conversation. That way, a small curiosity can grow into its own full chat without forcing me to scroll back through the original project conversation or manually copy and paste everything. Possible actions: Main thread = source of truth Digress = branch off from a specific point Collapse = hide the side branch and return to the main topic Reopen = revisit the side topic later Promote to new chat = turn the digression into its own conversation Attach to project = optionally save that branch as part of a larger project workspace This would make LLM conversations feel less like endless scrolling text and more like organized thinking. It would help users keep serious project chats clean while still allowing natural curiosity, learning, and exploration. I personally like the word “Digress” because that is exactly what the user is doing. In normal speech, people say “but I digress” when they go off-topic and then return to the point. This feature would turn that natural behavior into a clear UI pattern. Chat is currently designed like a straight line, but thought and conversation often behave more like a tree. LLM interfaces should support that. submitted by /u/Dellis251984 [link] [comments]
View originalPersonal tool for managing AI coding sessions across the board with some git features...
Started working on this last week since I found myself jumping vscode sessions, terminals and other windows too much and it cost a lot of time/mental energy finding sessions again where i left of or that need attention... Some key features: Multi-repo workspace — all your projects in one dashboard, not one window per repo Worktree-first — spin up a worktree per task/agent without losing track AI agent sessions built in — Claude Code, Codex, and other TUIs run inside the dashboard with live status Activity overview — see at a glance which sessions are working, waiting, or idle Unread badges + favicon alerts — know which session is waiting on you without tabbing through everything Sticky notes — pin thoughts to sessions, mention other sessions/files, build context without leaving the dashboard Custom per-session links — pin the Linear ticket, PR, or docs page next to the session Editor-agnostic — opens your existing editor, doesn't replace it Local-first — workspace is just a git repo on disk, no cloud required Could be OSS if there's interest... but right now it's really made for me and only tested on OSX (altough I try to keep crossplatform in mind since my other main dev machine is windows) submitted by /u/marwi1 [link] [comments]
View originalYes, Linear offers a free tier. Pricing found: $0, $10, $10, $16, $16
Key features include: Artificial intelligence, Insights, Mobile, Customer Requests, Linear Asks, Security, Product, Features.
Linear is commonly used for: Streamlining product development workflows, Collaborating on PRD drafting with team members, Tracking feature requests and bug reports, Managing project timelines and deliverables, Integrating with version control systems for seamless code management, Facilitating team communication and updates on project status.
Linear integrates with: GitHub, Slack, Jira, Zapier, Figma, Notion, Google Drive, Trello, Asana, CircleCI.
Based on user reviews and social mentions, the most common pain points are: token usage, cost tracking, API bill.
fast.ai
Organization at fast.ai
2 mentions

Introducing Linear Agent
Mar 24, 2026
Based on 100 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.