The unified interface for LLMs. Find the best models & prices for your prompts
Based on the social mentions, users view OpenRouter as a valuable platform for AI model access and cost management. **Strengths** include extremely detailed statistics and analytics, particularly for programming use cases which represent the largest token consumption, and its utility as a reusable integration for AI agents with features like model discovery, cost tracking, and routing with fallbacks. **Key complaints** center around token costs, with users actively seeking ways to reduce expenses and noting concerns about "burning money" on AI services. **Pricing sentiment** is cost-conscious, with users appreciating tools that cut token costs by 44% and looking for more economical alternatives. **Overall reputation** positions OpenRouter as a go-to platform for developers building AI agents and applications, especially for programming tasks, though cost optimization remains a primary user concern.
Mentions (30d)
2
Reviews
0
Platforms
4
Sentiment
0%
0 positive
Based on the social mentions, users view OpenRouter as a valuable platform for AI model access and cost management. **Strengths** include extremely detailed statistics and analytics, particularly for programming use cases which represent the largest token consumption, and its utility as a reusable integration for AI agents with features like model discovery, cost tracking, and routing with fallbacks. **Key complaints** center around token costs, with users actively seeking ways to reduce expenses and noting concerns about "burning money" on AI services. **Pricing sentiment** is cost-conscious, with users appreciating tools that cut token costs by 44% and looking for more economical alternatives. **Overall reputation** positions OpenRouter as a go-to platform for developers building AI agents and applications, especially for programming tasks, though cost optimization remains a primary user concern.
Features
Industry
information technology & services
Employees
40
Funding Stage
Series A
Total Funding
$40.0M
openrouter rankings for programming tokens show sharp rise in open models and stagnation of US frontier models
Site has extremely detailed stats by day/week for every model. Programming is by far the largest consumer of tokens, and in fact entire token growth in 2025 was only from programming. Other categories very flat. It is also a category where you would pay for better performance. IMO, its relevant to this sub in that one of the top models, minimax, fits in under 256gb, but also that the trends are for cost effectiveness rather than "the absolute best". There is a tangent insight as to whether US datacenter frenzy is needed. kimi k2.5 being free on openclaw is a big reason for its total dominance. In week of Feb 2, minimax was only other top model to increase token usage. Opus 4.6 release seems to be extremely flat in reception. Agentic trend tends to make LLM models disposable, since better ones are released every week, and the agents/platforms that can switch on the fly while keeping context, is something you can invest in improving while not being obsolete next month.
View originalPricing found: $10
z.ai debuts faster, cheaper GLM-5 Turbo model for agents and 'claws' — but it's not open-source
Chinese AI startup Z.ai, known for its powerful, open source GLM family of large language models (LLMs), has introduced GLM-5-Turbo, a new, proprietary variant of its open source GLM-5 model aimed at agent-driven workflows, with the company positioning it as a faster model tuned for OpenClaw-style tasks such as tool use, long-chain execution and persistent automation. It's available now through Z.ai's application programming interface (API) on third-party provider OpenRouter with roughly a 202.8K-token context window, 131.1K max output, and listed pricing of $0.96 per million input tokens and $3.20 per million output tokens. That makes it about $0.04 cheaper per total input and output cost (at 1 million tokens) than its predecessor, according to our calculations. Model Input Output Total Cost Source Grok 4.1 Fast $0.20 $0.50 $0.70 xAI Gemini 3 Flash $0.50 $3.00 $3.50 Google Kimi-K2.5 $0.60 $3.00 $3.60 Moonshot GLM-5-Turbo $0.96 $3.20 $4.16 OpenRouter GLM-5 $1.00 $3.20 $4.20 Z.ai Claude Haiku 4.5 $1.00 $5.00 $6.00 Anthropic Qwen3-Max $1.20 $6.00 $7.20 Alibaba Cloud Gemini 3 Pro $2.00 $12.00 $14.00 Google GPT-5.2 $1.75 $14.00 $15.75 OpenAI GPT-5.4 $2.50 $15.00 $17.50 OpenAI Claude Sonnet 4.5 $3.00 $15.00 $18.00 Anthropic Claude Opus 4.6 $5.00 $25.00 $30.00 Anthropic GPT-5.4 Pro $30.00 $180.00 $210.00 OpenAI Second, Z.ai is also adding the model to its GLM Coding subscription product, which is its packaged coding assistant service. That service has three tiers: Lite at $27 per quarter, Pro at $81 per quarter, and Max at $216 per quarter. Z.ai’s March 15 rollout note says Pro subscribers get GLM-5-Turbo in March, while Lite subscribers get the base GLM-5 in March and must wait until April for GLM-5-Turbo. The company is also taking early-access applications for enterprises via a Google Form, which suggests some users may get access ahead of that schedule depending on capacity. z.ai describes GLM-5-Turbo as designed for “fast inference” and “deeply optimized for real-wor
View originalShow HN: Beta-Claw – I built an AI agent runtime that cuts token costs by 44%
I built Beta-Claw during a competition and kept pushing it after because I genuinely think the token waste problem in AI agents is underrated.<p>The core idea: most agent runtimes serialize everything as JSON. JSON is great for humans but terrible for tokens. So I built TOON (Token-Oriented Object Notation) — same structure, 28–44% fewer tokens. At scale that's millions of tokens saved per day.<p>What else it does: → Routes across 12 providers (Anthropic, OpenAI, Groq, Ollama, DeepSeek, OpenRouter + more) → 4-tier smart model routing — picks the cheapest model that can handle the task → Multi-agent DAG: Planner → Research → Execution → Memory → Composer → Encrypted vault (AES-256-GCM), never stores secrets in plaintext → Prompt injection defense + PII redaction built in → 19 hot-swappable skills, < 60ms reload → Full benchmark suite included — 9ms dry-run pipeline latency<p>It's CLI-first, TypeScript, runs on Linux/Mac/WSL2.<p>Repo: <a href="https://github.com/Rawknee-69/Beta-Claw" rel="nofollow">https://github.com/Rawknee-69/Beta-Claw</a><p>Still rough in places but the core is solid. Brutal feedback welcome.
View originalShow HN: Open-source multi-model code review council (BYOK, free tier)
I built this after realizing that that single-model AI code review currently has obvious blind spots. The idea: chat with a Lead AI about your project, then "convene the Council" — three other models independently review everything in parallel, and the Lead synthesizes findings into structured categories: consensus, majority positions, lone warnings, and dissent.<p>The surprising insight from building and using it: the value isn't consensus, it's structured disagreement. When Grok catches a temporal data mismatch that three other models missed, or Claude spots an imported function that's never called — that's where the Council is useful.<p>Each model has genuine strengths: Claude tends toward architecture, Grok goes deep on data flows, ChatGPT catches API/integration issues, Gemini thinks about product gaps.<p>Stack: FastAPI, HTMX, OpenRouter as a unified API gateway. BYOK via OpenRouter so one key covers all four models — a typical review runs about 25 cents. Free tier gives you 1 review to try it.<p>Perplexity just launched "Model Council" as part of their $200/month Computer product — same core mechanic. This is the open-source, bring-your-own-key version.<p>GitHub: <a href="https://github.com/scifi-signals/council-of-alignment" rel="nofollow">https://github.com/scifi-signals/council-of-alignment</a> Live: <a href="https://council.stardreamgames.com" rel="nofollow">https://council.stardreamgames.com</a><p>Constructive feedback would be much appreciated!
View originalShow HN: OpenRouter Skill – Reusable integration for AI agents using OpenRouter
Hi HN,<p>I kept rebuilding the same OpenRouter integration across side projects – model discovery, image generation, cost tracking via the generation endpoint, routing with fallbacks, multimodal chat with PDFs. Every time I'd start fresh, the agent would get some things right and miss others (wrong response parsing, missing attribution headers, etc.).<p>So I packaged the working patterns into a skill – a structured reference that AI coding agents (Claude, Cursor, etc.) read before writing code. It includes quick snippets, production playbooks, Next.js and Express starter templates, shared TypeScript helpers, and smoke tests.<p>I'm a PM, not a developer – the code was written by Claude and reviewed/corrected by me. Happy to answer questions about the skill format or the OpenRouter patterns.
View originalHow to stop burning money on OpenClaw
[Original Reddit post](https://www.reddit.com/r/ArtificialInteligence/comments/1rktfk0/how_to_stop_burning_money_on_openclaw/) OpenClaw is one of the fastest-growing open-source projects in recent history. 230,000 GitHub stars, 116,000 Discord members, 2 million visitors per week. All of that in two months. People are running personal AI agents on their Mac Minis and cloud servers. It works, and it is genuinely useful. Like any major shift in how we use technology, it comes with constraints. After speaking with over a hundred OpenClaw users, cost is the topic that comes up in almost every conversation. Someone sets up their agent, starts using it daily, and two weeks later discovers they have spent $254 on API tokens. Another spent $800 in a month. These are not power users pushing the limits. These are normal setups with normal usage. Where the money goes Your agent sends every request to your primary model. A heartbeat check, a calendar lookup, a simple web search. If your primary model is Opus 4.6, all of it goes through the most expensive endpoint available. Your costs stack up from four main sources: System context - SOUL.md loads into the prompt on every call. Other bootstrap files like AGENTS.md contribute depending on what the agent needs. Even with memory pulled in through search rather than loaded raw, the base system context still adds up. On a typical setup, you are looking at thousands of tokens billed on every single request. Conversation history - Your history grows with every exchange. After a few hours of active use, a session can carry a large amount of tokens. The entire history tags along with every new request. Heartbeat checks - The heartbeat runs in the background every 30 minutes by default. Each check is a full API call with all of the above included. Model choice - Without routing, every request is sent to a single primary model, whether the task is simple or complex. That prevents cost optimization. One user woke up to an unexpected $141 bill overnight because the heartbeat was hitting the wrong model. Put all of this together on an unoptimized Opus setup and you can easily spend more per day than most people expect to pay in a month. Token consumption taken from manifest.build dashbaord Use one agent with skills instead of many agents This is the highest-impact change you can make and almost nobody talks about it. A lot of users build multi-agent setups. One agent for writing, one for research, one for coding, one to coordinate. Each agent runs as a separate instance with its own memory, its own context, and its own configuration files. Every handoff between agents burns tokens. Each agent adds its own fixed context overhead, so costs scale with every new instance you spin up. OpenClaw has a built-in alternative. A skill is a markdown file that gives your agent a new capability without creating a new instance. Same brain, same memory, same context. One user went from spending hundreds per week on a multi-agent setup to $90 per month with a single agent and a dozen skills. The quality went up because context stopped getting lost between handoffs. Keep one main agent. Give it a skill for each type of work. Only spin up a sub-agent for background tasks that take several minutes and need to run in parallel. Route each task to the right model The majority of what your agent does is simple. Status checks, message formatting, basic lookups. These do not need a frontier model. Only a small fraction of requests actually benefits from premium reasoning. Without routing, all of it hits your most expensive endpoint by default. One deployment tracked their costs before and after implementing routing and went from $150 per month to $35. Another went from $347 to $68. Smart routing tools can reduce costs by 70 percent on average. OpenClaw does not ship with a built-in routing engine, so you need an external tool to make this work. Manifest or OpenRouter handle this out of the box. It classifies each request and routes it to the right model automatically, so your heartbeats and simple lookups go to Haiku while complex reasoning still hits Opus. That alone cuts your bill dramatically without any manual config per task. If you prefer a DIY approach, you can set up multiple model configs or write a routing skill yourself, but it takes more effort to get right. https://i.redd.it/m6yf761an2ng1.gif Cache what does not change Your SOUL md, MEMORY md, and system instructions are the same from one call to the next. Without caching, the provider processes all of those tokens from scratch on every single request. You pay full price every time for content that has not changed. Prompt caching is a capability on the provider side. Anthropic offers an explicit prompt caching mechanism with a documented TTL where cached reads cost significantly less than fresh processing. Other providers handle caching differently or automatically, so the details depend on which model you are using. The point is the same: stat
View originalopenrouter rankings for programming tokens show sharp rise in open models and stagnation of US frontier models
Site has extremely detailed stats by day/week for every model. Programming is by far the largest consumer of tokens, and in fact entire token growth in 2025 was only from programming. Other categories very flat. It is also a category where you would pay for better performance. IMO, its relevant to this sub in that one of the top models, minimax, fits in under 256gb, but also that the trends are for cost effectiveness rather than "the absolute best". There is a tangent insight as to whether US datacenter frenzy is needed. kimi k2.5 being free on openclaw is a big reason for its total dominance. In week of Feb 2, minimax was only other top model to increase token usage. Opus 4.6 release seems to be extremely flat in reception. Agentic trend tends to make LLM models disposable, since better ones are released every week, and the agents/platforms that can switch on the fly while keeping context, is something you can invest in improving while not being obsolete next month.
View originalYes, OpenRouter offers a free tier. Pricing found: $10
Key features include: Product, Company, Developer, Connect.
Based on user reviews and social mentions, the most common pain points are: raised, large language model, llm, foundation model.
Based on 11 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.
danny-avila
1 mention