深度求索(DeepSeek),成立于2023年,专注于研究世界领先的通用人工智能底层模型与技术,挑战人工智能前沿性难题。基于自研训练框架、自建智算集群和万卡算力等资源,深度求索团队仅用半年时间便已发布并开源多个百亿级参数大模型,如DeepSeek-LLM通用大语言模型、DeepSeek-Coder代
Based on the provided content, there is very limited specific user feedback about DeepSeek. The social mentions primarily consist of multiple YouTube channel references to "DeepSeek AI" without actual user reviews or detailed commentary. One technical mention discusses IndexCache, a sparse attention optimizer that improves inference speed by 1.82x for long-context AI models, potentially related to DeepSeek's technical capabilities. The other mentions focus on general AI cost optimization tools and competitive analysis rather than DeepSeek-specific user experiences. Without substantive user reviews or detailed social commentary, it's difficult to assess user sentiment regarding DeepSeek's strengths, weaknesses, or pricing perception.
Mentions (30d)
2
Reviews
0
Platforms
4
GitHub Stars
102,417
16,606 forks
Based on the provided content, there is very limited specific user feedback about DeepSeek. The social mentions primarily consist of multiple YouTube channel references to "DeepSeek AI" without actual user reviews or detailed commentary. One technical mention discusses IndexCache, a sparse attention optimizer that improves inference speed by 1.82x for long-context AI models, potentially related to DeepSeek's technical capabilities. The other mentions focus on general AI cost optimization tools and competitive analysis rather than DeepSeek-specific user experiences. Without substantive user reviews or detailed social commentary, it's difficult to assess user sentiment regarding DeepSeek's strengths, weaknesses, or pricing perception.
Industry
information technology & services
Employees
170
87,689
GitHub followers
32
GitHub repos
102,417
GitHub stars
20
npm packages
40
HuggingFace models
Artificial Analysis Intelligence Index and cost benchmarks are useful decision/guidance determinants for which models to use. Analysis for top models.
# AI Intelligence and Benchmarking Cost (Feb 2026) As per the **Artificial Analysis Intelligence Index v4.0** (February 2026), the scoring ceiling is set by **Claude Opus 4.6 (max) at 53**. ## Adjusted Score Formula The "Adjusted Score" follows a quadratic penalty formula: ``` Adjusted Score = 53 × (1 - (53 - Intel Score)² / 53²) ``` This creates a steeper penalty for performance gaps compared to a linear scale. ## Model Comparison Table | Lab | Model | Intel Score | Adjusted Score | Benchmark Cost | Intel Ratio (Score/Cost) | Adj. Ratio (Adj/Cost) | |-----------|-------|-------------|----------------|----------------|--------------------------|----------------------| | Anthropic | Claude Opus 4.6 (max) | 53 | 53 | $2,486.45 | 0.021 | 0.021 | | OpenAI | GPT-5.2 (xhigh) | 51 | 49 | $2,304.00* | 0.022 | 0.021 | | Zhipu AI | GLM-5 (Reasoning) | 50 | 47 | $384.00* | 0.130 | 0.122 | | Google | Gemini 3 Pro | 48 | 43 | $1,179.00* | 0.041 | 0.036 | | MiniMax | MiniMax-M2.5 | 42 | 31 | $124.58 | 0.337 | 0.249 | | DeepSeek | DeepSeek V3.2 (Reasoning) | 42 | 31 | $70.64 | 0.595 | 0.439 | | xAI | Grok 4 (Reasoning) | 41 | 29 | $1,568.34 | 0.026 | 0.018 | *\*Benchmark costs for proprietary models are based on Artificial Analysis evaluation token counts (typically 12M–88M depending on verbosity) multiplied by current API rates.* ## Key Insights 1. **High token reasoning models**: Grok 4 and Claude Opus 4.6 use a high number of tokens during reasoning, up to **88M tokens**. This results in low Intel-to-Cost ratios despite high scores. 2. **DeepSeek V3.2 is the most efficient**: It provides an adjusted intelligence ratio that is roughly **20 times better** than the proprietary frontier. 3. **Cost efficiency comparison**: MiniMax-M2.5 and DeepSeek V3.2 share a score of 42. DeepSeek is almost **twice as cost-effective** due to lower API pricing and higher token efficiency. ## Visual Summary ``` Intel Score vs Cost Efficiency (Adjusted Ratio) ───────────────────────────────────────────────── DeepSeek V3.2 ████████████████████████████ 0.439 MiniMax-M2.5 ███████████████ 0.249 GLM-5 ███████ 0.122 Gemini 3 Pro ██ 0.036 Claude Opus 4.6 █ 0.021 GPT-5.2 █ 0.021 Grok 4 █ 0.018 ``` --- *Source: Artificial Analysis Intelligence Index v4.0, February 2026* google AI mode made analysis, GLM 5 formatted and added cute graph. this combines the intelligence score and cost to run the intelligence benchmark from https://artificialanalysis.ai/?endpoints=openai_gpt-5-2-codex%2Cazure_kimi-k2-thinking%2Camazon-bedrock_qwen3-coder-480b-a35b-instruct%2Camazon-bedrock_qwen3-coder-30b-a3b-instruct%2Ctogetherai_minimax-m2-5_fp4%2Ctogetherai_glm-5_fp4%2Ctogetherai_qwen3-next-80b-a3b-reasoning%2Cgoogle_gemini-3-pro_ai-studio%2Cgoogle_glm-4-7%2Cmoonshot-ai_kimi-k2-thinking_turbo%2Cnovita_glm-5_fp8 look at intelligence vs cost graph for further insight. You can add much smaller models for comparison to LLMs you might run locally. The adjusted intelligence/cost metric is a useful heuristic for "how much would you pay extra to get top score". Choosing non-open models requires a much higher penalty than 2x the difference/comparison to highest score. Quantized versions don't seem to score lower. This site provides good base info to make your own model of "score deficit", model size, tps as a combined score relative to tokens/cost to get a benchmark score. I was originally researching how grok 4.2 approach would inflate costs vs performance, but it is not yet benchmarked.
View originalIndexCache, a new sparse attention optimizer, delivers 1.82x faster inference on long-context AI models
Processing 200,000 tokens through a large language model is expensive and slow: the longer the context, the faster the costs spiral. Researchers at Tsinghua University and Z.ai have built a technique called IndexCache that cuts up to 75% of the redundant computation in sparse attention models, delivering up to 1.82x faster time-to-first-token and 1.48x faster generation throughput at that context length. The technique applies to models using the DeepSeek Sparse Attention architecture, including the latest DeepSeek and GLM families. It can help enterprises provide faster user experiences for production-scale, long-context models, a capability already proven in preliminary tests on the 744-billion-parameter GLM-5 model. The DSA bottleneck Large language models rely on the self-attention mechanism, a process where the model computes the relationship between every token in its context and all the preceding ones to predict the next token. However, self-attention has a severe limitation. Its computational complexity scales quadratically with sequence length. For applications requiring extended context windows (e.g., large document processing, multi-step agentic workflows, or long chain-of-thought reasoning), this quadratic scaling leads to sluggish inference speeds and significant compute and memory costs. Sparse attention offers a principled solution to this scaling problem. Instead of calculating the relationship between every token and all preceding ones, sparse attention optimizes the process by having each query select and attend to only the most relevant subset of tokens. DeepSeek Sparse Attention (DSA) is a highly efficient implementation of this concept, first introduced in DeepSeek-V3.2. To determine which tokens matter most, DSA introduces a lightweight "lightning indexer module" at every layer of the model. This indexer scores all preceding tokens and selects a small batch for the main core attention mechanism to process. By doing this, DSA slashes the heavy co
View originalShow HN: Beta-Claw – I built an AI agent runtime that cuts token costs by 44%
I built Beta-Claw during a competition and kept pushing it after because I genuinely think the token waste problem in AI agents is underrated.<p>The core idea: most agent runtimes serialize everything as JSON. JSON is great for humans but terrible for tokens. So I built TOON (Token-Oriented Object Notation) — same structure, 28–44% fewer tokens. At scale that's millions of tokens saved per day.<p>What else it does: → Routes across 12 providers (Anthropic, OpenAI, Groq, Ollama, DeepSeek, OpenRouter + more) → 4-tier smart model routing — picks the cheapest model that can handle the task → Multi-agent DAG: Planner → Research → Execution → Memory → Composer → Encrypted vault (AES-256-GCM), never stores secrets in plaintext → Prompt injection defense + PII redaction built in → 19 hot-swappable skills, < 60ms reload → Full benchmark suite included — 9ms dry-run pipeline latency<p>It's CLI-first, TypeScript, runs on Linux/Mac/WSL2.<p>Repo: <a href="https://github.com/Rawknee-69/Beta-Claw" rel="nofollow">https://github.com/Rawknee-69/Beta-Claw</a><p>Still rough in places but the core is solid. Brutal feedback welcome.
View originalShow HN: Sutra.team – The First OS for Autonomous Agents
We built an operating system for AI agents that actually deploy and run autonomously — not just chat interfaces you have to babysit. The core idea: Agents should work like specialists on your team, not assistants you prompt all day. What that means in practice: 15 prebuilt production agents (legal, finance, marketing, operations, etc.) 32+ skills from the OpenClaw library (email, web search, browser automation, code execution, council deliberation, etc.) Deploy to Telegram, Slack, email, or dashboard Heartbeat scheduling for proactive agents (weekly reports, daily checks, etc.) BYOK support (Claude, GPT-4, Gemini, DeepSeek, local models) Portable Mind Format: every agent is a JSON file you own and can export Three layers that matter: 1. KARMA (cost governance) Agents with unlimited API access burn money fast. Every skill call is budget-tracked. You set spending limits. Agents that exceed budget get throttled, not shut down. 2. SILA (audit) Every agent action is logged with full context: what skill was called, what data was accessed, what was sent externally. SOC2/GDPR/HIPAA compliance isn't an afterthought — it's built into the execution layer. 3. SUTRA (orchestration) Council deliberation as a first-class skill. 8 specialist agents (Right View, Right Intention, Right Speech, etc.) can be invoked by any other agent. You get multi-perspective analysis without manually coordinating LLM calls. Why we built this: Most "agent frameworks" are libraries for developers to stitch together their own infrastructure. That's fine for engineers, but it leaves everyone else stuck with ChatGPT. We wanted something in between: opinionated infrastructure that handles deployment, security, and cost control — so you can focus on what your agents do, not how they run. Current state: Live in production $9/month Explorer tier (full platform access) Companion book: How to Use Autonomous Agents — free on Kindle March 1-5, 2026 16+ agent build examples across business, creative, and household domains The constraint we're designing around: Agents that "write code on the fly" are powerful in demos, brittle in production. Our bet: pre-audited, composable skills + user-defined constraints + transparent cost tracking = agents you can actually trust to run unsupervised. Built by: JB Wagoner (patent holder for transportable AI persona architecture, founder of Sutra.team) Feedback welcome — especially from people building agent systems in production. What's missing? What would make this more useful?
View originalShow HN: Get GPT-5.2, Grok-4.1-fast, KimiK2.5 and more LLMs at half the cost
Hey I'm Vansh. I built frogAPI because I was tired of managing separate accounts and billing for every AI provider.<p>It's an OpenAI-compatible API gateway. Swap your base URL to frogapi.app/v1, keep your SDK code, and pick from 9 models: GPT-5.2, GPT-5-Mini, GPT-5-Nano, DeepSeek-V3.2, Mistral-Large-3, Llama-4-Maverick, Kimi-K2.5, Grok-4.1-Fast, GPT-OSS-120B.<p>Per-token pricing matches the source models exactly. The way it works out cheaper: every deposit is matched with free credits. Put in $10, get $20 in credits. So your effective cost per token is half.<p>No subscriptions, no tiers. Pay per token.<p>Would love feedback.<p><a href="https://frogapi.app" rel="nofollow">https://frogapi.app</a>
View originalFebruary 26, 2026
It appears the State of the Union was the marker for the White House to launch directly into campaign mode. Much of that mode centers on trying to defang Trump’s weaknesses with attacks on Democrats. And since the 2024 campaign brought us the insistence from the Trump campaign, including Trump and then–vice presidential candidate J.D. Vance, that “they’re eating the dogs…they’re eating the cats,” it’s reasonable to assume the next several months are going to be a morass of lies and disinformation. Trump announced in his State of the Union that he was declaring a “war on fraud to be led by our great Vice President J.D. Vance” and said that “members of the Somali community have pillaged an estimated $19 billion from the American taxpayer…in actuality, the number is much higher than that. And California, Massachusetts, Maine and many other states are even worse.” He added: “And we’re able to find enough of that fraud, we will actually have a balanced budget overnight.” This, in part, seemed designed to reverse victim and offender by suggesting that rather than Trump’s being the perpetrator of extraordinary frauds and corruption in cryptocurrency, for example—he was, after all, found guilty on 34 charges of business fraud in 2024—immigrants are to blame for fraud. As Kirsten Swanson and Ryan Raiche of KSTP in Minneapolis explain, members of Minnesota’s Somali community, 95% of whom are U.S. citizens, pay about $67 million in taxes annually and have an estimated $8 billion impact on the community. While some have indeed been charged and convicted of fraud over the past five years, the accusation of $19 billion in fraud is just a number thrown out without evidence by “then-Assistant U.S. Attorney Joe Thompson,” who estimated in December 2025 that “‘half or more’ of $18 billion in Medicaid reimbursements from 14 high-risk programs could be fraudulent.” Yesterday Vance and Dr. Mehmet Oz, who oversees Medicaid, the federal healthcare program for low-income households, announced the administration is withholding $259 million in Medicaid funds from Minnesota, claiming the state has not done enough to protect taxpayers from fraud. It is illegal for the executive branch to withhold funds appropriated by Congress, and a federal judge has blocked a similar freeze on $10 billion in childcare funding for Illinois, California, Colorado, Minnesota, and New York while the case is in court. Nonetheless, Minnesota representative Tom Emmer, who is part of the Republican leadership in the House, approved the attack on his constituents, posting: “The war on fraud has begun. And Somali fraudsters in my home state are about to find out.” Minnesota governor Tim Walz, a Democrat, posted: “This has nothing to do with fraud…. This is a campaign of retribution. Trump is weaponizing the entirety of the federal government to punish blue states like Minnesota. These cuts will be devastating for veterans, families with young kids, folks with disabilities, and working people across our state.” While Walz is almost certainly correct that this is a campaign of retribution, the administration is also salting into the media an explanation for the sudden depletion of the trust funds that are used to pay Medicare and Social Security. In March 2025, the nonpartisan Congressional Budget Office (CBO) estimated the trust fund that pays for Medicare A would be solvent until 2052. On Monday, it updated its projections, saying the funds will run out in 2040. The CBO also expects the Social Security trust fund to run dry a year earlier than previously expected, by the end of 2031. As Nick Lichtenberg of *Fortune* wrote, policy changes by the Republicans under Trump, especially the tax cuts in the budget reconciliation bill the Republicans call the “One Big Beautiful Bill Act” have “drastically shortened the financial life spans of both Medicare and Social Security, accelerating their paths toward insolvency.” Between Trump’s statement that if the administration finds enough fraud it can balance the budget overnight, and the subsequent insistence that cuts to Medicaid are necessary because of that fraud, it sure looks like the administration is trying to distract attention from the CBO’s report that Trump’s tax cuts have cut the solvency of Social Security and Medicare by more than a decade. Instead, they are hoping to convince voters that immigrants are at fault. Similarly, in an oldie but a goodie, Republicans today hauled former secretary of state Hillary Clinton before the House Oversight and Government Reform Committee to testify by video about her knowledge of the investigations into sex traffickers Jeffrey Epstein and Ghislaine Maxwell. In a scathing opening statement, Clinton noted that while committee chair James Comer (R-KY) subpoenaed eight law enforcement officials who were directly involved in that investigation, only one appeared before the committee. The rest simply submitted brief statements saying they had no information. Clinton al
View originalArtificial Analysis Intelligence Index and cost benchmarks are useful decision/guidance determinants for which models to use. Analysis for top models.
# AI Intelligence and Benchmarking Cost (Feb 2026) As per the **Artificial Analysis Intelligence Index v4.0** (February 2026), the scoring ceiling is set by **Claude Opus 4.6 (max) at 53**. ## Adjusted Score Formula The "Adjusted Score" follows a quadratic penalty formula: ``` Adjusted Score = 53 × (1 - (53 - Intel Score)² / 53²) ``` This creates a steeper penalty for performance gaps compared to a linear scale. ## Model Comparison Table | Lab | Model | Intel Score | Adjusted Score | Benchmark Cost | Intel Ratio (Score/Cost) | Adj. Ratio (Adj/Cost) | |-----------|-------|-------------|----------------|----------------|--------------------------|----------------------| | Anthropic | Claude Opus 4.6 (max) | 53 | 53 | $2,486.45 | 0.021 | 0.021 | | OpenAI | GPT-5.2 (xhigh) | 51 | 49 | $2,304.00* | 0.022 | 0.021 | | Zhipu AI | GLM-5 (Reasoning) | 50 | 47 | $384.00* | 0.130 | 0.122 | | Google | Gemini 3 Pro | 48 | 43 | $1,179.00* | 0.041 | 0.036 | | MiniMax | MiniMax-M2.5 | 42 | 31 | $124.58 | 0.337 | 0.249 | | DeepSeek | DeepSeek V3.2 (Reasoning) | 42 | 31 | $70.64 | 0.595 | 0.439 | | xAI | Grok 4 (Reasoning) | 41 | 29 | $1,568.34 | 0.026 | 0.018 | *\*Benchmark costs for proprietary models are based on Artificial Analysis evaluation token counts (typically 12M–88M depending on verbosity) multiplied by current API rates.* ## Key Insights 1. **High token reasoning models**: Grok 4 and Claude Opus 4.6 use a high number of tokens during reasoning, up to **88M tokens**. This results in low Intel-to-Cost ratios despite high scores. 2. **DeepSeek V3.2 is the most efficient**: It provides an adjusted intelligence ratio that is roughly **20 times better** than the proprietary frontier. 3. **Cost efficiency comparison**: MiniMax-M2.5 and DeepSeek V3.2 share a score of 42. DeepSeek is almost **twice as cost-effective** due to lower API pricing and higher token efficiency. ## Visual Summary ``` Intel Score vs Cost Efficiency (Adjusted Ratio) ───────────────────────────────────────────────── DeepSeek V3.2 ████████████████████████████ 0.439 MiniMax-M2.5 ███████████████ 0.249 GLM-5 ███████ 0.122 Gemini 3 Pro ██ 0.036 Claude Opus 4.6 █ 0.021 GPT-5.2 █ 0.021 Grok 4 █ 0.018 ``` --- *Source: Artificial Analysis Intelligence Index v4.0, February 2026* google AI mode made analysis, GLM 5 formatted and added cute graph. this combines the intelligence score and cost to run the intelligence benchmark from https://artificialanalysis.ai/?endpoints=openai_gpt-5-2-codex%2Cazure_kimi-k2-thinking%2Camazon-bedrock_qwen3-coder-480b-a35b-instruct%2Camazon-bedrock_qwen3-coder-30b-a3b-instruct%2Ctogetherai_minimax-m2-5_fp4%2Ctogetherai_glm-5_fp4%2Ctogetherai_qwen3-next-80b-a3b-reasoning%2Cgoogle_gemini-3-pro_ai-studio%2Cgoogle_glm-4-7%2Cmoonshot-ai_kimi-k2-thinking_turbo%2Cnovita_glm-5_fp8 look at intelligence vs cost graph for further insight. You can add much smaller models for comparison to LLMs you might run locally. The adjusted intelligence/cost metric is a useful heuristic for "how much would you pay extra to get top score". Choosing non-open models requires a much higher penalty than 2x the difference/comparison to highest score. Quantized versions don't seem to score lower. This site provides good base info to make your own model of "score deficit", model size, tps as a combined score relative to tokens/cost to get a benchmark score. I was originally researching how grok 4.2 approach would inflate costs vs performance, but it is not yet benchmarked.
View originalRepository Audit Available
Deep analysis of deepseek-ai/DeepSeek-V3 — architecture, costs, security, dependencies & more
DeepSeek has a public GitHub repository with 102,417 stars.
Based on user reviews and social mentions, the most common pain points are: large language model, llm, foundation model, token cost.
Based on 11 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.
Sebastian Raschka
Staff ML Engineer at Lightning AI
2 mentions