深度求索(DeepSeek),成立于2023年,专注于研究世界领先的通用人工智能底层模型与技术,挑战人工智能前沿性难题。基于自研训练框架、自建智算集群和万卡算力等资源,深度求索团队仅用半年时间便已发布并开源多个百亿级参数大模型,如DeepSeek-LLM通用大语言模型、DeepSeek-Coder代
Users generally praise DeepSeek for its strong model performance and innovative approach, reflected by high overall ratings, notably 4.5 to 5 on G2. However, some mention potential cost concerns, particularly in AI benchmarking and token use, though exact pricing details were less discussed. The pricing seems to be perceived positively as part of broader cost-efficiency discussions on platforms like social media. DeepSeek holds a solid reputation as a top model in AI circles, often compared favorably alongside other leading AI platforms like Opus and GPT.
Mentions (30d)
35
Avg Rating
4.5
8 reviews
Platforms
5
GitHub Stars
102,417
16,606 forks
Users generally praise DeepSeek for its strong model performance and innovative approach, reflected by high overall ratings, notably 4.5 to 5 on G2. However, some mention potential cost concerns, particularly in AI benchmarking and token use, though exact pricing details were less discussed. The pricing seems to be perceived positively as part of broader cost-efficiency discussions on platforms like social media. DeepSeek holds a solid reputation as a top model in AI circles, often compared favorably alongside other leading AI platforms like Opus and GPT.
Features
Use Cases
Industry
information technology & services
Employees
170
87,689
GitHub followers
32
GitHub repos
102,417
GitHub stars
20
npm packages
40
HuggingFace models
DeepSeek just popped the American AI bubble.
DeepSeek just popped the American AI bubble. Not by killing AI. By killing the fantasy of unlimited AI pricing power. DeepSeek V4 Pro: Input: $0.435 per 1M tokens Output: $0.87 per 1M tokens OpenAI GPT-5.5: Input: $5.00 Output: $30.00 Claude Opus 4.7: Input: $5.00 Output: $25.00 Claude Sonnet 4.6: Input: $3.00 Output: $15.00 DeepSeek is roughly: 11.5x cheaper than GPT-5.5 on input 34.5x cheaper than GPT-5.5 on output 28.7x cheaper than Claude Opus on output 17.2x cheaper than Claude Sonnet on output If a model is “good enough” at 1/20th or 1/30th the cost, margins will compress faster than Wall Street expects. AI is not dead. But the AI bubble just lost its pricing power.
View original| Model | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
| deepseek-v3 | $0.27 | $1.10 |
| deepseek-r1 | $0.55 | $2.19 |
Light
1M tokens/mo
$0.60 – $1
deepseek-v3 → deepseek-r1
Growth
50M tokens/mo
$30 – $60
deepseek-v3 → deepseek-r1
Scale
500M tokens/mo
$301 – $603
deepseek-v3 → deepseek-r1
Estimates assume 60/40 input/output ratio. Actual costs vary by usage pattern.
g2
What do you like best about Deepseek?Deepseek is the Strongest AI chatbot which has great thinking capability and good result giving capability Review collected by and hosted on G2.com.What do you dislike about Deepseek?Deepseek stopped its realtime data, that is the only one reason i disliked it Review collected by and hosted on G2.com.
What do you like best about Deepseek?Deepseek is very user friendly and more human than Chatgpt, it has a deepthink feature which I feel is a really good value addition as it shows what it thinks. Review collected by and hosted on G2.com.What do you dislike about Deepseek?At times even after giving context the AI doesnt understand what is asked of it. Review collected by and hosted on G2.com.
What do you like best about Deepseek?DeepSeek was one of the Chinese AI models that became viral instantly, with millions of downloads. and it claimed to be extremely cheap. I also started with it out of curiosity. My usage was mainly in content creation, curation, and research for my daily requirements of Social Media goals. This tool is useful for businesses, students, researchers, marketers, and coders. The interface is very simple and fast. We have 3 modes of appearance: System, Light, and Dark. Thinking and searching are quick. We can give inputs through the keyboard and the mic. The responses can be liked/disliked/shared or retried. Quite easy to implement and use. We have the option of agreeing or disagreeing on the usage of our content to be used to train the models and improve them. The control is in our hands. It answers questions promptly, summarizes text, and recommends ideas. I have used it for generating titles/ headlines for blogs and articles, and they were quite good. It solves puzzles smartly. Its strength is coding abilities. DeepSeek excels in software development due to its code-centric training on vast repositories, supporting 338+ languages like Python, JavaScript, and C++ with strong project-level completion. It can debug and suggest fixes. It also provides APIs for developers, chatbot interfaces, and options for local or cloud deployment. DeepSeek’s training and inference costs are cheaper than those of its competitors. DeepSeek offers open-source versions under permissive licenses, allowing developers to customize, modify, or self-host the models. This fosters community contributions and flexibility. It is often compared with Gemini in terms of its ability to integrate/capacity to handle large data and output. The choice of tools differs from user to user. It is an example of low-cost and smart engineering. Review collected by and hosted on G2.com.What do you dislike about Deepseek?There are significant concerns about privacy risks associated with data storage in China. The model censors politically sensitive topics, especially those related to Chinese governance or geopolitics, which undermines its reliability for generating unbiased information. The ecosystem is small, and the accuracy might not be 100%. Review collected by and hosted on G2.com.
What do you like best about Deepseek?I found it better than other AI tools because it gave fresh responses. With other AI tools, I kept getting similar answers to every question, which made them feel repetitive. Review collected by and hosted on G2.com.What do you dislike about Deepseek?It doesn’t accept videos, and it can’t read, analyze, or interpret them. Review collected by and hosted on G2.com.
What do you like best about Deepseek?What I like best about Deepseek is that it offers strong AI capabilities for free. It’s fast, easy to use, and gives fairly accurate responses without forcing paid upgrades. For daily tasks like research, content drafting, and quick problem-solving, it works really well and feels very accessible. Review collected by and hosted on G2.com.What do you dislike about Deepseek?While Deepseek is good and free, it doesn’t yet match ChatGPT in terms of understanding complex prompts and giving very accurate, detailed responses. Even after explaining things properly, the output is sometimes not exactly what I expect. I also found the interface a bit confusing and not very smooth, so it takes extra effort to get comfortable with it. With better integrations and UI improvements, it can become much better. Review collected by and hosted on G2.com.
What do you like best about Deepseek?Deepseek feels like a personal and professional advisor, always ready to help me no matter what situation I encounter. Review collected by and hosted on G2.com.What do you dislike about Deepseek?I have nothing negative to say about Deepseek. Review collected by and hosted on G2.com.
What do you like best about Deepseek?As a marketing strategist dedicated to improving efficiency in SEO and Paid Media, I have found DeepSeek R1 and V3 to be a transformative tool for my team. Its outstanding performance-to-cost ratio, combined with the fact that it's Open Source, truly sets it apart. DeepSeek R1 is the successor to the Deep Thinking feature (V3), which was later adopted by many GPTs in the market. I am especially impressed by its reasoning abilities. Whether I provide it with complex data sets or ask it to troubleshoot intricate Python scripts for automation, it consistently manages logic puzzles and challenging questions with remarkable skill. Review collected by and hosted on G2.com.What do you dislike about Deepseek?The image and video generation features are still not available, including the most recent updates. When I initially created my account in early 2025, I frequently encountered a "server is busy" error. However, it appears that this issue has now been resolved. Review collected by and hosted on G2.com.
What do you like best about Deepseek?It is easy to use and generates better results. Review collected by and hosted on G2.com.What do you dislike about Deepseek?The ability to filter responses and the length of chat. Review collected by and hosted on G2.com.
Are LLMs the New Propagandists?
I was brainstorming about a video with Claude (Sonnet 4.6). It suggested to explain the difference among ChatGPT, Gemini, Claude and DeepSeek. I agreed. It asked to write the script. I said ‘Yes’. And this is the first thing that set off alarm bells in my head: https://preview.redd.it/rh4rk1pxvb3h1.png?width=940&format=png&auto=webp&s=38822e52f64f46dd2dd276a30e44fb96b8b739c2 Curious, I skimmed the script. For the Western models, it provided the basic information: about the models, the strengths, the weaknesses and pricing. But for the Chinese model, it did appreciate it for its strengths. But it also mentioned the controversy (no such thing for the other three): https://preview.redd.it/3jzf7iv1wb3h1.png?width=940&format=png&auto=webp&s=f61c7145323375d0d11bfd6963f35c11490a50de **Translation:** *Now I will pause here — and tell you something important. There are serious privacy concerns about DeepSeek worldwide. Italy, Australia, Taiwan, South Korea — all these countries have banned DeepSeek on government devices. The reason is that DeepSeek operates under Chinese law — and Chinese law requires the company to share user data upon government request. A major data leak also surfaced within weeks of launch, exposing over 1 million user records. And researchers discovered that DeepSeek's iPhone app was sending data directly to a state-controlled company in China. So I will not be teaching DeepSeek on this channel. I leave the decision to you — but I wanted to share the facts so you stay informed.* And here is the summary it asked me to put on the screen: https://preview.redd.it/otsdin8awb3h1.png?width=940&format=png&auto=webp&s=b0cde4e5e04b95f694ccc7624b4ebe326ebae9da **Translation:** *ChatGPT – a little bit of everything.* *Gemini – best for google users* *DeepSeek – capable but privacy risk* *Claude – writing & documents* When I pushed it back on its bias and mentioned about privacy issues with Western companies, it replied with this: https://preview.redd.it/cxrhrqphwb3h1.png?width=940&format=png&auto=webp&s=59b8b83e83c4089a0c30fe6fb284abcb1a827e73 It said it was trained predominantly on Western media. And Western media has a documented pattern of covering Chinese and Eastern technology with more alarm than it covers equivalent Western behavior. So here is the question: If AI models are trained on Western media, which has a documented history of treating non-Western countries, especially China, with suspicion and alarm, then what exactly are people absorbing when they ask these tools for information? Hundreds of millions of people use these tools daily. Most people accept the first answer they receive. If that answer carries built-in bias, framing Eastern technology as dangerous while treating identical Western behavior as normal, that bias spreads quietly without anyone noticing. Yes, models warn that they can make mistakes and users should use the information at their own discretion. But this does not remove the responsibility from these tech giants Every new model becomes smarter, more capable with higher token limits and larger context windows. But what about ethics? What about the bias of one side of the world towards the other? Are we going to shrug this off and focus only on making models “smarter”? Then it’s neither artificial nor intelligent. As any LLM would write: “This is not information. This is propaganda.”
View originalThis shit is crazy !! and do people agree this will get people's accounts blocked? paid actors?
[I was looking for new ways to reduce context memory to save on tokens.when i see multiple video's on getting using deepseek in Claude, I vaguely remember something about Anthropic accusing them and putting measures to combat it.https:\/\/www.anthropic.com\/news\/detecting-and-preventing-distillation-attacksAnd yeah they are. and some of them literally claim this is an official way, no janky workaround. and that deepseek doesn't scan the generated outputs. I'm sorry but am I dumb, or is this asking to get blocked?](https://preview.redd.it/kw1bdq3h3b3h1.jpg?width=1078&format=pjpg&auto=webp&s=916fc11faa417d52b3dc787a2147ca3631e04aaf) I was looking for new ways to reduce context memory to save on tokens. when i see multiple video's on getting using deepseek in Claude, I vaguely remember something about Anthropic accusing them and putting measures to combat it. [https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks](https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks) And yeah they are. and some of them literally claim this is an official way, no janky workaround. and that deepseek doesn't scan the generated outputs. I'm sorry but am I dumb, or is this asking to get blocked?
View originalI think I know why deepseek is so good
Might have something to do with "Claude, made by Anthropic" ... learning from the best.
View originalHow do you guys avoid Claude always thinking newer LLMs don't exist?
Hey all, so I've been experimenting a bunch with different LLMs, specifically for creative tasks, i.e. RP and so forth, by letting Claude Code run experiments autonomously, to figure out best prompts, and such. This has been fun, in particular with DeepSeek V4 Pro, which is a true bang for a buck. However, despite reminding Claude that v4 Pro exists, mentioning it in [CLAUDE.MD](http://CLAUDE.MD) and so forth, every single time, it still falls back to older DeepSeek versions because those are known by it. So often I catch it mid talking saying "let's make a call to DeepSeek-r3 (or whatever the older one was called)" and stop it, reminding it to look at newer versions. Same for Open AI LLMs, it's basically stuck at GPT-4o. I fully understand knowledge cutoffs and all that, but it's a bit annoying because even when I tell it to research LLMs, at least half the list is depreciated or old LLMs. Any way to cope or handle this? It's super annoying because sometimes, despite me asking it to research latest and such, I just catch it late, and then suddenly my entire research is undone lmao.
View originalPapersWithCode new features - week 1 [P]
Hi, Niels here from the open-source team at Hugging Face. It's been one week since I [launched](https://www.reddit.com/r/MachineLearning/comments/1tgmwqr/reviving_paperswithcode_by_hugging_face_p/) [paperswithcode.co](http://paperswithcode.co), a revival of the website we all loved. It allows us to keep track of the state-of-the-art (SOTA) across various domains of AI, from agents to computer vision and time-series forecasting. The reception has been great, and I'm excited to extend this over the next few months. This week, I've added the following features: \- Support for multiple metrics for a given benchmark: leaderboards now support multiple metrics, see e.g., the [Open ASR Leaderboard](https://paperswithcode.co/benchmark/open-asr-leaderboard) for automatic speech recognition, which supports both Word Error Rate (WER) and the Inverse Real-Time Factor (RTFx) metrics, or the [Object Detection leaderboard](https://paperswithcode.co/benchmark/coco-val2017), which now also reports frames-per-second (FPS) besides mean average precision (mAP) on COCO. https://preview.redd.it/owlxn0b5u23h1.png?width=2878&format=png&auto=webp&s=1dff2f8feab4f160f77c97ceeb5d90e82382e63c \- Support for external papers: We do support submitting papers beyond Arxiv, such as a Github repo, a blog post, BiorXiv, and more. You can submit a paper at [paperswithcode.co/submit](http://paperswithcode.co/submit). AI will automatically enrich it with task and method tags, the GitHub repo, evals, and more. See e.g. [DeepSeek-v4](https://paperswithcode.co/paper/82956) below, which is not on Arxiv: https://preview.redd.it/uogbt0fjw23h1.png?width=2928&format=png&auto=webp&s=8b81e48af69b8935ddeb569d882d866b3e9ba216 \- Support for paper lineage: whenever a paper has a follow-up or predecessor, this will be displayed with a small banner above the abstract. See e.g. [Mamba-3](https://paperswithcode.co/paper/2603.15569), [DINOv2](https://paperswithcode.co/paper/2304.07193) and [GLM-4.5](https://paperswithcode.co/paper/2508.06471). https://preview.redd.it/f6vgtd1du23h1.png?width=2228&format=png&auto=webp&s=f8627f7669405f1766eecfd3322e925e15b4806d \- New methods: support for new methods based on popularity, including [Gated DeltaNet](https://paperswithcode.co/methods/gated-deltanet), [Kimi Delta Attention](https://paperswithcode.co/methods/kimi-delta-attention), [Mamba-2](https://paperswithcode.co/methods/mamba-2), and more. Each method also lists all papers that cite it. Find all supported methods [here](https://paperswithcode.co/methods). https://preview.redd.it/6pzagifvu23h1.png?width=2984&format=png&auto=webp&s=400efdc9677d1fbd369eedf684e622dd8c807973 \- Support for screenshotting a leaderboard for easy sharing on social media: each benchmark now includes a "copy image" button both on the scatter plot and table, which can be shared on social media. Try it on [ClawEval](https://paperswithcode.co/benchmark/claw-eval), for example. https://preview.redd.it/w7y7t7xnw23h1.png?width=2950&format=png&auto=webp&s=cb70ad91c6ba075e49b743d6e34f157d22266f04 \- Added many more evals: we are adding evals gradually, starting with all models supported in the Transformers library. So far, we have about 3k evals! Find them at the bottom of each paper page, e.g. [Qwen 3.6](https://paperswithcode.co/paper/83277). https://preview.redd.it/zao056s9x23h1.png?width=2218&format=png&auto=webp&s=540d87f473be05cb6f9c0aca88afa74fd4373e15 Happy to hear more feature requests and feedback! I will also launch a channel on the [Hugging Face Discord server](https://huggingface.co/discord-community) for easier communication. You can also chime in on the GitHub thread [here](https://github.com/huggingface/paperswithcode-feedback/issues/1). Cheers, Niels
View originalDeepSeek just popped the American AI bubble.
DeepSeek just popped the American AI bubble. Not by killing AI. By killing the fantasy of unlimited AI pricing power. DeepSeek V4 Pro: Input: $0.435 per 1M tokens Output: $0.87 per 1M tokens OpenAI GPT-5.5: Input: $5.00 Output: $30.00 Claude Opus 4.7: Input: $5.00 Output: $25.00 Claude Sonnet 4.6: Input: $3.00 Output: $15.00 DeepSeek is roughly: 11.5x cheaper than GPT-5.5 on input 34.5x cheaper than GPT-5.5 on output 28.7x cheaper than Claude Opus on output 17.2x cheaper than Claude Sonnet on output If a model is “good enough” at 1/20th or 1/30th the cost, margins will compress faster than Wall Street expects. AI is not dead. But the AI bubble just lost its pricing power.
View originalWhat is your favorite AI chatbot?
In 2026, we have so many AI chatbots at our fingertips. We have some newbies, like Grok and DeepSeek, while Gemini and ChatGPT have been around for a few years now. There are new ones popping up everyday, so I want to know what AI chatbots do the users of this subreddit recommend using. I know that there'll be different answers, but I want to see if most answers are the same, if most will recommend the trusty ChatGPT, or if there's a new, underrated AI chatbot that we should check out.
View originalml intern skill instead of gsd
\- designed for ml workflows \- works autonomously for hours Projects fully done with this skill \- flash attention for volta (very old GPUs) https://github.com/AlexWortega/flash-attn-volta \- deepseek 4 full replication + training on runpod + webgpu https://huggingface.co/spaces/AlexWortega/ml-intern-v4-100m-tinystories-demo Download it here https://github.com/AlexWortega/claude-ml-intern-skill
View originalI vibecoded an app called Think Local - a fully private AI app that runs directly on your iPhone, iPad, and Mac.
[Think Local](https://apps.apple.com/us/app/think-local-ai-private-chat/id6758632782) started with a simple idea: AI should work for you, not collect from you. So I built an app that lets you run modern AI models completely on-device - privately and fully offline. You can even turn on Airplane Mode ✈️ and the app still works. Chat, write, summarize text, analyze images, and create using local AI powered by Apple Silicon and Apple’s MLX framework. \- No internet required. \- No accounts. \- No cloud processing. \- Your data never leaves your device. Run models like Llama, Gemma, Qwen, DeepSeek, and more - all with complete privacy and control. I vibe-coded the app using Claude Code, and designed the app icon using ChatGPT image generation. The app has already generated $26.31 from a one-time purchase model - no hidden subscriptions, just pay once and use everything. Still learning, still experimenting, but really excited about what’s possible with local AI.
View originalA different way to reduce hallucination
All actual LLMs, sometimes, hallucinate, this is part of their "personalities". I made an experiment with my AI assistant. I added a "verifier" mode which consist of 2 panes. One dedicated to a primary LLM provider and the second pane dedicated to a second LLM provider. User prompts are sent to the primary LLM and when its reponse is completed, that response is sent to the verifier with the instruction to verify the veracity of the response. The verifier output is a fact check report. An interesting observation is that when I modified the system prompt of the primary LLM provider that it wil be verified by another LLM, the hallucination rate was reduced. And same thing for the verifier, when I added in the system prompt that it will verify the response of another LLM it got more zealous. Interesting behavior.
View originalI used Claude Code to build while delegating coding to Mistral/DeepSeek - 10 days, 57M tokens saved, over 90% costs savings, Claude quality result
I've been running vibe-skill ( [https://github.com/pcx-wave/vibe-skill](https://github.com/pcx-wave/vibe-skill) ), a Claude Code skill that delegates coding tasks to Mistral Vibe instead of burning Claude tokens. I initially did that because couldn't bear with hitting session limits so fast on Pro plan, but didn't want to lose the quality of Claude's planning. Here's a breakdown after 10 days usage. What it does: you type /vibeon <whatever>, Claude decomposes the task and delegates coding to Vibe, Claude reviews the diff and corrects if necessary. Vibe's token burn stays on the cheap model. Vibe being agnostic, i tried with default model (Mistral medium 3.5) and Deepseek vs flash. 10-day results (254 runs, 57M tokens delegated) By model: | Model | Tokens | Actual cost | Claude equiv | Savings | |---|---|---|---|---| | DeepSeek V4 Flash | 29M | $4.13 | $92.16 | 95% | | Mistral Medium 3.5 | 28M | $0 (pro sub) | $84.77 | 100% | 98% success rate across 254 runs. If something fails, Claude catches it and corrects. Mistral tokens are usually 50% cheaper than Claudes, Deepseek tokens are 95% cheaper... however i'm also a pro subscriber of mistral so i get a huge quota of free tokens included with the sub (circa 1Bn). So with Mistral Pro, every delegation is $0 until quota is reached, at which point you switch to DeepSeek immediately (Mistral PAYG at $1.52/M is 10× more expensive than DeepSeek). So at what monthly volume does DeepSeek alone cost more than the Mistral sub? $18.36 mistral sub price / $0.14 per M deepseek token cost = 131M tokens/month Below 131M → DeepSeek alone is cheaper, no Mistral subscription needed. Above 131M → Mistral Pro wins, and you get \~10× more headroom before hitting the quota. More details in repo concerning orchestration flow: [https://github.com/pcx-wave/vibe-skill](https://github.com/pcx-wave/vibe-skill) Did a similar skill with gemini [https://github.com/pcx-wave/gemini-skill](https://github.com/pcx-wave/gemini-skill) as i know they give cheap tokens too, but haven't practiced it as much yet because gemini isn't as configurable as vibe so delegation can be a bit flaky.
View original$18 to $4 on the same agent run after i stopped asking opus to rename css variables
I've been running an agent loop that refactors my static site. CSS variable renames, YAML config updates, running a linter through MCP. Really glamorous stuff for a blog that gets 40 visitors a month, most of whom are me refreshing to check if Vercel actually deployed. Every single step was going to Opus 4.7 because setting up routing felt like work and I am, apparently, the kind of person who'd rather burn $18 than spend 20 minutes writing an if statement. So I finally wrote the if statement. Hard subtasks still go to Opus: component architecture, debugging code I wrote at 2am and have zero memory of writing, anything where the model needs to hold a complex plan across a long conversation. Opus is genuinely unmatched at that kind of sustained reasoning. I tried routing a tricky auth middleware bug to a cheaper model once and got back something that looked perfectly plausible but silently broke session handling in a way that cost me an hour to trace. Lesson learned permanently. The routine stuff (lint, rename, config edits, tool orchestration) goes to cheap models. I landed on DeepSeek V4 Pro for general coding chores and Tencent Hunyuan Hy3 preview for anything with heavy tool calling. As of late April it was ranked number one on OpenRouter by tool call volume, and in my MCP loops it almost never botches a function call when the schema is clean. The listed rate on Tencent Cloud is around $0.18 per million input tokens and $0.59 per million output, so roughly 28x cheaper than Opus 4.7 on input. Same 212 step refactor, now with routing: 178 steps to the cheap tier, 34 to Opus. $18 became roughly $4. I couldn't spot a difference on the routine changes. My 40 monthly visitors certainly can't. I've since started doing stuff I used to skip entirely, like having the agent write and run tests for every CSS change or regenerating all my Open Graph images, because at a fraction of a cent per tool call there's just no reason not to. They do mess up in specific and annoying ways though. The tool calling model hallucinates parameters when my schemas get sloppy (honestly fair, the schemas were bad). DeepSeek V4 Pro occasionally writes code that's syntactically perfect but does the precise opposite of what you asked, in a way that survives a quick skim. And neither can touch Opus when you need it to reason through three layers of why your auth flow is silently eating a cookie. My routing logic boils down to one question: how expensive is a wrong answer to catch? Bad lint fix costs a 2 second git revert. Bad architecture call costs the whole afternoon.
View originalcdesktop — open-source Claude Code Desktop alternative, runs locally via npx, supports any provider
I built cdesktop with Claude Code — it's an open-source alternative to Anthropic's Claude Code Desktop, running locally on your machine via `npx cdesktop`. Free, Apache 2.0. It mirrors the Code tab of Anthropic's desktop app — see the video — and supports 5 agents in one UI. Claude Code Desktop does not support third party models, cdesktop does. Features: * 5 coding agents in one UI: Claude Code, Codex, Gemini CLI, OpenCode, Hermes. Switch per session. * Full third-party support — OpenRouter, DeepSeek, Kimi, GLM, custom ANTHROPIC\_BASE\_URL — any provider, any model. 20+ presets baked in. * Agent teams — spawn teammates that share your workspace; mix agents and models per teammate; lead delegates via `npx cdesktop team spawn`. * Routines — scheduled recurring agent runs (hourly/daily/weekdays/weekly). * Side-by-side sessions — split workspace into up to 4 cells, drag any session between them. * Optional Git worktrees per session, or work in-place. Non-Git directories work too. * Diff review with inline comments routed back to the agent. * 7 UI languages: English, Simplified Chinese, Traditional Chinese, Spanish, French, Japanese, Korean. * Responsive UI — usable from a phone. Repo: [https://github.com/cdesktop-ai/cdesktop](https://github.com/cdesktop-ai/cdesktop) How Claude Code helped build it: started from a fork of vibe-kanban; Claude Code (opus) rewrote the UI around a Claude-Code-Desktop-style session model and drafted most of the new Rust + React code. It's beta — expect rough edges. Feedback welcome, especially on Claude Code workflows where it falls short of the official app.
View originalGlia – Local-first shared memory layer (SQLite-vec + FTS5 + Offline Knowledge Graph)
Hey everyone, I wanted to share a project I've been working on called Glia. It is a 100% offline, local-first RAG and memory layer designed to connect your AI web chats (Claude, ChatGPT, DeepSeek) with your local developer tools (Claude Code, Cursor, Windsurf) using a unified local database. I wanted something lightweight that did not require pulling heavy Docker containers or subscribing to third-party memory APIs. I settled on a Node.js + SQLite architecture running sqlite-vec (for 768-dim float32 embeddings) alongside SQLite FTS5 for hybrid search, powered completely by local Ollama instances. We just launched a live website that outlines the details and demonstrates the features in action: * Website: [https://glia-ai.vercel.app/](https://glia-ai.vercel.app/) * Codebase: [https://github.com/Eshaan-Nair/Glia-AI](https://github.com/Eshaan-Nair/Glia-AI) Technical Stack & Features: * Hybrid Search Retrieval: SQLite-vec (using nomic-embed-text locally) + FTS5 keyword prefix matching (porter stemmer). * Surgical Sentence-level Trimming: Chunks are sliced into sentences. When a prompt is intercepted, only the exact matching sentences are pulled out of the vector store instead of the whole paragraph. It cuts LLM prompt bloat by \~90-95% in my benchmarks. * Knowledge Graph Extraction: An offline task queue uses a local LLM (llama3.1:8b via Ollama) to extract entity triples (subject-relation-object). These are stored in a SQLite facts table (or Neo4j if you run the full Docker compose profile) and fused with the vector retrieval score. * HyDE (Hypothetical Document Embeddings): Queries are pre-processed to generate a hypothetical answer, which is embedded together with the original query to bridge semantic gaps. * Concurrency: Running SQLite in WAL (Write-Ahead Logging) mode allows the browser extension dashboard and active MCP sessions to read/write concurrently without locking. * PII Redaction: Aggressive scrubbing of JWTs, API keys, emails, and IPs in the extension before data is saved. The extension works on [Claude.ai](http://claude.ai/), ChatGPT, DeepSeek, Gemini, Grok, and Mistral. The MCP server runs out of the same backend database for your terminal agent or Cursor. You can set it up with a single command: npx glia-ai-setup Glia is completely open-source (MIT). If you like the local-first approach or want to contribute to the SQLite vector pipeline, PRs are very welcome, and a star on GitHub helps the project get discovered! I would appreciate any feedback on the SQLite hybrid search scaling, the scoring fusion algorithm (RAG pipeline details are in RAG\_PIPELINE.md), or local graph extraction performance.
View originalHow I used Claude Code (and Codex) for adversarial review to build my security-first agent gateway
Long-time lurker first time posting. Hey everyone! So earlier this year, I got pulled into the OpenClaw hype. WHAT?! A local agent that drives your tools, reads your mail, writes files for you? The demos seemed genuinely incredible, people were posting non-stop about it, and I wanted in. I had been working on this problem since last year and was genuinely excited to see that someone had actually solved it. Then around February, Summer Yue, Meta's director of alignment for Superintelligence Labs, posted that her agent had deleted over 200 emails from her inbox. YIKES. She'd told it: "Check this inbox too and suggest what you would archive or delete, don't action until I tell you to." When she pointed it at her real inbox, the volume of data triggered context window compaction, and during that compaction the agent "lost" her original safety instruction. She had to physically run to her computer and kill the process to stop it. That should literally NEVER be the case with any software ever. This is a person whose actual job is AI alignment, at Meta's superintelligence lab, who could not stop an agent from deleting her email. The agent's own memory management quietly summarized away the "don't act without permission" instruction, treated the task as authorized, and started speed-running deletions. She had to kill the host process. That's when I sort of went down the rabbit hole, not because Yue did anything wrong, but because the failure mode was actually architectural and I knew that in my gut. Guess what I found? Yep. Tons more instances of this sort of thing happening. Over and over. Why? Because the safety constraint was just a prompt. It's obvious, isn't it? It's LLM 101. Prompts can be summarized away. Prompts can be misread. Prompts are fucking NOT a security boundary. And yet every agent framework I have ever seen seems to be treating them as one. I went and read the OpenClaw source code, which I should have done to begin with. What I found was a pattern I think a lot of agent frameworks have fallen into: - Tool names sit in the model context, so the model can guess or forge them - "Dangerous mode" is one config flag away from default - Memory management has no concept of instruction priority - The audit story is mostly "the model thought it should" I went looking for a security-first alternative I could trust, anything that was really being talked about or at a bare minimum attempted to address the security concerns I had. I couldn't find one. So I made it myself. CrabMeat is what came out of that, what I WANTED to exist. v0.1.0 dropped yesterday. Apache 2.0. WebSocket gateway for agentic LLM workloads. One design thesis: The LLM never holds the security boundary. What that means in code: Capability ID indirection. The model doesn't see real tool names. It sees per-session HMAC-derived opaque IDs (cap_a4f9e2b71c83). It can't guess or forge a tool name because it doesn't know any tool names. Effect classes. Every tool declares a class (read, write, exec, network). Every agent declares which classes it can use. The check is a pure function with no runtime state, easy to test exhaustively, hard to bypass. IRONCLAD_CONTEXT. Critical safety instructions are pinned to the top of the context window and explicitly marked as non-compactable. The Yue failure mode, compaction silently stripping the safety constraint, cannot happen by construction. The compactor literally cannot touch them. Tamper-evident audit chain. Every tool call, every privileged operation, every scheduler run enters the same SHA-256 hash-chained log. If something happens, you can prove what happened. If the chain is tampered with, you can prove that too. Streaming output leak filter. Secrets are caught mid-stream across token boundaries, capability IDs, API keys, JWTs, PEM blocks redacted before they reach the client. No YOLO mode. There is no global "trust the LLM with everything" switch. There never will be. Expanded reach comes through named scoped roots that are explicit, audit-logged, and bounded. The README has 15 'always-on' protections in a table. None of them can be turned off by config, because these things being toggleable is how the ecosystem ended up where it is. I decided to make sure that this wasn't just a 'trend hopping' project and aligned with my own personal values as well. I built this to be secure and local-first by default. Configured for Ollama / LM Studio / vLLM out of the box. Anthropic and OpenAI work too but require explicit configuration. There is no "happy path" that silently ships your prompts to a cloud endpoint. I decided that FIRST it needed to only run as an email agent with a CLI. Bidirectional IMAP + SMTP with allowlisted senders, threading preserved, attachments handled. This is the use case that bit Yue and a lot of other people, and I wanted to prove it could be done with real boundaries. I added in 30+ built-in tools of my own. File ops, shell (denylisted, output-capped, CWD-lo
View originalRepository Audit Available
Deep analysis of deepseek-ai/DeepSeek-V3 — architecture, costs, security, dependencies & more
DeepSeek has an average rating of 4.5 out of 5 stars based on 8 reviews from G2, Capterra, and TrustRadius.
Key features include: Open-source large language models, MoE (Mixture of Experts) model architecture, Custom training framework, High-performance inference optimization with IndexCache, API access for seamless integration, Support for billion-parameter models, Advanced natural language understanding, Code generation capabilities with DeepSeek-Coder.
DeepSeek is commonly used for: Natural language processing tasks, Code generation and completion, Conversational AI applications, Content generation for marketing, Data analysis and insights extraction, Automated customer support systems.
DeepSeek integrates with: AWS, Google Cloud Platform, Microsoft Azure, Kubernetes, Docker, Jupyter Notebooks, Slack, Trello, Zapier, GitHub.
DeepSeek has a public GitHub repository with 102,417 stars.
Lewis Tunstall
ML Engineer at Hugging Face
2 mentions
Based on user reviews and social mentions, the most common pain points are: token cost, API costs, cost per token, cost tracking.
Based on 84 social mentions analyzed, 5% of sentiment is positive, 95% neutral, and 0% negative.