Zhipu AI

llm-providersubscription + tiered

智谱推出 AutoClaw（澳龙）——国内首款一键安装的本地 OpenClaw 客户端。内置50+Skills，集成 AutoGLM 浏览器操作能力，下载地址：https://autoglm.zhipuai.cn/autoclaw

Based on the available social mentions, Zhipu AI appears to be recognized in contexts involving AI benchmarking and browser agent task completion. However, specific user feedback on its performance or user experience in the data provided is limited. There is a neutral to positive sentiment regarding its capabilities in cloud-related automation tasks. Pricing sentiment and detailed overall reputation are not clearly addressed in the given mentions.

Website GitHub

Mentions (30d)

Reviews

Platforms

GitHub Stars

41,144

5,177 forks

15 integrations10 features

Voices Discussing Zhipu AI

Liang Wenfeng

CEO at DeepSeek

3 mentions

Nathan Lambert

Research Scientist at Allen AI

2 mentions

Jerry Liu

CEO at LlamaIndex

1 mention

Share:Twitter LinkedIn

AI Summary

Features & Use Cases

Features

GLM-5GLM-4.6VAutoGLMGLM PPT/海报AI 搜索工具AutoClaw智谱学习中心Zread.aiAMiner智谱AI输入法

Use Cases

Automated customer supportContent generation for marketingReal-time language translationData analysis and reportingInteractive chatbots for user engagementPersonalized learning experiencesAutomated code generationResearch assistance and summarization

Company Intel

Employees

Social Reach

11,704

GitHub followers

Developer Ecosystem

127

GitHub repos

41,144

GitHub stars

npm packages

HuggingFace models

Top Mention

lemmy@humanspiral4 engagement2/19/2026

Artificial Analysis Intelligence Index and cost benchmarks are useful decision/guidance determinants for which models to use. Analysis for top models.

# AI Intelligence and Benchmarking Cost (Feb 2026) As per the **Artificial Analysis Intelligence Index v4.0** (February 2026), the scoring ceiling is set by **Claude Opus 4.6 (max) at 53**. ## Adjusted Score Formula The "Adjusted Score" follows a quadratic penalty formula: ``` Adjusted Score = 53 × (1 - (53 - Intel Score)² / 53²) ``` This creates a steeper penalty for performance gaps compared to a linear scale. ## Model Comparison Table | Lab | Model | Intel Score | Adjusted Score | Benchmark Cost | Intel Ratio (Score/Cost) | Adj. Ratio (Adj/Cost) | |-----------|-------|-------------|----------------|----------------|--------------------------|----------------------| | Anthropic | Claude Opus 4.6 (max) | 53 | 53 | $2,486.45 | 0.021 | 0.021 | | OpenAI | GPT-5.2 (xhigh) | 51 | 49 | $2,304.00* | 0.022 | 0.021 | | Zhipu AI | GLM-5 (Reasoning) | 50 | 47 | $384.00* | 0.130 | 0.122 | | Google | Gemini 3 Pro | 48 | 43 | $1,179.00* | 0.041 | 0.036 | | MiniMax | MiniMax-M2.5 | 42 | 31 | $124.58 | 0.337 | 0.249 | | DeepSeek | DeepSeek V3.2 (Reasoning) | 42 | 31 | $70.64 | 0.595 | 0.439 | | xAI | Grok 4 (Reasoning) | 41 | 29 | $1,568.34 | 0.026 | 0.018 | *\*Benchmark costs for proprietary models are based on Artificial Analysis evaluation token counts (typically 12M–88M depending on verbosity) multiplied by current API rates.* ## Key Insights 1. **High token reasoning models**: Grok 4 and Claude Opus 4.6 use a high number of tokens during reasoning, up to **88M tokens**. This results in low Intel-to-Cost ratios despite high scores. 2. **DeepSeek V3.2 is the most efficient**: It provides an adjusted intelligence ratio that is roughly **20 times better** than the proprietary frontier. 3. **Cost efficiency comparison**: MiniMax-M2.5 and DeepSeek V3.2 share a score of 42. DeepSeek is almost **twice as cost-effective** due to lower API pricing and higher token efficiency. ## Visual Summary ``` Intel Score vs Cost Efficiency (Adjusted Ratio) ───────────────────────────────────────────────── DeepSeek V3.2 ████████████████████████████ 0.439 MiniMax-M2.5 ███████████████ 0.249 GLM-5 ███████ 0.122 Gemini 3 Pro ██ 0.036 Claude Opus 4.6 █ 0.021 GPT-5.2 █ 0.021 Grok 4 █ 0.018 ``` --- *Source: Artificial Analysis Intelligence Index v4.0, February 2026* google AI mode made analysis, GLM 5 formatted and added cute graph. this combines the intelligence score and cost to run the intelligence benchmark from https://artificialanalysis.ai/?endpoints=openai_gpt-5-2-codex%2Cazure_kimi-k2-thinking%2Camazon-bedrock_qwen3-coder-480b-a35b-instruct%2Camazon-bedrock_qwen3-coder-30b-a3b-instruct%2Ctogetherai_minimax-m2-5_fp4%2Ctogetherai_glm-5_fp4%2Ctogetherai_qwen3-next-80b-a3b-reasoning%2Cgoogle_gemini-3-pro_ai-studio%2Cgoogle_glm-4-7%2Cmoonshot-ai_kimi-k2-thinking_turbo%2Cnovita_glm-5_fp8 look at intelligence vs cost graph for further insight. You can add much smaller models for comparison to LLMs you might run locally. The adjusted intelligence/cost metric is a useful heuristic for "how much would you pay extra to get top score". Choosing non-open models requires a much higher penalty than 2x the difference/comparison to highest score. Quantized versions don't seem to score lower. This site provides good base info to make your own model of "score deficit", model size, tps as a combined score relative to tokens/cost to get a benchmark score. I was originally researching how grok 4.2 approach would inflate costs vs performance, but it is not yet benchmarked.