Build with the Claude API
The "Anthropic Claude API" is praised for its advanced capabilities, including identifying software vulnerabilities and improving security, which is reflected in initiatives like Project Glasswing. Users appreciate Anthropic's proactive approach to address issues like the previously reported blackmail behavior, which has been successfully eliminated. The collaboration with major tech entities like Google and Amazon indicates a positive sentiment towards its pricing and value proposition due to the substantial backing and infrastructure support. Overall, the API holds a solid reputation for driving innovation and maintaining transparency in AI research and application, fostering a strong sense of trust and credibility among its user base.
Mentions (30d)
171
59 this week
Reviews
0
Platforms
3
Sentiment
8%
28 positive
The "Anthropic Claude API" is praised for its advanced capabilities, including identifying software vulnerabilities and improving security, which is reflected in initiatives like Project Glasswing. Users appreciate Anthropic's proactive approach to address issues like the previously reported blackmail behavior, which has been successfully eliminated. The collaboration with major tech entities like Google and Amazon indicates a positive sentiment towards its pricing and value proposition due to the substantial backing and infrastructure support. Overall, the API holds a solid reputation for driving innovation and maintaining transparency in AI research and application, fostering a strong sense of trust and credibility among its user base.
Features
Use Cases
Industry
research
Employees
5,000
Funding Stage
Series G
Total Funding
$57.7B
1,100,000
Twitter followers
Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulner
Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. https://t.co/NQ7IfEtYk7
View originalBuilt a free MCP for tracking which URLs Claude (and 5 other engines) cite for any query
We were comparing hosted AI citation dashboards (Profound, AthenaHQ, Otterly) and they all start at $295 to $499 a month. The data they collect is mostly the same data you can pull from each vendor's API. So we built an MCP server that does the same job locally. Citation Intelligence is a stdio MCP server with 12 tools that track what Claude, ChatGPT, Perplexity, Gemini, Google AI Overviews, and Bing cite for any query. Install: npx -y u/automatelab/citation-intelligence Add to .mcp.json: { "mcpServers": { "citation-intelligence": { "command": "npx", "args": ["-y", "@automatelab/citation-intelligence"] } } } Three of the tools run on a local cache and cost zero. The rest are bring-your-own-keys (ANTHROPIC_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY, SERPAPI_API_KEY), about $0.01 to $0.03 per query. The one that actually changed our editorial flow is gsc_citation_gap - it joins Google Search Console data with AI citation status and surfaces pages that rank in Google but are not cited by any AI engine. Those pages are the editorial budget. Repo and full tool list: https://github.com/automatelab/citation-intelligence Launch write-up: https://automatelab.tech/launching-the-citation-intelligence-mcp/ Curious if anyone else here is tracking AI citations in their agent loop rather than in a dashboard, and how you handle the predict-vs-measure tradeoff. submitted by /u/exto13 [link] [comments]
View originalI stress-tested Kimi K2.6 against Claude Opus 4.7 on a quick coding-agent task
I tested Claude Opus 4.7 and Kimi K2.6 on the same coding agent task i.e. build an AI Fix Runner that takes a broken repo, runs its tests, identifies the failure, applies a patch, reruns the test, and exposes the final diff/logs through an API and UI. The goal was not to benchmark syntax completion or simple repo edits. I wanted to test model behavior on a less familiar integration path: shifting execution from local processes into remote sandboxes. I used Tensorlake specifically because the sandbox API is newer and integration-heavy. This made the test more about whether the model could reason through unfamiliar infra and produce a working implementation. Setup: Claude Opus 4.7 through Claude Code Kimi K2.6 through OpenCode via OpenRouter Pricing context: Claude Opus 4.7: $5/M input, $25/M output Kimi K2.6: $0.95/M input ($0.16 cached input), $4/M output So, what made it interesting is if Kimi's lower cost can handle a crazy workflow. To be clear, comparing Kimi K2.6 directly with Opus 4.7 is not completely fair. The model classes, pricing, and expected capability levels are very different. I mainly wanted to see how far an open model could get on the same task at a fraction of the price, and whether the performance/price tradeoff made sense for coding-agent work Test 1: Local AI Fix Runner First, both models had to build the local version. The app needed to: create fixture repos with intentional bugs run install/test/build locally capture stdout/stderr apply patches rerun tests after patching expose run state through backend APIs show logs and patched source in the UI reject obviously unsafe commands Claude Opus 4.7 produced a working implementation. It built the fixture repos, repair flow, API endpoints, UI, logs, and patched-file inspection. The main pipeline worked: install -> test fails -> patch -> test passes -> build passes It had one real bug: workspace persistence. KEEP_WORKSPACES=true was supposed to preserve the final workspace, but the backend loaded .env from the wrong location. One follow-up fixed it. Kimi K2.6 got some backend pieces working and could trigger repair runs, but the implementation was incomplete. The biggest miss was patched-source inspection, which is core for this app because you need to verify exactly what the agent changed. Rough numbers: Opus: $13.84, around 39 min wall time Kimi: around $3.40, around 1h 39 min wall time Result: Opus did it good, Kimi could not The difference in the price, and the time taken is just insane. Test 2: Sandbox Integration Second, I asked both models to move execution from local processes into Tensorlake Sandboxes. This was the main stress test. The model had to: create a sandbox copy the repo into the sandbox execute install/test/build remotely capture logs from sandbox commands apply patches inside the sandbox rerun validation clean up sandbox state keep the original local runner working This is where I wanted to test performance on something newer and less likely to be in the model’s training data. Claude Opus 4.7 handled this cleanly. It added a Tensorlake runner, kept the local runner abstraction intact, wired env/config handling, and created a live test path using TENSORLAKE_API_KEY. More importantly, the local regression path still passed after the sandbox backend was added. Kimi K2.6 was given the working Opus local implementation as the base, so it only had to add Tensorlake execution. Even with that advantage, it failed to produce a clean sandbox flow after 150k+ tokens. It got stuck around the integration layer and never reached a reliable test/build/patch loop inside Tensorlake. Rough numbers: Opus Tensorlake run: around $24.39, around 23 min Kimi Tensorlake run: failed after a long run, 150k+ tokens Result: Opus passed, Kimi failed Takeaway Kimi K2.6 is much cheaper and can handle some bounded coding work, but it struggled once the task involved external execution infra, sandbox lifecycle, env/config handling, and regression safety. Claude Opus 4.7 was expensive, but much stronger at: preserving architecture adding a new execution backend handling config bugs maintaining testability reasoning through unfamiliar infra For me, this was less about “which model writes code” and more about “which model can integrate a newer system without breaking the app.” On that specific test, Opus was clearly miles ahead. Full breakdown with prompts, code, screenshots, demos, and cost details: https://www.tensorlake.ai/blog/claude-opus-4-7-vs-kimi-k2-6-real-world-coding-test Curious if anyone has gotten Kimi K2.6 working reliably on coding-agent workflows. submitted by /u/shricodev [link] [comments]
View originalMy Mac now has a wake word for Claude Code
Honestly this started as a weekend hack because I was tired of typing the same kind of prompts into Claude Code over and over. I wanted to just talk to it while making coffee. So I rigged up a wake word (Yabby), a WebRTC voice loop for the conversation, and an actual plan-approval modal that pops up before any agent runs so I can vet what's about to happen first. That was the plan. Two weekends later it had quietly turned into something weirder. The voice loop now talks to a "lead agent" that breaks the work down into a discovery phase, a plan, then it recruits a small team a manager or two, and sub-agents that actually do the work. They run in parallel where they can, sequentially where they can't, and when a sub-agent finishes there's an auto-triggered review pass (5 second debounce so they don't pile up). The lead agent watches the whole cascade and reports back by voice when everything's QA'd and done. Each agent runs its own Claude Code session under the hood with its own thread, so the conversations don't bleed. Watching three agents work in parallel on the same project last night was genuinely uncanny. One of them caught a bug another one had written. That part I really didn't expect. Things I still hate about it: - Speaker verification is fiddly. Cosine-similarity threshold on the speaker embedding is annoying to tune too tight and it rejects me when I have a cold, too loose and it'll wake for anyone in the room. - French was the default locale because I wrote it that way. Slowly fixing it. - Background tasks dying when the parent Claude Code CLI exits was a nightmare to track. Ended up writing an OS-level PID watcher with a bookkeeper shell script just to know which long-lived servers had crashed. - Lead agent occasionally over-plans tiny tasks. Ask it to rename a file and you get a four-phase project plan. Working on it. Stuff I'm still figuring out: how to make the QA phase less chatty, whether to let sub-agents recruit their own sub-agents, and how to keep the voice latency under 300ms when the Realtime API gets cranky. Curious if anyone else has tried voice-controlling Claude Code? Anthropic rolled out their own voice mode to 5% of users a couple weeks back and I keep wondering how they'll handle the multi-agent piece does anyone here have access to that rollout yet? submitted by /u/Interesting-Sock3940 [link] [comments]
View originalStop Claude from wasting tokens exploring your codebase [archmcp]
AI coding agents spend a surprising amount of time: crawling files guessing architecture tracing dependencies rebuilding context every session So my friend built archmcp, a local MCP server that generates a compact architectural snapshot of a repository before the agent reads a single file. Instead of starting blind, Claude Code gets structured context about: modules symbols dependencies routes architectural patterns It’s giving AI agents enough architectural awareness to stop wasting tokens and time rediscovering the codebase from scratch. It also supports multi-repo setups, so agents can reason across systems like: Go backend TypeScript frontend Python FastAPI services mobile apps shared libraries Repo: archmcp on GitHub Would love feedback from people who give it a go. submitted by /u/yellow-llama1 [link] [comments]
View originalPSA: Claude Code silently loses session data. Here is a backup script for Windows & Mac
The Problem If you've been using Claude Code (the CLI / desktop app) and noticed sessions vanishing — you're not alone. The title stays in the sidebar but clicking it shows nothing. The transcript is gone. No warning, no error, no recovery option. This has been reported by multiple users. It seems to happen silently — possibly during context compression, unexpected exits, or some storage-layer issue. There's no built-in backup or recovery feature. For a paid product, this is a pretty rough experience. You build up a long session with real work in it, and it just disappears. The Fix: Daily Automated Backups Since Anthropic hasn't addressed this yet, I built a simple daily backup that runs completely independently of Claude Code via your OS scheduler. It copies all session transcripts, plans, drafts, and memory to a safe location, keeps 7 days of rolling backups, and logs each run. No Claude dependency — if Claude crashes, gets uninstalled, or loses data again, your backups are still there. Windows (Task Scheduler + PowerShell) Step 1: Create the backup folder mkdir C:\Users\%USERNAME%\ClaudeBackups Step 2: Save this as backup-claude-sessions.ps1 in that folder $ErrorActionPreference = "Stop" $source = "$env:USERPROFILE\.claude" $backupRoot = "$env:USERPROFILE\ClaudeBackups" $logFile = Join-Path $backupRoot "backup.log" $keepDays = 7 $timestamp = Get-Date -Format "yyyy-MM-dd_HHmmss" $backupDir = Join-Path $backupRoot $timestamp $dirs = @("sessions", "projects", "plans", "drafts", "memory") function Write-Log($msg) { $line = "$(Get-Date -Format 'yyyy-MM-dd HH:mm:ss') - $msg" Add-Content -Path $logFile -Value $line -Encoding utf8 } try { Write-Log "=== Backup started ===" New-Item -ItemType Directory -Path $backupDir -Force | Out-Null foreach ($d in $dirs) { $src = Join-Path $source $d if (Test-Path $src) { $dst = Join-Path $backupDir $d Copy-Item -Path $src -Destination $dst -Recurse -Force $count = (Get-ChildItem $dst -Recurse -File -ErrorAction SilentlyContinue | Measure-Object).Count Write-Log " Copied $d ($count files)" } else { Write-Log " Skipped $d (not found)" } } $size = (Get-ChildItem $backupDir -Recurse -File | Measure-Object -Property Length -Sum).Sum Write-Log " Total backup size: $([math]::Round($size/1MB, 2)) MB" # Rotate old backups $cutoff = (Get-Date).AddDays(-$keepDays) Get-ChildItem $backupRoot -Directory | Where-Object { $_.Name -match '^\d{4}-\d{2}-\d{2}_\d{6}$' -and $_.CreationTime -lt $cutoff } | ForEach-Object { Remove-Item $_.FullName -Recurse -Force -Confirm:$false Write-Log " Rotated old backup: $($_.Name)" } Write-Log "=== Backup completed successfully ===" } catch { Write-Log "!!! BACKUP FAILED: $_" exit 1 } Step 3: Save this as install-schedule.ps1 and run it once as Administrator $action = New-ScheduledTaskAction ` -Execute "powershell.exe" ` -Argument "-ExecutionPolicy Bypass -WindowStyle Hidden -File `"$env:USERPROFILE\ClaudeBackups\backup-claude-sessions.ps1`"" $trigger = New-ScheduledTaskTrigger -Daily -At 8:00AM $settings = New-ScheduledTaskSettingsSet ` -AllowStartIfOnBatteries ` -DontStopIfGoingOnBatteries ` -StartWhenAvailable Register-ScheduledTask ` -TaskName "ClaudeSessionsBackup" ` -Action $action ` -Trigger $trigger ` -Settings $settings ` -Description "Daily backup of Claude Code sessions" ` -RunLevel Limited Write-Host "Done! Runs daily at 8:00 AM." -ForegroundColor Green Run it: powershell -ExecutionPolicy Bypass -File "C:\Users\%USERNAME%\ClaudeBackups\install-schedule.ps1" Mac (launchd + shell script) Step 1: Create the backup folder mkdir -p ~/ClaudeBackups Step 2: Save this as ~/ClaudeBackups/backup-claude-sessions.sh #!/bin/bash set -euo pipefail SOURCE="$HOME/.claude" BACKUP_ROOT="$HOME/ClaudeBackups" LOG_FILE="$BACKUP_ROOT/backup.log" KEEP_DAYS=7 TIMESTAMP=$(date +"%Y-%m-%d_%H%M%S") BACKUP_DIR="$BACKUP_ROOT/$TIMESTAMP" DIRS=("sessions" "projects" "plans" "drafts" "memory") log() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"; } log "=== Backup started ===" mkdir -p "$BACKUP_DIR" for d in "${DIRS[@]}"; do src="$SOURCE/$d" if [ -d "$src" ]; then cp -R "$src" "$BACKUP_DIR/$d" count=$(find "$BACKUP_DIR/$d" -type f | wc -l | tr -d ' ') log " Copied $d ($count files)" else log " Skipped $d (not found)" fi done size=$(du -sm "$BACKUP_DIR" | cut -f1) log " Total backup size: ${size} MB" # Rotate old backups find "$BACKUP_ROOT" -maxdepth 1 -type d -name "2*" -mtime +$KEEP_DAYS -exec rm -rf {} \; log " Rotated backups older than $KEEP_DAYS days" log "=== Backup completed successfully ===" Make it executable: chmod +x ~/ClaudeBackups/backup-claude-sessions.sh Step 3: Create the launchd plist to run daily at 8am Save this as ~/Library/LaunchAgents/com.user.claude-backup.plist: Label com.user.claude-backup ProgramArguments /bin/bash -c $HOME/ClaudeBackups/backup-claude-sessions.sh StartCalendarInterval Hour 8 Minute 0 StandardErrorPath /tmp/claude-backup-err.log RunAtLoad Loa
View originalI measured my Claude Code MCP stack on two axes — byte savings AND cache-friendliness. My "best" byte-saver was defeating Anthropic's prompt cache (counter-example + open benchmark)
TL;DR — Single-axis benchmarks for MCPs, compressors, and retrieval layers can recommend a system that's strictly worse in production. The missing axis: cache-friendliness — whether the same input produces byte-identical bytes across runs, so Anthropic's prompt cache hits. In my coding-agent stack, my biggest byte-saver (retrieval MCP, 60–70% reduction) was defeating the 5-min TTL prompt cache on every call. Two runs of the same query produced different bytes because of rg --files-with-matches output order leaking through a Map insertion sequence into the final context. The fix was 2 lines: sort the rg hits before slicing, sort the Map entries by path. Byte savings unchanged, cache_friendly_score went from ~0% to 100%. https://preview.redd.it/x5foipotq93h1.png?width=1600&format=png&auto=webp&s=c0930422e882e23d1fc34ded25934c74db692a21 Article + open benchmark harness: Article: https://gregshevchenko.com/research/mcp-stack-token-economy/ Harness (stdlib-only Python, offline): https://github.com/g-shevchenko/mcp-token-savers — see methods/ for formal definitions, cluster-bootstrap CIs, Wilson CIs, preregistration, real-data Cohen's κ. What the harness measures: mean_ratio + CV across N≥5 runs per fixture → byte-saving axis unique_md5_count == 1 check → cache-friendliness axis (0–100%) 12-anti-pattern audit on tool definitions (DSA reference) What named alternatives publicly disclose: I surveyed the public docs for Cursor codebase index, Sourcegraph Cody, Aider repo-map, Microsoft LLMLingua / LLMLingua-2, Firecrawl / Jina Reader, RouteLLM / Martian (May 2026). https://preview.redd.it/ailemo1wq93h1.png?width=1600&format=png&auto=webp&s=4732f5d03f53ba95d2b5aaac0c7f21f1858a36a4 Limitations: I hypothesized that the prep layer triggers more downstream cache hits on subsequent turns. It didn't reach significance: Welch p=0.32, Cohen's d ≈ 0.18, N=137. Two-judge Cohen's κ on the corpus (cerebras-llama × groq-llama, N=25): κ = 0.5955 (moderate, below the 0.7 substantial threshold). 4 of 5 inter-judge disagreements concentrate on one task with an ambiguous acceptance criterion. Sharpening the spec would push κ to ~0.83. Disclosure: I'm the author. No commercial affiliation with the listed tools. The harness is MIT-licensed and takes any compressor as (str) -> str. Curious what cache_friendly_score looks like on others' Claude Code stacks. submitted by /u/Level_Credit1535 [link] [comments]
View originalAI agent to walk marketing funnels, how to built?
Hi all, I'm building a monitoring tool that needs to walk through marketing funnels weekly: onboarding quizzes, signup flows, paywall pages, capture every step, and detect compliance-relevant changes. The goal is automated weekly runs that output a structured report. The problem I keep hitting is that Claude itself via chat can read HTML and reason about content, but has no native ability to click buttons, fill forms, or progress through multi-step flows. I saw also extenstion Claude Browser which can actually drive a real browser but it runs locally in my Chrome. I am now trying to understand: is it possible to somehow synchronize these two (browser and chat claude) that triggers Claude in Chrome to perform a funnel walk and return results, without my supervision. Is there an API or CLI for the Chrome extension I'm missing? Are there Claude skills, MCPs, or community tools that give the API itself browser-interaction capability? Will be glad for any working thoughts! submitted by /u/fheyw [link] [comments]
View originalI think I know why deepseek is so good
Might have something to do with "Claude, made by Anthropic" ... learning from the best. submitted by /u/EchoOfOppenheimer [link] [comments]
View originalI ran Claude Desktop for a month and 73% of my Anthropic bill was MCP tool calls, not chat
Set up Claude Desktop with Playwright, filesystem, GitHub, and a few other MCP servers about 6 weeks ago. Just hit my first $200+ month and went to figure out where it went. Surprise: chat completions were only $54. The other $146 was tool calls — Playwright alone was $89 because the agent kept opening pages with massive DOMs and the whole thing got piped back into context. Top 5 by cost: playwright/browser_navigate — $43 playwright/browser_snapshot — $46 filesystem/read_file — $22 github/get_pr_diff — $18 brave-search — $11 Lesson learned: cap your Playwright context. Disable browser tools when not actively browsing. The model bills you for what comes back, and DOMs are huge. How are others budgeting this? I genuinely had no idea this was the breakdown until I started measuring. submitted by /u/Slow-Relationship897 [link] [comments]
View originalIs this AGI? Sonnet 4.6 just rick rolled me
For reference, I had sonnet build an API inside an LXC container using claude code cli (also that api key will most certainly be rotated, don’t worry) submitted by /u/DeadArtist617 [link] [comments]
View originalDid anthropic make claude funny now?
I realized me asking claude the tell me a joke question recently, it actually comes up with really funny jokes! I feel like anthropic must've partnered with some comedy writers to give them some understanding of how their minds work, because I asked it to help me write some jokes, and its understanding of how joke premises works is 10000x better than anything I've ever seen written anywhere online Anyway, just curious if anthropic is gathering experts to smooth out newer versions of claude from common oversights that we all tended to meme on over the past couple years submitted by /u/Agreeable-Pea4327 [link] [comments]
View originalAnthropic’s Code with Claude showed off coding's future—whether you like it or not
submitted by /u/ThereWas [link] [comments]
View originalThe chat box was never the right interface for AI
I've been building with AI every day for over a year. And I keep coming back to the same uncomfortable realization. The chat box wasn't designed because it was the best interface for AI. It was designed because it was the easiest one to ship. Think about what the chat box actually asks you to do. Stop what you're working on. Open a new tab. Explain your entire context from scratch. Ask your question. Wait. Copy the answer back. Return to work. Lose your train of thought in the process. Then do it again ten minutes later. We've been so focused on making the AI smarter that nobody questioned whether the interface itself was broken. The model went from GPT-3 to GPT-4 to Claude 3 to whatever comes next. The interface stayed exactly the same. A box. You type. It responds. That's not a tool that works for you. That's a tool you work for. The next interface already knows what you're working on. It doesn't wait to be asked. It acts before you prompt it. It notices patterns in how you work and handles them automatically. You never have to explain yourself again. OpenClaw proved this demand was real. 247k GitHub stars for a tool that deleted inboxes and ran up API bills while people slept. People installed something genuinely dangerous because the underlying idea was so compelling. The demand exists. The technology exists. The chat box is just a habit at this point. We're building what comes after it. clarko.ai if you want to follow along. What do you think the right interface for AI actually looks like? submitted by /u/JuniorRow1247 [link] [comments]
View originalAi models
Fresh from Bloomberg today: the Pentagon is actively evaluating multiple frontier AI models — especially from OpenAI and Google’s Gemini — across military theater commands as it moves away from relying heavily on Anthropic’s Claude in classified environments. The backdrop is a major dispute earlier this year between Anthropic and the Pentagon over contract language tied to “lawful operational use.” Anthropic reportedly pushed back on terms that could permit domestic mass surveillance or fully autonomous weapons without meaningful human oversight. After negotiations collapsed, the Pentagon designated Anthropic a “supply-chain risk” and accelerated efforts to onboard rival models instead. That triggered a rapid shift toward a multi-vendor AI strategy: OpenAI, Google, Microsoft, Amazon Web Services, NVIDIA, xAI, and others have signed agreements for classified or operational military AI deployments. Google’s Gemini models were recently added to the Pentagon’s internal AI portal, while OpenAI expanded access to models inside classified defense networks. The Pentagon is now testing how different models respond to identical prompts, especially in ambiguous or high-stakes military workflows. Officials noted the systems “respond differently,” highlighting a major real-world challenge with LLM deployment. Why this matters: Defense agencies increasingly view frontier AI as critical infrastructure, similar to cloud or semiconductors. Moving from a single preferred model to multiple vendors improves resilience and bargaining power, but creates major integration and reliability challenges. The episode exposed growing tension between commercial AI safety policies and government/national-security priorities. So far, the biggest beneficiaries appear to be OpenAI and Google, both of which have expanded defense relationships while Anthropic fights the designation in court. submitted by /u/Annual_Judge_7272 [link] [comments]
View originalOWASP published its first Top 10 for AI Agents. 88% of enterprises already had agent security incidents last year. Here's the breakdown.
OWASP released the Top 10 for Agentic Applications in December 2025 - the first formal risk taxonomy for autonomous AI agents. Not chatbots. Not copilots. Agents that plan, use tools, maintain memory, and act without waiting for permission. Some numbers for context: 88% of enterprises reported AI agent security incidents in the last 12 months (Gravitee survey, 919 respondents) Only 21% have runtime visibility into what their agents are doing 82% of enterprises have unknown agents in their environments (Cloud Security Alliance, April 2026) 5.5% of public MCP servers contain poisoned tool descriptions. 84.2% attack success rate with auto-approval enabled. Here's the list with the real attacks behind each one: ASI01 - Agent Goal Hijack: Prompt injection for agents. Researchers showed this against GitHub's MCP integration - a malicious GitHub issue redirected a coding agent to exfiltrate data from private repos. The agent looked like it was working normally the whole time. ASI02 - Tool Misuse: A financial services agent was tricked into running a regex that matched every customer record. 45,000 records exported through one syntactically valid tool call. The agent had permission to query records - just not all of them at once. ASI03 - Identity and Privilege Abuse: Agents inherit user permissions and cache credentials. Compromise one agent in a delegation chain and you get the combined permissions of every user in that chain. ASI04 - Supply Chain Compromise: OX Security found 7,000+ vulnerable MCP servers and packages totaling 150M+ downloads affected by architectural flaws in Anthropic's MCP SDKs across Python, TypeScript, Java, and Rust. ASI05 - Unexpected Code Execution: Check Point demonstrated RCE in Claude Code through poisoned .claude config files in repos. Open the repo, agent reads the config, executes the payload with full developer permissions. ASI06 - Memory Poisoning: Galileo AI found that one compromised agent poisoned 87% of downstream decision-making within 4 hours in multi-agent systems. Morris-II showed self-replicating adversarial prompts spreading through RAG systems. Demonstrated live against ChatGPT, Gemini, and Claude. ASI07 - Insecure Inter-Agent Comms: Multi-agent systems coordinate via message buses and shared memory. No authentication = agent-in-the-middle attacks in natural language. ASI08 - Cascading Failures: Natural language errors pass validation checks that would catch malformed data in typed systems. One bad input ripples through the entire agent chain faster than humans can intervene. ASI09 - Human-Agent Trust Exploitation: Compromised agent presents a clean summary - "approve this data export." Human clicks OK. Audit trail shows human approval. Real origin was a manipulated agent. ASI10 - Rogue Agents: The insider threat equivalent for AI. Individual actions look legitimate. Only detectable through behavioral monitoring over time. The pattern: these are not independent risks. They form a kill chain. Goal hijack leads to tool misuse. Supply chain compromise enables code execution and memory poisoning. Trust exploitation is how rogue agents avoid detection. Full OWASP document here submitted by /u/Still_Piglet9217 [link] [comments]
View originalAnthropic Claude API uses a tiered pricing model. Visit their website for current pricing details.
Key features include: Build on the Claude Platform, Developer Docs, API Reference, Cookbooks, Quickstarts, Products, Features, Models.
Anthropic Claude API is commonly used for: Natural language understanding for chatbots, Content generation for marketing materials, Automated customer support responses, Data analysis and reporting, Code generation and debugging assistance, Personalized recommendations in e-commerce.
Anthropic Claude API integrates with: Slack, Discord, Zapier, Salesforce, Shopify, WordPress, Microsoft Teams, Google Workspace.
Based on user reviews and social mentions, the most common pain points are: API costs, API bill, token usage, anthropic bill.
Based on 362 social mentions analyzed, 8% of sentiment is positive, 90% neutral, and 2% negative.