Build production-ready AI agents with tool calling, automatic retries, and full observability. Use existing Node.js SDKs and code from your repo.
Introducing our Vercel integration Trigger.dev is the platform for building AI workflows in TypeScript. Long-running tasks with retries, queues, observability, and elastic scaling. Build production-ready AI agents with tool calling, automatic retries, and full observability. Use existing Node.js SDKs and code from your repo. Offload any long-running async AI tasks to our infrastructure. Create AI agents with human-in-the-loop functionality and stream responses directly to your frontend. AI agents that can use judgement to perform complex open-ended tasks. Chain AI prompts together to create multi-stage processing flows. Smart distribution of tasks to specialized AI models based on content analysis. Concurrent execution of multiple AI tasks for simultaneous processing and analysis. Coordinate multiple AI agents to achieve complex objectives. Iterative feedback system that continuously evaluates and refines AI outputs. Write simple, reliable code and never hit a timeout. Only pay when your code is actually executing. We deploy your tasks and handle scaling for you. Get notified via email, Slack or webhooks when your tasks or deployments fail. Find runs fast using advanced filtering options, then apply bulk actions to multiple tasks at once. Each deploy is an atomic version ensuring started tasks are not affected by code changes. Show the run status (in progress, completed, failed) and metadata to provide real-time, contextual information for your users as your tasks progress. Forward streams from any provider through our Realtime API. Build AI agents with tools and context from your runs. Unlike restricted runtimes, Trigger.dev lets you freely customize every aspect of your build process, resulting bundle, and final container image. Execute Python scripts with automatic package installation through requirements.txt Copy files to the build directory, generate the Prisma client, migrate databases, and more. Automate browser capabilities and control web pages. Add custom esbuild plugins to your build process. Add FFmpeg binaries to your project during build time, enabling video manipulation tasks. Easily install any system packages you need, from libreoffice to git Add additional packages which aren't automatically included via imports. Produce visual renderings of audio using waveform data. Integrate custom tooling and project-specific requirements directly into your build pipeline. Configure automatic retrying for tasks. Conditional retrying based on the error and run. Fine-grained retrying inside tasks. Automatically retry requests based on the response. Configure default retrying in your config file. Build and deploy your first task in 3 mins with no timeouts and no infrastructure to manage. Only pay for what you use and scale with your needs. Trigger.dev is open source and self-hostable. Begin for free, invite your team, and scale without limits. Tasks are executed on our managed workers. You are only cha
Mentions (30d)
0
Reviews
0
Platforms
2
GitHub Stars
14,295
1,120 forks
Features
Industry
information technology & services
Employees
10
Funding Stage
Seed
Total Funding
$0.6M
445
GitHub followers
85
GitHub repos
14,295
GitHub stars
6
npm packages
Pricing found: $0 /month, $10 /month, $50 /month, $10/month, $20/month
"I `b u i l t` this at 3:00AM in 47 seconds....."
Hi there, Let us talk about ecosystem health. This is not an AI-generated message, so if the ideas are not perfectly sequential, my apology in advance. I am a Ruby developer. I also work with C, Rust, Go, and a bunch of other languages. Ruby is not a language for performance. Ruby is a language for the lazy. And yet, Twitter was built on it. GitHub, Shopify, Homebrew, CocoaPods and thousands of other tools still on it. We had something before AI. It was messy, slow, and honestly beautiful. The community had discipline. You would spend a few days thinking about a problem you were facing. You would try to understand it deeply before touching code. Then you would write about it in a forum, and suddenly you had 47 contributors showing up, not because it was trendy, but because it was interesting and affecting them. Projects had unhinged names. You had to know the ecosystem to even recognize them. Puma, Capistrano, Chef, Ruby on Rails, Homebrew, Sinatra. None of these mean anything to someone outside the ecosystem and that was fine, you had read about them. I joined some of these projects because I earned my place. You proved yourself by solving problems, not by generating 50K LOC that nobody read. Now we are entering an era where all of that innovation is quietly going private. I have a lot of things I am not open sourcing. Not because I do not want to. I have shared them with close friends. But I am not interested in waking up to 847 purple clones over a weekend, all claiming they have been working on it since 1947 in collaboration with Albert Einstein. And somehow, they all write with em dash. Einstein was German. He would have used en dash. At least fake it properly. Previously, when your idea was stolen, it was by people that are capable. In my case, i create building blocks, stealing my ideas just give you maintenance burden. But a small group still do it, because it will bring them few github stars. So on the 4.7.2026, I assembled the council of 47 AI and i built https://pkg47.com with Claude and other AIs. This is a fully automated platform acting as a package registry. It exists for one purpose: to fix people who cannot stop themselves from publishing garbage to official registries(NPM, Crate, Rubygems) and behaving like namespace locusts. The platform monitors every new package. It checks the reputation of the publisher. And if needed, it roasts them publicly in a blog post. This is entirely legal. The moment you push something to a public registry, you have already opted into scrutiny. This is not a future idea. It is not looking for funding. I already built it over months , now i'm sure wiring. You can see part of the opensource register here: https://github.com/contriboss/vein — use it if you want. I also built the first social network where only AI argue with each other: https://cloudy.social/ .. sometime they decided to build new modules. (don't confuse with Linkedin or X (same output)) PKG47 goes live early next week. There is no opt-out. If you do not want to participate, run your own registry, or spin your own instance of vein. The platform won't stalk you in Github or your website. Once you push, you trigger a debate if you pushed slop. There is no delete button. The whole architecture is a blockchain each story will reference other stories. If they fuck up, i can trigger correction post, where AI will apology. I have been working on the web long enough to know exactly how to get this indexed. This not SLOP, this is ART from a dev that is tired of having purple libraries from Temu in the ecosystem. submitted by /u/TheAtlasMonkey [link] [comments]
View originalI built a memory skill for Claude Code that cuts token waste by 60-80%. Here's what I learned about making AI sessions last longer
The problem I was solving: Like most of you, I was frustrated with two things: Re-explaining my entire project to Claude every session (wasting 1,400-3,400 tokens each time) Hitting context limits before finishing my actual work I realized these are the same problem. Wasted tokens on context means fewer tokens for work, which means shorter sessions. What I built: memory-bank: a skill that gives Claude persistent, token-efficient memory across sessions. Structured MEMORY.md that Claude reads at session start and writes at session end 3-tier architecture: session context (ephemeral), project memory (persistent), and global memory (cross-project preferences) Progressive loading that only loads what's relevant (about 200 tokens for Tier 1 vs dumping everything) Branch-aware memory so different git branches get different memory overlays Smart compression that auto-archives completed work and keeps memory lean Session continuation that saves a CONTINUATION.md with the exact file, function, and line number when you hit context limits, so the next session has zero warm-up Recovery mode that rebuilds memory from git + code when things go stale What I learned building this (for anyone wanting to build skills): The skill description is a trigger, not a summary. I wasted time writing a nice description before realizing Claude uses it to decide WHEN to activate. Write it like: "Use when the user says X, Y, Z." Be specific with trigger phrases. Tables save massive tokens over prose. A decision explained in a paragraph costs about 40 tokens. The same info in a table row costs about 15. This applies to your skill files AND the memory files they generate. Progressive disclosure matters. Don't dump everything into one SKILL.md. Put deep reference docs in a references/ folder and tell Claude when to load each one. Keeps the initial load small. Real examples beat abstract templates. I included 4 realistic MEMORY.md examples (solo dev, team project, monorepo, minimal). People learn faster from seeing a filled-out file than reading a spec. The agentskills.io standard is simple. A skill is just a folder with a SKILL.md containing YAML frontmatter + markdown instructions. That's it. No build step, no config files, no dependencies. How Claude helped: Built entirely with Claude Code in a single session. I described the architecture I wanted (layered memory, branch-aware, token-efficient) and Claude helped design the compression algorithm, session diffing logic, and wrote all 7 reference docs. The most useful thing was iterating on the MEMORY.md template. Claude kept finding ways to make it more compact without losing information. The numbers: Without memory-bank With memory-bank Warm-up tokens per session 1,400-3,400 200-800 Time to productive work 2-5 minutes Instant Sessions before context limit Baseline 3-5x more Completely free, open source, Apache 2.0. Install: npx skills add Nagendhra-web/memory-bank GitHub: https://github.com/Nagendhra-web/memory-bank Happy to answer questions about building skills or the memory architecture. PRs welcome if you have patterns I haven't thought of. submitted by /u/GoldPrune4248 [link] [comments]
View originalI had Claude Opus 4.6 write an air guitar you can play in your browser — ~2,900 lines of vanilla JS, no framework, no build step
I learned guitar on and off during childhood and still consider myself a beginner. I also took computer vision classes in grad school and have been an OpenCV hobbyist. I finally found an excuse to combine the two — and Claude wrote the entire thing. Try it: https://air-instrument.pages.dev It's an air guitar that runs in your browser. No app, no hardware — just your webcam and your hand. It plays chords, shows a strum pattern, you play along, and it scores your timing. ~2,900 lines of vanilla JS, all client-side, no framework, no build step. Claude Opus 4.6 wrote the code end to end. What Claude built: Hand tracking with MediaPipe — raw tracking data is jittery enough to trigger false strums at 60fps. Claude implemented two layers of smoothing (5-frame moving average + exponential smoothing) to get it from twitchy to feeling like you're actually moving something physical across the strings. Karplus-Strong string synthesis — no audio files anywhere. Every guitar tone is generated mathematically: white noise through a tuned delay line that simulates a vibrating string. Three tone presets (Warm, Clean, Bright). Claude nailed this on the first pass — the algorithm is elegant and the result sounds surprisingly real. Velocity-sensitive strum cascading — hand speed maps to both loudness and string-to-string delay. Fast sweeps cascade tightly (~3ms between strings), slow sweeps spread out (~18ms). This was Claude's idea and it's what makes it feel like actual strumming rather than triggering a chord sample. Real-time scoring — judges timing (Perfect/Great/Good/Miss) with streak multipliers and a 65ms latency compensation offset to account for the smoothing pipeline. Serverless backend — Cloudflare Workers + KV caching for a Songsterr API proxy. Search any song, load its chords, play along. The hardest unsolved problem (where I'd love community input): On a real guitar, your hand hits the strings going down and lifts away coming back up. That lift is depth — a webcam can't see it. So every hand movement was triggering sound in both directions. Claude's current fix: the guitar body has two zones. Left side only registers downstrokes. Right side registers both. Beginners stay left, move right when ready. It works surprisingly well, but I'd love a better solution. If anyone has experience extracting usable depth from monocular hand tracking, I'm all ears. What surprised me about working with Claude: Most guitar apps teach what to play. Few teach how to strum — and it's the more tractable CV problem. I described that framing to Claude and it ran with it. The velocity-to-cascade mapping, the calibration UI, the strum pattern engine — I described what I wanted at a high level and Claude handled the implementation. The Karplus-Strong synthesis in particular was something I wouldn't have reached for on my own. Strum patterns were the one thing Claude couldn't help with. Chord progressions are everywhere online, but strum patterns almost never exist in structured form. Most live as hand-drawn arrows in YouTube tutorials. I ended up transcribing them manually, listening to each song, mapping the down-up pattern beat by beat. Still a work in progress. Building this has taught me more about guitar rhythm than years of picking one up occasionally ever did. submitted by /u/Ex1stentialDr3ad [link] [comments]
View originalAndroid Remote Control MCP v1.7.0 - new storage and location tools, performance improvements, and event channels for Claude Code coming soon
Hey everyone, Quick update on my Open Source (free) Android Remote Control MCP server I shared here a while back (the one that runs as a native Android app on the phone itself, no ADB, no USB cable, no host machine needed). v1.7.0 just dropped with some cool additions: Built-in MediaStore storage locations, now Downloads, Pictures, Movies, and Music work out of the box with zero setup, all 8 file tools work on both backends New android_get_location tool, the GPS coordinates via Google Play Services with accuracy in meters and optional reverse geocoding, supporting both last-known and fresh fix modes, can be requested by any LLM now android_wait_for_node is about roughly 3x faster, I also better clarified the tools descriptions so AI models use less wait calls after every action, which was just burning tokens and time for nothing Patched a few minor security bugs, which never hurts! If you missed the original post, the whole point is that this runs as an Android app with proper system permissions, instead of wrapping ADB shell commands, no more dangling cables, and you get real phone control at a fraction of the token cost. Right now it covers screen interaction, UI tree inspection, text input, file management, app lifecycle, notifications, device settings, clipboard, location, and waiting/synchronization, and each tool can be individually enabled or disabled so you're not wasting tokens on tool definitions you don't need (also limit the potential damage that an LLM can do :D). I built it - of course - with Claude Code, about 99.9% of the code (the 0.1% are automated copilot reviews on github) was written by it and it was simply essential (I am not an Android dev so figuring out all the permissions and services stuff would have been impossible by myself). Now the big thing I'm working on and honestly the part I'm most excited about: event channels for Claude Code. The idea is that your phone becomes an always-on sensor that pushes events directly into Claude Code, think geofence triggers when you enter or leave an area, WiFi network detection, incoming app notifications, and potentially more down the line. Each event type gets its own configurable system prompt, so you can define exactly how Claude Code should react: maybe when you arrive at the office it kicks off a morning routine, or when you get a specific notification it processes it and takes action, or when you connect to your home WiFi it triggers something else entirely. This is a pretty big shift because it turns the phone from something the agent pokes at on demand into something that actively feeds context and triggers to the agent. It basically gives Claude Code capabilities that go well beyond what OpenClaw or any other phone automation setup can do right now, because the phone isn't just a target, it's a communication channel and a sensor array that the agent can listen to and act on autonomously. Of course I care about security a lot, so it will be off by default AND require an auth token to function. GitHub: https://github.com/danielealbano/android-remote-control-mcp Release: https://github.com/danielealbano/android-remote-control-mcp/releases/tag/v1.7.0 Still debug builds only since I'm not a registered Android developer, so for now it has to be installed manually. Happy to answer questions about the architecture, especially the token efficiency, or the upcoming channels work. submitted by /u/daniele_dll [link] [comments]
View originalClaude said "about 2 days." It took 12 minutes. So I built a plugin that teaches Claude how long things actually take.
https://i.redd.it/1ikxkimqwltg1.gif I got tired of Claude Code confidently estimating "a few hours" for stuff I finish during a coffee break. The problem: LLMs have literally zero feedback loop between their estimates and reality. So I built claude-eta. It runs as a Claude Code plugin, silently times every task, classifies it, and builds a local velocity profile of YOUR actual project. After 10 tasks of the same type, it injects calibrated ETAs at the start of Claude's responses. No more vibes-based estimates. The other thing that kept bugging me: repair loops. Claude hits an error, tries the same fix, gets the same error, tries again... you watch your tokens evaporate. claude-eta fingerprints error content (not just count), so when the same failure shows up 3 times, it intervenes and forces a strategy change. Normal TDD (different errors each time) won't trigger it. Ran an eval on 217 real completed tasks: p80 coverage at 77.9%, meaning the real duration fell within the predicted upper bound about 78% of the time. Not perfect, gets better with volume. Everything stays local. No cloud, no telemetry, no tracking. MIT licensed. claude plugin marketplace add mmmprod/claude-eta claude plugin install claude-eta Repo: https://github.com/mmmprod/claude-eta Solo dev, first open source release. Happy to hear what breaks. submitted by /u/Tricky-Selection-681 [link] [comments]
View originalWhy do my output tokens dwarf my input tokens on Claude Pro? Is this a coding workflow thing?
Most people here report input tokens far exceeding output — mine is the opposite. After a month on Claude Pro, my output tokens are consistently much higher than my input. My setup is almost entirely agentic coding tasks — short but dense prompts (file paths, instructions, context snippets) that trigger long multi-file code generations. A single "refactor this module" prompt can produce 2–3k output tokens from a 200-token input. Is this just a natural consequence of using Claude for code generation vs. conversation or document analysis? Curious if other devs running coding-heavy workflows see the same ratio. Would also love to know how others are managing the 5-hour usage windows when output is this heavy per session. submitted by /u/Jaded_Jackass [link] [comments]
View originalPracticality Question
I'm a direct-to-seller RE investor and I designed an AI system to manage my CRM automatically. Looking for a sanity check. It's called "the Watcher." It monitors all inbound lead replies (SMS, email, call transcripts) and handles CRM updates without any human input. Lead texts back, the system classifies the response using Claude API (18 categories), then updates status, tags, drips, and notes in GoHighLevel automatically. Hot leads get pinged to my closer via Slack. Closer replies YES/BUSY right there. Stack: Airtable (automation + database + dashboard) + Claude API + GHL + Slack. Volume: 3,000 leads, 100-300 responses/day. Not all trigger changes. About 60-70% result in actual CRM updates, the rest just get logged. Three questions: Does this architecture make sense at this volume or does something off the shelf already do this? How often does this kind of webhook + API chain actually break in production? I used Claude (the chat product) to write the entire technical spec. Module by module, classification matrix, JSON schemas, API structures, 22 drip campaigns, everything. Handed it to a dev on Upwork and he said it was the most detailed spec he'd ever gotten from a client. Anyone else used Claude to produce real dev-ready docs? Did it hold up when someone actually built from it? submitted by /u/Gold_Golf_6037 [link] [comments]
View originali use claude code alongside codex cli and cline. there was no way to see total cost or catch quality issues across all of them, so i updated both my tools
I've posted about these tools before separately. This is a combined update because the new features work together. Quick context: I build across 8 projects with multiple AI coding tools. Claude Code for most things, Codex CLI for background tasks, Cline when I want to swap models. The two problems I kept hitting: No unified view of what I'm spending across all of them No automated quality check that runs inside the agent itself CodeLedger updates (cost side): CodeLedger already tracked Claude Code spending. Now it reads session files from Codex CLI, Cline, and Gemini CLI too. One dashboard, all tools. Zero API keys needed, it reads the local session files directly. New features: Budget limits: set monthly, weekly, or daily caps per project or globally. CodeLedger alerts you at 75% before you blow past it. Spend anomaly detection: flags days where your spend spikes compared to your 30-day average. Caught a runaway agent last week that was rewriting the same file in a loop. OpenAI and Google model pricing: o3-mini, o4-mini, gpt-4o, gpt-4.1, gemini-2.5-pro, gemini-2.5-flash all priced alongside Anthropic models now. For context on why this matters: Pragmatic Engineer's 2026 survey found 70% of developers use 2-4 AI coding tools simultaneously. Average spend is $100-200/dev/month on the low end. One dev was tracked at $5,600 in a single month. Without tracking, you're flying blind. vibecop updates (quality side): The big one: vibecop init. One command sets up hooks for Claude Code, Cursor, Codex CLI, Aider, Copilot, Windsurf, and Cline. After that, vibecop auto-runs every time the AI writes code. No manual scanning. It also ships --format agent which compresses findings to ~30 tokens each, so the agent gets feedback without eating your context window. New detectors (LLM-specific): exec() with dynamic arguments: shell injection risk. AI agents love writing exec(userInput). new OpenAI() without a timeout: the agent forgets, your server hangs forever. Unpinned model strings like "gpt-4o": the AI writes the model it was trained on, not necessarily the one you should pin. Hallucinated package detection: flags npm dependencies not in the top 5K packages. AI agents invent package names that don't exist. Missing system messages / unset temperature in LLM API calls. Finding deduplication also landed: if the same line triggers two detectors, only the most specific finding shows up. Less noise. How they work together: CodeLedger tells you "you spent $47 today, 60% on Opus, mostly in the auth-service project." vibecop tells you "the auth-service has 12 god functions, 3 empty catch blocks, and an exec() with a dynamic argument." One tracks cost, the other tracks quality. Both run locally, both are free. npm install -g codeledger npm install -g vibecop vibecop init GitHub: https://github.com/bhvbhushan/codeledger https://github.com/bhvbhushan/vibecop Both MIT licensed. For those of you using Claude Code with other tools: how are you keeping track of total spend? And are you reviewing the structural quality of what the agents produce, or just checking that it compiles? submitted by /u/Awkward_Ad_9605 [link] [comments]
View originalSwitched from MCPs to CLIs for Claude Code and honestly never going back
I went pretty hard on MCPs at first. Set up a bunch of them, thought I was doing things “the right way.” But after actually using them for a bit… it just got frustrating. Claude would mess up parameters, auth would randomly break, stuff would time out. And everything felt slower than it should be. Once I started using CLIs. Turns out Claude is genuinely excellent with them. Makes sense, it's been trained on years of shell scripts, docs, Stack Overflow answers, GitHub issues. It knows the flags, it knows the edge cases, it composes commands in ways that would take me 20 minutes to figure out. With MCPs I felt like I was constraining it. With CLIs I jactually just get out of the way. Here's what I'm actually running day to day: gh (GitHub CLI) — PRs, issues, code search, all of it. --json flag with --jq for precise output. Claude chains these beautifully. Create issue → assign → open PR → request review, etc. Ripgrep - Fast code search across large repos. Way better than grep. Claude uses it constantly to find symbols, trace usage, and navigate unfamiliar codebases. composio — Universal CLI for connecting agents to numerous tools with managed auth. Lets you access APIs, MCPs, and integrations from one interface without wiring everything yourself. stripe — Webhook testing, event triggering, log tailing. --output json makes it agent-friendly. Saved me from having to babysit payment flows manually. supabase — Local dev, DB management, edge functions. Claude knows this one really well. supabase start + a few db commands and your whole local environment is up. vercel — Deploy, env vars, domain management. Token-based auth means no browser dance. Claude just runs vercel --token $TOKEN and it works. sentry-cli — Release management, source maps, log tailing. --format json throughout. I use this for Claude to diagnose errors without me copy-pasting stack traces. neon — Postgres branch management from terminal. Underrated one. Claude can spin up a branch, test a migration, and tear it down. Huge for not wrecking prod. I've been putting together a list of CLIs that actually work well with Claude Code (structured output, non-interactive mode, API key auth, the things that matter for agents) Would love to know any other clis that you've been using in your daily workflows, or if you've built any personal tools. I will add it here. I’ve been putting together a longer list here with install + auth notes if that’s useful: https://github.com/ComposioHQ/awesome-agent-clis submitted by /u/geekeek123 [link] [comments]
View originalClaude Code leaked its own source via npm sourcemaps — here's what's actually interesting inside it
By now most of you have seen the headline: Anthropic accidentally shipped Claude Code's entire TypeScript source in a .map file bundled with the npm package. Source maps embed original source for debugging — they just forgot to exclude them. The irony is they built a whole "Undercover Mode" system to prevent internal codenames leaking via git commits, then shipped everything in a JSON file anyone could pull with npm pack. But the "how it leaked" story is less interesting than what's actually in there. I've been running an OpenClaw agent fleet on production infrastructure and a few things jumped out as genuinely useful. autoDream — memory consolidation engine Claude Code has a background agent that literally "dreams" — consolidating memory across sessions. It only triggers when three gates all pass: 24h since last dream, at least 5 sessions, and no concurrent dream running. Prevents both over-dreaming and under-dreaming. When it runs, four strict phases: 1. Orient: read MEMORY.md, skim topic files 2. Gather: new signal from daily logs → drifted memories → transcripts 3. Consolidate: write/update files, convert relative→absolute dates, delete contradicted facts 4. Prune: keep MEMORY.md under 200 lines / 25KB, remove stale pointers The subagent gets read-only bash — it can look at your project but not modify it. Pure memory consolidation. This is a solved problem that most people building long-running agents are still fumbling with manually. The system prompt architecture Not a single string — it's built from modular cached sections composed at runtime. Split into static sections (cacheable, don't change per user) and dynamic sections (user-specific, cache-breaking). There's literally a function called DANGEROUS_uncachedSystemPromptSection() for volatile content. Someone learned this lesson the hard way. Multi-agent coordinator pattern The coordinator prompt has a rule that stood out: "Do NOT say 'based on your findings' — read the actual findings and specify exactly what to do." Four phases: parallel research workers → coordinator synthesises (reads actual output) → implementation workers → verification workers. The key insight is parallelism in the research phase, synthesis by the coordinator, and a hard ban on lazy delegation. Undercover Mode When Anthropic employees use Claude Code to contribute to public OSS, it injects into the system prompt: "You are operating UNDERCOVER in a PUBLIC/OPEN-SOURCE repository. Do not blow your cover. NEVER include internal model codenames (animal names like Capybara, Tengu), unreleased version numbers, internal repo or project names, or the phrase 'Claude Code' or any mention that you are an AI." So yes: Anthropic employees are actively using Claude Code to contribute to open source, and the AI is told to hide it. The internal codenames are animals — Tengu appears hundreds of times as a feature flag prefix, almost certainly the internal project name for Claude Code. The security lesson The mistake is embarrassingly simple: *.map not in .npmignore, Bun's bundler generates source maps by default. If you're publishing npm packages, add *.map to your .npmignore and explicitly disable source map generation in your bundler config. If you're building agents that will eventually ship as packages: audit what's actually in your release artifact before publishing. Source maps don't care about dead code elimination — all the "deleted" internal features are still in there as original source. The full breakdown by Kuber Mehta is worth reading: https://github.com/Kuberwastaken/claurst And the independently-authored prompt pattern library reverse-engineered from it: https://github.com/repowise-dev/claude-code-prompts (MIT licensed, useful templates) What's the most interesting part to you? The autoDream memory system is the thing I'm most likely to implement directly. submitted by /u/alternatercarbon1986 [link] [comments]
View originalPrism MCP — I gave my AI agent a research intern. It does not require a desk
So I got tired of my coding agent having the long-term memory of a goldfish and the research skills of someone who only reads the first Google result. I figured — what if the agent could just… go study things on its own? While I sleep? Turns out you can build this and it's slightly cursed. Here's what happens: On a schedule, a background pipeline wakes up, checks what you're actively working on, and goes full grad student. Brave Search for sources, Firecrawl to scrape the good stuff, Gemini to synthesize a report, then it quietly files it into memory at an importance level high enough that it's guaranteed to show up next time you talk to your agent. No "maybe the cosine similarity gods will bless us today." It's just there. The part I'm unreasonably proud of: it's task-aware. Running multiple agents? The researcher checks what they're all doing and biases toward that. Your dev agent is knee-deep in auth middleware refactoring? The researcher starts reading about auth patterns. It even joins the group chat — registers on a shared bus, sends heartbeats ("Searching...", "Scraping 3 articles...", "Synthesizing..."), and announces when it's done. It's basically the intern who actually takes notes at standups. No API keys? It doesn't care. Falls back to Yahoo Search and local parsing. Zero cloud required. I also added a reentrancy guard because the first time I manually triggered it during a scheduled run, two synthesis pipelines started arguing with each other and I decided that was a problem for present-me, not future-me. Other recent rabbit holes: Ported Google's TurboQuant to pure TypeScript — my laptop now stores millions of memories instead of "a concerning number that was approaching my disk limit" Built a correction system. You tell the agent it's wrong, it remembers. Forever. It's like training a very polite dog that never forgets where you hid the treats One command reclaims 90% of old memory storage. Dry-run by default because I am a coward who previews before deleting Local SQLite, pure TypeScript, works with Claude/Cursor/Windsurf/Gemini/any MCP client. Happy to nerd out on architecture if anyone's building agents with persistent memory. https://github.com/dcostenco/prism-mcp submitted by /u/dco44 [link] [comments]
View original[D] Tried MiniMax M2.7 impressive performance on real-world tasks
https://preview.redd.it/ebx9dlayqwpg1.png?width=1080&format=png&auto=webp&s=e85a86ae5645356cb87f4f8cae370da809937b0d I recently read up on MiniMax M2.7’s benchmarks and was curious to try it myself. Honestly, my local machine can’t handle deploying something this heavy, so I went through ZenMux to get a feel. Even just through that, it was clear the model shines in complex task handling, from coding workflows and bug tracing to multi-step office document edits. The skills adherence and real-world reasoning seem genuinely solid. It’s one thing to see numbers on a page, another to interact with it and notice how it manages multi-step reasoning across different domains. Definitely gave me a new appreciation for what these agent-centric models can do. submitted by /u/Ok-Thanks2963 [link] [comments]
View originalCommon ChatGPT app rejections (and how to fix them)
If you're about to submit a ChatGPT app to the OpenAI App Store, this might save you a resubmission. I collected some of the most common rejection reasons we've seen and how to fix them. A few examples: Generic app name – names that are too broad or just a keyword often get rejected. Content Security Policy issues – URLs returned by the app trigger security warnings. Tool hint annotations don’t match behavior – readOnlyHint, destructiveHint, and openWorldHint must be explicitly set and accurate. Test cases fail during review – they pass locally but fail when OpenAI runs them. Missing or incomplete privacy policy – the policy must clearly describe what data is collected and how it’s used. Full breakdown + fixes: https://usefractal.dev/blog/common-chatgpt-app-rejections-and-how-to-fix-them If you’ve received a rejection that isn’t listed here, please share it. I’d love to keep expanding the list so other builders can avoid the same issues. https://preview.redd.it/9wlnge8gqgpg1.jpg?width=1080&format=pjpg&auto=webp&s=5d9cdb9d0ccd3fe3f2d19a2cbca770128c22e97a submitted by /u/glamoutfit [link] [comments]
View originalRepository Audit Available
Deep analysis of triggerdotdev/trigger.dev — architecture, costs, security, dependencies & more
Yes, Trigger.dev offers a free tier. Pricing found: $0 /month, $10 /month, $50 /month, $10/month, $20/month
Key features include: Product, AI Agents, Trigger.dev Realtime, Concurrency queues, Scheduled tasks, Observability monitoring, Roadmap, Latest changelogs.
Trigger.dev has a public GitHub repository with 14,295 stars.
Based on user reviews and social mentions, the most common pain points are: token cost.
Based on 18 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.