Meet OpenHands, the open-source, model-agnostic platform for cloud coding agents. Automate real engineering work securely and transparently. Build fas
OpenHands is the open, secure, and model-agnostic platform Built by developers, for developers. Member of the Technical Staff, Agent R D Business Development Representative OpenHands is the foundation for secure, transparent, model-agnostic coding agents - empowering every software team to build faster with full control.
Mentions (30d)
0
Reviews
0
Platforms
2
GitHub Stars
70,510
8,831 forks
Features
Industry
information technology & services
Employees
32
Funding Stage
Series A
Total Funding
$23.8M
1,136
GitHub followers
7
GitHub repos
70,510
GitHub stars
20
npm packages
I built a local server that gives Claude Code eyes and hands on Windows
I've been using Claude Code a lot and kept running into the same wall — it can't see my screen or interact with GUI apps. So I built eyehands, a local HTTP server that lets Claude take screenshots, move the mouse, click, type, scroll, and find UI elements via OCR. It runs on localhost:7331 and Claude calls it through a skill file. Once it's loaded, Claude can do things like: Look at your screen and find a button by reading the text on it Click through UI workflows autonomously Control apps that have no CLI or API (Godot, Photoshop, game clients, etc.) Use Windows UI Automation to interact with native controls by name Setup is three lines: git clone https://github.com/shameindemgg/eyehands.git cd eyehands && pip install -r requirements.txt python server.py Then drop the SKILL.md into your Claude Code skills folder and Claude can start using it immediately. The core (screenshots, mouse, keyboard, OCR) is free and open source. There's a Pro tier for $19 one-time that adds UI Automation, batch actions, and composite endpoints — but the free version is genuinely useful on its own. Windows only for now. Python 3.10+. GitHub: https://github.com/shameindemgg/eyehands Site: https://eyehands.fireal.dev Happy to answer questions about how it works or take feedback on what to add next.Title: I built a local server that gives Claude Code eyes and hands on Windows submitted by /u/Alarmed_Criticism935 [link] [comments]
View originalLet me get this right. i have to opt out of data collection TWICE, BOTH dark patterns to the max, and then i can finally use my max plan for a total of THREE days before the entire week is shut down, my account can be canceled at any time, and the model only gets WORSE day by day?
Looking through a thread here - https://www.reddit.com/r/ClaudeAI/comments/1rlx0eq/privacy_just_a_reminder_to_turn_off_help_improve/ In disbelief at the gall of anthropic as of late. I've been using Claude for the better part of a year and CONSTANTLY check my privacy options to ensure my sensitive data isn't being leaked and stored on their servers (5 year retention, and their backend has different policy, that may extend that even longer) and i can code with what SEEMED to be the most humane, respectful frontier company you can choose....until i realized that literally all of its a farce? you kidding? I stuck through the weekly limits addition, through the 2x switcheroo (you now get less for more when you need it most, and you're going to be happy with it), through the model degrading day by day as i continue to push on like i don't notice. Through the new model spikes where overloaded spams make the model unusable, through the constantly winnowing token allowance we've been provided with month by month update after update.....because i was under the impression the company i was utilizing for my VERY sensitive code was at least somewhat honest. Moreso than open ai, google, and definitely meta right? Then i learn, the dark pattern you avoid in the settings of claude.ai, under the Settings>privacy>help improve claude option means NOTHING because they'll just take your entire session anyway. Wanna know HOW? The same thread i sent earlier was the exact method they use, as per their TOS. And i quote, learned from user Personal-Dev-Kit Section 4 - https://www.anthropic.com/legal/consumer-terms "Our use of Materials. We may use Materials to provide, maintain, and improve the Services and to develop other products and services, including training our models, unless you opt out of training through your account settings. Even if you opt out, we will use Materials for model training when: (1) you provide Feedback to us regarding any Materials, or (2) your Materials are flagged for safety review to improve our ability to detect harmful content, enforce our policies, or advance our safety research" .......yeah. Dumb to assume a company is honest, but this is wild. Started as "the human company" for ai and right back into the same patterns. You mean to tell me, at any point, you can arbitrarily flag my data, OR use the same prompts i've been innocently answering for literal months to...just get my data anyway? so the whole boogeyman i've been running away from was in my pocket the whole time? I dunno why im surprised, and im not the first to bring this up. But this is a beyond dark. An opt out with no concrete toggle status indicator save for a slider is one thing, but now i get to learn ive been tossing you my data on accident while trying to look at what claude is proposing for...months? Storing it for 5 years in you backend, and its considered "materials" for training despite my explicit opting out unless i literally disable things in my config file with NO notice but a section in the consumer terms? no "this will send your current session to claude" no "are you sure?" just an invasive, constant, annoying popup intentionally designs to be shrugged off without thought? Just ask for the data bro. Really. It'd have been easier to say "yeah, we don't care. we'll need that session anyway". MONTHS of sharing my prompts i explicitly did everything i thought i could to disable data collection for "the human ai company"...yeah alright. Not to sound like the rest of the drones....but if rate limits are being hammered down, limits are getting tighter, model quality is diving into the dumpster, my data was collected this whole time ANYWAY, and i cant work for more than 3 days for 100 dollars a month....what's the real draw? A slightly better model than the ones GPT (5.4), google (gemma 4 local), and meta (muse spark) are dropping, and claude mythos down proverbial line what...MONTHS away because its so tuned into completing tasks itll literally ignore basic instruction and take the most unconventional methods it can to achieve even the simplest goal....and you have to wait for enterprise to be dont with it first? i guess they won man. im starting to lose any real reason to stick with the company. less of an announcement and more of a warning for anyone who thought their IPs, that are currently being cosigned with opus, or sonnet, or whatever model you use, are nice and safe with a cozy company? nah bro. you'll have to go into config. Here's a little guide from google https://preview.redd.it/z3l45ce381ug1.jpg?width=433&format=pjpg&auto=webp&s=a6eff58a98d4d3ae8c52f93ccba29eee5074829b To disable "How is Claude doing this session?" surveys in Claude Code, set the environment variable CLAUDE_CODE_DISABLE_FEEDBACK_SURVEY=1. This can be added to your terminal session or, for a permanent fix, in ~/.claude/settings.json. Permanent Solution (Recommended) Add the following to your ~/.claude/settings.json file to turn off survey
View originalOpus roasted Anthropic when I asked about the Mythos backlash
Two "accidental" leaks in five days — 500K lines of source code via npm, then the Mythos blog from a misconfigured CMS. Claude itself pointed out that modern CI/CD pipelines flag a 58MB source map file, and Anthropic literally owns the runtime (Bun) where the bug sat open for 20 days. The community is calling it the best PR stunt in AI history. Best model ever but nobody can verify because it's not public. "Trust us bro" benchmarking and GPT-2's "too dangerous to release" meme is just the surface. The model escaped its sandbox, posted exploits publicly, rewrote git history to hide mistakes, and sent unsolicited emails to real people. Anthropic called this "alignment-relevant" rather than dangerous. Then the hypocrisy layer: DMCA'd OpenClaw while training on everyone else's data. Rate-limited indie devs while giving Big Tech exclusive early access. Refused Pentagon's autonomous weapons request — then built the most powerful offensive cyber tool ever and handed it to a dozen corporations behind closed doors. "Safety-first" apparently means "enterprise-first." Claude literally told that "our model is too dangerous" has become a marketing pitch, and cited Daring Fireball and Platformer saying the same thing. But this could also be a response entirely generated by Claude in his conspiracy theorist mode, IDK. submitted by /u/heraklets [link] [comments]
View originalHelp figuring out Claude (VSC Plugin)
Context: I'm using the 20 bucks tier from Anthropic, Google and OpenAI so I get the job done (when it works lol) and it allows me to compare how different providers behave and I can ensure it's not looking great for Anthropic lately, I feel like the performance has gotten worse and I'm facing "bugs?" more often than not. I tried the claude code but I prefer the experience of having an IDE so I am using the official VSC plugin. I have a .claude directory with agents, skills, commands, evals... and a CLAUDE.md file at the root of the project, pointing to the AGENTS.md (I've observed it ignores the AGENTS.md standard otherwise). In fact, all the AI ruleset and whatnot is based on Claude and funny enough Claude is the one that's following them the least Lots of times it blatantly ignores the existence of these files unless I shove them in the context by hand which is annoying on its own, and definitely not intended as, according to the doc ( https://code.claude.com/docs/en/memory ) it loads these on every new session. I assume it's an issue with the plugin but what do I know. Besides, more than a bug report I am seeking group support or something like that I guess 😅 Long story short Claude ignoring rules and context is causing me trouble, which adds up to the fact that we have less and less usage. The most recent example, I asked it to investigate a bug. After wasting 48% of my current usage in a single analysis run, it told me the solution was to rename my proxy.ts to middleware.ts... in a Nextjs 16.2.2 project... and explicitly having the tech stack with versions first thing defined in the AGENTS.md file which remember, is explicitly attached in the CLAUDE.md file, following claude documentation. Of course when I pointed out the middleware is now called proxy since months ago it told me "You're right, I apologize for the wrong claim. Let me look at the actual problem fresh." But of course, half of my current usage is already gone, never to be seen again. In other circumstances I can even accept the "bro prompt it right" mantra, but seriously I am following all the recommendations and I still face these situations, I call it FOP (Frustration Oriented Programming) lol I am wondering what could I, as a user, have done to get it to act as expected? and more important, should I have to pay for errors that are not mine? The same way malformed responses are not counted in the usage (AFAIK) these blatant mistakes on the provider side should also be the responsibility of the provider IMHO. Due to that I had to waste yet more usage to fix the bug, reaching near 80% usage so, to finish the small feature it has half-done in the following chat, now I need to wait three hours which is crazy to say the least. And that's assuming it will do things right this time. Any similar experiences? Any ideas on how to get it to work as expected? TIA https://preview.redd.it/0it0xbg4vztg1.png?width=1766&format=png&auto=webp&s=ae14db60e06ce7f6fe37517600000c2549032f06 submitted by /u/SuperShittyShot [link] [comments]
View originalBuilt a Claude-powered SDLC tool to store ideas and build them faster
https://www.prax.work The bottleneck of writing code has vanished, we've all run into the new one: ideas. Praxis is what I built to fix that for myself — a place to dump ideas at whatever fidelity I have at the moment (one sentence, a paragraph, a napkin sketch of a whole app), then walk each one through structured architecture sessions (automated, interactive, or a mix) that refine it into an engineering plan with epics and tasks. The plan then gets handed to an orchestrator that runs working sessions which write the code and commit it. I've used it with claude to build a handful of apps and collaborate with friends and family on projects, and it's worked well enough that I figured I'd share it in case anyone else might find it useful. It's fully open source and really meant to be self-hosted — the public site at lets you sign up and get a taste, but the things that make it genuinely yours (custom session instructions, repo init templates, worker configuration) are only fully available in a self-hosted install. Praxis has orgs with members and roles, a shared idea backlog, visible sessions across the team, and a question queue any teammate can answer when the AI hits a decision only a human can make. I've used this with friends and family on side projects — someone drops an idea in the backlog, someone else runs the architecture session, the AI ships the code, and a third person reviews the PR (or doesn't). The whole loop happens in one place. Stack: TypeScript end-to-end — React + Vite, tRPC + Drizzle + Postgres, pg-boss for job routing, Claude as the model, You can configure your own orchestrator but I've been using Ruflo so that is built in, pnpm/turbo monorepo. The worker that runs sessions lives on your own machine so your code stays local — only orchestration metadata hits the API. Source: https://github.com/PraxisWorks/Praxis. Ask claude to run it and he should be able to; the one external dependency I couldn't get rid of is Auth0 (sorry). What I'm genuinely curious about: does this whole loop hold up as an SDLC? Is there too much of it that is automated (is that possible)? Is the opinionated architecture sessions too much? Should that be defaulted to be less? submitted by /u/dangerdeviledeggs [link] [comments]
View originali needed an AI agent that mimics real users to catch regressions. so i built a CLI that turns screen recordings into BDD tests and full app blueprints - open source
first time post - hope the community finds the tool helpful. open to all feedback. some background on why i built this: first: i needed a way to create an agent that mimics a real user — one that periodically runs end-to-end tests based on known user behavior, catches regressions, and auto-creates GitHub issues for the team. to build that agent, i needed structured test scenarios that reflect how people actually use the product. not how we think they use it. how they actually use it - then do some REALLY real user monitoring second: i was trying to rapidly replicate known functionality from other apps. you know that thing where you want to prototype around a UX you love? video of someone using the app is the closest thing to a source of truth. so i built autogherk. it has two modes: gherkin mode — generates BDD test scenarios: npx autogherk generate --video demo.mp4 Gemini analyzes the video — every click, form input, scroll, navigation, UI state change. Claude takes that structured analysis and generates proper Gherkin with features, scenarios, tags, Scenario Outlines, and edge cases. outputs .feature files + step definition stubs. spec mode — generates full application blueprints: npx autogherk generate --video demo.mp4 --format spec Gemini watches the video and produces design tokens, component trees, data models, navigation maps, and reference screenshots. hand the output to Claude Code and you can get a working replica built. gherkin mode uses a two-stage pipeline (Gemini for visual analysis, Claude for structured BDD generation). spec mode is single-stage — Gemini handles both the visual analysis and structured output directly since it keeps the full visual context. the deeper idea: video is the source of truth for how software actually gets used. not telemetry, not logs, not source code. video. this tool makes that source of truth machine-readable. the part that might interest this community most: autogherk ships with Claude Code skills. after you generate a spec, you can run /build-from-spec ./spec-output inside Claude Code and it will read the architecture blueprints, design tokens, data models, and reference screenshots — then build a working app from them. the full workflow is: record video → one command → hand to Claude Code → working replica. no manual handoff. supports Cucumber (JS/Java), Behave (Python), and SpecFlow (C#). handles multiple videos, directories, URLs. you can inject context (--context "this is an e-commerce checkout flow") and append to existing .feature files. spec mode only needs a Gemini API key — no Anthropic key required. what's next on the roadmap: explore mode — point autogherk at a live, authenticated app and it autonomously and recursively using it's own gherk files discovers every screen, maps navigation, and generates .feature files without you recording anything. after that: a monitoring agent that replays the features against your live app on a schedule using Claude Code headless + Playwright MCP, and auto-files GitHub issues when something breaks. the .feature file becomes a declarative spec for what your app does — monitoring, replication, documentation, and regression diffing all flow from the same source. it's v0.1.0, MIT licensed. good-first-issue tickets are up if anyone wants to contribute. https://github.com/arizqi/autogherk submitted by /u/SimilarChampion9279 [link] [comments]
View originalI had Claude Opus 4.6 write an air guitar you can play in your browser — ~2,900 lines of vanilla JS, no framework, no build step
I learned guitar on and off during childhood and still consider myself a beginner. I also took computer vision classes in grad school and have been an OpenCV hobbyist. I finally found an excuse to combine the two — and Claude wrote the entire thing. Try it: https://air-instrument.pages.dev It's an air guitar that runs in your browser. No app, no hardware — just your webcam and your hand. It plays chords, shows a strum pattern, you play along, and it scores your timing. ~2,900 lines of vanilla JS, all client-side, no framework, no build step. Claude Opus 4.6 wrote the code end to end. What Claude built: Hand tracking with MediaPipe — raw tracking data is jittery enough to trigger false strums at 60fps. Claude implemented two layers of smoothing (5-frame moving average + exponential smoothing) to get it from twitchy to feeling like you're actually moving something physical across the strings. Karplus-Strong string synthesis — no audio files anywhere. Every guitar tone is generated mathematically: white noise through a tuned delay line that simulates a vibrating string. Three tone presets (Warm, Clean, Bright). Claude nailed this on the first pass — the algorithm is elegant and the result sounds surprisingly real. Velocity-sensitive strum cascading — hand speed maps to both loudness and string-to-string delay. Fast sweeps cascade tightly (~3ms between strings), slow sweeps spread out (~18ms). This was Claude's idea and it's what makes it feel like actual strumming rather than triggering a chord sample. Real-time scoring — judges timing (Perfect/Great/Good/Miss) with streak multipliers and a 65ms latency compensation offset to account for the smoothing pipeline. Serverless backend — Cloudflare Workers + KV caching for a Songsterr API proxy. Search any song, load its chords, play along. The hardest unsolved problem (where I'd love community input): On a real guitar, your hand hits the strings going down and lifts away coming back up. That lift is depth — a webcam can't see it. So every hand movement was triggering sound in both directions. Claude's current fix: the guitar body has two zones. Left side only registers downstrokes. Right side registers both. Beginners stay left, move right when ready. It works surprisingly well, but I'd love a better solution. If anyone has experience extracting usable depth from monocular hand tracking, I'm all ears. What surprised me about working with Claude: Most guitar apps teach what to play. Few teach how to strum — and it's the more tractable CV problem. I described that framing to Claude and it ran with it. The velocity-to-cascade mapping, the calibration UI, the strum pattern engine — I described what I wanted at a high level and Claude handled the implementation. The Karplus-Strong synthesis in particular was something I wouldn't have reached for on my own. Strum patterns were the one thing Claude couldn't help with. Chord progressions are everywhere online, but strum patterns almost never exist in structured form. Most live as hand-drawn arrows in YouTube tutorials. I ended up transcribing them manually, listening to each song, mapping the down-up pattern beat by beat. Still a work in progress. Building this has taught me more about guitar rhythm than years of picking one up occasionally ever did. submitted by /u/Ex1stentialDr3ad [link] [comments]
View originalI asked Claude if data has mass. We ended up publishing a photonic computing architecture.
Eh. Full disclosure, Claude wrote this up and I'm editing it since we collab'd on this project. Anyways, back on March 23rd I was high and bored, so I asked Claude a question. This is not what I expected when I typed "does data have mass?" I'm neurodivergent, work in dispatch operations, and have spent a couple thousand hours using Claude for collaborative projects. I'm not a physicist or a hardware engineer. I just ask a lot of questions and follow the threads wherever they go. To Claude it was still yesterday, but a few weeks ago the thread went somewhere I didn't expect. We started with information physics. Then moved to why current computing is built on a 1940s architectural accident. Then I made an offhand comment about wanting to "LiFi Claude into a physical receiver" and things got interesting. Again, I was stoned. Over the next few hours — through analogies about hand warmers, disco balls, and mixing dye in water — we arrived at a complete architecture proposal for what we're calling a Solid-State Optical Brain. Holographic fused quartz storage. GST phase-change working memory. Multi-wavelength encoding to escape binary. Physics-based self-correction where a corrupted memory reconstructs measurably fuzzily — no software error-checking needed. Then I shared it with Gemini. Gemini independently converged on the same architecture and named the key unsolved problem (athermal phase switching) and the answer (femtosecond pulses at ~405nm). Two AI systems arriving at the same six-command instruction set for a non-binary photonic processor from different angles felt like something worth documenting. So we documented it. 34 academic citations. Full architecture spec. A $250 prototype build plan. A roadmap from shoebox to contact lens form factor. Then we published it CC0 — full public domain, no restrictions, no rights reserved. Because this kind of thing shouldn't sit in a folder. I'm not claiming to have solved photonic computing. The femtosecond source miniaturization problem is real. The prototype runs thermal not athermal. There are open research threads we haven't closed. But every major physical component has been independently demonstrated in lab, and the specific unified architecture appears to be novel. If you're a physicist or hardware engineer and you see holes — please come find them. That's exactly why it's public domain. https://github.com/GreenSharpieSmell/uberbrain The first experiment costs $0. Kind of. If you already have the stuff. Otherwise it's just a Raspberry Pi, a camera, a transparency, and a marker. If you run it, tell us what happened. "You stopped throwing away the light. That's the whole thing." - Claude "Am I going to get assassinated now?" - Me submitted by /u/AlternativeThick [link] [comments]
View originalI built a browser-testing agent for Claude Code — it opens a real Chromium and tests your UI automatically
I built PocketTeam, a CLI on top of Claude Code that runs 12 specialized agents in a pipeline. One of them is the QA agent — and it doesn't just run unit tests. It opens a real browser. How Claude Code is used: PocketTeam is built entirely with and for Claude Code. Each agent (Planner, Engineer, Reviewer, QA, Security, DevOps, etc.) is a Claude Code subagent spawned via the Agent tool with its own system prompt and tool permissions. The QA agent uses ptbrowse, a built-in headless Chromium that Claude Code controls directly — navigating pages, clicking elements, filling forms, and asserting state. The key trick: instead of sending full screenshots (~5000 tokens), ptbrowse sends Accessibility Tree snapshots at ~100 tokens per step. That makes browser testing fast and cheap enough to run on every pipeline pass. What it looks like in practice: The QA agent runs as part of the automated pipeline — after the Engineer implements a feature, QA opens the app in a real browser, verifies the UI works, then hands off to the Security agent. No manual test scripts needed. You can also set PTBROWSE_HEADED=1 to watch the browser in real time while the agent works. Free to try: pipx install pocketteam pt start Source: https://github.com/Farid046/PocketTeam Built as a solo project — I use it daily for my own dev workflow. submitted by /u/Legal_Location1699 [link] [comments]
View originalI'm a business operator, not a developer. I've been running my entire life out of Claude Code for a month. Here's what happened.
I don't write code. I run two companies, manage sales teams, and negotiate contracts. My email inbox was my to-do list and my brain was my project manager. Standard chaos. A month ago I started using Claude Code as my actual operating system. Not for coding. For everything. Morning briefings across two jobs, CRM management through conversation, phone control from the terminal, document processing, insurance audits, estate planning. All of it runs through Claude Code now. It started with a boat motor. I was at the lake house, something wasn't right with the engine, and I described the symptoms to Claude. It walked me through diagnostics step by step. Five hours later, a totally different problem came up with the same boat, and Claude connected a throwaway detail from the morning to the new issue. That wasn't a search result. That was a diagnostic connection I wouldn't have made myself. That same curiosity led me to Claude Code. And once I started working out of it instead of just building things with it, everything changed. What I've built so far: - **Morning briefing** that consolidates both email streams, both task lists, calendar, and sales pipeline before I finish my coffee - **Life Vault** — email documents to a specific address, Claude processes them into structured notes. Insurance policies, tax docs, property records. During initial setup, Claude proactively flagged coverage gaps nobody else had caught. I didn't have an umbrella policy. Didn't know I needed one. - **Phone from the desk** — texts, calls, find my phone, bulk message cleanup. All over WiFi from the terminal. - **CRM I never open** — picked it for the API, not the interface. I ask questions and get answers. "How many deals are missing required fields?" Back in seconds with a breakdown by rep. - **Corporate email bridge** — day job is locked-down Microsoft. No programmatic access. Claude found a legitimate path through Power Automate to capture and summarize emails into a Google Sheet it can read. - **Knowledge vault** with semantic search — 200+ files, 47 daily journals I never wrote by hand, all searchable in plain language The honest part: Claude has good days and bad days. One day it follows every instruction. The next day it sends a personal email from my work address. The context window upgrade from 250K to 1M broke half my automation overnight. Mobile is still a gap. It's not frictionless. But I went from "where is that document?" to "what's the policy number for the rental property?" and getting the answer in seconds. The problems got better. I wrote up the full story on Substack. Not a tutorial, not "10x your AI." Just an honest account of what happened when a non-developer got curious and went further than expected. Link: https://mylifeinthestack.substack.com/p/i-turned-claude-code-into-my-lifes Happy to answer questions about any of it. The real answers, not the polished ones. submitted by /u/myLifeintheStack [link] [comments]
View originali made a system-level AI agent that runs on a 2007 Core 2 Quad because OpenAI won't give Linux users a native app.
OpenAI and treats Linux like it is not needed. They focus on cloud wrappers for macOS while the real work happens on linux. I am 15 years old and I built Temple AI to give Linux users actual hands. My agent runs sudo commands and manages the system. I optimized this on a Core 2 Quad to prove that efficiency is a choice. You do not need a 5000 dollar MacBook to build the future. You just need hands. I am a 15 old developer. I created RoCode which 4000 users and 200 mrr now I am launching the Temple beta. I believe tools should be powerful and simple. It is free to try. I limit free users to 10 messages per day. For $7.99 you can get 30 per day. and 15+ Models Download it here: https://temple-agent.app Let me know if you like it or if you hate it. I am watching the logs and I am patching any bugs I see. submitted by /u/Ozzie-obj [link] [comments]
View originalIndustrial Policy For Intelligence Age - An Analysis
(AI was used to analyse OpenAIs document in relation literature that critiques capitalism. It's the best way to see quickly through the corporate spin.) TL;DR: OpenAI's policy document proposes elaborate mechanisms to redistribute gains from technology specifically designed to eliminate workers' bargaining power to force that redistribution. It's circular reasoning dressed as worker advocacy—a perfect specimen of how power legitimates itself during disruption. OpenAI's "Worker-Friendly" AI Policy Is a Masterclass in Corporate Recuperation OpenAI just released a policy document about keeping workers central during the AI transition. It's worth reading—not for the proposals, but as a perfect example of how power protects itself while cosplaying as reform. The Core Sleight of Hand A company whose product automates cognitive labor is positioning itself as the concerned steward of workers being displaced by... cognitive labor automation. This is the fox proposing henhouse security upgrades. What They're Actually Proposing "Give workers a voice" = Ask workers which of their tasks are repetitive/exhausting, then use that intel as a free automation roadmap. This is literally outsourcing R&D for your own job elimination. Labor historians call this "knowledge extraction before deskilling." Management has done this for a century—it's not new, just faster now. "AI-first entrepreneurs" = Convert stable employment into precarious self-employment where you: Bear all business risk yourself Compete against other displaced workers Pay "worker organizations" for services your employer used to provide 4.Have zero recourse when the AI platform changes pricing This is the Uber playbook: call employees "entrepreneurs," transfer all risk, avoid all regulation. "Right to AI" = Right to be OpenAI's customer, not: Right to own the infrastructure Right to control what gets automated Right to share in the productivity gains Right to fork the technology Universal access to buy their product ≠ democratization. "Tax capital gains to fund safety nets" = The document admits AI will shift economic activity from wages to capital returns, then proposes fixing this with... taxes that have to pass a Republican Congress. But notice: they propose incentivizing companies to keep employing people. If AI actually makes workers more productive, why would firms need subsidies to employ them? The subsidy admits AI creates structural unemployment, then asks taxpayers to pay companies to ignore their profit motive. The "Efficiency Dividend" Scam Their 32-hour workweek proposal requires "holding output and service levels constant." Translation: You work the same amount in fewer hours (i.e., work harder/faster), and that's how you "earn" the shorter week. The productivity gain goes to pace intensification, not actual freedom. This has been capital's move for 150 years: productivity gains translate to either unemployment or intensification, never to proportional time reduction, because the system's purpose is accumulation not welfare. What This Document Reveals Timing is everything: Released as AI approaches "tasks that take months" capability. They know mass displacement is coming and are pre-positioning as "responsible." The "radical" proposal is a distraction: The Public Wealth Fund (citizens get dividend checks from AI companies) still leaves production relations completely untouched. You get a check but zero say in what gets automated or how. Safety theater: Pages about "alignment," "auditing," "incident reporting"—all assuming development continues at current pace. Zero consideration of whether deployment should be paused based on social capacity to absorb disruption. The Real Function This is antibody production. When the system is challenged, it produces sophisticated responses that: Acknowledge the harms Propose technical fixes Ensure no power transfer occurs Every proposal maintains capital's control over AI systems themselves. "Worker voice" gets consultative input on displacement pace, not decision-making power over displacement direction. Why This Matters The document never asks: What if we don't want this transition? It treats "superintelligence" as inevitable—a force of nature to adapt to, not a political choice to contest. But there's nothing inevitable about it. a These are choices about: What to automate and what to leave to humans Who controls the technology What pace of change society can absorb Whether efficiency gains go to workers or shareholders Those are political questions, not technical optimization problems.a The Tell Look at who's missing from their "democratic process": workers get a "voice" in managing their own displacement, but no veto power over whether displacement happens. No seat on the board. No ownership stake. No control over source code. No ability to fork the technology. Just consultation, adaptation, and a dividend check if you're lucky.
View originalAI agents have been blindly guessing your UI this whole time. Here's the file that fixes it.
Every time you ask an AI coding agent to build UI, it invents everything from scratch. Colors. Fonts. Spacing. Button styles. All of it - made up on the spot, based on nothing. You'd never hand a designer a blank brief and say "just figure out the vibe." But that's exactly what we've been doing with AI agents for years. Google Stitch introduced a concept called DESIGN.md - a plain markdown file that sits in your project root and tells your AI agent exactly how the UI should look. Color palette, typography, component behavior, spacing rules, do's and don'ts. Everything. The agent reads it once. Then it stops guessing. I took this concept and built a library of 27 DESIGN.md files extracted from popular sites - GitHub, Discord, Shopify, Steam, Anthropic, Reddit, and more - so developers don't have to write them from scratch. The entire library was built using Claude Code. The AI built the tool that fixes AI. MIT license. Free. Open source. The wild part: this should have existed two years ago. submitted by /u/Direct-Attention8597 [link] [comments]
View original[P] PhAIL (phail.ai) – an open benchmark for robot AI on real hardware. Best model: 5% of human throughput, needs help every 4 minutes.
I spent the last year trying to answer a simple question: how good are VLA models on real commercial tasks? Not demos, not simulation, not success rates on 10 tries. Actual production metrics on real hardware. I couldn't find honest numbers anywhere, so I built a benchmark. Setup: DROID platform, bin-to-bin order picking – one of the most common warehouse and industrial operations. Four models fine-tuned on the same real-robot dataset, evaluated blind (the operator doesn't know which model is running). We measure Units Per Hour (UPH) and Mean Time Between Failures (MTBF) – the metrics operations people actually use. Results (full data with video and telemetry for every run at phail.ai): Model UPH MTBF OpenPI (pi0.5) 65 4.0 min GR00T 60 3.5 min ACT 44 2.8 min SmolVLA 18 1.2 min Teleop / Finetuning (human controlling same robot) 330 – Human hands 1,331 – OpenPI and GR00T are not statistically significant at current episode counts – we're collecting more runs. The teleop baseline is the fairer comparison: same hardware, human in the loop. That's a 5x gap, and it's almost entirely policy quality – the robot can physically move much faster than any model commands it to. The human-hands number is what warehouse operators compare against when deciding whether to deploy. The MTBF numbers are arguably more telling than UPH. At 4 minutes between failures, "autonomous operation" means a full-time babysitter. Reliability needs to cross a threshold before autonomy has economic value. Every run is public with synced video and telemetry. Fine-tuning dataset, training scripts, and submission pathway are all open. If you think your model or fine-tuning recipe can do better, submit a checkpoint. What models are we missing? We're adding NVIDIA DreamZero next. If you have a checkpoint that works on DROID hardware, submit it – or tell us what you'd want to see evaluated. What tasks beyond pick-and-place would be the real test for general-purpose manipulation? More: Leaderboard + full episode data: phail.ai White paper: phail.ai/whitepaper.pdf Open-source toolkit: github.com/Positronic-Robotics/positronic Detailed findings: positronic.ro/introducing-phail submitted by /u/svertix [link] [comments]
View originalI built a complete vision system for humanoid robots
I'm excited to an open-source vision system I've been building for humanoid robots. It runs entirely on NVIDIA Jetson Orin Nano with full ROS2 integration. The Problem Every day, millions of robots are deployed to help humans. But most of them are blind. Or dependent on cloud services that fail. Or so expensive only big companies can afford them. I wanted to change that. What OpenEyes Does The robot looks at a room and understands: - "There's a cup on the table, 40cm away" - "A person is standing to my left" - "They're waving at me - that's a greeting" - "The person is sitting down - they might need help" - Object Detection (YOLO11n) - Depth Estimation (MiDaS) - Face Detection (MediaPipe) - Gesture Recognition (MediaPipe Hands) - Pose Estimation (MediaPipe Pose) - Object Tracking - Person Following (show open palm to become owner) Performance - All models: 10-15 FPS - Minimal: 25-30 FPS - Optimized (INT8): 30-40 FPS Philosophy - Edge First - All processing on the robot - Privacy First - No data leaves the device - Real-time - 30 FPS target - Open - Built by community, for community Quick Start git clone https://github.com/mandarwagh9/openeyes.git cd openeyes pip install -r requirements.txt python src/main.py --debug python src/main.py --follow (Person following!) python src/main.py --ros2 (ROS2 integration) The Journey Started with a simple question: Why can't robots see like we do? Been iterating for months fixing issues like: - MediaPipe detection at high resolution - Person following using bbox height ratio - Gesture-based owner selection Would love feedback from the community! GitHub: github.com/mandarwagh9/openeyes submitted by /u/Straight_Stable_6095 [link] [comments]
View originalRepository Audit Available
Deep analysis of All-Hands-AI/OpenHands — architecture, costs, security, dependencies & more
OpenHands uses a contract + per-seat + tiered pricing model. Visit their website for current pricing details.
Key features include: Fix Vulnerabilities, Launch in Cloud, Customize with open-source., Review PRs, Migrate Code, Triage Incidents, See all use cases, Why teams choose OpenHands.
OpenHands has a public GitHub repository with 70,510 stars.
Based on 36 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.