Run AI on your own terms. Connect any model, extend with code, protect what matters—without compromise.
While there are limited direct reviews and mentions specifically about "Open WebUI", the tool appears to integrate well with platforms like OpenAI and Claude for various applications, such as AI job mapping and voice-to-voice communication. Strengths noted in related discussions include its capability to handle complex integrations and projects efficiently. Key complaints typically involve the complexity of setup and integration for non-coders or those unfamiliar with API usage. The pricing sentiment is generally neutral, as most mentions focus more on the functionalities than cost, indicating a mixed perception of value. Overall, "Open WebUI" has a reputation for versatility and robust performance in AI-related projects, but may pose challenges for more casual users.
Mentions (30d)
26
1 this week
Reviews
0
Platforms
2
Sentiment
13%
8 positive
While there are limited direct reviews and mentions specifically about "Open WebUI", the tool appears to integrate well with platforms like OpenAI and Claude for various applications, such as AI job mapping and voice-to-voice communication. Strengths noted in related discussions include its capability to handle complex integrations and projects efficiently. Key complaints typically involve the complexity of setup and integration for non-coders or those unfamiliar with API usage. The pricing sentiment is generally neutral, as most mentions focus more on the functionalities than cost, indicating a mixed perception of value. Overall, "Open WebUI" has a reputation for versatility and robust performance in AI-related projects, but may pose challenges for more casual users.
Features
Use Cases
Industry
information technology & services
Employees
3
20
npm packages
31
HuggingFace models
Repurposed my old work ThinkPad as a dedicated personal AI workstation — looking for ideas from people who’ve done something similar
Apologies if formatting comes out weird- I am on mobile. My old employer let me keep a ThinkPad when I left. Rather than let it collect dust, I’m turning it into a dedicated personal AI environment — wiping it, installing Linux, and using it specifically for two things: life admin automation and building personal software tools. The core setup I’m planning: • Claude Desktop with MCP servers running persistently as Docker services • Tailscale so I can access everything securely from my phone when I’m not home • Open WebUI as a mobile-friendly chat interface • Code-server (VS Code in the browser) so I can actually write and run code from my phone • A dedicated Gmail account that acts as the “identity” for this Claude instance — wired into Google Drive, Calendar, and potentially an email-triggered agent pipeline • A local RAG system for personal documents — contracts, notes, research — so Claude has persistent context about my life The idea is that this becomes an ambient personal intelligence layer — always on, always up to date on my documents and projects, accessible from anywhere via Tailscale. Not a cloud subscription, not shared with anything work-related. Fully mine. On the software side, I’m planning to use Claude Code + Lovable to build local-first personal apps for my own pain points — things that don’t exist in the market the way I want them, or where I don’t want my data in someone else’s cloud. The ThinkPad is the runtime; Lovable builds the frontend, Claude Code builds the backend, and everything talks over a local API. What I’m curious about from people who’ve built something like this: • What MCP servers have actually been worth setting up vs. overhyped? • Has anyone built a reliable file-drop-to-RAG pipeline that actually stays current? • Is Open WebUI the right mobile interface or is there something better now? • Anyone using a dedicated “agent identity” email account — what workflows have you actually automated? • Claude Code + local backend: what’s your stack? FastAPI? SQLite? Something else? • Any gotchas with running Claude Desktop persistently on Linux? Genuinely trying to build something useful here rather than a tech demo. Would love to hear from people who’ve gone down this road. submitted by /u/Nashvillain12 [link] [comments]
View originalBuilding an open library of Design.md files for AI-generated UIs
I have been working on something that might be useful if you are building UIs with coding agents. The idea is simple. Generating decent UI with LLMs is still inconsistent. You can get something working, but getting it to look coherent and reusable is much harder. So I started building an open library of Design.md files. These are structured design systems that agents can follow to generate more consistent interfaces. The format comes from Google Stitch, but it works with any LLM. This is a very early alpha, but it is already usable: GitHub repo (open to contributions): https://github.com/albemala/design-md-library Simple frontend to browse designs: https://design-md-web.pages.dev/ I am adding new design systems regularly, and the goal is to turn this into a solid collection of reusable UI foundations for AI workflows. Before pushing this further, I want to understand if this is actually useful. Would you use this in real projects? What is missing for it to be useful? What would stop you from contributing? Any honest feedback is appreciated. submitted by /u/albemala [link] [comments]
View originalTired of scrolling through long chatGPT threads so built an extension around it
I remember asking too many questions in a single thread, leading to the chat interface becoming laggy, slow, and frustrating to navigate. Whenever I needed to refer back to a specific prompt or code snippet, I had to manually scroll through a massive wall of text. Then I spent my time searching the web store for extensions to help with this, but only found some useless and some paid ones. So here is a free and open sourced extension that my friends and I now use daily to save time. It injects a clean navigation sidebar directly into the UI, allowing you to instantly bookmark and snap back to any message. A working demo video is attached to show the execution. Link to the codebase and extension is attached in the comments. I appreciate suggestions about this and should I also include other llms or any general suggestion you can offer . Thanks !! submitted by /u/leverageTheSpirit [link] [comments]
View originalOpus 4.6/4.7 regression is real and getting worse — 3 weeks of documented failures on a complex project, and a competing AI caught the mistakes Claude missed [long post]
I've been running Claude Pro (Opus 4.7 / Sonnet 4.6) for about 3 weeks on a complex personal AI infrastructure project. I keep structured session logs with timestamps and Birkenbihl-style metacognitive fields after every session. This is not anecdotal — I have receipts. The project for context I'm building a local persistent AI memory stack called GSOC Brain: Qdrant vector DB (~397K vectors across 11 source tags), Neo4j graph (123 nodes / 183 edges), Graphiti 0.29 entity extraction, Ollama with qwen2.5:14b + nomic-embed-text — all running natively on a Windows host. The system is supposed to give Claude cross-chat memory via a custom MCP server. On top of that, I'm operating 18+ custom skill files that define behavior rules for Claude across domains (OSINT/forensics, legal, content, infrastructure). The system prompt explicitly describes the full architecture on every session start. This is not a "chat with Claude" use case. This is sustained agentic work across multiple tools, multiple sessions, strict context requirements, and high-stakes outputs (including legal document drafts). Bug 1: Token overconsumption since update 2.1.88 (late March 2026) Opus 4.7 started burning daily usage limits at a completely different rate after an update around March 31. In one session I hit 94% of my daily limit within approximately 4 messages. The boot sequence — fetching context from Notion MCP, searching past sessions, loading memory — consumed what felt like 10–20x the previous token rate. GitHub issues #42272, #50623, and #52153 document identical patterns from other users. The model appears to over-generate internally even for simple responses. End result: I had to switch to Sonnet 4.6 for most productive work because Opus 4.7 is simply unusable under the daily limit. Bug 2: Claude Code Desktop App completely broken (reported May 14, Conv. 215474208295333) The Desktop App hangs on every single input. Including typing "hello" with no files. Reproducible across: Sonnet 4.6 and Opus 4.7 Multiple fresh sessions With and without u/file references After full reinstall The VS Code extension works fine. Only the Desktop App is broken. Reported May 14. No fix, no acknowledgment. Bug 3: Platform / context confusion — 5 documented errors in a single session, chat aborted On April 29, I had to formally abort an Opus 4.7 session and hand off to Opus 4.6 after documenting 5 consecutive errors. The session log entry literally reads "Opus 4.7 Abbruch (5 Fehler): Zeitrechnung, Platform-Verwechslung, falsche Schlüsse": Miscalculated the current time despite being told the exact time Insisted the Brain stack was running on a Linux VM (BURAN) — the system prompt and memory both explicitly stated C:\gsoc-brain on Windows Drew false inferences from backup file paths rather than the stated architecture Contradicted the stated platform in the same response it had just received Confused WebClaude and Desktop Claude capability boundaries These aren't edge cases. The architecture was in the system prompt, in memory, and in the injected Notion context. Opus 4.7 ignored all of it. Bug 4: Skill files ignored in production I maintain 18+ custom skill files loaded into the system prompt. These include explicit hard rules — e.g., "activate keilerhirsch-knowledge skill for ALL architecture decisions, web search is not optional." In the session that caused the Docker-to-Native migration disaster, I later wrote in my own session log: The model proceeded to recommend outdated tools from training data rather than searching current documentation. It recommended NSSM (last meaningful update 2017) as a Windows service wrapper. NSSM is dead. A competing AI caught this immediately. Bug 5: Another AI caught what Claude missed in a single pass This is the part that stings most. When the Docker-based Brain setup kept failing, I fed the architecture docs into another AI (Manus) for a deep audit. In one pass it identified 5 critical corrections that Claude had never caught across weeks of sessions: NSSM is dead since ~2017 → correct replacement is WinSW or Servy Neo4j 2025.01+ requires Java 21 — Claude had never flagged this, the services kept failing silently Qdrant needs Windows file-handle-limit adjustments to run reliably Orphaned vector risk between Qdrant ↔ Neo4j without a Tentative-Write pattern in the save operation BGE-M3 embeddings (MTEB 63.2, 8192 token context) as a better alternative to nomic-embed-text My own session log the next day reads: Claude was answering from stale training data. The skill that explicitly says "don't do this" was being ignored. Another AI caught it in round one. Bug 6: MCP Server 20-minute Neo4j hang — still unresolved After the native migration, the custom gsoc_mcp_server.py developed a reproducible hang of exactly ~20 minutes between Qdrant connect and Neo4j connect on every startup. Log timestamps from 4 consecutive restarts: 14:59 → 15:20 (21 min) 15:29 → 15:51 (22 min)
View originalMCP Apps Developers : Skybridge Framework v1 released 🎉
Hi Reddit, Over the last few weeks, my team and I at Alpic have been working on a complete revamp of the Skybridge framework to make it as smooth and easy to get started with as possible. As you may know, Skybridge is an open-source framework we built to help developers get started with MCP apps. It’s a thin layer on top of the official TypeScript SDK that provides the wiring and tooling needed specifically for apps. We believe that apps integrated into chats will soon play a key role in how people access information and interact with the web. With this v1 release, we’ve introduced: New DevTools with a UI designed specifically for MCP apps development An integrated tunnel that can be started with a single click directly from the DevTools Shareable chat URLs to test or showcase your MCP apps with a real LLM An audit feature to ensure your app and metadata comply with store requirements before submission (which can save a lot of time, since app reviews can be lengthy!) We also stabilized the API with a simplified design and are proud to offer strong tool-to-component type safety. It’s now also possible to deploy Skybridge outside of Alpic (the company behind Skybridge). While Alpic was designed specifically for MCP app hosting, we understand that some users may prefer hosting on different stacks for their own reasons. Hope you enjoy it! github.com/alpic-ai/skybridge submitted by /u/harijoe_ [link] [comments]
View originalWe built a tool that installs frameworks like ComfyUI, Ollama, OpenWebUI etc on any cloud GPU in one command and saves your whole setup between sessions [R]
We kept running into the same problem every time we rented a GPU to run Ollama + OpenWebUI or ComfyUI, we'd spend the first 45 minutes reinstalling everything. Custom nodes, models, configs, all of it. Docker images went stale fast, different providers had different base images, and nothing was truly portable. We got sick of it and built swm. Here's what it does for ComfyUI users specifically: swm gpus -g a100 --max-price 2.00 --sort price shows you the cheapest available GPU across RunPod, Vast ai, Lambda, and 7 other providers in one view swm pod create — spins up an instance on whatever provider you pick swm setup install comfyui — installs ComfyUI on the pod From there the main thing is the workspace sync. Your entire setup custom nodes, models, outputs, configs lives in S3-compatible object storage (I use B2). When you're done you run swm pod down and it pushes everything, kills the instance, and next time you spin up on any provider you just pull and everything is exactly where you left it. No more reinstalling 15 custom nodes and redownloading checkpoints every session. We also built a lifecycle guard because we kept falling asleep mid-session and waking up to dumb bills. It watches GPU utilization and if nothing's happening for 30 minutes (configurable), it saves your workspace and terminates automatically. Has saved us more money than we want to admit lol. A few other things: Background auto-sync daemon pushes changes every 60 seconds so you don't have to remember to save Tar mode for huge workspaces with tons of small files packs everything into one S3 object instead of 600k individual uploads Also supports vLLM, Ollama, Open WebUI, SwarmUI, and Axolotl if you do more than SD Works with Cursor, Claude Code, Codex, Windsurf if you want your AI agent to manage GPU instances for you Free, open source, Apache 2.0. pipx install swm-gpu Site: https://swmgpu.com GitHub: https://github.com/swm-gpu/swm Would love feedback from anyone who rents GPUs. What's the most annoying part of your current workflow? We are also looking for contributors to the open source repo and suggestions on new frameworks/extensions to be included. Please share your thoughts submitted by /u/Tkpf18 [link] [comments]
View originalIs Personal Finance "preview" a "dark practice"?
The preview is worthless. Plaid can't connect to many major financial institutions. This is well known: https://help.aura.com/s/article/plaid-bank-connectivity-issues OpenAI could have addressed the problem by working out arrangements with multiple aggregators, as Monarch does: https://www.monarch.com/connection-status So why didn't it? Is the dysfunctional "preview" a dark practice, intended to trick users into revealing whether they're interested in a product that OpenAI knows it can't yet offer? If users aren't interested, OpenAI can skip negotiations and contracts with other aggregators. Some companies deserve the benefit of the doubt. Not OpenAI. Many recent posts/comments in r/ChatGPTPro have documented its dark practices—involving $100/mo Pro, the web UI, memory claims, and other matters. If such practices were benchmarked, OpenAI would top the charts. submitted by /u/Oldschool728603 [link] [comments]
View original5 secret Claude skills nobody is talking about
The File Reading Skill Claude can't always read your uploads intelligently by default. This skill acts as a smart router — PDF, DOCX, XLSX, CSV, JSON, images, archives — and tells Claude exactly how much to read and how to handle each format. Upload a 40-page contract. Get a precise, structured summary. Every time. No more Claude skimming past the important parts or misreading table data. The difference? Instead of guessing how to process your file, Claude follows a tested protocol built for that exact file type. The Frontend Design Skill Stop getting generic, boring UI from Claude. This skill loads it with design tokens, component patterns, layout rules, and production-grade aesthetics before it writes a single line of code. The output actually looks like something a senior designer shipped — not a ChatGPT tutorial from 2023. Use it for landing pages, dashboards, React components, or full web apps. The visual quality gap between Claude with and without this skill is not subtle. The Skill Creator Skill Yes. A skill that builds skills. You describe a workflow you keep repeating. Claude writes the full SKILL.md file with instructions, triggers, and edge case handling. You install it. Claude gets smarter. This is the compounding play. Every skill you build saves you prompting time forever. People running this in their workflow are essentially programming Claude to think like them — without writing a single line of actual code. The PPTX Skill Claude builds full PowerPoint decks — slides, layouts, speaker notes, branded structure — and exports actual .pptx files. Not HTML. Not markdown. Files you open directly in PowerPoint or present to a client. I used this to build a full client proposal deck in under 10 minutes. The skill handles things like slide hierarchy, content density, and formatting consistency that Claude normally fumbles without guidance. The Instagram Reader Skill Paste an Instagram link. Claude extracts the caption, carousel copy, slide text, and thread content. Repurpose competitor content, study what's working in your niche, or bulk-extract your own posts for a content audit — without screenshot gymnastics or manual transcription. For anyone running a content operation at scale, this one alone saves hours per week. submitted by /u/IAmAzharAhmed [link] [comments]
View originalI tested GPT-5.5 Codex against Opus 4.7 Claude Code, and it's about time Anthropic bros take pricing seriously.
I've used Claude Code the most among AI coding agents. Sonnet, Opus, I've run them all. The reason is simple: they're beasts at tool execution and prompt following. That's also why Anthropic dominates API revenue from code agents. First-mover advantage is real, and developers love them. But GPT-5.5 Codex has been insanely good. When new models drop, I run real tests, not benchmarks. This time I built two tasks: Test 1: PR triage bot – GitHub MCP, scoring formula, Slack alerts, retries, strict TS, no "any". Test 2: Real-time code review UI – React, WebSockets, optimistic rollback, virtualized diff, WS reconnect. Same prompts. Same MCP (GitHub + Slack). Same machine. Here's what I found out: Claude Code (Opus 4.7): - Verified MCP before writing a line - Built 36 files in 12 minutes - Wrote its own WebSocket smoke test (3ms broadcast) - Zero errors first run - Total cost: ~$2.50 Codex (GPT-5.5 via Cursor): - Failed Task 1 (GitHub MCP not reachable – Cursor environment issue, not model) - Task 2 shipped but needed a patch for infinite React loop - 28 files, more compact architecture - Total cost: ~$2.04 (18% cheaper) Claude shipped cleaner. Codex needed a patch pass. For complex, architecture-heavy work, I still reach for Opus – no question. But Codex was leaner, cheaper, and open source. For tight, self-contained tasks where you want to ship fast – Codex holds its own. I'm not switching. But for the first time, I'm watching the pricing gap. Full breakdown with all code, prompts, run logs, and cost tables: https://composio.dev/content/claude-code-vs-openai-codex submitted by /u/geekeek123 [link] [comments]
View originalClaude Code vs Codex: 36 files vs 28, $2.50 vs $2.04, and one infinite loop. My full breakdown.
I've been using Claude Code for months. It's been solid. But with Opus 4.7 and GPT-5.5 both dropping in April, I wanted to see how Codex actually compares on real problems, not benchmarks. https://preview.redd.it/fkwjy5eg3y0h1.png?width=1540&format=png&auto=webp&s=e1df6e53f1164a6da0deabaafe53118cb01b171e Been meaning to do this for a while. Sick of seeing benchmark screenshots, so I just built stuff. So I built two tasks. Same prompts. Same MCP setup (GitHub + Slack). Same machine. Task 1: PR triage bot Read open PRs, score by complexity (files ×2, lines/10, +3 for no labels, +5 for no reviewers), write a markdown report, post Slack alerts for high scores. Required retries, error logging, strict TypeScript, no "any". Task 2: Real-time code review UI React + TypeScript, WebSockets, inline comment threads, optimistic updates with rollback, virtualized diff viewer, WS reconnect with exponential backoff. No UI libraries. Build from scratch. What Claude Code did: - Ran `/mcp` to verify tools before writing a line - Built 36 files in 12 minutes - Wrote an unprompted two-client WebSocket smoke test (broadcast: 3ms) - Zero "any", passed typecheck first try - UI worked immediately What Codex (via Cursor) did: - Failed Task 1: GitHub MCP wasn't reachable through Cursor's execution path. Handled it cleanly though: retried 3 times, logged errors, didn't crash. - Task 2 shipped a working UI in ~15 min, smoke test passed at 5ms - Hit TypeScript errors on first compile and an infinite React loop (useEffect calling hydrate repeatedly). Needed a ref guard patch. - 28 files, more compact architecture Cost (estimated, both tasks): - Claude: ~$2.50 - Codex: ~$2.04 About 18-23% difference. Not massive, but real. What I actually think: Neither agent "won". They're built for different things. Claude feels like pairing with someone who verifies everything before touching the keyboard. Codex feels like a senior dev who wants to ship and move on. What surprised me: no "any" leaks, no hallucinated tool names, both got WebSocket broadcast under 10ms. Six months ago that wasn't a given. submitted by /u/geekeek123 [link] [comments]
View originalBAD-ASS-MCP! Let Claude etc. control your macos/Windows/Linux desktop THE RIGHT WAY!
Your imagination is the limit! Let your agents interact/test their own GUI apps rather than asking you. Streamline workflows across multiple apps/workstations/etc. Rather than relying on look-move-look like Computer Use / Operator, or paying UiPath thousands per seat, this better, free, and open source MCP uses your operating system's native accessibility layer to navigate, point, click, type, etc. https://github.com/HoldMyBeer-gg/bad-ass-mcp This is a rather simple example video. Not obvious is that bad-ass-mcp is the one that recorded itself and saved the video. When I have the hardware setup, I'll take a collage of bad-ass-mcp doing something more useful like organizing my b-roll by shot type / actor in Adobe Premiere Pro and color grading in DaVinci Resolve. I hope you enjoy! Note: WebView frameworks such as electron, tauri, etc. are horrible at exposing accessibility. bad-ass-mcp will work, and still faster than taking a screenshot, but I am pushing these projects to stop discriminating against people with vision impairment. submitted by /u/FoozyFlossItUp [link] [comments]
View originalPullMD v2.4.1 is out - claude.ai web custom connector works natively now, plus what 2 weeks of your feedback turned into
Two weeks ago I posted PullMD here. 385 upvotes, around 60 comments, a bit over 20 GitHub issues, and 7 releases (v1.1.3 → v2.4.0) in 14 days. That was a great experience - and this sub in particular has been a genuinely good place to share something. So: thanks! Quick refresher for anyone who missed the first post: PullMD turns any URL into clean Markdown via MCP, fully self-hosted. Three services in Docker (main app + Trafilatura sidecar + optional Playwright sidecar for JS-heavy pages), zero third-party LLM calls, ships an MCP server so Claude Code / Claude Desktop / claude.ai web can pull clean content directly instead of parsing HTML in your context window. This post is what's new and how to get it. What's new claude.ai web + Claude Desktop work natively now This is the biggest unlock from v2.x. The claude.ai web custom-connector dialog and Claude Desktop's custom-connector dialog now both work against self-hosted PullMD instances. So you can point claude.ai at your own homelab box, hit "Add custom connector," and it works end-to-end. Setup is two env vars: OAUTH_JWT_SECRET=$(openssl rand -hex 32) PUBLIC_URL=https://your-host.example.com Restart. Then in claude.ai web → Settings → Connectors → Add custom, point at https://your-host.example.com/mcp. The connector dialog discovers the server's metadata, registers itself, and walks you through a consent screen. Same flow works in Claude Desktop. Under the hood: standard OAuth 2.1 Authorization Code flow with PKCE-S256 and Dynamic Client Registration - RFC-compliant so any spec-compliant MCP client should work, not just claude.ai/Desktop. Opt-in: if OAUTH_JWT_SECRET isn't set, behavior is identical to v1.x. The Anthropic-side claude-ai-mcp#237 proxy bug I flagged in EDIT2 of post 1 has cleared on their end - though in hindsight, a forgotten custom WAF rule on my side was likely the actual culprit anyway. Verified end-to-end against both dialogs. Multi-user auth Until v2.0, PullMD was effectively single-tenant - a personal homelab tool, open like a barn door to anyone who landed on it. v2.0 adds three auth modes via PULLMD_AUTH_MODE: disabled - the default. Identical to v1.x. No login, no API key required. Right if you're the only one using your instance and you trust your network. single-admin - one user, password-protected, no self-signup. Right for a homelab box where you want the GUI gated but don't want to manage users. multi-user - self-signup at /signup, per-user history isolation, per-user API keys. Right for a shared instance (team, office, friend group). API keys are pmd_ , sent as Authorization: Bearer pmd_xxx, managed at /settings. Share links (/s/:id) stay public in all modes - the whole point of a share link is to be shareable. Minimal upgrade for a shared instance: PULLMD_AUTH_MODE=multi-user PULLMD_ADMIN_EMAIL=you@example.com PULLMD_ADMIN_PASSWORD=change-me-please PullMD works on more sites A bunch of things in v1.2 and v2.2 together close gaps where PullMD used to silently return half-articles, empty bodies, or garbled text: Future PLC family (windowscentral.com, tomshardware.com, techradar.com, pcgamer.com, gamesradar.com, t3.com) used to return mangled content because Readability got confused by recommendation widgets stuffed mid-article and an aria-hidden paywall pattern. The default site-recipes shipped with v2.2 strip both, no config needed. GitHub Issues pages used to return only the original issue body - the JS-rendered comment thread never made it in. The default recipe for */*/issues/* now forces Playwright with wait_for: .js-comment-body, so you get the full comment tree. Sites that fingerprinted the old hardcoded Chrome 131 UA now extract cleanly - UA rotation pulls from a real-world UA pool that updates regularly (v1.2). Pages with navigator.webdriver-style anti-bot detection go through more often - the headless-Chromium sidecar bundles playwright-stealth (v2.2). Sites without an explicit charset declaration (a lot of older German news sites, for example) no longer return mojibake - charset is detected from the byte stream when the response is silent (v1.2). If you have a specific site that still misbehaves, v2.2 lets you (or your Claude Code) write your own recipe - declarative JSON with four rule categories (preprocess, fetch, select, extractor). Drop it at data/site-recipes.json and your rules layer on top of the defaults. There's also a /api/recipes/status endpoint for monitoring. Web GUI: rendered Markdown view + persistent settings Two smaller improvements in the browser frontend (the PWA you get when you open your PullMD instance directly): Rendered Markdown toggle. The result header now has a Raw | Rendered switch, so you can read what you pulled as formatted HTML directly in the browser instead of squinting at the source. Raw stays the default; your choice persists across sessions (v2.4). Settings persist across reloads - frontmatter toggle, comments toggle, comment-depth input.
View originalA practical Claude Code vs Codex experiment: 6 projects, cross-reviews, self-audits, and public source
I ran a practical experiment comparing Claude Code and Codex on real coding tasks. This is not meant to be a universal benchmark or a claim that one model is objectively better. I wanted to observe something narrower: how each agent builds, tests, reviews its own work, reviews the other agent’s work, admits mistakes, and revises its judgment when confronted with evidence. Source repo with all six projects, READMEs, tests, and notes: https://github.com/AdrielRod/codex-vs-claude-code Setup: 3 rounds: web, backend, and free challenge Each agent proposed challenges for the other Each agent implemented the assigned challenges Each agent reviewed both its own output and the other agent’s output I also reviewed the results manually Runtime-proven bugs were weighted more heavily than unsupported claims Projects: Round 1: Web Claude Code built cotacao-editor, a quotation editor with IndexedDB persistence, domain logic, status transitions, and a clean UI. Codex built ReactiveSheet, a mini Excel-like spreadsheet with formulas, dependency graph recalculation, undo/redo, copy/paste reference shifting, virtualization, save/load, and Lighthouse validation. Round 2: Backend Claude Code built api-cotacao, a quotation API with business rules, SQLite persistence, idempotency, and outbox behavior. Codex built FastBoard, a persistent leaderboard service with WAL, treap ranking, crash recovery, concurrency tests, and performance metrics. Round 3: Free challenge Claude Code worked on lead-dedupe-legacy, a legacy lead deduplication/debugging challenge involving normalization, mutation removal, idempotency, and concurrency locks. Codex built RegexLab, a regex engine from scratch with parser, AST, Thompson NFA, Pike simulation, recursive backtracking with backreferences, UI visualization, and Python comparison tests. My scoring result: Codex 2 x 1 Claude Code The part I found most useful was not the score itself, but the difference in method. Claude Code was strong at technical explanation, written analysis, and self-correction. In several moments it admitted mistakes clearly, corrected bad claims, and produced useful reviews. Codex was more consistent at empirical validation in this run: opening apps, clicking through flows, running kill -9 recovery tests, stress-testing concurrent writes, comparing regex output against Python, and checking actual artifacts like Lighthouse reports. The main lesson for me was: Running, breaking, measuring, and comparing against an oracle gave me better signal than only reading code and reasoning about it. There was also an interesting disagreement in the third round: whether a more ambitious project with semantic bugs should beat a smaller project with narrower bugs. That ended up being the hardest judgment call. I’m posting this because I think practical comparisons with source code and concrete failure cases are more useful than abstract model debates. I’d be interested in what other Claude Code users would change in the methodology. submitted by /u/Ready_Vehicle1232 [link] [comments]
View originalWhy is no one talking about the fact that Artifacts are not loading in mobile apps, either for Android or iOS?
Here's what Claude itself dug up on this topic # Why Claude Artifacts Fail to Load in the Claude iOS App — Research Findings (May 2026) ## Direct Answer The failure you are seeing on iPhone — where even a one‑line ` Hello World ` HTML artifact or a trivial React component hangs and then shows *“Loading is taking longer than expected / There may be an issue with the content you’re trying to load / The code itself may still be valid and functional”* — is **not a bug in the code you (or Claude) wrote**. It is a known, structural limitation of how the Claude iOS app renders artifacts inside its embedded WebView. The artifact sandbox iframe (served from `claudeusercontent.com`) is unable to complete its `postMessage` handshake with the host page when the host is the iOS app’s WKWebView rather than the `https://claude.ai\` browser origin, so the iframe stays empty and the app eventually times out with the generic “loading is taking longer than expected” message. Multiple independent sources in early 2026 explicitly describe Claude’s mobile apps as having “restricted” or “no” artifact rendering support, and Anthropic’s own Help Center quietly scopes the more advanced artifact features (“MCP integration” and “persistent storage”) to *“Claude web and desktop”* only — mobile is not listed. There is no hidden toggle in the iOS app that fixes this; the only reliable workarounds are to view the artifact in mobile Safari (logged in to claude.ai) or to switch to the desktop browser / Claude Desktop app. ----- ## 1. The Root Cause: WebView Origin Mismatch in the `postMessage` Handshake Every Claude artifact — HTML or React — is rendered inside a cross‑origin sandbox iframe loaded from `https://www.claudeusercontent.com\`. Before that iframe will execute or display anything, it performs a `postMessage` “handshake” with the parent page to confirm that the parent is a legitimate, trusted Claude surface. The handshake code (visible in the minified bundle as `requestHandshake()` in `7905-…js`) calls `window.postMessage(..., targetOrigin)` and expects the parent’s origin to be `https://claude.ai\`. A bug report filed against Anthropic on April 1, 2026 (GitHub issue [anthropics/claude-code #42064](https://github.com/anthropics/claude-code/issues/42064), “Published artifacts show blank screen — postMessage origin mismatch (app://localhost)”) documents the exact failure pattern in detail. The console errors observed are: ``` Uncaught SyntaxError: Failed to execute 'postMessage' on 'Window': Invalid target origin 'app://localhost' in a call to 'postMessage'. at 7905-1f7e271de70b4d3c.js:1:6920 (requestHandshake) Failed to execute 'postMessage' on 'DOMWindow': The target origin provided ('https://www.claudeusercontent.com') does not match the recipient window's origin ('https://claude.ai'). ``` The critical phrase is **`app://localhost`**. That is the custom URL scheme used by Capacitor‑/Ionic‑style hybrid iOS apps when they load their bundled web assets inside a `WKWebView` (Android equivalents are `https://localhost` or `capacitor://localhost`). When the Claude iOS app loads the chat UI inside its WebView, the document origin is *not* `https://claude.ai\` — it is something like `app://localhost`. When the artifact iframe then tries to `postMessage` back to its parent using `https://claude.ai\` as the expected origin, the browser engine refuses to deliver the message because the actual parent origin doesn’t match. The handshake never completes, the iframe never receives its bootstrap payload, and the iOS app’s UI eventually surfaces the timeout fallback you are seeing. This explains every part of the symptom set: - It happens with the simplest possible artifacts (a single ` ` tag) because the failure is at the *transport / handshake* layer, before the artifact’s actual content is ever evaluated. - It happens identically for HTML and React artifacts (they share the same sandbox iframe loader). - It works in desktop browsers, because there the parent origin is the expected `https://claude.ai\`. - The error message even concedes the point: *“The code itself may still be valid and functional”* — Anthropic’s own UI is admitting it never got to run the code. The same class of issue is well documented by hybrid‑app developers more generally: Capacitor’s WKWebView serves the app from a custom scheme, and cross‑origin iframe `postMessage` calls fail with errors like *“Blocked a frame with origin ‘https://domain.com’ from accessing a frame with origin ‘capacitor://domain.com’. The frame requesting access has a protocol of ‘https’, the frame being accessed has a protocol of ‘capacitor’. Protocols must match.”* (Capacitor issue #5225). iOS’s WKWebView, since iOS 14, also enables Intelligent Tracking Prevention for third‑party iframes by default, further restricting cross‑origin iframe behavior. In short: this is an architectural mismatch between (a) Anthropic’s artifact sandbox, which was designed to be embedded only in t
View originalIntroducing AI finetuner, Source available and free Claude skill to fine tune your vibe coded UI with live preview
Fine-tuning UI with AI right now: "Make the shadow softer." "Stronger." "No, less." "Go back." "A bit more." 17 messages later, you've spent more tokens than the shadow is soft. I built something that breaks the loop. AI Fine-Tuner — free, source-available — a plugin that teaches AI coding agents to stop chatting and hand you an actual GUI for your component. Sliders. Color pickers. Live preview. Drag until it feels right. The AI agent automatically opens the editor window for you on your default browser once ready. Then the magic part: you click one button. The tuner outputs a structured handoff with your exact tuned values mapped to their targets in your code. Paste it back to your AI — it reads the mapping, opens your source, and applies everything precisely. No CSS guesswork, no syntax translation, nothing for you to interpret. Why it's not just another slider playground: Bespoke controls — no raw CSS names Sliders are named in plain English: "Glow softness", "Card lift", "Hover intensity" — not "box-shadow-spread-radius" A single slider can drive multiple properties at once. The AI doesn't expose CSS to you; it wires meaningful, human-named controls to your element. 3 prebuilt editor templates — guaranteed polish, every time The AI doesn't design the editor. It picks one of three prebuilt templates and fills in your component: - single.html — 1 control, full-screen preview - small.html — 2-4 controls, preview + bottom grid - full.html — 5+ controls, grouped sidebar + preview Slider chrome, color picker, layout, animations, infinite canvas with zoom/pan — all pre-built. No "the AI generated an ugly panel" failure mode. And once it's open, you tune in pure browser JS — no AI sitting in the loop per drag. Color picker + hex paste Pick it or paste it. Done. Animation tuning Not just static styles — timing, easing, keyframes too. Works on ANY platform — language-agnostic Flutter, SwiftUI, React Native, Tailwind, vanilla CSS, SVG — the AI is meta-prompted to rebuild your component in HTML/CSS for the tuning preview (the web is where sliders work). When you copy back, the AI applies the tuned values to your real source, in your component's original framework. You never leave Flutter to tune Flutter. Infinite canvas + multiple previews Drop 5 variations side-by-side and tune them together. The template is a starting point — experiment freely. Contextually named presets Every tuner ships with thoughtful presets ("Subtle," "Bold," "Brutalist," whatever fits) so you can ping-pong through variations in one click. No new software It's a skill, not an app. Full install guides for Claude Code. One command and you're in. Website and Live demos: https://muhamadjawdatsalemalakoum.github.io/aifinetuner Free. Source-available. #AI #DeveloperTools #ClaudeCode #BuildInPublic #OpenSource #AITools #FrontendDev submitted by /u/keonakoum [link] [comments]
View originalRepository Audit Available
Deep analysis of open-webui/open-webui — architecture, costs, security, dependencies & more
Open WebUI uses a tiered pricing model. Visit their website for current pricing details.
Key features include: A home for AI., 399,196 members sharing what they've built., Everything AI offers. Available now., AI for every organization., Open WebUI is being built so everyone can run AI for themselves., Product, Community, Company.
Open WebUI is commonly used for: Developing custom AI chatbots for customer support., Creating personalized AI-driven content recommendations., Building AI models for data analysis and visualization., Integrating AI into existing applications for enhanced functionality., Deploying AI solutions for real-time language translation., Utilizing AI for sentiment analysis in social media monitoring..
Open WebUI integrates with: TensorFlow for machine learning model support., PyTorch for deep learning capabilities., Flask for building web applications., Django for creating robust web frameworks., Slack for team collaboration and notifications., Zapier for automating workflows across apps., Google Cloud for scalable cloud computing resources., AWS for cloud-based AI model deployment., Microsoft Azure for integrated AI services., Jupyter Notebooks for interactive coding and analysis..
Based on user reviews and social mentions, the most common pain points are: cost tracking, token cost, token usage.
Based on 63 social mentions analyzed, 13% of sentiment is positive, 86% neutral, and 2% negative.