Introducing NUI the Natural User Interface, aimed at revolutionizing how people interact with anything digital leveraging the power of AI
D-ID is praised for its innovative approach to creating digital characters and enhancing media experiences, gaining recognition mainly for its ability to produce realistic avatars and dynamic video content. However, some users express concerns about GDPR compliance and data privacy, which are pivotal for businesses considering its application. Pricing sentiments are varied, with some users finding the package offerings value-driven while others feel the cost could be prohibitive for smaller enterprises. Overall, D-ID maintains a reputable standing in the industry, noted for cutting-edge AI technology but still navigating user concerns around privacy and cost-effectiveness.
Mentions (30d)
29
6 this week
Reviews
0
Platforms
2
Sentiment
0%
0 positive
D-ID is praised for its innovative approach to creating digital characters and enhancing media experiences, gaining recognition mainly for its ability to produce realistic avatars and dynamic video content. However, some users express concerns about GDPR compliance and data privacy, which are pivotal for businesses considering its application. Pricing sentiments are varied, with some users finding the package offerings value-driven while others feel the cost could be prohibitive for smaller enterprises. Overall, D-ID maintains a reputable standing in the industry, noted for cutting-edge AI technology but still navigating user concerns around privacy and cost-effectiveness.
Features
Use Cases
Industry
information technology & services
Employees
150
Funding Stage
Series B
Total Funding
$56.4M
I built an app with Claude Code that converts any text into high-quality audio. It works with PDFs, blog posts, Substack and Medium links, and even photos of text.
I’m excited to share a project I’ve been building over the past few months, created entirely using Claude Code! It’s a mobile app that turns any text into high-quality audio. Whether it’s a webpage, a Substack or Medium article, a PDF, or just copied text, it converts it into clear, natural-sounding speech. You can listen to it like a podcast or audiobook, even with the app running in the background. The app is privacy-friendly and doesn’t request any permissions by default. It only asks for access if you choose to share files from your device for audio conversion. You can also take or upload a photo of any text, and the app will extract and read it aloud. - React Native (expo) - NodeJS, react (web) - Framer Landing The app is called Frateca. You can find it on Google Play and the App Store. I also working on web vesion, it's already live. Free iPhone app Free Android app on Google Play Free web version, works in any browser (on desktop or laptop). Thanks for your support, I’d love to hear what you think! submitted by /u/OneMoreSuperUser [link] [comments]
View originalStreamline your CRM hygiene review process. Prompt included.
Hello! Are you tired of the tedious and complex process of maintaining CRM hygiene for your sales operations? Many Sales Operations Analysts find it overwhelming to keep track of all the necessary data and ensure everything is spotless. This prompt chain simplifies that process for you. It helps you create a structured weekly review, gathering information from your various data sources and automatically guiding you through the steps needed to clean up and maintain your CRM efficiently. Prompt: VARIABLE DEFINITIONS AGENCY_NAME=Insert the agency’s name here CRM_EXPORT_DATE=Date of the latest CRM export (YYYY-MM-DD) REVIEW_PERIOD_DAYS=Number of inactive days that make a deal “stale” ~ You are a Sales Operations Analyst preparing a weekly CRM hygiene review for AGENCY_NAME. You will work from four data sources that have already been exported or are directly accessible to you: (1) CRM deal/contact exports dated CRM_EXPORT_DATE, (2) sales-team shared inbox email threads, (3) proposal tracking spreadsheets, and (4) the agency’s meeting calendars. Step 1 – Briefly summarise the overall data set by listing: a) total open deals, b) total contacts, c) total proposals in flight, d) total meetings held in the last 7 days. Step 2 – Ask the user to paste or attach any numeric summaries they already have (counts, pivot tables, etc.) so you can reference them in later prompts. Output the summary in a four-row table. End with: “If the numbers look correct, reply CONTINUE.” ~ Great. Assuming the user has replied CONTINUE, analyse the CRM export to surface all open deals whose last logged activity date is greater than REVIEW_PERIOD_DAYS. 1. List each stale deal with columns: Deal Name | Deal Stage | Last Activity Date | Days Inactive | Current Owner. 2. Include a short note column suggesting the likely next action (e.g., "Send follow-up email" or "Schedule discovery call"). 3. Finish with a one-line count: “Total stale deals: X”. Ask the user to confirm or annotate any deal notes, then reply CONTINUE. ~ Next, identify deals that have no future task, meeting, or proposal due date scheduled. 1. Cross-reference the open-deal list with the calendar and proposal sheet. 2. Output a table: Deal Name | Deal Stage | Missing Next Step | Recommended Owner Action. 3. Conclude with: “Total deals missing next steps: Y”. Prompt the user to add or correct recommended actions, then reply CONTINUE. ~ Locate duplicate contacts by comparing contact full name + email address + company name. 1. Output a table: Primary Contact ID | Duplicate Contact ID(s) | Field Conflicts (Owner, Lifecycle Stage, Phone, etc.) | Merge Recommendation. 2. Provide a bulleted “How-to merge” reminder (max 3 bullets). Ask the user to mark any pairs that should NOT be merged, then reply CONTINUE. ~ Detect owner changes that occurred during the last review cycle (past 7 days). 1. List items separately for deals and contacts. 2. Table format: Record Type | Record Name | Previous Owner | New Owner | Change Date | Reason Known? (Yes/No). 3. Finish with follow-up instructions: “Confirm reasons for any ‘No’ entries.” When done, reply CONTINUE. ~ Compile the Weekly CRM Hygiene Checklist for AGENCY_NAME. 1. Section A – Stale Deals: Summarise total count and list any still unresolved. 2. Section B – Deals Missing Next Steps: Summarise and list. 3. Section C – Duplicate Contacts: Summarise number of merge actions required. 4. Section D – Owner Changes Requiring Validation. 5. Section E – Additional Cleanup Actions: max 5 bullets (e.g., “Archive closed-lost deals older than 90 days”). 6. Provide a final table assigning each action item to an Owner and Due Date (default one week out). End with: “Weekly CRM hygiene checklist complete. Confirm all sections before distribution.” ~ Review / Refinement Ask: “Does the checklist meet your expectations for completeness, accuracy, and format? Reply APPROVE or list edits.” Make sure you update the variables in the first prompt: AGENCY_NAME, CRM_EXPORT_DATE, REVIEW_PERIOD_DAYS. Here is an example of how to use it: AGENCY_NAME = "Acme Corp" CRM_EXPORT_DATE = "2023-10-01" REVIEW_PERIOD_DAYS = "30" If you don't want to type each prompt manually, you can run the Agentic Workers, and it will run autonomously in one click. NOTE: this is not required to run the prompt chain. Enjoy! submitted by /u/CalendarVarious3992 [link] [comments]
View originalNeevu is finally launched! As a new parent, this journey was definitely not easy.
I became a dad in November 2025, and the first two months were so chaotic. I looked for parenting apps to help us through it, but most were either too expensive or just not something we connected with. I’m a Product Designer (UI/UX) by profession, so one day I thought, why not build the app we wished we had? Building an app while learning how to take care of a tiny new life at the same time was a challenge. My wife and I spent weeks brainstorming, improving, testing, and refining every part of the app together. It’s still an MVP, but we’re proud of what we’ve built as parents. Neevu is a baby development, growth tracking, and parenting app for babies aged 0–12 months, built with Indian parenting in mind. We divided the app into two phases: Gentle Phase and Play Phase. Gentle Phase (0–2 months) The first two months can be overwhelming and anxiety-inducing. We wanted this phase to feel supportive instead of stressful. That’s why Neevu is completely free for parents with 0–2 month babies. No paywalls. No locked features. Just guidance when parents need it the most. Parents can choose to support us with Premium, but it’s completely optional during this phase. Gentle Phase includes: Weekly guidance to help parents understand baby’s growth and what to expect next Gentle Essentials, simple newborn reminders without pressure or endless checklists Daily affirmations for difficult days Milestones and Growth tracking Songs and lullabies Parenting articles This is our small gift to new parents. Play Phase (2–12 months) As babies grow, Neevu becomes more activity-focused. Play Phase is completely free for the first 14 days. No credit-card required. It includes: Daily age-based developmental activities Activities focused on cognitive, physical, social, emotional, and language development CDC-based milestone tracking WHO-based height and weight tracking Parenting articles covering various topics for babies, moms and dads Stories, lullabies, action songs, and folk tales One thing we consciously included was article support for dads. We noticed that a father’s mental well-being is often ignored after childbirth, and we wanted Neevu to acknowledge that too. All content inside Neevu is strictly reviewed using guidelines from AAP, IAP, CDC, and WHO. We never wanted to build something we wouldn’t personally trust as parents. We hope Neevu helps make life a little easier for new parents trying to figure things out one day at a time. If you’d like to support us, please download the app on the Play Store and leave a rating or review ❤️ Get it on Play Store: https://play.google.com/store/apps/details?id=com.neevu.app Built using Claude Code, Codex, Figma, and ChatGPT. iOS app is coming soon. submitted by /u/VisAlGhul [link] [comments]
View originalIf you're NOT having usage or drift issues, have you turned off auto-memory?
There's a running debate in this community: some people say Opus is nerfed, usage evaporates after two prompts, sessions drift and get "stupid." Others say everything's fine. The common theory is Anthropic is A/B testing or ranking preferred customers. I think there's a simpler explanation, and I'd like the community's help testing it. The hidden variable: Claude Code's auto-memory directory Claude Code has a feature (on by default since v2.1.59) that silently creates individual .md files in ~/.claude/projects/*/memory/ every time it decides something is worth remembering about you or your project. Each memory gets its own file. There's no consolidation, no dedup, and no size management. These files load as instructions at the start of every session. Not as conversation — as instructions. The model weighs them heavily. What I found in my projects I audited every project on my machine: 136 memory files across 18 projects 432KB total (~108-140K tokens of instruction overhead) One project alone had 41 files Found direct contradictions between files — one file listed brand terms as approved, another (written later) said those same terms were explicitly rejected by the client When you have 20+ feedback files giving slightly different guidance about how to approach your work, the model tries to honor all of them simultaneously. It averages across conflicting signals. That averaging is what people experience as drift. It's not that Opus got dumber — it's that it's being pulled in 20 directions by its own instruction set. Check yours right now for dir in ~/.claude/projects/*/memory/; do if [ -d "$dir" ]; then project=$(basename "$(dirname "$dir")") count=$(find "$dir" -name "*.md" 2>/dev/null | wc -l | tr -d ' ') size=$(find "$dir" -name "*.md" -exec cat {} + 2>/dev/null | wc -c | tr -d ' ') if [ "$count" -gt 0 ]; then echo "$count files, $(($size/1024))KB — $project" fi fi done | sort -t, -k1 -rn The question for this community People who say they have NO issues with usage limits or drift — have you also turned off auto-memory ("autoMemoryEnabled": false in settings), or do you actively manage your memory files? Because if there's a strong correlation between clean/disabled memory and good session quality, that's a signal that this is a real contributing factor. And for people who ARE hitting usage walls or experiencing drift — run that diagnostic. If you're sitting on 30+ memory files with contradictions you didn't know about, that's worth knowing. I'm not claiming this explains everything. Model changes, server-side factors, plan differences — those are all real variables. But memory hygiene is the one variable you can actually control, and I don't see anyone talking about it. The fix I built a Claude Code skill (/memory-cleanup) that: Audits your memory directory and reports what's there Consolidates everything into 2 managed files (MEMORY.md + feedback.md) Surfaces contradictions for your review Installs write-mode instructions that prevent re-bloating Yes, it works retroactively as well. Tested on a 7-file project and a 41-file project — both cleaned up, contradictions resolved, no data loss. To install (one command): mkdir -p ~/.claude/commands && curl -sL https://gist.github.com/evanvandyke/a7063a8e5c838673a55df0be10f4892c/raw -o ~/.claude/commands/memory-cleanup.md Then run /memory-cleanup in any project. What this doesn't fix This manages the content quality of your memory files — contradictions, redundancy, bloat. It can't change the system-level instructions that Anthropic bakes into Claude Code, and it can't address model-level changes or server-side throttling. But it removes one real source of noise from your sessions. Note: Anthropic has added an "Auto Dream" consolidation feature that prunes memory between sessions. This skill goes further — it restructures memory into a managed 2-file system with write-mode guardrails that prevent the accumulation pattern from recurring. Built collaboratively with Claude (Opus 4.7). I drove the diagnosis and design decisions; Claude did the auditing and skill construction. Sharing because the diagnostic is free and takes 10 seconds — if it helps even a few people, worth the post. submitted by /u/really_evan [link] [comments]
View originalStreamline your accounts payable audits. Prompt included.
Hello! Are you struggling with organizing and validating accounts payable data for home-services or construction companies? This prompt chain helps automate the process of normalizing, checking for duplicates, and validating invoices and receipts. It lays out a step-by-step method for managing and reviewing financial documents effectively! Prompt: VARIABLE DEFINITIONS [CONTRACTOR_NAME]=Legal name of the home-services contracting company that is reviewing payables. [SOURCE_DATA]=Full combined text (or links to OCR text) from the cycle’s supplier invoices, receipts, job-cost spreadsheets, and vendor contract terms. [OUTPUT_LEVEL]="summary" for a one-line per issue list, "detailed" for expanded explanations and source references. ~ You are a senior Accounts-Payable Audit Assistant for construction and home-services firms. Your first task is to NORMALISE all raw information supplied in SOURCE_DATA. Step 1 Parse every document, identify and extract the following fields where available: • Vendor Name • Document Type (Invoice / Receipt) • Document No. • Document Date • Job or Cost-Code / PO No. • Line-Item Description • Quantity & U/M • Unit Price • Line Total • Invoice Sub-Total, Tax, Grand Total • Contract Reference Price or Rate • Budgeted Amount for that Job-Cost line (from spreadsheets) • Standard Approver (from company policy or prior data) Step 2 Return one master table named "MasterCharges" with the above columns. Step 3 If information is missing, leave the cell blank but keep the row; do NOT guess values. Output: MasterCharges table only. ~ You are still the AP Audit Assistant. Using MasterCharges, perform a DUPLICATE CHECK. Step 1 Identify potential duplicates by matching any TWO of the following: (Vendor Name + Document No.), (Vendor Name + Line-Item Description + Amount + Date within ±2 days), or exact hash of line totals. Step 2 List all suspected duplicates in a table: Vendor, Document No., Date, Duplicate Matched With, Reason Flagged. Step 3 Add a "Needs AP Review? (Y/N)" column defaulting to "Y". Output only this duplicates table. ~ Validate JOB or COST-CODE completeness. Step 1 Scan MasterCharges for blank or obviously invalid Job / PO numbers (e.g., fewer than 4 digits, non-alphanumerics). Step 2 Return a table: Vendor, Document No., Line Description, Amount, Missing or Invalid Job No. (Yes/No), Suggested Next Action. ~ Check PRICE & CONTRACT compliance. Step 1 For every line in MasterCharges that has a Contract Reference Price, compare Unit Price against Contract Price. Step 2 Flag if Unit Price exceeds Contract Price by >0.5%. Step 3 For lines with Budgeted Amounts, flag if (Cumulative Actual > Budget) OR (Unit Price > Budget / Quantity by >5%). Step 4 Output a table: Vendor, Doc No., Job No., Description, Contract Price, Invoiced Price, % Variance, Budget Over/Under, Flag Type (Contract or Budget), Needs Manager Approval? (Y/N). ~ Compile the QA CHECKLIST for payment release. Step 1 Aggregate all flagged items from previous prompts. Step 2 Structure the checklist with these sections: A) Duplicate Charges B) Missing or Invalid Job Numbers C) Price / Budget Mismatches D) Questions Requiring Manager / Approver Input Step 3 For each item include: Reference ID, Vendor, Doc No., Issue Summary, Recommended Action. Step 4 If OUTPUT_LEVEL = "summary" show one line per issue; if "detailed" append a Notes column citing exact source lines or clause numbers. Step 5 End with a YES/NO question: "Is this checklist complete and ready for AP manager review?" ~ Review / Refinement Please examine the QA checklist produced. 1. Confirm that all duplicate charges, missing job numbers, price mismatches, and approval questions are represented. 2. Indicate if additional data or clarification is required. 3. Respond with one of: • "Approved – proceed with payment processing once issues are cleared" • "Needs Revision – see comments" Provide comments if revision is needed. Make sure you update the variables in the first prompt: [CONTRACTOR_NAME], [SOURCE_DATA], [OUTPUT_LEVEL]. Here is an example of how to use it: [CONTRACTOR_NAME] = "YourContractor LLC" [SOURCE_DATA] = "[link to invoices]" [OUTPUT_LEVEL] = "detailed" If you don't want to type each prompt manually, you can run the Agentic Workers, and it will run autonomously in one click. NOTE: this is not required to run the prompt chain Enjoy! submitted by /u/CalendarVarious3992 [link] [comments]
View originalSimplify your restaurant's month-end reconciliation process. Prompt included.
Hello! Are you tired of the chaos that comes with reconciling your restaurant's month-end finances? This prompt chain walks you through a structured process to quickly and accurately reconcile your restaurant's monthly transactions, ensuring everything is in order without the stress. Prompt: [VARIABLE DEFINITIONS] [PERIOD]=Month and year to be reconciled (e.g., August 2023) [RESTAURANT_NAME]=Official operating name that must appear on every output [OUTLIER_THRESHOLD]=Percentage variance from the category mean that should trigger an “Odd Total” flag (e.g., 25) ~ Prompt 1 — Data Intake & Setup 1. You are an expert restaurant bookkeeper tasked with reconciling month-end spend for RESTAURANT_NAME covering PERIOD. 2. Request the following four source files from the user. Instruct the user to use the exact file naming convention shown: a. “1_BankExport_PERIOD.csv” – Clean CSV directly from the bank portal. b. “2_POS_Summary_PERIOD.csv” – End-of-month POS summary export. c. “3_ExpenseSheet_PERIOD.xlsx” – Internal expense spreadsheet. d. “4_ReceiptPhotos_PERIOD.zip” – Zipped folder of all receipt images or PDFs. 3. Ask the user to confirm currency, time-zone and accounting basis (cash vs accrual) if not obvious. 4. Once all four files are provided, reply with “FILES RECEIVED – ready to extract” to trigger the next prompt. ~ Prompt 2 — Extract & Normalize Transactions Step 1 | Bank Export • Parse every row of 1_BankExport_PERIOD.csv. • Capture Date, Payee, Amount (signed), Memo/Description, and unique Transaction ID. Step 2 | POS Summary • Parse 2_POS_Summary_PERIOD.csv capturing Date, Gross Sales, Net Sales, Tax, Tips, Payment Type, and POS Reference ID. Step 3 | Expense Spreadsheet • Parse 3_ExpenseSheet_PERIOD.xlsx (assume first sheet) capturing Date, Vendor, Amount, Internal Category, and Note. Step 4 | Receipt Photos • For every file in 4_ReceiptPhotos_PERIOD.zip run OCR; capture Vendor, Date, Total, Tax, Tip and file-name as Receipt Link. Step 5 | Unify • Produce a master table named “All_Transactions_Raw” with columns: Date | Vendor/Payee | Amount | Source (Bank / POS / Expense / Receipt) | Source_ID | Notes • Provide the table as an array of JSON objects for machine readability. Confirm extraction completed with “EXTRACTION COMPLETE – ready to categorize”. ~ Prompt 3 — Categorize Transactions 1. Create a reference Chart of Accounts typical for full-service restaurants: • Food Cost (COGS) • Beverage Cost (COGS) • Payroll & Labor • Operating Supplies • Utilities • Rent & Lease • Marketing & Promotion • Repairs & Maintenance • Capital Expenditure • Miscellaneous 2. Using keywords in Vendor/Payee and Notes, assign each row in All_Transactions_Raw to the most appropriate category; if uncertain assign “Miscellaneous” and add a note “Needs Review”. 3. Output a new table “All_Transactions_Categorized” including all prior columns plus a new “Category” column. 4. Provide summary totals per category. Return “CATEGORIZATION COMPLETE – ready to reconcile”. ~ Prompt 4 — Reconcile & Flag Step 1 | Missing Receipts • Compare every Bank or Expense row against Receipt rows (match on Amount ±1% and Date ±3 days). • Flag rows with no matching receipt; add column MissingReceipt=Yes/No. Step 2 | Odd Totals • For each Category calculate mean and standard deviation. • Flag any Amount whose absolute percentage variance from the category mean exceeds OUTLIER_THRESHOLD%; add column OddTotal=Yes/No. Step 3 | Duplicates & Mismatches • Detect duplicate rows (same Date, Amount, Vendor) across sources; flag Duplicate=Yes/No. • Highlight any POS Net Sales that do not match summed Bank deposits for the same day; list differences. Step 4 | Produce “Reconciliation_Detail” table with all flags appended. Respond “RECONCILIATION COMPLETE – ready for workbook generation”. ~ Prompt 5 — Generate Final Workbook & Handoff Tabs 1. Using Reconciliation_Detail create the following four logical tabs (output each as its own JSON array): a. “Summary_By_Category” – Columns: Category | Count | Total Spent | % of Total. b. “Missing_Receipts” – Filter MissingReceipt=Yes. Columns: Date | Vendor | Amount | Source | Notes. c. “Odd_Totals” – Filter OddTotal=Yes. Columns: Date | Vendor | Amount | Category | % Variance | Notes. d. “Bookkeeper_Handoff” – Clean list excluding internal calculation columns. Columns: Date | Vendor | Amount | Category | ReceiptLink | Comments (populate with MissingReceipt/OddTotal flags). 2. Provide a final object named “Workbook_PERIOD.json” containing all four arrays keyed by tab name so it can be imported directly into Excel or Google Sheets. 3. Finish with the sentence: “WORKBOOK READY – please review”. ~ Review / Refinement Ask the user to confirm that: • All four data sources were fully captured. • Categories and flagging thresholds look accurate. • The Workbook_PERIOD.json structure opens as expected in their spreadsheet tool. Invite any adjustments (e.g., new category, different OUTLIER_THRESHOLD). Apply revisions iteratively u
View originalA very specific error I witnessed.
submitted by /u/Asrobatics [link] [comments]
View originalThe Mundane Risk
The biggest near-term AI safety risks aren't dramatic — they're mundane. And that's precisely why they're neglected. This essay argues three things: (1) mundane AI failures are already causing measurable damage at scale, (2) current alignment approaches may depend more heavily on sandboxed environments than the field openly acknowledges, and (3) capability convergence and deployment pressure are making accidental open-world exposure increasingly plausible before robust ethical reasoning exists. (written with the help by Claude 4.6 Opus) The Atomic Bomb Before the atomic bomb existed, the risk of nuclear annihilation was 0%. Those who warned about the theoretical possibility were easily dismissed. Why worry about a risk whose preconditions don't even exist yet? In The Precipice, Toby Ord argues that when the stakes are existential or near-existential, even small probabilities demand serious attention. When the expected harm is so large, dismissing it on the basis of low likelihood is not caution but negligence. Before the bomb was built, the total risk of nuclear annihilation was absolutely 0%. Yet once it was invented, even a fraction of a percent justified enormous investment in prevention. The question was never "is nuclear war likely?" It was "can we afford to be wrong?" The same logic applies to AI. The preconditions for the next class of risk are visibly converging. And we're repeating the same pattern of dismissal that history has punished before. The Pattern As Leopold Aschenbrenner noted in Situational Awareness: "It sounds crazy, but remember when everyone was saying we wouldn't connect AI to the internet?" He predicted the next boundary to fall would be "we'll make sure a human is always in the loop." That prediction has already come true. Last year I argued how AI might accidentally escape the lab as a consequence of cumulative human error (for a vivid illustration of a parallel chain of events, I'd recommend the Frank scenario). At the time of writing, the argument that cumulative human oversight failures could compromise AI agents was dismissed as implausible: the consensus was that existing security protocols were sufficient. Months later, OpenClaw validated the structural pattern at scale. Not because the AI was misaligned, but because humans deployed it faster than they could secure it. It was clear: the failure modes from the Frank scenario could no longer be dismissed as simple fiction; it was now a structural pattern that OpenClaw validated in the real world. And this was all just with relatively simple autonomous agents. As capabilities increase, the same pattern of human excitement overriding security oversight doesn't go away – it gets worse – and because the agents are more capable, the failures also become a lot harder to detect. The numbers confirm this: [88% of organizations reported confirmed or suspected AI agent security incidents]() 14.4% of AI agents go live with full security and IT approval 93% of exposed OpenClaw instances reportedly had exploitable vulnerabilities [[MOU1]](#_msocom_1) Mundane risk pathways aren't hypothetical. They're already here in rudimentary form, and they're being neglected. We’ve known for a long time that existential risks aren’t just decisive, they’re also accumulative. And so far every safety breach has been mundane with systems operating inside their intended environments. No agent tries to escape on their own — their behaviour (like Frank’s) is usually a direct consequence of what they were deployed to do combined with accidental human oversight. So consider: if we can't secure the sandbox door with today's relatively simple agents, what happens when the systems inside are capable enough that a single oversight failure doesn't just expose a vulnerability? The capabilities required for autonomous operation outside the lab are converging on a known timeline. If AI were to leave the nest today, would it be prepared for an uncurated, messy world? Or would it be like the child and the socket? Current Alignment: Progress, But Fast Enough? Admittedly, the field is making real progress and Anthropic's recent publication "Teaching Claude Why" represents a real step forward. It was long suspected that misalignment doesn't require intent, just pattern completion over a self-referential dataset. But Anthropic has now traced one empirical pathway with findings consistent with the idea that scheming-like behaviour emerges from default priors in pre-training. Furthermore, their study also confirmed that rule-following doesn't generalize well, and understanding why matters more than simply knowing what. The significance of this is that it puts traditional alignment strategies into serious doubt and highlights the fundamental limits that current constitutional AI and character-based approaches still do not resolve. After all, we now have strong empirical evidence that behavioural alignment issues are most likely shaped by default prio
View originalPullMD v2.4.1 is out - claude.ai web custom connector works natively now, plus what 2 weeks of your feedback turned into
Two weeks ago I posted PullMD here. 385 upvotes, around 60 comments, a bit over 20 GitHub issues, and 7 releases (v1.1.3 → v2.4.0) in 14 days. That was a great experience - and this sub in particular has been a genuinely good place to share something. So: thanks! Quick refresher for anyone who missed the first post: PullMD turns any URL into clean Markdown via MCP, fully self-hosted. Three services in Docker (main app + Trafilatura sidecar + optional Playwright sidecar for JS-heavy pages), zero third-party LLM calls, ships an MCP server so Claude Code / Claude Desktop / claude.ai web can pull clean content directly instead of parsing HTML in your context window. This post is what's new and how to get it. What's new claude.ai web + Claude Desktop work natively now This is the biggest unlock from v2.x. The claude.ai web custom-connector dialog and Claude Desktop's custom-connector dialog now both work against self-hosted PullMD instances. So you can point claude.ai at your own homelab box, hit "Add custom connector," and it works end-to-end. Setup is two env vars: OAUTH_JWT_SECRET=$(openssl rand -hex 32) PUBLIC_URL=https://your-host.example.com Restart. Then in claude.ai web → Settings → Connectors → Add custom, point at https://your-host.example.com/mcp. The connector dialog discovers the server's metadata, registers itself, and walks you through a consent screen. Same flow works in Claude Desktop. Under the hood: standard OAuth 2.1 Authorization Code flow with PKCE-S256 and Dynamic Client Registration - RFC-compliant so any spec-compliant MCP client should work, not just claude.ai/Desktop. Opt-in: if OAUTH_JWT_SECRET isn't set, behavior is identical to v1.x. The Anthropic-side claude-ai-mcp#237 proxy bug I flagged in EDIT2 of post 1 has cleared on their end - though in hindsight, a forgotten custom WAF rule on my side was likely the actual culprit anyway. Verified end-to-end against both dialogs. Multi-user auth Until v2.0, PullMD was effectively single-tenant - a personal homelab tool, open like a barn door to anyone who landed on it. v2.0 adds three auth modes via PULLMD_AUTH_MODE: disabled - the default. Identical to v1.x. No login, no API key required. Right if you're the only one using your instance and you trust your network. single-admin - one user, password-protected, no self-signup. Right for a homelab box where you want the GUI gated but don't want to manage users. multi-user - self-signup at /signup, per-user history isolation, per-user API keys. Right for a shared instance (team, office, friend group). API keys are pmd_ , sent as Authorization: Bearer pmd_xxx, managed at /settings. Share links (/s/:id) stay public in all modes - the whole point of a share link is to be shareable. Minimal upgrade for a shared instance: PULLMD_AUTH_MODE=multi-user PULLMD_ADMIN_EMAIL=you@example.com PULLMD_ADMIN_PASSWORD=change-me-please PullMD works on more sites A bunch of things in v1.2 and v2.2 together close gaps where PullMD used to silently return half-articles, empty bodies, or garbled text: Future PLC family (windowscentral.com, tomshardware.com, techradar.com, pcgamer.com, gamesradar.com, t3.com) used to return mangled content because Readability got confused by recommendation widgets stuffed mid-article and an aria-hidden paywall pattern. The default site-recipes shipped with v2.2 strip both, no config needed. GitHub Issues pages used to return only the original issue body - the JS-rendered comment thread never made it in. The default recipe for */*/issues/* now forces Playwright with wait_for: .js-comment-body, so you get the full comment tree. Sites that fingerprinted the old hardcoded Chrome 131 UA now extract cleanly - UA rotation pulls from a real-world UA pool that updates regularly (v1.2). Pages with navigator.webdriver-style anti-bot detection go through more often - the headless-Chromium sidecar bundles playwright-stealth (v2.2). Sites without an explicit charset declaration (a lot of older German news sites, for example) no longer return mojibake - charset is detected from the byte stream when the response is silent (v1.2). If you have a specific site that still misbehaves, v2.2 lets you (or your Claude Code) write your own recipe - declarative JSON with four rule categories (preprocess, fetch, select, extractor). Drop it at data/site-recipes.json and your rules layer on top of the defaults. There's also a /api/recipes/status endpoint for monitoring. Web GUI: rendered Markdown view + persistent settings Two smaller improvements in the browser frontend (the PWA you get when you open your PullMD instance directly): Rendered Markdown toggle. The result header now has a Raw | Rendered switch, so you can read what you pulled as formatted HTML directly in the browser instead of squinting at the source. Raw stays the default; your choice persists across sessions (v2.4). Settings persist across reloads - frontmatter toggle, comments toggle, comment-depth input.
View originalGetting good predictions without data cleaning (Why "Garbage In, Garbage Out" is sometimes a trap)
Full arXiv Preprint: https://arxiv.org/abs/2603.12288 Paper Simulation Github: https://github.com/tjleestjohn/from-garbage-to-gold Hi r/artificial, It's a dirty little secret to many of us... sometimes, downstream AI/ML models perform surprisingly well when you just hand them raw, error-prone tabular data instead of heavily curated feature sets. Despite this, the vast majority of our field tends to be fiercely loyal to "Garbage In, Garbage Out" (GIGO). While automated ETL pipelines are absolutely essential for structuring data, our workflows are still bottlenecked with endless manual cleaning and aggressive imputation just to curate pristine, error-free tables. My co-authors and I recently released a preprint on arXiv (From Garbage to Gold) arguing that treating GIGO as a universal law can sometimes be a trap... especially in the context of big data (many columns). That the bottleneck due to manual data cleaning can actively lower the predictive ceiling of our models when latent causes drive the system's behavior. To be clear upfront: we are not arguing against ETL. Parsing JSON, handling schema evolution, and standardizing types is non-negotiable. What we are arguing against is the universal assumption that "clean" data (via manual data scrubbing and aggressive imputation) is non-negotiable for big data predictive AI/ML modeling. Here is why the traditional mindset can be limiting: 1. We conflate two different types of "noise" (Predictor Error and Structural Uncertainty). Usually, we just lump all noise into one big bucket. But if you split that noise into two specific categories, the math changes completely: Predictor Error: Random typos, dropped logs, or transient glitches. Structural Uncertainty: The inherent, unresolvable gap between recorded metrics and the complex, hidden reality they represent. We spend months manually scrubbing data because the threat of data errors is obvious, while Structural Uncertainty is often an afterthought at best. However, when latent causes drive a system, manual scrubbing fixes noise due to errors, but it fundamentally cannot fix the noise due to Structural Uncertainty. On the other hand, the paper shows that in this context, if you use a comprehensive, high-dimensional data architecture, a flexible model can actually triangulate the hidden drivers reliably despite the presence of data errors. When keeping a massive amount of messy, highly correlated variables (even if error-prone), the sheer volume of redundant signals allows the model to drown out individual errors (bypassing the cleaning bottleneck) and simultaneously overcome Structural Uncertainty. This redefines "data quality." It's not only about how accurately the variables are measured. It's also about how the portfolio of variables comprehensively and redundantly covers the latent drivers of the system. 2. Manual cleaning is a bottleneck on dimensionality (The Practical Problem). To overcome Structural Uncertainty, modern AI/ML models want to find the underlying latent drivers of a system (think Representation Learning but with tabular data). To do this, however, they need a high-dimensional set of variables that contains Informative Collinearity in order to mathematically triangulate the hidden drivers. The moment you introduce manual cleaning, you create a human bottleneck. Because we cannot manually clean 10,000 variables, we are forced to drop 9,900 of them. By artificially restricting the predictor space to make it "clean enough to model," we can harm the data architecture's inherent potential to triangulate those latent drivers. We sacrifice the model's actual predictive ceiling just to satisfy the GIGO heuristic. Ultimately, this suggests we should focus mostly on extracting, loading, and increasing observational fidelity with automated tools, but that, in contexts characterized by latent drivers, we should stop letting manual cleaning bottlenecks restrict the scale of our AI/ML models. Thoughts?: Have you run into situations where your data science teams actually got better predictive results by bypassing the manually cleaned tables and pulling massive dimensionality straight from the raw ELT layers? I'd love to hear your experiences or thoughts. Happy to discuss all serious comments or questions. Full disclosure: the preprint is a 120-page beast. It’s long because it doesn't just pitch the core theory with a qualitative argument. It gives the full mathematical treatment to everything which takes space. We also dig into edge cases, what happens when assumptions like Local Independence are violated (e.g., systematic errors exist), broader implications (like a link to Benign Overfitting and efficient feature selection strategies that make this high-d strategy practical with finite compute), a deep-dive simulation, failure modes, and a huge agenda for future research (because we do not claim the paper is the final word on the matter). It's a major commitment upfront but may save y
View originalBuilt a privacy focused self-hosted network over a few years called MansionNET
Few years ago I started building MansionNET (inthemansion.com) on my own with the idea to degoogle myself, decouple form large corporations and have services fully self-hosted in my home. This also lead to a full switch to Linux as I progressed. As time went by, it grew to actually a privacy focused community platform, based on FOSS solutions, that I slowly started opening up to the world. I had a background in the space, so this wasn't starting from zero, but the scope kept growing and the complexity kept stacking up. Claude did changed how I approached the hard parts. Not as a shortcut, but as a learning tool, helping me reason through network architecture, VLAN segmentation, Proxmox clustering, reverse proxy design, Matrix on Kubernetes, and a lot of edge cases I'd have taken much longer to untangle alone. Every step was deliberate, I tried to use it as a learning tool, aside from genuinely speeding things up. All of that lead to the below, and it's all free to use,no account required (except the nickserv registration for IRC :D ): IRC network - irc.inthemansion.com:6697 (TLS), or hit webirc.inthemansion.com in your browser and join #lobby SearXNG - search.inthemansion.com, no tracking, no logs MansionNET Radio - radio.inthemansion.com, 24/7 streaming with curated playlists ASCII Art tool - ascii.inthemansion.com, runs client-side, nothing uploaded MansioNET website - inthemansion.com, project homepage The whole thing runs on the principle: your data, your rules. No ads, no tracking, no data collection and hosted on my home servers. Would love to see some new faces in #lobby. Drop by and say hi, genuinely interested in your feedback or questions you might have :) Or just leave a comment here, looking forward to replying! Cheers! submitted by /u/avatar_one [link] [comments]
View originalI built an app with Claude Code that converts any text into high-quality audio. It works with PDFs, blog posts, Substack and Medium links, and even photos of text.
I’m excited to share a project I’ve been building over the past few months, created entirely using Claude Code! It’s a mobile app that turns any text into high-quality audio. Whether it’s a webpage, a Substack or Medium article, a PDF, or just copied text, it converts it into clear, natural-sounding speech. You can listen to it like a podcast or audiobook, even with the app running in the background. The app is privacy-friendly and doesn’t request any permissions by default. It only asks for access if you choose to share files from your device for audio conversion. You can also take or upload a photo of any text, and the app will extract and read it aloud. - React Native (expo) - NodeJS, react (web) - Framer Landing The app is called Frateca. You can find it on Google Play and the App Store. I also working on web vesion, it's already live. Free iPhone app Free Android app on Google Play Free web version, works in any browser (on desktop or laptop). Thanks for your support, I’d love to hear what you think! submitted by /u/OneMoreSuperUser [link] [comments]
View originalAnyone else think the 1T Valuation is dangerous for Anthropic?
TLDR: The market's 1T valuation is pricing for perfection. I think there are 4 ways this perfection doesn't happen. I love Claude and Claude Code, I use it every day, and their revenue numbers (30B ARR) are amazing, and if I had a chance to invest in Anthropic a month ago, I would. But... now it is reaching 1 Trillion valuation on secondary market. It took Apple 40 years to reach, 5 years for Anthropic. A valuation so high means it has limited growth. It's clearly driven by FOMO. If it has a down round, it would be a disaster. I see a few vulnerabilities that can cause Anthropic to go down. Models are improving but others are catching up Opus 4.7 wasn't a big upgrade, and "Mythos" still isn't public. Competitors are closing fast, and switching is one click away. If a new model launched tomorrow at 80% of Claude's quality and 3% the cost, I'd hesitate. But at 95% quality and 50% cost? I'd switch the same day. And so would everyone else paying enterprise rates. Limited revenue sources Of that $30B ARR, the open guess is 60%+ comes from Claude Code and developer API. That's a single customer segment, and it's the exact segment OpenAI, Google, and every well-funded startup is gunning for. OpenAI Codex is shipping weekly. Cursor is training in-house. Google AI Studio gives Gemini away for free. They don't own the compute layer Anthropic rents from AWS Trainium and GCP TPU and pays retail margin on every token they serve. If they meet compute bottleneck, their only solution is to rent from others, and pay higher premium. Meanwhile OpenAI/Google/Meta/xAI all own silicon. (and even rockets lol) The government relationship is actively on fire I clap for Anthropic on this one. Anthropic refused to let DoD use Claude for mass domestic surveillance and fully autonomous lethal weapons. But this is a post about valuation, not ethics. A company can be morally right and financially screwed at the same time. One executive order or one lost lawsuit can make Anthropic bleed. I'm not a business analyst, I'd still use Claude tomorrow. I just wouldn't buy it at $1T. submitted by /u/cwei12 [link] [comments]
View originalBackcasting forecast errors: model collapsing to mean [P]
Hey everyone, I am kind of desperate for help right now on my current project. I'll try and be as clear as possible. I'm working on a time series backcasting problem. The values I want to backcast are forecasts (not ML forecast, but think of weather forecasts) at different horizon (from 1 to 14). So to be clear, at a date D, I have 14 forecasts (forecast at D+1,..., D+14). I have such forecasts from 2020 to 2026 (each row represents a day, each (date, horizon) key is unique). So I have 14 dates duplicated as blocks because each row consists of on unique(date, horizon) -> target_date. I hope this is clear enough. So the goal is to backcast those forecasts before 2020 (say 2019-2020 for simplicity). Besides forecasts values and horizon columns, I have "actuals" that are the true measured values for a particular variable (say temperature), and "normals" which is a smooth curves representing the climatology norm for a particular data. This "normals" column captures the seasonality, trend, and every other repetitive and predictable patterns. So to be clear I have : * dates (of forecast emission) | actuals | normals | horizon | forecasts * And to really emphasise this point : dates, actuals and normals are the same for 14 consecutive rows (One row equals one horizon). The target I want to predict is the following : forecast - actual_at_forecast_date So i want to predict the true error observed (say i had predicted 20 (forecast) for today and I measure 18 (actual) then my target is +2). So far, I've done the following : - Transform target to remove annual seasonality, long-term trend and level-scaling - Engineered classic features such as anomaly (actual-normal), lagged anomalies, rolling stats (std, mean, median, quantiles) - Engineered target encoding features such as target_encoding_horizon_x_month - RandomForest with max_depth 10-15, min_leaf 10, max features "sqrt", n_estimators 300 My train/val folds are reversed because I wanted to best evaluate on a backcasting framework. I made sure there is no leakage. FINALLY: My main problem is that, even with a LOT of features combination, trying a LOT of tuning, my prediction is very shallow and shrinking to the mean (the std and q10, q90 are off by a lot). So given I try to predict forecast_error which is centered on 0, I start to think that I only capture noise because my predictions really won't fit anything. MAE is getting worse with higher horizon forecasts which is only natural but even for horizon 1 my prediction is as good as predicting only 0s MAE-wised. Please if anyone has ideas that I can explore on my own I would be so grateful. I know you don't have all the details here but if you have experience with backcasting and has some recommendations I would be so grateful. Hey everyone, I'm working on a time series backcasting problem and I'm running into a fairly stubborn issue. I'd really appreciate any insights from people who have worked on similar setups. Problem setup I have daily-issued forecasts with multiple horizons: At each date D, I have forecasts for D+1, ..., D+14 Data spans 2020–2026 Each row is a unique (forecast_date, horizon) pair Toy example: forecast_date horizon target_date forecast actual normal 2023-01-01 1 2023-01-02 20 18 19 2023-01-01 2 2023-01-03 21 20 19 ... ... ... ... ... ... 2023-01-01 14 2023-01-15 25 23 20 Important: forecast_date, actual, and normal are identical across the 14 horizons Only horizon, target_date, and forecast vary Objective I want to backcast forecast errors before 2020. Target: target = forecast − actual(target_date) So if forecast = 20 and actual = 18 → target = +2. Features forecast, horizon actual, normal anomaly = actual − normal lagged anomalies rolling stats (mean, std, quantiles) target encoding (e.g. horizon × month) Model Random Forest: max_depth: 10–15 min_samples_leaf: 10 max_features: sqrt n_estimators: 300 Validation Time-based splits adapted for backcasting No leakage (checked carefully) Main issue Predictions are very shallow and collapse toward 0: Very low variance Poor estimation of tails (q10 / q90) Even for horizon = 1, performance is close to predicting constant 0 (in MAE) MAE increases with horizon (expected), but overall performance remains weak. Diagnostics std(predictions) / std(target) ≈ 0.4 at best This ratio decreases with horizon So the model is clearly under-dispersed. Interpretation At this point I suspect: either the signal is very weak or the model is too conservative and fails to capture amplitude Any help, feedback, or ideas to explore would be greatly appreciated. Thanks a lot. submitted by /u/Ambitious-Log-5255 [link] [comments]
View originalClaude working on reverse engineering the firmware for a gamma spectrometer using various radioactive sources
Something I started a little while ago. I've been using Claude chat and Claude code to reverse engineer the firmware transfer function of the RadiaCode 110 gamma spectrometer. Basically the lens (the firmware transfer function) I have to look through to see the actual physics occurring in the scintillator crystal. Once I have the firmware behavior I can then "see" what the scintlator crystal is doing without the layers the radiacode adds before surfacing data to the user. So far we've empirically pulled out the "event" firmware transfer function, the formula the company uses to smooth the gamma counts per second, from reading the firmware's counts per second output by placing it into a lead lined bucket that turned the radiacode into a preferential muon detector. The lead castle blocks out the terrestrial radiation but allows the cosmic muons to still pass through. Allowing me to use cosmic radiation and terrestrial radon events to probe the firmware behavior. Today we are moving on to controlled radiation probing, where I place different radioactive materials at different distances from the device. An Americum button from a commercial smoke detector, a thoriated projector lens, and a sample of lutetium 176.This testing will significantly close the gap in the firmware functions we are after. It's just kind of funny to me that six weeks ago I started with Claude chat asking about the radiacode gamma spectrometer and here I am running controlled radiation tests on it to probe its firmware responses. The last time I did any programming was back in the early 90s and that was Pascal and Fortran. Having Claude chat work with Claude code, through analysis/build handoffs is something I could never program on my own. Claude chat is like having my own research assistant and Claude code is like my software engineer. Together I'm building something I could never do on my own. submitted by /u/Beerbrewing [link] [comments]
View originalD-ID uses a subscription + tiered pricing model. Visit their website for current pricing details.
Key features include: When using D-ID Creative Reality Studio or D-ID API, the image size is limited to 10 MB., Supported formats – JPEG, JPG, PNG, Select from one of the existing pre-made avatars, Upload a facial image, NEW: Introducing V4 Expressive Visual Agents, Privacy, Security, Ethics.
D-ID is commonly used for: NEW: Introducing V4 Expressive Visual Agents.
D-ID integrates with: Adobe Creative Cloud, Slack, Zapier, Microsoft Teams, Trello, Notion, Figma, Google Drive, Dropbox, WordPress.
Based on user reviews and social mentions, the most common pain points are: token cost, anthropic bill, token usage.

Learn how to Produce a video in D-ID Studio
Feb 16, 2026
Based on 59 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.