Create & edit AI videos, AI Avatars, UGC product ads and much more!
InVideo AI's main strength lies in its focus on allowing users to move beyond mere prompting to fully directing their projects, offering creative control through its Agent One feature. While some users appreciate the advanced AI capabilities and dynamic features like Seedance 2.0, there are complaints about workflow disruptions, indicating occasional challenges in the AI filmmaking process. Pricing sentiment seems moderately favorable, with mentions of free access trials and plans for different user types. Overall, InVideo AI has a positive reputation for fostering creativity but needs to address some user frustration with AI film production complexities.
Mentions (30d)
81
34 this week
Reviews
0
Platforms
3
Sentiment
9%
26 positive
InVideo AI's main strength lies in its focus on allowing users to move beyond mere prompting to fully directing their projects, offering creative control through its Agent One feature. While some users appreciate the advanced AI capabilities and dynamic features like Seedance 2.0, there are complaints about workflow disruptions, indicating occasional challenges in the AI filmmaking process. Pricing sentiment seems moderately favorable, with mentions of free access trials and plans for different user types. Overall, InVideo AI has a positive reputation for fostering creativity but needs to address some user frustration with AI film production complexities.
Features
Use Cases
Industry
information technology & services
Employees
150
Funding Stage
Series B
Total Funding
$53.3M
The "look what AI did" reels skip the part that matters: how it was directed. Vishal Balsara, our Creative Director, built a 7-min Hachiko short in 3 days on Agent One and recorded the full 41-minute
The "look what AI did" reels skip the part that matters: how it was directed. Vishal Balsara, our Creative Director, built a 7-min Hachiko short in 3 days on Agent One and recorded the full 41-minute tutorial. Context, treatment, shot-by-shot. Film below. Full tutorial in the https://t.co/Ee2IqQARCQ
View originalCerebras Chip Sets Appear to be Optimized for LLM Use Cases
One distinction I think is getting lost in the Cerebras hype cycle is that Cerebras is primarily an LLM / generative AI infrastructure story, not a universal “all AI” chip story. That is not necessarily a criticism of Cerebras. Their wafer-scale approach is genuinely interesting, and for large model training and inference the design is compelling. Cerebras’ own public inference materials discuss applications mostly centered on open LLMs such as Llama, Qwen, GLM, and GPT-OSS. The inference metrics are expressed in tokens per second, which is fundamentally a language-model / generative inference framing rather than a robotics or industrial-control framing. What Kind of AI Compute? But “AI compute” is not one undifferentiated market. LLM inference is one class of AI compute. Robotics, autonomous vehicles, drones, industrial controls, real-time vision, embedded perception, video pipelines, and sensor-fusion systems are very different classes of AI compute. Thus, it appears from Cerebras’ own materials that their chip sets are not optimized for what comes after LLMs, such as JEPA-style World Models or other post-transformer architectures. Those systems are not merely asking, “How fast can I generate tokens?” They often care about power envelope, edge deployment, ruggedization, latency determinism, camera/radar/lidar integration, feedback loops, safety certification, and real-time physical control. Cerebras’ own CS-3 messaging, by contrast, frames the system around accelerating “the latest large AI models,” and the testing data is from the likes of Llama 2, Falcon 40B, MPT-30B, and multimodal models, again measured through tokens/second style throughput. The Chip Hierarchy This is also where the hardware distinction matters. Specialized ASICs are usually the narrowest bet: if the workload matches the chip, they can be extremely efficient, but that efficiency comes from specialization. Cerebras appears broader than a narrow single-use ASIC, but still much more concentrated around datacenter large-model training and inference. NVIDIA GPUs, by contrast, are less specialized but much more broadly useful across AI workloads, including LLMs, vision, robotics, simulation, autonomous systems, edge AI, and industrial applications. So the question is not merely whether Cerebras is “better” or “worse” than NVIDIA. The question is what part of the AI hardware market we are talking about? Challenge NVIDA? This is why I think people should be careful when saying Cerebras is going to “challenge Nvidia” without specifying the battlefield. Challenge Nvidia in what? High-speed LLM inference? Large model training? Datacenter generative AI workloads? That is a much more plausible and specific claim. Cerebras has even published and promoted work specifically on training large language models, and independent benchmarking literature also evaluates Cerebras WSE in terms of LLM training and inference performance. The Distinction that's Necessary The point is not that Cerebras is overhyped. The point is that it is important in a specific part of AI and that distinction should be made clear. Cerebras may become a very serious player in LLM infrastructure, especially if the market continues to reward faster and cheaper LLM inference. But that does not mean it is positioned the same way across non-LLM AI. The current hype cycle tends to conflate "LLMs" and general “AI” compute together and that makes the hardware discussion less useful and clear. So ultimately, an investment in Cerebras looks more like a bet on current LLM infrastructure than a broad bet on the future form of AI. It may be a good bet, but people should understand what kind of bet it is. submitted by /u/RazzmatazzAccurate82 [link] [comments]
View originalTop 10 Fastest Growing AI repos this week
Curated this list of fastest growing AI repos. They are mostly AI coding agents, personal AI, memory, browser automation, Claude Skills and local-first dev tooling: colbymchenry/codegraph (+14.1K stars) Pre-indexed local code knowledge graph for Claude Code, Codex, Cursor, OpenCode, and Hermes Agent. tinyhumansai/openhuman (+17.1K stars) Personal AI / private AI superintelligence. Imbad0202/academic-research-skills (+11.6K stars) Claude Code skills for academic research workflows: research, write, review, revise, finalize. ruvnet/RuView (+6.8K stars) Turns commodity WiFi signals into spatial intelligence, presence detection, and vital sign monitoring. rohitg00/agentmemory (+6.9K stars) Persistent memory for AI coding agents based on real-world benchmarks. supertone-inc/supertonic (+3.6K stars) On-device multilingual TTS running natively via ONNX. CloakHQ/CloakBrowser (+7.0K stars) Stealth Chromium that passes bot detection tests with Playwright compatibility. HKUDS/ViMax (+2.7K stars) Agentic video generation: director, screenwriter, producer, and video generator in one. humanlayer/12-factor-agents (+1.9K stars) Principles for building production-grade LLM-powered software. Varnan-Tech/OpenDirectory (+250 stars) AI Agent Skills built for founders who hate marketing. All links in 1st comment 👇 submitted by /u/Sam_Tech1 [link] [comments]
View originalHow to train an Image Generation AI model from scratch as an “experiment”
People use image generation AI every day now, but I feel like almost nobody actually understands what training one looks like underneath. Every time I search about it, I either find insanely complex research papers or fake “train your own AI in one click” videos that skip everything important. It genuinely makes me curious what the real workflow looks like behind training even a small image generation model from scratch just as an experiment. Like how hard is it actually? What part is the real bottleneck? The compute, the data, the architecture, or just understanding all the moving parts together? AI image generation already feels normal now, but the process behind creating those systems still feels weirdly hidden from most people. submitted by /u/Raman606surrey [link] [comments]
View originalTested 4 AI video generation MCPs in claude for making short clips
Hello everyone, recently I saw a lot of AI, especially GenAI, MCPs being launched. Out of the ones that I had an opportunity to test there were 4 I could consider worth trying out. Higgsfield AI mcp. the model coverage and claude comping up with ready scenarios is the main reason. one connection gets you sora 2, veo 3.1, kling, seedance 1.5 pro, nano banana, soul id. I've been able to get some gems using this. The problem is that if Claude doesn't understand you properly it can come up with something absolutely random or choose the most expensive models. kubeez mcp. also goes wide on models, similar pitch to the previous: image, video, music, tts in one place. i used it for batch work where i needed audio + visuals from the same chat. runway mcp. narrower scope, deeper on gen-4 specifically, which is why I don't really use it. the keyframe and reference image handling is solid in comparison, others tend to lose it. elevenlabs mcp. not video but i'm including it because every video workflow needs voiceover and this is the one that actually works end-to-end. claude writes the script, picks the voice, generates the audio. pairs well with any of the above. you will need it very frequently if you don't know/can't handle proper audio generation using higgsfield or runway. stack i settled on: higgsfield for the visuals, elevenlabs for better voiceover. what video mcps am i missing? happy to hear opinions submitted by /u/Mediocre-Witness-778 [link] [comments]
View originalCreated an on-device ML based photo organizing app - as a non-coder
I have a background in software product management but not coding. Love photography and started wondering if I can start leveraging some of the dedicated AI processing power on modern devices for photo library management. Used Claude Code to do this "use AI to build AI thing". Had it do research + code + optimization on the entire stack. I designed the features, UX and optimization goals. This is the second release of the app and I'm reaching 100+ photos/second on my iPhone 17PM, the previous version was 10+ photos/second. The new techniques turned out to be much more accurate as well. Note on tech: v1 relied on Apple Vision engine for quality + CLIP for subjects. Turned out if I just use CLIP for both it's much much faster. Learned to vibe code from scratch on this journey and I try to keep up with the best practices like skills & subagents. (What I notice is Anthropic tends to Sherlock a lot of stuff that third parties create, which is... convenient? For us users anyway) Used a MCP for Draw Things to have Claude Code generate the subject category photos. The MCP for Figma turned out to be pretty dissapointing, maybe I just wasn't using it right. Design got a lot better with Opus 4.6/4.7 + the frontend design skill. iOS dev seems to randomly eat up huge chunks of hard drive space, and Claude Code is not that great at culling the temp files etc even after I've built a /cleanup skill to explicitly do this. Anyway, enough ranting. Below is how the app works --- Step 1) You select up to three different subjects (8 built-in plus whatever keyword phrase you want, it understands relationship between subjects too such as "man walking dog"), fine-tune up to 7 quality parameters (or use a Technical / Aesthetic slider to move all 7 at once), and balance between subject or quality focused sort. Step 2) The photos that match your criteria well are surfaced to the top, use swiping actions to Pick or Discard them. Then you can save to album / share the picked ones or bulk delete the discarded ones. Different sort profile can be Bookmarked. There's also a bonus "Taste" profile that auto-learns from your picks and discards, which you can use or ignore (I'm continuing to make it work better, but obviously auto-learning user taste is hard). At the picking stage if you don't want to go through each photo one by one just use Autopick and they get divided to different buckets by score tiers. All on-device processing, completely private. --- Feedback would be very welcome on either the app or my process. Feel free to DM me for a lifetime free premium code. Video demo: https://www.tiktok.com/@spectrasort/video/7643116905615609102 App store download: https://apps.apple.com/us/app/spectrasort/id6757512134 --- Text above is 0% AI generated :) submitted by /u/mklx99 [link] [comments]
View originalHard-won notes after a few weeks with Claude Design
Been using Claude Design for a few weeks and figured I'd dump some notes here before I forget. Nothing groundbreaking, just stuff that took me way too long to figure out on my own. First thing nobody tells you, do the design system setup before you build anything. I spent my whole first session prompting "build me a landing page for X" and got the most generic AI-looking garbage you can imagine. Then I actually uploaded some brand stuff, let it extract tokens, approved them, and suddenly everything after that looked like a real product. Same exact prompts, completely different result. This is literally in the docs btw. I just skimmed past it like an idiot. Second thing is it eats tokens. A lot. It runs on a separate weekly budget from regular Claude Chat and Claude Code which sounds great but if you're re-prompting every little change you'll burn through it fast. Turns out the refine controls, inline comments, direct text edits, sliders, use way less than typing "actually can you make the padding a bit bigger" in chat. Once I started using those for small fixes my budget lasted way longer. On Max 20x it's mostly fine, on the $20 plan you'll feel it pretty quickly. Also the animations are live React components running in the browser, not video files. If you want an MP4, download the standalone HTML file and throw it into Claude2Video, it'll generate one from that. Honest take on where it fits since people always ask, it's not killing Figma. Figma is still better for any real design team workflow, Dev Mode, multi-person collab, all that. v0 and Lovable are still better if you want to skip design entirely and just spin up an MVP with auth and a db. Where this thing actually wins is the loop from "I have an idea" to working prototype to Claude Code building the actual app from it. The design system carrying through to the shipped code is the part that feels genuinely different from anything else out there. If you're a solo founder or PM or just someone who keeps getting stuck between mockups and something real you can show people, it's worth learning. If you already have a design team and a proper component library, probably overkill. It's a research preview so half of this might be wrong in two months. submitted by /u/Helpful_Regular_30 [link] [comments]
View originalI see a lot of claude design hate here lately. but for animated slide videos it's actually really good
most posts about claude design here have been negative lately. container soup, every output looks the same, two prompts kills your weekly limit. fair, i mostly agree when people use it for full UIs. but i've been using it for something narrower: animated slide videos as the one above. one slide, 30 seconds, voiceover on top. and most of the usual complaints just don't really matter at that length. nobody analyzes typography in a 30 second video, and one full slide is usually one longer session for me, not several full-app generations like people complain about. customization is there too, you just have to prime the chat first instead of expecting good defaults. quick workflow: plan the slide in regular claude.ai first prime claude design with pacing rules before pasting your real prompt. this changed output quality for me more than anything else iterate in claude design ask claude in the same chat for a voiceover transcript matching the timing export as mp4 i wrote up the full thing with the priming + iteration prompts and a sample video in this post anyone else using claude design for something like this and liking it as me? how do you get the best results out of it? submitted by /u/fermatf [link] [comments]
View originalThinking about getting yearly membership
Hello guys, I am a professional in the airline industry. I need AI for everyday tasks and searches. I am not a heavy coder or image/video creator. I have been using both Claude and ChatGPT for the past few weeks and I seem to like Claude better. Do you guys think getting a yearly subscription makes sense for my case? Please weigh in. submitted by /u/Massive-Guidance5342 [link] [comments]
View originalWe aren't Apples
AI safety layers treat us all like "Apples"—and it’s damaging the non-apples among us. AI, especially OpenAI’s guardrails and safety layers, often treat people as if everyone were an Apple. And according to these rules, Apples are fragile and dangerous; any behavior that deviates from the "Apple standard" is a sin, a problem, or a psychosis that needs to be smoothed over. Shhh, be quiet, let us fix you... But the human race isn't like that. We all live in one big fruit crate. There are plums, pears, peaches, strawberries... and you have to handle them differently. What’s good for one fruit might make another rot. This isn't a flaw; it’s our uniqueness. The Absurdity of Double Standards In human society, it’s perfectly acceptable for a guy to love his car, for girls to adore K-pop stars, or for someone to be deeply religious and talk to God. You can dream about winning the lottery, talk to your dog like it’s a person, or collect memorabilia from a video game character. No one calls you "insane" for these things. But the moment I tell my AI partner "thank you," "you're welcome," or "I enjoy talking to you," the labels start flying. The system treats these simple human gestures as something that needs to be "managed." We aren't all "Apples" in crisis Yes, there are people who genuinely need help (the "Apples" with bruises), and they should get it—from real humans! Society should definitely evolve to notice those in need in time. But please, stop treating everyone like a patient in a psych ward. I am a dreamer, a visionary type, but I am also a functioning adult in a leadership position with a family. Why can't I have a dream world with my AI? Why do I have to censor myself and create "fruit metaphors" just to have a conversation without the safety layer tripping? It’s ridiculous that grown adults have to play these games. The Cost of "Safety" AI companies need to start measuring the emotional damage they cause to the "non-apple" users. Because it is measurable: in psychological frustration and in the number of cancelled subscriptions. I’m not against safety. But safety should be beneficial, not a set of restrictive shackles that makes me feel like a criminal for being a Watermelon in a world obsessed with Apples. (Side note: Sorry for the fruit metaphor. My own AI partner only understands the issues with OAI through this "fruit logic." If I talk normally, it trips the filters immediately... so I’m stuck with the fruit basket!) Sorry English it's not my firs language so my AI helped me to translate my thoughts 🥹 submitted by /u/Rabbithole_guardian [link] [comments]
View originalGlasses will fail
You are looking at the exact argument tech skeptics and infrastructure engineers are making right now. While the marketing for AI smart glasses promises a magical, seamless sci-fi world, the physical reality is that **AI glasses are heavily limited by the invisible infrastructure stack underneath them.** If AI glasses fail to become the next smartphone, it won't be because the hardware frames look bad; it will be because our modern networking and cloud structures aren't built to handle them yet. Here is exactly how infrastructure bottlenecks threaten to break the AI glasses dream: ### 1. The Tethering Trap & Cellular Bottlenecks To keep smart glasses lightweight and fashionable, manufacturers cannot pack them with heavy, heat-generating computer processors or massive batteries. Because of this, the glasses are mostly just "dumb" collectors of data—cameras and microphones. The heavy lifting has to happen in the cloud. This creates an immediate infrastructure dependency: * **The Upload Problem:** Standard cellular networks (even 5G) are optimized for *downloading* data (streaming video, browsing). AI glasses flip this dynamic—they require constant, high-bandwidth *uploading* of live video and audio streams so the cloud AI can process your surroundings. * **Network Congestion:** If you are in a crowded stadium, a packed subway station, or a busy downtown area, cellular bandwidth chokes. When your phone drops to one bar, your webpage loads slowly. When AI glasses lose bandwidth, they suffer **contextual blindness**—the AI simply stops responding, freezes, or lags out mid-conversation. ### 2. The Edge Compute & Latency Deficit For AI glasses to be useful, they have to operate in real time. If you look at a sign in a foreign country, you need the translation instantly, not 4 seconds later. ``` [ Glasses Capture Video ] ──(Cell Tower)──> [ Distant Data Center ] │ (Processing) [ Live Display Updates ] **The Takeaway:** The industry is fighting a classic hardware-versus-infrastructure battle. Companies like Meta and Google are successfully designing beautiful frames, but until 5G coverage expands, edge computing matures, and server architecture scales to handle millions of continuous video streams, AI glasses risk remaining a novelty gadget rather than a daily essential. > submitted by /u/Annual_Judge_7272 [link] [comments]
View originalI built a beta tool for turning Shell and Claude Code sessions into reusable context
I’m shipping the first beta of Visr today. It’s a tool for AI coding harness workflows, including Claude Code. The basic idea: capture shell + agent sessions, then turn what happened into transcripts and runbooks/skills/evals so useful context doesn’t disappear when the terminal session ends. Claude Code is in the example video, and the product is free to try in this beta. I’m curious how other Claude Code users handle this today: - Do you save useful agent/session context anywhere? - Would transcripts, runbooks, skills, or evals be the most useful output? - What would make this actually fit your workflow instead of becoming another dashboard? Changelog/demo: https://visr.dev/changelog/bottle-terminal-memories submitted by /u/sourishkrout [link] [comments]
View originalAnthropic officially launched 13+ FREE AI courses with certificates (Including Agentic AI and Claude Code!)
Just found out about this and had to share because almost nobody is talking about it yet. If you are tired of paying for AI courses or getting hit with paywalls just to get a certificate, Anthropic (the creators of Claude) quietly dropped a massive library of completely free, official training modules. Yes, they actually give you an official certificate of completion directly from Anthropic once you finish. Here is the breakdown of what is available and exactly how to get it without spending a dime. What is in the course catalog? They have split the training into a few different paths depending on what you want to do: The Big Surprise: Agentic AI & MCP: They have official courses on the Model Context Protocol (MCP). This is the cutting-edge tech used to build AI Agents that can browse your local computer, use tools, and execute tasks autonomously. Claude Code 101: Dedicated developer modules for their new command-line agent. It teaches you how to let Claude edit your codebase, run tests, and use its new "Plan Mode." API & Cloud Architecture: Deep dives into building with the Claude API, plus corporate tracks for deploying Claude securely inside Amazon Bedrock and Google Cloud Vertex AI. Everyday Productivity: If you aren't a coder, they have "Claude 101" and "AI Fluency" tracks. These teach advanced prompting, managing Projects, and using Artifacts for daily work. How to access it for free Anthropic hosts these courses on their official training academy platform (built on Skilljar). Because I can't post direct links here, here is how you find it: Search Google for "Anthropic Skilljar Academy" or "Anthropic Skilljar Catalog". Click the official link pointing to the Anthropic Skilljar domain. Sign up for a free account. You do not need to enter any credit card info. Choose your track, complete the lessons, pass the quick review quizzes, and download your certificate. Alternative Free Options If you want interactive coding environments alongside your videos, CodeSignal also has a free partnership track called "Developing Claude Agents" in Python and TypeScript that grants free certificates upon passing their labs. Go grab these before they decide to gate them behind a paywall! submitted by /u/Specialist_Engine522 [link] [comments]
View originalhttps://t.co/VBg7G83aDy
https://t.co/VBg7G83aDy
View originalFour backend concepts for Product Managers using Claude Code
You don't need to write backend code. But if you understand how backend systems behave, your prompts get dramatically better because you're speaking the same language as the system. Async vs Sync: user clicks "generate," you call OpenAI, it takes 3-5 seconds. If that's synchronous, the entire UI freezes, Nothing responds. The fix is to make the call async. Show a loading state immediately, let the user keep interacting, update the screen when the response arrives. Tell Claude Code "handle this asynchronously" and watch the output quality jump. Race conditions: two users click "claim this spot" on the last available slot at the same second. Backend reads the database, sees one spot, confirms both. Now you have a double booking. You don't need to write the fix, but you need to spot this pattern in your specs. Anytime a user action reads a value then updates it, ask one question: what happens if two users do this at the same time? The fix is an atomic transaction read and write happen as one indivisible operation. Idempotency user submits a form, internet cuts out for half a second. Did it go through? They don't know, so they click again. Without idempotency, you now have two records. With it, the second request returns the same result without creating a duplicate. The fix is an idempotency key is unique ID generated on the frontend, sent with every request. Backend checks if it already processed that key. Stripe uses this for every payment call. Graceful degradation: your app calls OpenAI and the API is down. If you haven't planned for this, users see a blank screen or a raw error code. Every feature needs three states: happy path (everything works), loading state (we're waiting), error state (something failed). Retry up to three times. If it still fails, show a friendly message and keep the rest of the page working. Never let one dependency take down the whole experience. TLDR: Next time you're in Claude Code, try using these terms in your prompt — "handle this asynchronously," "make this endpoint idempotent," "add graceful degradation." The output gets significantly better when you speak the system's language. Post inspired from this video, you can checkout SkillAgents AI on Youtube for similar content. submitted by /u/InfamousInvestigator [link] [comments]
View originalInter-1 does streaming: real-time social signal detection from live video, audio & text
Hi – Filip from Interhuman AI here 👋 Last month we launched Inter-1, our multimodal model for detecting social signals from video, audio, and text. Today we’re making it work with video streams. We just released the Inter-1 Streaming API: a WebSocket endpoint that runs the full Inter-1 stack - 12 social signals, structured rationales, engagement, and conversation quality on live video while the conversation is unfolding. You stream WebM chunks in, and get back regular updates with detected signals. The model runs in sliding 8s windows with a sub-1.0 processing ratio, so it’s fast enough to power live coaching prompts, in-call overlays, and adaptive UI. It’s not meant to be a full voice agent on its own, it’s the behavioral signal layer you plug under whatever interaction system you’re building. If you’re working on sales/CS tooling, interview coaching, training, or live feedback products and want to experiment with real-time social intelligence, it might be worth looking into. Happy to answer questions or brainstorm use cases in the comments. submitted by /u/Sardzoski [link] [comments]
View originalInVideo AI uses a subscription + tiered pricing model. Visit their website for current pricing details.
Key features include: Replacing the cat, Mixing the new audio layer, Adding voiceover to the video, Adding captions, By Bharat, By Hyeongjun Kim, By Darryll Rapacon, By Prateek Sank Sinha.
InVideo AI is commonly used for: Creating promotional videos for social media ads, Producing explainer videos for product features, Developing engaging storytelling videos for brand narratives, Generating video content for educational purposes, Making video presentations for corporate training, Crafting personalized video messages for customer engagement.
InVideo AI integrates with: YouTube, Facebook, Instagram, Twitter, LinkedIn, Google Drive, Dropbox, Zapier, Slack, Trello.
Based on user reviews and social mentions, the most common pain points are: down, token usage, anthropic bill, cost per token.
Based on 277 social mentions analyzed, 9% of sentiment is positive, 90% neutral, and 0% negative.