We create the world’s fastest supercomputer and largest gaming platform.
Users generally praise NVIDIA for its impressive performance, particularly with AI and robotics applications, as highlighted by the excitement around projects using NVIDIA technology like the Jetson Orin Nano. However, there are concerns regarding the reliance on certain technologies like DLSS, which can sometimes produce misleading visual data. Users view the pricing of NVIDIA products as high but often justified by their cutting-edge capabilities. Overall, NVIDIA enjoys a strong reputation for innovation and technological leadership in the GPU and AI spaces.
Mentions (30d)
30
4 this week
Avg Rating
4.5
14 reviews
Platforms
2
Sentiment
16%
19 positive
Users generally praise NVIDIA for its impressive performance, particularly with AI and robotics applications, as highlighted by the excitement around projects using NVIDIA technology like the Jetson Orin Nano. However, there are concerns regarding the reliance on certain technologies like DLSS, which can sometimes produce misleading visual data. Users view the pricing of NVIDIA products as high but often justified by their cutting-edge capabilities. Overall, NVIDIA enjoys a strong reputation for innovation and technological leadership in the GPU and AI spaces.
Features
Use Cases
Industry
computer hardware
Employees
36,000
20
npm packages
40
HuggingFace models
g2
What do you like best about Nvidia AI Enterprise?NVIDIA AI Enterprise is a robust end-to-end software suite designed to help organizations as well as individual to accelerate their use of AI adoption with enterprise grade security and scalability . A key strength of this is its versatility,it supports a wide range of use cases, from NLP and computer vision to gen AI.It accelerates both AI development and deployment and its ease of use and implementation. Seamless integration with VMware and cloud-native environments. Review collected by and hosted on G2.com.What do you dislike about Nvidia AI Enterprise?Requires investment in NVIDIA-certified infrastructure for maximum efficiency. Steep learning curve for teams entirely new to AI workflows. Review collected by and hosted on G2.com.
What do you like best about Nvidia AI Enterprise?Nvidia AI Enterprise enables us to communicate with our environment using AI. It allows us to do the whole work in ease. Review collected by and hosted on G2.com.What do you dislike about Nvidia AI Enterprise?As i have used Nvidia AI Enterprise, till now i have not found any thing that i can dislike. By using such AI tool, it allows me to interact with new world. Review collected by and hosted on G2.com.
What do you like best about Nvidia AI Enterprise?It's like having a full toolbox for AI development, with everything you need from data preparation to model deployment. Plus, the performance boost you get from NVIDIA GPUs is fantastic! It's like having a turbocharger for your AI projects. Review collected by and hosted on G2.com.What do you dislike about Nvidia AI Enterprise?It's a comprehensive platform with a lot of features, but that also means it comes with a higher price tag. Additionally, while it's designed to be user-friendly, it might still have a learning curve for those who are new to AI or deep learning. So, while I appreciate its power and features, the cost and potential learning curve might be factors to consider for some users. Review collected by and hosted on G2.com.
What do you like best about Nvidia AI Enterprise?Nvidia AI Enterprise is a easy to use, more accurate and time saving Ai tools. Review collected by and hosted on G2.com.What do you dislike about Nvidia AI Enterprise?Nvidia AI Enterprice - pricing s a little bit higher. Review collected by and hosted on G2.com.
What do you like best about Nvidia AI Enterprise?The graphics uses for creation of new enterprise and moving the slides .Itt is really smooth and understand your requirement Review collected by and hosted on G2.com.What do you dislike about Nvidia AI Enterprise?The customer support and services needs more enhance as reaching to get some help on their services is tough Review collected by and hosted on G2.com.
What do you like best about Nvidia AI Enterprise?It was well crafted to harness the data based on the inputs we provide to get the desired outcome. Review collected by and hosted on G2.com.What do you dislike about Nvidia AI Enterprise?NVidia is all set with all the relevant features, nothing to improve much as such Review collected by and hosted on G2.com.
What do you like best about Nvidia AI Enterprise?Optimized Performance: Leverages NVIDIA GPUs for faster AI training and inference. Comprehensive Toolset: Includes essential tools, libraries, and pre-trained models. Enterprise Support: Offers technical support and regular updates. Scalability: Flexible deployment across various environments. Framework Integration: Compatible with popular AI frameworks. Review collected by and hosted on G2.com.What do you dislike about Nvidia AI Enterprise?High Cost: Expensive hardware and licensing fees. Complexity: Requires specialized knowledge and can have a steep learning curve. Review collected by and hosted on G2.com.
What do you like best about Nvidia AI Enterprise?It very helpfull for the prepare data and clean it for the training, performance improvement. Review collected by and hosted on G2.com.What do you dislike about Nvidia AI Enterprise?There is some high pricing, setting up and manage platform some complexity. Review collected by and hosted on G2.com.
What do you like best about Nvidia AI Enterprise?What stands out most about NVIDIA AI Enterprise is Optimized GPU Performance, Comprehensive AI Tools, Enterprise-Grade Support, Seamless Integration with Existing IT Infrastructure Review collected by and hosted on G2.com.What do you dislike about Nvidia AI Enterprise?Some potential downsides of NVIDIA AI Enterprise includes High Cost: The licensing and hardware requirements can be expensive, which might be a barrier for smaller businesses, Complexity: Setting up and managing the platform can be complex, especially for teams without in-depth AI or IT expertise., Hardware Dependence: The platform is heavily optimized for NVIDIA GPUs, which can limit flexibility if you want to use other hardware, Learning Curve: While it offers many powerful tools, the extensive feature set can have a steep learning curve for new users. Review collected by and hosted on G2.com.
What do you like best about Nvidia AI Enterprise?I am using nvidia gpu rtx 3070 and I can use it easily as main stream server because it is certified server from nvidia and most improtantly, they are sharing a public cloud server through Google cloud. so it is very helpful and their support would be available though these channels. It's implementation is very handy through gpu server and really handy to use it daily whenever required. There is no limitation to use it on daily basis that is plus. Thier ai model has a lot of ai features to I can use from it, I word on multiple idea through their ai. Integaration is very easy, I already have a gpu so I require no much efforts. Review collected by and hosted on G2.com.What do you dislike about Nvidia AI Enterprise?If you don't have a nvidia gpu or dpu, then you need some extrea online available resourses to configure it and use it, the hardware with powerful resourse is must. Review collected by and hosted on G2.com.
Are we nearly there?
Implying tech companies besides Anthropic, Google, and Nvidia have any money left over by 2027 after they all ran through cash on hand for tokens. submitted by /u/irelatetolevin [link] [comments]
View originalAre we nearly there?
Implying tech companies besides Anthropic, Google, and Nvidia have any money left over by 2027 after they all ran through cash on hand for tokens. submitted by /u/irelatetolevin [link] [comments]
View originalIf you use NVIDIA Isaac Sim for reinforcement learning, do you use Isaac Lab with it? Just want to get a sense of what the status quo is. [D]
The reason for this query is that I am in the process of shifting to Isaac Sim / Isaac Lab since that is what seems to be in use nowadays. However, Isaac Lab is proving to be somewhat difficult to handle. While it handles the logging, and the creation of multi-actor systems for algorithms like PPO beautifully (with, say, hundreds of actors), its documentation leaves much to be desired. I am also concerned about the ease of setting up new robotic environments, actions, rewards, policies and possibly even custom algorithms. So, what is it that you do at your lab? In my mind there's a trade-off. On the one hand, I use the Isaac Lab scaffolding but run into its idiosyncracies very frequently until I document everything I need. Or, I interface directly with Isaac Sim, but then I need to write my own handlers for interfacing Isaac Sim with the RL agent. submitted by /u/StayingUp4AFeeling [link] [comments]
View originalMemory
Your explanation is largely correct. The reason “memory” has become the dominant systems problem for LLMs is that modern transformers are increasingly memory-bandwidth bound, not compute-bound. The key shift is this: Training large models was mostly about FLOPs. Serving large models at scale is increasingly about moving KV cache data around fast enough. A single token generation step only performs a relatively modest amount of math compared to the amount of KV data that must be fetched from memory every step. Why this happens During inference, every new token attends to all prior tokens. So for token t, the model needs access to all prior K/V tensors: \text{KV Cache Size} \propto 2 \times L \times S \times H \times d Where: L = layers S = sequence length H = attention heads d = head dimension The killer is the S term. As context grows: 8K → manageable 128K → huge 1M → infrastructure problem A 70B model with long context can require hundreds of GBs of KV cache across concurrent users. Why bandwidth matters more than raw compute Modern GPUs like the NVIDIA H100 or NVIDIA Blackwell can perform enormous amounts of compute. But every generated token requires: Loading KV cache from memory Running attention Writing updated KV back That means inference speed often depends more on: HBM bandwidth memory locality cache management than tensor core throughput. This is why: HBM3E NVLink unified memory memory compression have become strategic bottlenecks. Why the KV cache can exceed model weights Model weights are static. KV cache is dynamic and scales with: users context length output length batch size Example intuition: 70B model weights might occupy ~140 GB FP16 But serving thousands of users with long contexts can require multiple TBs of KV cache So operators increasingly optimize: cache reuse eviction paging quantization instead of just model size. Why vLLM and PagedAttention mattered so much Before systems like vLLM, memory fragmentation was catastrophic. PagedAttention essentially borrowed ideas from operating systems: divide KV into pages allocate dynamically avoid contiguous memory assumptions That dramatically improved: utilization batching throughput This was one of the biggest inference infrastructure breakthroughs of the last few years because it improved economics without changing the model itself. The deeper issue: transformers scale poorly with context Standard attention fundamentally has a retrieval problem: Each token potentially references every prior token. Even though compute optimizations exist, the architecture still requires huge memory movement. That’s why researchers are exploring: Grouped Query Attention (GQA) Multi-Query Attention (MQA) sliding window attention recurrent memory state-space models hybrid retrieval systems The industry increasingly believes: infinite-context transformers using naive KV scaling are economically unsustainable. Why inference economics are now the focus Training frontier models is expensive. But operating them continuously at global scale is potentially even larger economically. For many providers: inference cost dominates memory dominates inference cost That’s why companies across the stack are racing on memory: NVIDIA → HBM + NVLink + Grace AMD → MI300 unified memory Cerebras → wafer-scale SRAM Groq → deterministic low-latency SRAM-heavy architecture Marvell Technology → custom memory fabrics The bottleneck has shifted from: “Can we train bigger models?” to: “Can we serve them cheaply and fast enough?” submitted by /u/Annual_Judge_7272 [link] [comments]
View originalpipeline is really slow - consulting [D]
Hi, after a long debugging process and many discussions, I wanted to ask for advice from people who may have encountered similar training bottlenecks. My goal is imitation learning for robotics. Model / Pipeline Observation space: 4 RGB robot cameras image resolution: 128x128x3 small vector of robot joint velocities (14 dims) Pipeline: Shared ResNet18 encoder processes each image Each image embedding dimension is 128 Final input to policy: 4 * 128 image embedding concatenated with 14-dim state vector Policy backbone: DiT (Diffusion Transformer) ~8 layers hidden dim: 512 8 attention heads total params: ~50M Diffusion setup: predict action chunks of length ~50 diffusion timesteps: 4 Dataset / Storage Dataset stored in Zarr Data access is indexed/reference-based (not loading huge chunks into RAM) train/val split is contiguous no shuffling Current encoder setup Initially trained end-to-end During debugging I switched to ImageNet pretrained ResNet18 Encoder is currently frozen Hardware / Software GPU: NVIDIA A4500 RAM: 48GB Storage: SSD CUDA: 12.8 PyTorch: 2.9 Precision: bf16 mixed precision (also tested fp32) Dataloader batch size: 2 8 persistent workers pinned memory enabled Preprocessing preprocessing is minimal normalization + float conversion only preprocessing happens inside the multimodal encoder on GPU Profiler results (PyTorch profiler) Current workload split: train_dataloader_next: 4.41s / 41.84s = 10.5% batch_to_device: 0.32s / 41.84s = 0.77% training_step: 12.78s = 30.5% backward: 10.83s = 25.9% optimizer_step (wrapper total): 26.09s = 62.4% Problem The training is much slower than I expected. Current behavior: CPU utilization: ~100% GPU utilization: ~20–30% GPU utilization can even become LOWER with synthetic data VRAM usage is relatively low Throughput is around 10 iterations/sec Epoch of ~50k samples takes around 30 minutes Additional observations Increasing batch size does NOT reduce epoch wall-clock time Sometimes larger batches make things slower Freezing the encoder did not improve throughput much Replacing dataset samples with synthetic/random tensors improved throughput by only ~50% Synthetic dataset was initialized directly in memory I do not believe this setup should be this slow. At this rate, training takes multiple days. For comparison, I saw papers with somewhat similar architectures mentioning ~10 hour training times on RTX 4090. With my setup 10 hours is completely not enough. Does anyone see something obviously wrong or have suggestions for where I should investigate next? Please help, can't know what to do! submitted by /u/Potential_Hippo1724 [link] [comments]
View originali think flat-rate ai is dying.
tldr: longer one, but the point is simple: i think flat-rate ai is dying because the compute economics are starting to leak into the user experience. i think flat-rate ai is dying. and i don’t mean “ai is over” or whatever. i mean the $20/$200 subscription thing is starting to break. i’m on claude max. i use claude code a laaawt (actually can’t remember the last time my laptop was open without a terminal). and the thing that feels different lately is not just “claude got dumber” or “claude got slower”. maybe it did. maybe it didn’t. in the annoying daily way, you start thinking about usage, context, model choice, cache, tools, and whether this next prompt is going to burn half your session. that’s not really a chatbot subscription anymore. it’s some wierd middle thing where i pay monthly but still have to think about burn rate. and that kinda pisses me off. not because i expect infinite compute for $20, but because the product is still sold like a simple subscription while the actual experience is turning into metered infra. i also checked my own spend and it’s ugly. i’ve burned through around 11k since january because of heavy coding. and yeah, i haven’t had the time to properly audit this, so take it as “what it feels like” not a clean spreadsheet claim. but for roughly the same amount, i feel like i could code an entire year before. now it disappears in a few months if i’m really using the thing hard. that’s the part that made this click for me. look at anthropic’s own pricing chart: current sonnet is $3/$15 per million tokens. current opus is $5/$25. fast mode for opus 4.6/4.7 is $30/$150. https://platform.claude.com/docs/en/about-claude/pricing then look at the compute announcement: anthropic says the spacex deal gives them 220,000+ nvidia gpus, and that this lets them raise claude code limits. https://www.anthropic.com/news/higher-limits-spacex sorry but that’s the tell. if new compute capacity changes how much your $200 subscription can do, then you didn’t buy “ai access”. you bought a slice of scarce inference capacity. and the docs basically say it out loud now. usage depends on model choice, conversation length, tools, complexity, extended thinking, and all your claude surfaces sharing the same budget. claude code carries old context unless you clear or compact. tools eat tokens. opus eat limits faster. long sessions quietly become expensive sessions. my guess is 2027 looks way less like netflix and way more like aws. the good model costs more. speed costs more. deep thinking probably costs more. agents probably get their own meter. teams get pools. serious users get reserved capacity or whatever they end up calling it. basically all the boring cloud pricing stuff, but now inside a chat product. and honestly, maybe that’s fine. maybe that’s the only business model that survives. but then say that. so when people say “claude got worse”, i think part of that is real. but part of it is probably this: i think the cheap phase is ending. and nobody really wants to say out loud what the normal price is going to be. submitted by /u/tikkivolta [link] [comments]
View originalAnthropic and OpenAI don't want better models, they want to sell more tokens
There is a saying in auto racing that describes the current state of AI providers: “Go as slow as you can to win”, that translates as “Spend as low as you can on R&D to stay slightly better than average”. Let’s put our tin foil hats on and look at it from the business perspective of an AI provider. Follow the money AI providers do not make money on training models but on selling inference. It means, from a business perspective, if OpenAI could keep selling GPT-3 forever, they would not spend money on training a better model but keep milking the cow they already have. But they couldn’t, because it was still “cheap” ($80–$100 million for GPT-4) to train a better model, and there was a risk someone else would. That fear of losing to the better model got us where we are. Makes sense. But let’s look at modern times. Training a model is not “cheap” anymore, it’s mega expensive (estimated to be $1.5–$2 billion for GPT-5). There is only a handful of companies who can afford such an affair. And a new model will not necessary better (so sell more inference). An expensive gamble. What it means for the business: Training a new model is mega expensive, raising money for that is getting harder Training a new model is not a revenue stream, selling inference is Having somewhat capable models that don’t one-shot prompts but need “prolonged thinking” (self-prompting) is actually better for the business of selling tokens than a great model that one-shots SCREW NEW MODELS, SELL MORE INFERENCE! Better model is not a goal anymore Is that what’s happening? Did Anthropic and OpenAI accept their niche and unspokenly (or spokenly, we don’t know) decide to “go as slow as they can” with creating new models, as they both are winning anyway? That would sound reasonable if the goal is to make money (which is why commercial companies are created). Let’s look back 6 months (eternity in the AI world) at Anthropic’s release history: Nov 2025 Opus 4.5 released. The last model that felt like an improvement compared to its predecessor. Feb 2026 Opus 4.6: no shockwave, some users reverted back to 4.5. Maybe got slightly better, but only because it was “thinking for longer” (e.g. burning more tokens without extra prompting). April 2026 Opus 4.7: same underwhelming release, the biggest improvement is that the model now thinks even longer and prompts the user less, e.g. burns even more of your tokens without you asking it. To sum up: last 6 month we seen no quality improvements, but better token burn without bothering the user. From the other side, they also squeeze developers into using Claude Code (their AI harness): End of 2025: forbade usage of Claude subscription in 3rd party harnesses (OpenCode, etc.) Start of 2026: blocked subscription usage of OpenClaw, Hermes and other agents From June 2026: programmatic usage of their Claude Code (for example in scripts) will be forbidden as well. They force you into their harness, where they do as much as they can to keep the tokens flowing. Cherry on top of the pie: Boris Cherny, the head of Claude Code, stated he sees the AI coding future in “agent loops” — an agent keeps prompting itself until the task is completed. Have you noticed the difference? The goal is not to “one-shot” the answer anymore (that needs improving models) but “a loop” that keeps going until the problem is solved. And that loop is a money-making machine for Anthropic, great for the business. That approach also makes money for the whole AI supply chain: AI providers making margin on selling tokens Data centers selling GPU hours NVIDIA selling GPUs What does that mean? Lots of tech companies financially benefit from somewhat intelligent models but not intelligent enough to one-shot all questions. And those models are already there. So it’s likely we won’t see massive model improvements in upcoming future. There is no point in it. Top LLMs are on a more or less the same level, competition is miles behind. Time to make money on inference, or go IPO. submitted by /u/kgoncharuk [link] [comments]
View originalRethinking AI Bubble
For those worried about the AI Bubble bursting, it's not happening, at least for now, not until atleast OpenAI and Anthropic are listed (later this year). And if you actually discount Nvidia, and check the PE of AI companies right now OpenAI (35x) and anthropic (13x), these valuations do not really seem unsustainable as of now, and not to mention unlike the DotCom bubble, they have massive data centre infrastructure, so this is all not in the air. AI is here to stay, it's already altering our lives, taking up workspaces and transforming work, there is a massive upfront cost but that does not immediately signal a bubble unfolding. If any bubble bursts, it would not be solely the AI Bubble, it would be the government bonds and the dollar bubble. Edit: I wrote the post hastily, sorry for writing Valuation/Revenue as PE. submitted by /u/Upstair_Speaker [link] [comments]
View originalAi models
Fresh from Bloomberg today: the Pentagon is actively evaluating multiple frontier AI models — especially from OpenAI and Google’s Gemini — across military theater commands as it moves away from relying heavily on Anthropic’s Claude in classified environments. The backdrop is a major dispute earlier this year between Anthropic and the Pentagon over contract language tied to “lawful operational use.” Anthropic reportedly pushed back on terms that could permit domestic mass surveillance or fully autonomous weapons without meaningful human oversight. After negotiations collapsed, the Pentagon designated Anthropic a “supply-chain risk” and accelerated efforts to onboard rival models instead. That triggered a rapid shift toward a multi-vendor AI strategy: OpenAI, Google, Microsoft, Amazon Web Services, NVIDIA, xAI, and others have signed agreements for classified or operational military AI deployments. Google’s Gemini models were recently added to the Pentagon’s internal AI portal, while OpenAI expanded access to models inside classified defense networks. The Pentagon is now testing how different models respond to identical prompts, especially in ambiguous or high-stakes military workflows. Officials noted the systems “respond differently,” highlighting a major real-world challenge with LLM deployment. Why this matters: Defense agencies increasingly view frontier AI as critical infrastructure, similar to cloud or semiconductors. Moving from a single preferred model to multiple vendors improves resilience and bargaining power, but creates major integration and reliability challenges. The episode exposed growing tension between commercial AI safety policies and government/national-security priorities. So far, the biggest beneficiaries appear to be OpenAI and Google, both of which have expanded defense relationships while Anthropic fights the designation in court. submitted by /u/Annual_Judge_7272 [link] [comments]
View originalChina Banned Nvidia's China-Only Gaming Chip While Jensen Huang Was in Beijing
submitted by /u/andix3 [link] [comments]
View originalFinally a local ai box that doesn't cost a kidney
Local inference just got real. AMD dropped a mini workstation under four grand. I've been running models through cloud APIs for about two years now and the costs add up fast when you're doing anything beyond basic prompts. Like genuinely painful once you scale past hobby projects. Was sitting in my home office last Tuesday staring at another monthly bill and just thinking there has to be a better way. So seeing a compact box that can handle local model runs at roughly the same price point as a decent gaming rig, that changes the math completely. The NVIDIA alternative sits around forty seven hundred. Not a massive gap on paper but when you factor in that the AMD unit runs both Windows and Linux natively, the flexibility alone makes it more interesting for most dev workflows I've seen. And its like Mac Mini sized which is kind of absurd for what it does. Cloud bills might actually have competition now. submitted by /u/Defiant-Act-7439 [link] [comments]
View originalAnthropic is paying SpaceX $15 billion per year
According to SpaceX’s IPO filing, Anthropic is paying SpaceX $1.25 billion per month through May 2029 as part of the massive compute deal the two companies signed earlier this year. That works out to roughly $15 billion per year. The deal is huge for Anthropic because the company’s revenue is rapidly growing, but it has also been limited by a lack of available compute. More compute means more capacity to train and run its AI models. It is also a massive win for SpaceX. The company reportedly brings in around $18 billion in annual revenue, so a single customer paying $15 billion a year for compute is a serious boost. Anthropic and SpaceX announced the deal last month, but they did not give financial details at the time. The monthly payments were revealed in SpaceX’s IPO filing released Wednesday. SpaceX said the payments will be lower in May and June as the deal ramps up. Anthropic also announced just before the filing became public that it is expanding beyond SpaceX’s Colossus 1 facility and will also use Colossus 2. Tom Brown, Anthropic’s co-founder and chief compute officer, said the company is “expanding our partnership with SpaceX” and will be scaling up Nvidia GB200 capacity in Colossus 2 throughout June. SpaceX also made it clear this may not be the last deal of its kind. “We expect to enter into additional similar services contracts,” the company said in the filing. SpaceX also said it has enough capacity to support its own AI models while still meeting its obligations under these outside compute agreements. Source: https://www.axios.com/2026/05/20/anthropic-spacex-compute submitted by /u/Luka77GOATic [link] [comments]
View originalOpenAl Announced vs. Current Operational Compute
submitted by /u/Business_Garden_7771 [link] [comments]
View originalAnthropic Announced vs current compute capacity (Sources Below)
source list: Google Cloud TPU deal — up to 1M TPUs, “well over 1 GW” expected online in 2026 https://www.anthropic.com/news/expanding-our-use-of-google-cloud-tpus-and-services https://www.googlecloudpresscorner.com/2025-10-23-Anthropic-to-Expand-Use-of-Google-Cloud-TPUs-and-Services (Anthropic) Fluidstack / Anthropic $50B U.S. AI infrastructure — Texas + New York, sites coming online through 2026 https://www.anthropic.com/news/anthropic-invests-50-billion-in-american-ai-infrastructure https://www.fluidstack.io/about-us/blog/fluidstack-selected-by-anthropic-to-deliver-custom-data-centers-in-the-us (Anthropic) Microsoft + NVIDIA deal — $30B Azure compute commitment + up to 1 GW additional capacity https://blogs.microsoft.com/blog/2025/11/18/microsoft-nvidia-and-anthropic-announce-strategic-partnerships/ https://blogs.nvidia.com/blog/microsoft-nvidia-anthropic-announce-partnership/ (The Official Microsoft Blog) Google + Broadcom next-gen TPU deal — multiple GW starting 2027; Broadcom SEC filing says ~3.5 GW https://www.anthropic.com/news/google-broadcom-partnership-compute https://investors.broadcom.com/static-files/c906d370-921b-4bc2-bb7b-57877dfcf1ae (Anthropic) Amazon / AWS deal — up to 5 GW, nearly 1 GW by end-2026 https://www.anthropic.com/news/anthropic-amazon-compute (Anthropic) AWS Project Rainier — operational now, nearly half a million Trainium2 chips; Claude expected on 1M+ Trainium2 chips https://www.aboutamazon.com/news/aws/aws-project-rainier-ai-trainium-chips-compute-cluster (Amazon News) SpaceX / Colossus 1 — all Colossus 1 compute, >300 MW, 220k+ NVIDIA GPUs within the month https://www.anthropic.com/news/higher-limits-spacex https://x.ai/news/anthropic-compute-partnership (Anthropic) Independent reporting for SpaceX deal https://www.reuters.com/business/retail-consumer/anthropic-unveils-dreaming-feature-help-its-ai-agents-self-improve-2026-05-06/ (Reuters) submitted by /u/Business_Garden_7771 [link] [comments]
View originalClaude Code has 240+ models via NVIDIA NIM gateway
TIL Claude Code has 240+ models via NVIDIA NIM gateway — Nemotron-3 120B for agentic coding is surprisingly good So I was messing around with /model in Claude Code today and noticed something most people probably don't know about — after the standard Claude models (Opus, Sonnet, Haiku), there's a whole NVIDIA NIM gateway section with +239 additional models you can switch to mid-session. Some of the models I spotted: nvidia/nemotron-3-super-120b-a12b (with and without thinking mode) 01-ai/yi-large abacusai/dracarys-llama-3.1-70b-instruct ...and hundreds more I've been running the Nemotron thinking variant for multi-file refactoring and it's genuinely solid. It reasons through changes before touching your code — exactly what you want for agentic tasks. Latency is higher than Claude obviously, but if you're burning through Opus credits on long sessions this is worth experimenting with. How to try it: Open any Claude Code session Run /model Scroll past the four standard Claude options — NIM models appear below Hit d to set one as your session default, or pass --model at launch Anyone else been routing Claude Code through NIM? Curious what models people have had luck with — especially for Python or Rust codegen. submitted by /u/shadowBladeO4 [link] [comments]
View originalNVIDIA uses a tiered pricing model. Visit their website for current pricing details.
NVIDIA has an average rating of 4.5 out of 5 stars based on 14 reviews from G2, Capterra, and TrustRadius.
Key features include: NVIDIA GTC, Data Center, Artificial Intelligence, Agentic AI, Short Description, NVIDIA Nemotron 3 Omni, Introducing NVIDIA Nemotron 3 Omni, L’Oréal Uses post 1.
NVIDIA is commonly used for: Accelerate power-flexible AI deployment with Emerald AI, Build autonomous agents that perceive, reason, and act on enterprise knowledge, Enhance security in autonomous agents using NVIDIA OpenShell, Deploy self-evolving agents with control and governance, Utilize NVIDIA Dynamo 1.0 for large-scale inference, Develop robotics and vision AI agents for autonomous vehicles.
NVIDIA integrates with: NVIDIA DGX Station™, NVIDIA DGX Spark™, NVIDIA CUDA-X™, NVIDIA Omniverse™, NVIDIA ALCHEMI, NVIDIA CloudXR 6.0, NVIDIA Dynamo integration with vLLM, NVIDIA integration with Synopsys engineering solutions, Collaboration with T-Mobile and Nokia for 5G edge AI, Partnership with Dassault Systèmes for industrial transformation.
Mira Murati
Former CTO at OpenAI
3 mentions
Based on user reviews and social mentions, the most common pain points are: cost per token, API costs.
Based on 118 social mentions analyzed, 16% of sentiment is positive, 81% neutral, and 3% negative.