Close Menu
WithO2WithO2

    Subscribe to Updates

    Get the latest AI News Tools Updates in your Inbox

    What's Hot

    OpenAI Just Filed for IPO — Targeting $1 Trillion in September

    June 2, 2026

    Apple WWDC 2026: Siri 2.0, iOS 27 AI Features — What to Expect June 8

    June 2, 2026

    NVIDIA Nemotron 3 Ultra: America’s Biggest Open AI Model Launches June 4 — 550B Params, 300 Tokens/sec

    June 2, 2026

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    WithO2WithO2
    • AI
    • Blog
    • Business Software
    • Trending News
    • Stories
    WithO2WithO2
    Home » Trending News
    Trending News

    NVIDIA Nemotron 3 Ultra: America’s Biggest Open AI Model Launches June 4 — 550B Params, 300 Tokens/sec

    By Amitabh SarkarJune 2, 2026Updated:June 2, 20264 Mins Read0
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    NVIDIA Nemotron 3 Ultra 550B open-weights AI model Computex 2026
    Jensen Huang unveiled Nemotron 3 Ultra at Computex 2026 — the largest open-weights AI model from a US company.
    Share
    Facebook Twitter LinkedIn Pinterest Email

    NVIDIA CEO Jensen Huang opened the Computex 2026 keynote in Taipei on June 1 with a sweeping announcement: Nemotron 3 Ultra, a 550-billion-parameter open-weights AI model that the company is calling America’s most intelligent publicly available model. The reveal signals NVIDIA’s ambition to compete not just in AI chips — but in the models that run on them.

    Table of Contents

    Toggle
    • What Is Nemotron 3 Ultra?
    • The Catch: China Is Still Ahead on Open Models
    • Built for Agentic Workflows
    • Availability and the Full Computex Stack
    • Why This Matters

    What Is Nemotron 3 Ultra?

    Nemotron 3 Ultra is a mixture-of-experts (MoE) model with 550 billion total parameters, but only 55 billion active at any given inference call. That architectural choice is the key to its efficiency: the model achieves 300+ tokens per second while costing 30% less to run than comparable open-weight alternatives. For enterprises self-hosting large models on NVIDIA hardware, that cost gap matters enormously.

    The model scores an Intelligence Index of 48 on the Artificial Analysis benchmark — placing it solidly at the frontier for open-weights models in the US. For context, proprietary closed models like GPT-5.5 and Claude Opus 4.8 still lead on overall intelligence scores, but those come with API pricing and data-sharing agreements. Nemotron 3 Ultra can run entirely on-premise.

    The Catch: China Is Still Ahead on Open Models

    Despite the fanfare, Nemotron 3 Ultra doesn’t hold the global open-model crown. China’s Kimi K2.6 from Moonshot AI scores 54 on the same benchmark — ranking fourth among all AI models globally, closed or open. NVIDIA’s model is America’s best open-weight option, but the gap with leading Chinese open models is a persistent concern in the open-source AI community.

    NVIDIA was careful in its language at Computex: “the most intelligent US open weights model” — a geographically-scoped claim that reflects the current state of open AI model competition between the US and China.

    Built for Agentic Workflows

    Nemotron 3 Ultra was designed from the ground up for agentic AI — systems that can plan, execute, and iterate on multi-step tasks without human input at each stage. This is where the 550B architecture pays off: complex reasoning chains and long-horizon planning benefit from more parameters, even if only a fraction are active per token.

    This positions it squarely against models like Anthropic’s upcoming Mythos model and Google’s Gemini agent line — both of which also target agentic enterprise use cases. The difference is deployment model: Nemotron 3 Ultra is fully open-weights, making it attractive to regulated industries (finance, healthcare, government) that can’t send data to external APIs.

    Availability and the Full Computex Stack

    The model releases on June 4, 2026 across Hugging Face, ModelScope, and OpenRouter for download, plus NVIDIA’s NIM microservice on build.nvidia.com for managed deployment. Enterprises can also access it as a managed API.

    Nemotron 3 Ultra sits atop a three-tier family: the base Nemotron 3 for lightweight tasks, the Super (120B parameters, launched March 2026) for mid-range enterprise work, and Ultra for the most demanding reasoning and agentic applications.

    Huang’s Computex keynote also included new chip announcements, a personal AI PC called Project DIGITS 2, and updates to the Cosmos 3 world-model platform — but Nemotron 3 Ultra was the headline for enterprise AI buyers.

    Why This Matters

    NVIDIA’s entry into the open-weights model race is a strategic pivot. For years, the company’s pitch was simple: buy our GPUs to run others’ models. Nemotron 3 changes that to: buy our GPUs, run our model, and deploy via our microservices. It’s a full-stack lock-in play.

    For developers and enterprises, Nemotron 3 Ultra is worth benchmarking — particularly for agentic use cases where on-premise deployment is a hard requirement. At 300+ tokens/second and 30% cheaper than comparable open-weight alternatives, the efficiency story is real.

    Whether it narrows the gap with China’s leading open models remains to be seen. For now, NVIDIA has America’s best open-weights model — and that’s a claim it didn’t hold two weeks ago.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Amitabh Sarkar
    • Website

    I am a software engineer, I have a passion for working with cutting-edge technologies and staying up-to-date with the latest developments in the field. In my articles, I share my knowledge and insights on a range of topics, including business software, how to set up tools, and the latest trends in the tech industry.

    Related Posts

    OpenAI Just Filed for IPO — Targeting $1 Trillion in September

    June 2, 2026

    Apple WWDC 2026: Siri 2.0, iOS 27 AI Features — What to Expect June 8

    June 2, 2026

    Anthropic Files Confidential IPO — 65B Valuation, First Profitable Quarter Coming

    June 2, 2026

    Comments are closed.

    Don't Miss
    Trending News

    OpenAI Just Filed for IPO — Targeting $1 Trillion in September

    By Amitabh SarkarJune 2, 2026

    OpenAI filed a confidential S-1 with the SEC on May 22, 2026, targeting a September…

    Apple WWDC 2026: Siri 2.0, iOS 27 AI Features — What to Expect June 8

    June 2, 2026

    Anthropic Files Confidential IPO — 65B Valuation, First Profitable Quarter Coming

    June 2, 2026

    OpenAI Is Killing GPT-4.5 on June 27 — Here’s What to Do Before the Deadline

    June 1, 2026

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Our Picks

    12 Best Deepfake Apps and Software 2026 (Tested & Compared)

    March 7, 2023

    Best long-form AI writer 2023 for writing full blog articles

    January 29, 2023

    Best CRM Software for Insurance Agencies 2026: Top 10 Compared

    January 26, 2023

    Elevate Your Filmmaking with the Best Video Editing editing software for Filmmakers on the Market in 2023

    January 23, 2023
    Editors Picks

    OpenAI Just Filed for IPO — Targeting $1 Trillion in September

    June 2, 2026

    Apple WWDC 2026: Siri 2.0, iOS 27 AI Features — What to Expect June 8

    June 2, 2026

    Anthropic Files Confidential IPO — 65B Valuation, First Profitable Quarter Coming

    June 2, 2026

    OpenAI Is Killing GPT-4.5 on June 27 — Here’s What to Do Before the Deadline

    June 1, 2026
    About Us
    About Us

    Your Source for Innovation: Discover in-depth guides, solutions, and tools tailored to modern business challenges.

    Links
    • Blog
    • Privacy Policy
    • Contact WithO2.com
    • Terms and Conditions
    Facebook X (Twitter) Instagram Pinterest
    © 2026 WITHO2.COM

    Type above and press Enter to search. Press Esc to cancel.