• kpk ETH Primekpk ETH Prime(KPK ETH PRIME)$2,034.900.01%
  • bitcoinBitcoin(BTC)$69,961.000.58%
  • ethereumEthereum(ETH)$2,053.771.64%
  • kpk ETH Yieldkpk ETH Yield(KPK ETH YIELD)$2,030.62-0.04%
  • tetherTether(USDT)$1.000.00%
  • binancecoinBNB(BNB)$649.051.13%
  • rippleXRP(XRP)$1.380.33%
  • usd-coinUSDC(USDC)$1.000.00%
  • solanaSolana(SOL)$86.070.92%
  • tronTRON(TRX)$0.2901230.80%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.03-0.87%
  • dogecoinDogecoin(DOGE)$0.0929321.08%
  • whitebitWhiteBIT Coin(WBT)$55.380.48%
  • USDSUSDS(USDS)$1.00-0.01%
  • cardanoCardano(ADA)$0.2616821.42%
  • bitcoin-cashBitcoin Cash(BCH)$455.931.54%
  • HyperliquidHyperliquid(HYPE)$37.549.03%
  • leo-tokenLEO Token(LEO)$9.07-1.09%
  • moneroMonero(XMR)$354.801.30%
  • chainlinkChainlink(LINK)$8.990.87%
  • Ethena USDeEthena USDe(USDE)$1.00-0.02%
  • CantonCanton(CC)$0.1500741.49%
  • stellarStellar(XLM)$0.1593131.99%
  • USD1USD1(USD1)$1.000.03%
  • RainRain(RAIN)$0.0090580.90%
  • daiDai(DAI)$1.00-0.01%
  • litecoinLitecoin(LTC)$54.431.11%
  • avalanche-2Avalanche(AVAX)$9.590.45%
  • hedera-hashgraphHedera(HBAR)$0.0945891.15%
  • paypal-usdPayPal USD(PYUSD)$1.000.00%
  • suiSui(SUI)$0.983.00%
  • zcashZcash(ZEC)$210.55-1.21%
  • shiba-inuShiba Inu(SHIB)$0.0000063.30%
  • the-open-networkToncoin(TON)$1.343.34%
  • crypto-com-chainCronos(CRO)$0.0759270.80%
  • tether-goldTether Gold(XAUT)$5,145.00-0.13%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.100673-0.29%
  • pax-goldPAX Gold(PAXG)$5,182.64-0.20%
  • MemeCoreMemeCore(M)$1.462.57%
  • polkadotPolkadot(DOT)$1.521.50%
  • uniswapUniswap(UNI)$3.911.21%
  • Pi NetworkPi Network(PI)$0.2404304.85%
  • mantleMantle(MNT)$0.711.80%
  • Circle USYCCircle USYC(USYC)$1.120.00%
  • okbOKB(OKB)$95.29-0.40%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • BittensorBittensor(TAO)$206.135.28%
  • SkySky(SKY)$0.0806647.77%
  • Global DollarGlobal Dollar(USDG)$1.00-0.02%
  • AsterAster(ASTER)$0.712.09%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Nvidia’s new open weights Nemotron 3 super combines three different architectures to beat gpt-oss and Qwen in throughput

March 11, 2026
in AI & Technology
Reading Time: 8 mins read
A A
Nvidia’s new open weights Nemotron 3 super combines three different architectures to beat gpt-oss and Qwen in throughput
ShareShareShareShareShare

Multi-agent systems, designed to handle long-horizon tasks like software engineering or cybersecurity triaging, can generate up to 15 times the token volume of standard chats — threatening their cost-effectiveness in handling enterprise tasks.

But today, Nvidia sought to help solve this problem with the release of Nemotron 3 Super, a 120-billion-parameter hybrid model, with weights posted on Hugging Face.

By merging disparate architectural philosophies—state-space models, transformers, and a novel “Latent” mixture-of-experts design—Nvidia is attempting to provide the specialized depth required for agentic workflows without the bloat typical of dense reasoning models, and all available for commercial usage under mostly open weights.

Triple hybrid architecture

At the core of Nemotron 3 Super is a sophisticated architectural triad that balances memory efficiency with precision reasoning. The model utilizes a Hybrid Mamba-Transformer backbone, which interleaves Mamba-2 layers with strategic Transformer attention layers.

To understand the implications for enterprise production, consider the “needle in a haystack” problem. Mamba-2 layers act like a “fast-travel” highway system, handling the vast majority of sequence processing with linear-time complexity. This allows the model to maintain a massive 1-million-token context window without the memory footprint of the KV cache exploding. However, pure state-space models often struggle with associative recall. 

To fix this, Nvidia strategically inserts Transformer attention layers as “global anchors,” ensuring the model can precisely retrieve specific facts buried deep within a codebase or a stack of financial reports.

Beyond the backbone, the model introduces Latent Mixture-of-Experts (LatentMoE). Traditional Mixture-of-Experts (MoE) designs route tokens to experts in their full hidden dimension, which creates a computational bottleneck as models scale. LatentMoE solves this by projecting tokens into a compressed space before routing them to specialists. 

This “expert compression” allows the model to consult four times as many specialists for the exact same computational cost. This granularity is vital for agents that must switch between Python syntax, SQL logic, and conversational reasoning within a single turn.

Further accelerating the model is Multi-Token Prediction (MTP). While standard models predict a single next token, MTP predicts several future tokens simultaneously. This serves as a “built-in draft model,” enabling native speculative decoding that can deliver up to 3x wall-clock speedups for structured generation tasks like code or tool calls.

The Blackwell advantage

For enterprises, the most significant technical leap in Nemotron 3 Super is its optimization for the Nvidia Blackwell GPU platform. By pre-training natively in NVFP4 (4-bit floating point), Nvidia has achieved a breakthrough in production efficiency.

On Blackwell, the model delivers 4x faster inference than 8-bit models running on the previous Hopper architecture, with no loss in accuracy.

In practical performance, Nemotron 3 Super is a specialized tool for agentic reasoning.

It currently holds the No. 1 position on the DeepResearch Bench, a benchmark measuring an AI’s ability to conduct thorough, multi-step research across large document sets.

Benchmark

Nemotron 3 Super

Qwen3.5-122B-A10B

GPT-OSS-120B

General Knowledge

MMLU-Pro

83.73

86.70

81.00

Reasoning

AIME25 (no tools)

90.21

90.36

92.50

HMMT Feb25 (no tools)

93.67

91.40

90.00

HMMT Feb25 (with tools)

94.73

89.55

—

GPQA (no tools)

79.23

86.60

80.10

GPQA (with tools)

82.70

—

80.09

LiveCodeBench (v5 2024-07↔2024-12)

81.19

78.93

88.00

SciCode (subtask)

42.05

42.00

39.00

HLE (no tools)

18.26

25.30

14.90

HLE (with tools)

22.82

—

19.0

Agentic

Terminal Bench (hard subset)

25.78

26.80

24.00

Terminal Bench Core 2.0

31.00

37.50

18.70

SWE-Bench (OpenHands)

60.47

66.40

41.9

SWE-Bench (OpenCode)

59.20

67.40

—

SWE-Bench (Codex)

53.73

61.20

—

SWE-Bench Multilingual (OpenHands)

45.78

—

30.80

TauBench V2

Airline

56.25

66.0

49.2

Retail

62.83

62.6

67.80

Telecom

64.36

95.00

66.00

Average

61.15

74.53

61.0

BrowseComp with Search

31.28

—

33.89

BIRD Bench

41.80

—

38.25

Chat & Instruction Following

IFBench (prompt)

72.56

73.77

68.32

Scale AI Multi-Challenge

55.23

61.50

58.29

Arena-Hard-V2

73.88

75.15

90.26

Long Context

AA-LCR

58.31

YOU MAY ALSO LIKE

Google Play will let you try a game before you buy it

I guess this wasn’t an Xbox after all

66.90

51.00

RULER @ 256k

96.30

96.74

52.30

RULER @ 512k

95.67

95.95

46.70

RULER @ 1M

91.75

91.33

22.30

Multilingual

MMLU-ProX (avg over langs)

79.36

85.06

76.59

WMT24++ (en→xx)

86.67

87.84

88.89

It also demonstrates significant throughput advantages, achieving up to 2.2x higher throughput than gpt-oss-120B and 7.5x higher than Qwen3.5-122B in high-volume settings.

Nvidia Nemotron 3 Super key benchmarks chart. Nvidia

Custom ‘open’ license — commercial usage but with important caveats 

The release of Nemotron 3 Super under the Nvidia Open Model License Agreement (updated October 2025) provides a permissive framework for enterprise adoption, though it carries distinct “safeguard” clauses that differentiate it from pure open-source licenses like MIT or Apache 2.0.

Key Provisions for Enterprise Users:

  • Commercial Usability: The license explicitly states that models are “commercially usable” and grants a perpetual, worldwide, royalty-free license to sell and distribute products built on the model.

  • Ownership of Output: Nvidia makes no claim to the outputs generated by the model; the responsibility for those outputs—and the ownership of them—rests entirely with the user.

  • Derivative Works: Enterprises are free to create and own “Derivative Models” (fine-tuned versions), provided they include the required attribution notice: “Licensed by Nvidia Corporation under the Nvidia Open Model License.”

The “Red Lines”:

The license includes two critical termination triggers that production teams must monitor:

  1. Safety Guardrails: The license automatically terminates if a user bypasses or circumvents the model’s “Guardrails” (technical limitations or safety hyperparameters) without implementing a “substantially similar” replacement appropriate for the use case.

  2. Litigation Trigger: If a user institutes copyright or patent litigation against Nvidia alleging that the model infringes on their IP, their license to use the model terminates immediately.

This structure allows Nvidia to foster a commercial ecosystem while protecting itself from “IP trolling” and ensuring that the model isn’t stripped of its safety features for malicious use.

‘The team really cooked’

The release has generated significant buzz within the developer community. Chris Alexiuk, a Senior Product Research Enginner at Nvidia, heralded the launch on X under his handle @llm_wizard as a “SUPER DAY,” emphasizing the model’s speed and transparency. “Model is: FAST. Model is: SMART. Model is: THE MOST OPEN MODEL WE’VE DONE YET,” Chris posted, highlighting the release of not just weights, but 10 trillion tokens of training data and recipes.

The industry adoption reflects this enthusiasm:

  • Cloud and Hardware: The model is being deployed as an Nvidia NIM microservice, allowing it to run on-premises via the Dell AI Factory or HPE, as well as across Google Cloud, Oracle, and shortly, AWS and Azure.

  • Production Agents: Companies like CodeRabbit (software development) and Greptile are integrating the model to handle large-scale codebase analysis, while industrial leaders like Siemens and Palantir are deploying it to automate complex workflows in manufacturing and cybersecurity.

As Kari Briski, Nvidia VP of AI Software, noted: “As companies move beyond chatbots and into multi-agent applications, they encounter… context explosion.”

Nemotron 3 Super is Nvidia’s answer to that explosion—a model that provides the “brainpower” of a 120B parameter system with the operational efficiency of a much smaller specialist. For the enterprise, the message is clear: the “thinking tax” is finally coming down.

Credit: Source link

ShareTweetSendSharePin

Related Posts

Google Play will let you try a game before you buy it
AI & Technology

Google Play will let you try a game before you buy it

March 12, 2026
I guess this wasn’t an Xbox after all
AI & Technology

I guess this wasn’t an Xbox after all

March 11, 2026
NVIDIA Releases Nemotron 3 Super: A 120B Parameter Open-Source Hybrid Mamba-Attention MoE Model Delivering 5x Higher Throughput for Agentic AI
AI & Technology

NVIDIA Releases Nemotron 3 Super: A 120B Parameter Open-Source Hybrid Mamba-Attention MoE Model Delivering 5x Higher Throughput for Agentic AI

March 11, 2026
Anthropic gives Claude shared context across Microsoft Excel and PowerPoint, enabling reusable workflows in multiple applications
AI & Technology

Anthropic gives Claude shared context across Microsoft Excel and PowerPoint, enabling reusable workflows in multiple applications

March 11, 2026
Next Post
Colorado first responders save driver from sinking car

Colorado first responders save driver from sinking car

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
Michael Whatley wins Republican North Carolina Senate primary

Michael Whatley wins Republican North Carolina Senate primary

March 6, 2026
Dynamic UI for dynamic AI: Inside the emerging A2UI model

Dynamic UI for dynamic AI: Inside the emerging A2UI model

March 8, 2026
WATCH THIS BEFORE OIL PRICES COLLAPSE…

WATCH THIS BEFORE OIL PRICES COLLAPSE…

March 10, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!