• bitcoinBitcoin(BTC)$74,329.001.42%
  • ethereumEthereum(ETH)$2,326.793.30%
  • tetherTether(USDT)$1.000.00%
  • rippleXRP(XRP)$1.533.31%
  • binancecoinBNB(BNB)$674.84-0.43%
  • usd-coinUSDC(USDC)$1.000.00%
  • solanaSolana(SOL)$94.621.47%
  • tronTRON(TRX)$0.2994610.91%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.030.92%
  • dogecoinDogecoin(DOGE)$0.100927-0.67%
  • whitebitWhiteBIT Coin(WBT)$58.120.92%
  • USDSUSDS(USDS)$1.00-0.01%
  • cardanoCardano(ADA)$0.286640-0.10%
  • HyperliquidHyperliquid(HYPE)$41.306.19%
  • bitcoin-cashBitcoin Cash(BCH)$473.450.29%
  • leo-tokenLEO Token(LEO)$9.02-0.05%
  • chainlinkChainlink(LINK)$9.801.61%
  • moneroMonero(XMR)$375.263.54%
  • CantonCanton(CC)$0.1563105.31%
  • Ethena USDeEthena USDe(USDE)$1.000.13%
  • stellarStellar(XLM)$0.1756051.80%
  • USD1USD1(USD1)$1.00-0.04%
  • litecoinLitecoin(LTC)$57.920.78%
  • avalanche-2Avalanche(AVAX)$10.32-0.48%
  • zcashZcash(ZEC)$267.9715.65%
  • hedera-hashgraphHedera(HBAR)$0.0993201.01%
  • daiDai(DAI)$1.000.01%
  • RainRain(RAIN)$0.008802-1.84%
  • paypal-usdPayPal USD(PYUSD)$1.00-0.01%
  • suiSui(SUI)$1.03-2.66%
  • shiba-inuShiba Inu(SHIB)$0.000006-3.38%
  • crypto-com-chainCronos(CRO)$0.0802821.36%
  • the-open-networkToncoin(TON)$1.341.22%
  • MemeCoreMemeCore(M)$1.7418.21%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.104369-0.78%
  • tether-goldTether Gold(XAUT)$4,982.340.43%
  • mantleMantle(MNT)$0.834.11%
  • polkadotPolkadot(DOT)$1.621.27%
  • BittensorBittensor(TAO)$276.180.65%
  • uniswapUniswap(UNI)$4.02-1.41%
  • pax-goldPAX Gold(PAXG)$5,013.100.45%
  • Circle USYCCircle USYC(USYC)$1.12-0.01%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • okbOKB(OKB)$95.22-2.01%
  • nearNEAR Protocol(NEAR)$1.475.70%
  • aaveAave(AAVE)$120.851.44%
  • SkySky(SKY)$0.07891011.36%
  • AsterAster(ASTER)$0.741.62%
  • Global DollarGlobal Dollar(USDG)$1.000.01%
  • Pi NetworkPi Network(PI)$0.180706-9.26%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Nvidia BlueField-4 STX adds a context memory layer to storage to close the agentic AI throughput gap

March 16, 2026
in AI & Technology
Reading Time: 4 mins read
A A
Nvidia BlueField-4 STX adds a context memory layer to storage to close the agentic AI throughput gap
ShareShareShareShareShare

When an AI agent loses context mid-task because traditional storage can’t keep pace with inference, it is not a model problem — it is a storage problem. At GTC 2026, Nvidia announced BlueField-4 STX, a modular reference architecture that inserts a dedicated context memory layer between GPUs and traditional storage, claiming 5x the token throughput, 4x the energy efficiency and 2x the data ingestion speed of conventional CPU-based storage.

YOU MAY ALSO LIKE

Boox’s new Go E Ink tablet includes a 10-inch display and runs Android 15

Mistral AI Releases Mistral Small 4: A 119B-Parameter MoE Model that Unifies Instruct, Reasoning, and Multimodal Workloads

The bottleneck STX targets is key-value cache data. KV cache is the stored record of what a model has already processed — the intermediate calculations an LLM saves so it does not have to recompute attention across the entire context on every inference step. It is what allows an agent to maintain coherent working memory across sessions, tool calls and reasoning steps. As context windows grow and agents take more steps, that cache grows with them. When it has to traverse a traditional storage path to get back to the GPU, inference slows and GPU utilization drops.

STX is not a product Nvidia sells directly. It is a reference architecture the company is distributing to its storage partner ecosystem so vendors can build AI-native infrastructure around it.

STX puts a context memory layer between GPU and disk

The architecture is built around a new storage-optimized BlueField-4 processor that combines Nvidia’s Vera CPU with the ConnectX-9 SuperNIC. It runs on Spectrum-X Ethernet networking and is programmable through Nvidia’s DOCA software platform.

The first rack-scale implementation is the Nvidia CMX context memory storage platform. CMX extends GPU memory with a high-performance context layer designed specifically for storing and retrieving KV cache data generated by large language models during inference. Keeping that cache accessible without forcing a round trip through general-purpose storage is what CMX is designed to do.

“Traditional data centers provide high-capacity, general-purpose storage, but generally lack the responsiveness required for interaction with AI agents that need to work across many steps, tools and different sessions,” Ian Buck, Nvidia’s vice president of hyperscale and high-performance computing said in a briefing with press and analysts.

In response to a question from VentureBeat, Buck confirmed that STX also ships with a software reference platform alongside the hardware architecture. Nvidia is expanding DOCA to include a new component referred to in the briefing as DOCA Memo. 

“Our storage providers can leverage the programmability of the BlueField-4 processor to optimize storage for the agentic AI factory,” Buck said. “In addition to having a reference rack architecture, we’re also providing a reference software platform for them to deliver those innovations and optimizations for their customers.”

Storage partners building on STX get both a hardware reference design and a software reference platform — a programmable foundation for context-optimized storage.

Nvidia’s partner list spans storage incumbents and AI-native cloud providers

Storage providers co-designing STX-based infrastructure include Cloudian, DDN, Dell Technologies, Everpure, Hitachi Vantara, HPE, IBM, MinIO, NetApp, Nutanix, VAST Data and WEKA. Manufacturing partners building STX-based systems include AIC, Supermicro and Quanta Cloud Technology.

On the cloud and AI side, CoreWeave, Crusoe, IREN, Lambda, Mistral AI, Nebius, Oracle Cloud Infrastructure and Vultr have all committed to STX for context memory storage.

That combination of enterprise storage incumbents and AI-native cloud providers is the signal worth watching. Nvidia is not positioning STX as a specialty product for hyperscalers. It is positioning it as the reference standard for anyone building storage infrastructure that has to serve agentic AI workloads — which, within the next two to three years, is likely to include most enterprise AI deployments running multi-step inference at scale.

STX-based platforms will be available from partners in the second half of 2026.

IBM shows what the data layer problem looks like in production

IBM sits on both sides of the STX announcement. It is listed as a storage provider co-designing STX-based infrastructure, and Nvidia separately confirmed that it has selected IBM Storage Scale System 6000 — certified and validated on Nvidia DGX platforms — as the high-performance storage foundation for its own GPU-native analytics infrastructure.

IBM also announced a broader expanded collaboration with Nvidia at GTC, including GPU-accelerated integration between IBM’s watsonx.data Presto SQL engine and Nvidia’s cuDF library. A production proof of concept with Nestlé put numbers on what that acceleration looks like: a data refresh cycle across the company’s Order-to-Cash data mart, covering 186 countries and 44 tables, dropped from 15 minutes to three minutes. IBM reported 83% cost savings and a 30x price-performance improvement.

The Nestlé result is a structured analytics workload. It does not directly demonstrate agentic inference performance. But it makes IBM and Nvidia’s shared argument concrete: the data layer is where enterprise AI performance is currently constrained, and GPU-accelerating it produces material results in production.

Why the storage layer is becoming a first-class infrastructure decision

STX is a signal that the storage layer is becoming a first-class concern in enterprise AI infrastructure planning, not an afterthought to GPU procurement.

General-purpose NAS and object storage were not designed to serve KV cache data at inference latency requirements. STX-based systems from partners including Dell, HPE, NetApp and VAST Data are what Nvidia is putting forward as the practical alternative, with the DOCA software platform providing the programmability layer to tune storage behavior for specific agentic workloads.

The performance claims — 5x token throughput, 4x energy efficiency, 2x data ingestion — are measured against traditional CPU-based storage architectures. Nvidia has not specified the exact baseline configuration for those comparisons. Before those numbers drive infrastructure decisions, the baseline is worth pinning down.

Platforms are expected from partners in the second half of 2026. Given that most major storage vendors are already co-designing on STX, enterprises evaluating storage refreshes for AI infrastructure in the next 12 months should expect STX-based options to be available from their existing vendor relationships.

Credit: Source link

ShareTweetSendSharePin

Related Posts

Boox’s new Go E Ink tablet includes a 10-inch display and runs Android 15
AI & Technology

Boox’s new Go E Ink tablet includes a 10-inch display and runs Android 15

March 17, 2026
Mistral AI Releases Mistral Small 4: A 119B-Parameter MoE Model that Unifies Instruct, Reasoning, and Multimodal Workloads
AI & Technology

Mistral AI Releases Mistral Small 4: A 119B-Parameter MoE Model that Unifies Instruct, Reasoning, and Multimodal Workloads

March 16, 2026
Android tablets and foldables are getting a Chrome bookmark bar
AI & Technology

Android tablets and foldables are getting a Chrome bookmark bar

March 16, 2026
Nvidia’s DGX Station is a desktop supercomputer that runs trillion-parameter AI models without the cloud
AI & Technology

Nvidia’s DGX Station is a desktop supercomputer that runs trillion-parameter AI models without the cloud

March 16, 2026
Next Post
‘A molten, mushy state’: scientists may have found a new type of liquid planet – The Guardian

‘A molten, mushy state’: scientists may have found a new type of liquid planet - The Guardian

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
Nations agree to release oil reserves as war in Iran hits global economy – The Washington Post

Nations agree to release oil reserves as war in Iran hits global economy – The Washington Post

March 11, 2026
Gen Z digital moguls building seven-figure companies at the gym

Gen Z digital moguls building seven-figure companies at the gym

March 11, 2026
I guess this wasn’t an Xbox after all

I guess this wasn’t an Xbox after all

March 11, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!