• bitcoinBitcoin(BTC)$74,771.000.51%
  • ethereumEthereum(ETH)$2,339.42-0.84%
  • tetherTether(USDT)$1.000.01%
  • rippleXRP(XRP)$1.53-0.17%
  • binancecoinBNB(BNB)$674.12-0.90%
  • usd-coinUSDC(USDC)$1.000.00%
  • solanaSolana(SOL)$95.18-1.23%
  • tronTRON(TRX)$0.3059263.43%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.030.49%
  • dogecoinDogecoin(DOGE)$0.101059-0.95%
  • whitebitWhiteBIT Coin(WBT)$58.37-0.02%
  • USDSUSDS(USDS)$1.000.00%
  • cardanoCardano(ADA)$0.2907240.81%
  • HyperliquidHyperliquid(HYPE)$41.801.85%
  • bitcoin-cashBitcoin Cash(BCH)$475.17-1.17%
  • leo-tokenLEO Token(LEO)$9.05-0.14%
  • chainlinkChainlink(LINK)$9.88-1.18%
  • moneroMonero(XMR)$370.38-1.48%
  • Ethena USDeEthena USDe(USDE)$1.000.05%
  • stellarStellar(XLM)$0.1755130.20%
  • CantonCanton(CC)$0.151501-1.65%
  • zcashZcash(ZEC)$275.94-1.58%
  • USD1USD1(USD1)$1.00-0.01%
  • litecoinLitecoin(LTC)$58.54-0.60%
  • avalanche-2Avalanche(AVAX)$10.31-0.78%
  • hedera-hashgraphHedera(HBAR)$0.099383-1.09%
  • daiDai(DAI)$1.00-0.03%
  • RainRain(RAIN)$0.008937-1.32%
  • paypal-usdPayPal USD(PYUSD)$1.000.03%
  • suiSui(SUI)$1.04-3.04%
  • shiba-inuShiba Inu(SHIB)$0.000006-0.58%
  • the-open-networkToncoin(TON)$1.350.92%
  • crypto-com-chainCronos(CRO)$0.080080-0.30%
  • MemeCoreMemeCore(M)$1.8612.30%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.102664-3.97%
  • tether-goldTether Gold(XAUT)$4,969.500.18%
  • mantleMantle(MNT)$0.840.30%
  • polkadotPolkadot(DOT)$1.62-1.45%
  • BittensorBittensor(TAO)$282.36-0.51%
  • uniswapUniswap(UNI)$4.00-3.88%
  • pax-goldPAX Gold(PAXG)$4,993.510.16%
  • Circle USYCCircle USYC(USYC)$1.120.00%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • okbOKB(OKB)$95.16-2.64%
  • AsterAster(ASTER)$0.786.50%
  • nearNEAR Protocol(NEAR)$1.46-1.16%
  • aaveAave(AAVE)$121.88-1.03%
  • Global DollarGlobal Dollar(USDG)$1.000.01%
  • Falcon USDFalcon USD(USDF)$1.00-0.02%
  • Pi NetworkPi Network(PI)$0.177377-9.80%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Taalas is replacing programmable GPUs with hardwired AI chips to achieve 17,000 tokens per second for ubiquitous inference

February 23, 2026
in AI & Technology
Reading Time: 6 mins read
A A
Taalas is replacing programmable GPUs with hardwired AI chips to achieve 17,000 tokens per second for ubiquitous inference
ShareShareShareShareShare

In the high-stakes world of AI infrastructure, the industry has operated under a singular assumption: flexibility is king. We build general-purpose GPUs because AI models change every week, and we need programmable silicon that can adapt to the next research breakthrough.

But Taalas, the Toronto-based startup thinks that flexibility is exactly what’s holding AI back. According to Taalas team, if we want AI to be as common and cheap as plastic, we have to stop ‘simulating’ intelligence on general-purpose computers and start ‘casting’ it directly into silicon.

YOU MAY ALSO LIKE

Part Three trailer introduces Robert Pattinson’s villainous new character

Google makes Gemini personalization available to free users

The Problem: The ‘Memory Wall’ and the GPU Tax

The current cost of running a Large Language Model (LLM) is driven by a physical bottleneck: the Memory Wall.

Traditional processors (GPUs) are ‘Instruction Set Architecture’ (ISA) based. They separate compute and memory. When you run an inference pass on a model like Llama-3, the chip spends the vast majority of its time and energy shuttling weights from High Bandwidth Memory (HBM) to the processing cores. This ‘data movement tax’ accounts for nearly 90% of the power consumption in modern AI data centers.

Taalas’s solution is radical: eliminate the memory-fetch cycle. By using a proprietary automated design flow, Taalas translates the computational graph of a specific model directly into the physical layout of a chip. In their HC1 (Hardcore 1) chip, the model’s weights and architecture are literally etched into the wiring of the silicon.

https://taalas.com/the-path-to-ubiquitous-ai/

Hardcore Models: 17,000 Tokens Per Second

The results of this ‘direct-to-silicon’ approach redefine the performance ceiling for inference. At their latest unveiling, Taalas demonstrated the HC1 running a Llama 3.1 8B model. While a top-tier NVIDIA H100 might serve a single user at ~150 tokens per second, the HC1 serves a staggering 16,000 to 17,000 tokens per second.

This changes the ‘unit economics’ of AI:

  • Performance: A single HC1 chip can outperform a small GPU data center in terms of raw throughput for a specific model.
  • Efficiency: Taalas claims a 1000x improvement in efficiency (performance-per-watt and performance-per-dollar) compared to conventional chips.
  • Infrastructure: Because the weights are hardwired, there is no need for external HBM or complex liquid cooling systems. A standard air-cooled rack can house ten of these 250W cards, delivering the power of an entire GPU cluster in a single server box.

Breaking the 60-Day Barrier: The Automated Foundry

The obvious ‘catch’ for an AI developer is flexibility. If you hardwire a model into a chip today, what happens when a better model comes out tomorrow? Historically, designing an ASIC (Application-Specific Integrated Circuit) took two years and tens of millions of dollars.

Taalas has solved this through automation. They have built a compiler-like foundry system that takes model weights and generates a chip design in roughly a week. By focusing on a streamlined manufacturing workflow—where they only change the top metal masks of the silicon—they have collapsed the turnaround time from ‘weights-to-silicon’ to just two months.

This allows for a ‘seasonal’ hardware cycle. A company could fine-tune a frontier model in the spring and have thousands of specialized, hyper-efficient inference chips deployed by summer.

https://taalas.com/the-path-to-ubiquitous-ai/

The Market Shift: From Shovels to Stamps

This transition marks a pivotal moment in the AI hype cycle. We are moving from the ‘Research & Training’ phase—where GPUs are essential for their flexibility—to the ‘Deployment & Inference’ phase, where cost-per-token is the only metric that matters.

If Taalas succeeds, the AI market will split into two distinct tiers:

  1. General-Purpose Training: Led by NVIDIA and AMD, providing the massive, flexible clusters needed to discover and train new architectures.
  2. Specialized Inference: Led by ‘foundries’ like Taalas, which take those proven architectures and ‘print’ them into cheap, ubiquitous silicon for everything from smartphones to industrial sensors.

Key Takeaways

  • The ‘Hardwired’ Paradigm Shift: Taalas is moving from software-defined AI (running models on general-purpose GPUs) to hardware-defined AI. By ‘baking’ a specific model’s weights and architecture directly into the silicon, they eliminate the need for traditional instruction-set overhead, effectively making the model the processor itself.
  • Death of the Memory Wall: Traditional AI hardware wastes ~90% of its energy moving data between memory and compute. Taalas’s HC1 (Hardcore 1) chip eliminates the “Memory Wall” by physically wiring the model parameters into the chip’s metal layers, removing the need for expensive High Bandwidth Memory (HBM).
  • 1000x Efficiency Leap: By stripping away the ‘programmability tax’, Taalas claims a 1,000x improvement in performance-per-watt and performance-per-dollar. In practice, this means an HC1 can hit 17,000 tokens per second on a Llama 3.1 8B model—massively outperforming a standard GPU rack while using far less power.
  • Automated ‘Direct-to-Silicon’ Foundry: To solve the problem of model obsolescence, Taalas uses a proprietary automated design flow. This reduces the time to create a custom AI chip from years to just weeks, allowing companies to ‘print’ their fine-tuned models into silicon on a seasonal basis.
  • The Commodity AI Future: This technology signals a shift from ‘Cloud-First’ to ‘Device-Native’ AI. As inference becomes a cheap, hardwired commodity, AI will move off centralized servers and into local, low-power hardware—ranging from smartphones to industrial sensors—with zero latency and no subscription costs.

Check out the Technical details. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

The post Taalas is replacing programmable GPUs with hardwired AI chips to achieve 17,000 tokens per second for ubiquitous inference appeared first on MarkTechPost.

Credit: Source link

ShareTweetSendSharePin

Related Posts

Part Three trailer introduces Robert Pattinson’s villainous new character
AI & Technology

Part Three trailer introduces Robert Pattinson’s villainous new character

March 17, 2026
Google makes Gemini personalization available to free users
AI & Technology

Google makes Gemini personalization available to free users

March 17, 2026
Nvidia’s agentic AI stack is the first major platform to ship with security at launch, but governance gaps remain
AI & Technology

Nvidia’s agentic AI stack is the first major platform to ship with security at launch, but governance gaps remain

March 17, 2026
Senators tell ByteDance to shut down Seedance 2.0 AI video app ‘immediately’
AI & Technology

Senators tell ByteDance to shut down Seedance 2.0 AI video app ‘immediately’

March 17, 2026
Next Post
Athletes share emotional moments from the medal podium

Athletes share emotional moments from the medal podium

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
Anthropic is doubling Claude’s usage limits during off-peak hours for the next two weeks

Anthropic is doubling Claude’s usage limits during off-peak hours for the next two weeks

March 15, 2026
We’re House Poor And Living Paycheck-to-Paycheck

We’re House Poor And Living Paycheck-to-Paycheck

March 16, 2026
WHY ARE MARKETS DROPPING AGAIN?

WHY ARE MARKETS DROPPING AGAIN?

March 16, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!