• Kinza Babylon Staked BTCKinza Babylon Staked BTC(KBTC)$83,270.000.00%
  • Steakhouse EURCV Morpho VaultSteakhouse EURCV Morpho Vault(STEAKEURCV)$0.000000-100.00%
  • Stride Staked InjectiveStride Staked Injective(STINJ)$16.51-4.18%
  • Vested XORVested XOR(VXOR)$3,404.231,000.00%
  • FibSwap DEXFibSwap DEX(FIBO)$0.0084659.90%
  • ICPanda DAOICPanda DAO(PANDA)$0.003106-39.39%
  • TruFin Staked APTTruFin Staked APT(TRUAPT)$8.020.00%
  • bitcoinBitcoin(BTC)$104,717.001.55%
  • ethereumEthereum(ETH)$2,466.22-0.40%
  • VNST StablecoinVNST Stablecoin(VNST)$0.0000400.67%
  • tetherTether(USDT)$1.00-0.01%
  • rippleXRP(XRP)$2.402.94%
  • binancecoinBNB(BNB)$644.200.44%
  • solanaSolana(SOL)$170.412.63%
  • Wrapped SOLWrapped SOL(SOL)$143.66-2.32%
  • usd-coinUSDC(USDC)$1.000.00%
  • dogecoinDogecoin(DOGE)$0.2285666.27%
  • cardanoCardano(ADA)$0.751.01%
  • tronTRON(TRX)$0.265779-1.88%
  • staked-etherLido Staked Ether(STETH)$2,470.07-0.17%
  • wrapped-bitcoinWrapped Bitcoin(WBTC)$105,128.002.06%
  • SuiSui(SUI)$3.801.12%
  • Gaj FinanceGaj Finance(GAJ)$0.0059271.46%
  • Content BitcoinContent Bitcoin(CTB)$24.482.55%
  • USD OneUSD One(USD1)$1.000.11%
  • Wrapped stETHWrapped stETH(WSTETH)$2,957.57-0.52%
  • chainlinkChainlink(LINK)$15.551.62%
  • UGOLD Inc.UGOLD Inc.(UGOLD)$3,042.460.08%
  • avalanche-2Avalanche(AVAX)$22.490.81%
  • ParkcoinParkcoin(KPK)$1.101.76%
  • stellarStellar(XLM)$0.2894931.53%
  • HyperliquidHyperliquid(HYPE)$26.290.99%
  • shiba-inuShiba Inu(SHIB)$0.0000153.37%
  • hedera-hashgraphHedera(HBAR)$0.1924841.55%
  • leo-tokenLEO Token(LEO)$8.73-3.31%
  • bitcoin-cashBitcoin Cash(BCH)$401.932.81%
  • ToncoinToncoin(TON)$3.111.87%
  • litecoinLitecoin(LTC)$98.882.19%
  • polkadotPolkadot(DOT)$4.691.43%
  • USDSUSDS(USDS)$1.00-0.02%
  • Yay StakeStone EtherYay StakeStone Ether(YAYSTONE)$2,671.07-2.84%
  • wethWETH(WETH)$2,461.79-0.54%
  • moneroMonero(XMR)$340.781.61%
  • Pundi AIFXPundi AIFX(PUNDIAI)$16.000.00%
  • Bitget TokenBitget Token(BGB)$5.174.45%
  • Binance Bridged USDT (BNB Smart Chain)Binance Bridged USDT (BNB Smart Chain)(BSC-USD)$1.000.12%
  • PengPeng(PENG)$0.60-13.59%
  • Wrapped eETHWrapped eETH(WEETH)$2,626.62-0.60%
  • PepePepe(PEPE)$0.0000138.62%
  • Pi NetworkPi Network(PI)$0.749.69%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

LLMs Can Now Retain High Accuracy at 2-Bit Precision: Researchers from UNC Chapel Hill Introduce TACQ, a Task-Aware Quantization Approach that Preserves Critical Weight Circuits for Compression Without Performance Loss

April 22, 2025
in AI & Technology
Reading Time: 4 mins read
A A
LLMs Can Now Retain High Accuracy at 2-Bit Precision: Researchers from UNC Chapel Hill Introduce TACQ, a Task-Aware Quantization Approach that Preserves Critical Weight Circuits for Compression Without Performance Loss
ShareShareShareShareShare

YOU MAY ALSO LIKE

Big Tech is on trial. What that means for the future of the internet #technology #shorts

Elon Musk just got his very own town #technology #shorts

LLMs show impressive capabilities across numerous applications, yet they face challenges due to computational demands and memory requirements. This challenge is acute in scenarios requiring local deployment for privacy concerns, such as processing sensitive patient records, or compute-constrained environments like real-time customer service systems and edge devices. Post-training quantization (PTQ) is a promising solution that allows efficient compression of pre-trained models, reducing memory consumption by 2-4 times. However, current processes have a bottleneck at 4-bit compression, with substantial performance degradation when attempting 2- or 3-bit precision. Most PTQ methods rely on small mini-batches of general-purpose pre-training data to account for activation changes resulting from quantization.

Current methods for LLM compression primarily fall into three categories. Uniform quantization represents the most basic approach, where weights stored as 16-bit float tensors are compressed by treating each row independently, mapping floats to integers based on maximum and minimum values within each channel. GPTQ-based quantization techniques advance this concept by focusing on layerwise reconstruction, aiming to minimize reconstruction loss after quantization. Further, Mixed-precision quantization methods offer a more nuanced strategy, moving beyond fixed precision for all weights. These techniques assign bit-width based on weight importance to maintain performance, with some approaches preserving high-sensitivity “outlier” weights at higher precision.

Researchers from UNC Chapel Hill have proposed a novel mixed-precision post-training quantization approach called TaskCircuit Quantization (TACQ). The method shows similarities to automated circuit discovery by directly conditioning the quantization process on specific weight circuits, defined as sets of weights associated with downstream task performance. TACQ compares unquantized model weights with uniformly quantized ones to estimate expected weight changes from quantization, then uses gradient information to predict impacts on task performance, enabling preservation of task-specific weights. TACQ consistently outperforms baselines with the same calibration data and lower weight budgets, and achieves significant improvements in the challenging 2-bit and 3-bit regimes.

TACQ is defined by a saliency metric that identifies critical weights to preserve during quantization, building on concepts from model interpretability like automatic circuit discovery, knowledge localization, and input attribution. This metric uses two components:

  • Quantization-aware Localization (QAL): Trace how model performance is affected by estimating expected weight changes due to quantization.
  • Magnitude-sharpened Gradient (MSG): A generalized metric for absolute weight importance adapted from input attribution techniques.

MSG helps stabilize TACQ and addresses biases from QAL’s estimations. These factors combine into a unified saliency metric that can be efficiently evaluated for every weight in a single backward pass, allowing preservation of the top p% highest-scoring weights at 16-bit precision.

In the challenging 2-bit setting, TACQ outperforms SliM-LLM with absolute margin improvements of 16.0% (from 20.1% to 36.1%) on GSM8k, 14.1% (from 34.8% to 49.2%) on MMLU, and 21.9% (from 0% to 21.9%) on Spider. Other baseline methods like GPTQ, SqueezeLLM, and SPQR deteriorate to near-random performance at this compression level. At 3-bit precision, TACQ preserves approximately 91%, 96%, and 89% of the unquantized accuracy on GSM8k, MMLU, and Spider, respectively, while outperforming the strongest baseline, SliM-LLM, by 1-2% across most datasets. TACQ’s advantages become evident in generation tasks requiring sequential token outputs, where it is the only method capable of recovering non-negligible performance in the 2-bit setting for the Spider text-to-SQL task.

In conclusion, researchers introduced TACQ, a significant advancement in task-aware post-training quantization. It improves model performance at ultra-low bit-widths (2- to 3-bits) where previous methods degrade to near-random outputs. TACQ aligns with automatic circuit discovery research by selectively preserving only a small fraction of salient weights at 16-bit precision, indicating that sparse weight “circuits” disproportionately influence specific tasks. Moreover, experiments on Spider show that TACQ better preserves model generation capabilities, making it suitable for program-prediction tasks. This also applies to situations involving agents, where models frequently generate many executable outputs, and where efficiency is a concern.


Check out the Paper and GitHub Page. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 90k+ ML SubReddit.

🔥 [Register Now] miniCON Virtual Conference on AGENTIC AI: FREE REGISTRATION + Certificate of Attendance + 4 Hour Short Event (May 21, 9 am- 1 pm PST) + Hands on Workshop


Sajjad Ansari is a final year undergraduate from IIT Kharagpur. As a Tech enthusiast, he delves into the practical applications of AI with a focus on understanding the impact of AI technologies and their real-world implications. He aims to articulate complex AI concepts in a clear and accessible manner.

Credit: Source link

ShareTweetSendSharePin

Related Posts

Big Tech is on trial. What that means for the future of the internet #technology #shorts
AI & Technology

Big Tech is on trial. What that means for the future of the internet #technology #shorts

May 18, 2025
Elon Musk just got his very own town #technology #shorts
AI & Technology

Elon Musk just got his very own town #technology #shorts

May 18, 2025
Elon, Inc. podcast moves in to some new digs #technology #shorts
AI & Technology

Elon, Inc. podcast moves in to some new digs #technology #shorts

May 18, 2025
Twilio CEO Outlines Microsoft Partnership
AI & Technology

Twilio CEO Outlines Microsoft Partnership

May 18, 2025
Next Post
Trump presents his ‘gold card’ immigration visa

Trump presents his 'gold card' immigration visa

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
🚨BUY ALERT: United Health Care Stock UNH Crashed Today (90% ROI)

🚨BUY ALERT: United Health Care Stock UNH Crashed Today (90% ROI)

May 18, 2025
You’re Not A Victim Of Anything But Your Bad Thinking!

You’re Not A Victim Of Anything But Your Bad Thinking!

May 18, 2025
NBA play-offs: New York Knicks eliminate defending champions Boston Celtics – BBC

NBA play-offs: New York Knicks eliminate defending champions Boston Celtics – BBC

May 17, 2025

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!