• bitcoinBitcoin(BTC)$75,078.00-1.40%
  • ethereumEthereum(ETH)$2,234.84-2.36%
  • tetherTether(USDT)$1.00-0.02%
  • rippleXRP(XRP)$1.35-2.16%
  • binancecoinBNB(BNB)$612.02-1.83%
  • usd-coinUSDC(USDC)$1.00-0.03%
  • solanaSolana(SOL)$81.73-2.31%
  • tronTRON(TRX)$0.322906-0.11%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.03-0.65%
  • dogecoinDogecoin(DOGE)$0.1016352.36%
  • whitebitWhiteBIT Coin(WBT)$53.47-0.90%
  • USDSUSDS(USDS)$1.00-0.02%
  • leo-tokenLEO Token(LEO)$10.33-0.31%
  • HyperliquidHyperliquid(HYPE)$39.31-1.01%
  • cardanoCardano(ADA)$0.240010-2.47%
  • bitcoin-cashBitcoin Cash(BCH)$443.86-0.86%
  • moneroMonero(XMR)$371.12-2.28%
  • chainlinkChainlink(LINK)$8.98-2.74%
  • CantonCanton(CC)$0.1524801.69%
  • zcashZcash(ZEC)$319.76-4.24%
  • stellarStellar(XLM)$0.159030-1.98%
  • USD1USD1(USD1)$1.00-0.17%
  • daiDai(DAI)$1.00-0.05%
  • MemeCoreMemeCore(M)$3.37-2.82%
  • litecoinLitecoin(LTC)$54.70-0.86%
  • avalanche-2Avalanche(AVAX)$9.02-1.55%
  • hedera-hashgraphHedera(HBAR)$0.087522-1.81%
  • Ethena USDeEthena USDe(USDE)$1.000.01%
  • RainRain(RAIN)$0.0076702.40%
  • shiba-inuShiba Inu(SHIB)$0.000006-1.07%
  • suiSui(SUI)$0.89-3.58%
  • paypal-usdPayPal USD(PYUSD)$1.000.00%
  • the-open-networkToncoin(TON)$1.300.89%
  • crypto-com-chainCronos(CRO)$0.067703-2.09%
  • Circle USYCCircle USYC(USYC)$1.120.01%
  • tether-goldTether Gold(XAUT)$4,534.78-1.22%
  • Global DollarGlobal Dollar(USDG)$1.00-0.01%
  • BittensorBittensor(TAO)$249.37-2.88%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • pax-goldPAX Gold(PAXG)$4,529.89-1.31%
  • mantleMantle(MNT)$0.62-1.72%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.064295-12.94%
  • polkadotPolkadot(DOT)$1.19-3.14%
  • uniswapUniswap(UNI)$3.13-3.32%
  • Pi NetworkPi Network(PI)$0.1896961.31%
  • SkySky(SKY)$0.082098-6.61%
  • Falcon USDFalcon USD(USDF)$1.000.01%
  • okbOKB(OKB)$81.37-1.57%
  • nearNEAR Protocol(NEAR)$1.30-2.85%
  • AsterAster(ASTER)$0.651.17%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

The ‘Finding Neurons in a Haystack’ Initiative at MIT, Harvard, and Northeastern University Employs Sparse Probing

May 10, 2023
in AI & Technology
Reading Time: 5 mins read
A A
The ‘Finding Neurons in a Haystack’ Initiative at MIT, Harvard, and Northeastern University Employs Sparse Probing
ShareShareShareShareShare

It is common to think of neural networks as adaptable “feature extractors” that learn by progressively refining appropriate representations from initial raw inputs. So, the question arises: what characteristics are being represented, and in what way? To better understand how high-level, human-interpretable features are described in the neuronal activations of LLMs, a research team from the Massachusetts Institute of Technology (MIT), Harvard University (HU), and Northeastern University (NEU) proposes a technique called sparse probing.

Standardly, researchers will train a basic classifier (a probe) on the internal activations of a model to predict a property of the input and then examine the network to see if and where it represents the feature in question. The suggested sparse probing method probes for over 100 variables to pinpoint the relevant neurons. This method overcomes the limitations of prior probing methods and sheds light on the intricate structure of LLMs. It limits the probing classifier to using no more than k neurons in its prediction, where k is variable between 1 and 256.

The team uses state-of-the-art optimal sparse prediction techniques to demonstrate the small-k optimality of the k-sparse feature selection subproblem and tackle the confusion between ranking and classification accuracy. They use sparsity as an inductive bias to ensure their probes can keep a strong simplicity prior and pinpoint key neurons for granular examination. Furthermore, the technique can generate a more reliable signal of whether a specific characteristic is explicitly represented and used downstream because a capacity shortage prevents their probes from memorizing correlation patterns connected with features of interest.

🚀 JOIN the fastest ML Subreddit Community

The research group used autoregressive transformer LLMs in their experiment, reporting on classification results after training probes with varying k values. They conclude as follows from the study:

  • The neurons of LLMs contain a wealth of interpretable structure, and sparse probing is an efficient way for locating them (even in superposition). Still, it must be used cautiously and followed up with analysis if rigorous conclusions are to be drawn.
  • When many neurons in the first layer are activated for unrelated n-grams and local patterns, the features are encoded as sparse linear combinations of polysemantic neurons. Weight statistics and insights from toy models also lead us to conclude that the first 25% of completely connected layers extensively use superposition.
  • Although definitive conclusions about monosemanticity remain methodologically out of reach, mono-semantic neurons, especially in middle layers, encode higher-level contextual and linguistic properties (such as is_python_code).
  • While representation sparsity tends to rise as models become bigger, this trend doesn’t hold across the board; some features emerge with dedicated neurons as the model gets bigger, while others split into finer-grained features as the model gets bigger, and many others either don’t change or arrive rather randomly.

A Few Benefits of Sparse Probing

  • The potential risk of conflating classification quality with ranking quality when investigating individual neurons with probes is addressed further by the availability of probes with optimality guarantees.
  • In addition, sparse probes are intended to have a low storage capacity, so there’s less cause for alarm about the probe being able to learn the task by itself.
  • To probe, you’ll need a supervised dataset. Still, once you’ve built one, you can use it to interpret any model, which opens the door to research into things like the universality of learned circuits and the natural abstractions hypothesis.
  • Instead of relying on subjective assessments, it can be used to automatically examine how different architectural choices affect the occurrence of polysemantic and superposition.

Sparse probing has its limitations

  • Strong inferences can only be made from probing experiment data with an additional secondary investigation of the identified neurons.
  • Because of its sensitivity to implementation details, anomalies, misspecifications, and misleading correlations in the probing dataset, probing provides only limited insight into causation.
  • Particularly in terms of interpretability, sparse probes cannot recognize features constructed across multiple layers or differentiate between features in superposition and features represented as the union of numerous distinct, more granular features.
  • Iterative pruning may be required to identify all significant neurons if sparse probing misses some due to redundancy in the probing dataset. Using multi-token characteristics necessitates specialized processing, commonly implemented using aggregations that might further dilute the result’s specificity.

Using a revolutionary sparse probing technique, our work unveils a wealth of rich, human-understandable structures in LLMs. Scientists plan to build an extensive repository of probing datasets, possibly with the help of AI, that record details especially pertinent to bias, justice, safety, and high-stakes decision-making. They encourage other researchers to join in exploring this “ambitious interpretability” and argue that an empirical approach evocative of the natural sciences can be more productive than with typical machine learning experimental loops. Having vast and diverse supervised datasets will allow for improved evaluations of the next generation of unsupervised interpretability techniques that will be required to keep up with AI advancement, in addition to automating the assessment of new models.


Check out the Paper. Don’t forget to join our 21k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at [email protected]

🚀 Check Out 100’s AI Tools in AI Tools Club


YOU MAY ALSO LIKE

Definity embeds agents inside Spark pipelines to catch failures before they reach agentic AI systems

Poolside AI Introduces Laguna XS.2 and M.1: Agentic Coding Models Reaching 68.2% and 72.5% on SWE-bench Verified

Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone’s life easy.


Credit: Source link

ShareTweetSendSharePin

Related Posts

Definity embeds agents inside Spark pipelines to catch failures before they reach agentic AI systems
AI & Technology

Definity embeds agents inside Spark pipelines to catch failures before they reach agentic AI systems

April 29, 2026
Poolside AI Introduces Laguna XS.2 and M.1: Agentic Coding Models Reaching 68.2% and 72.5% on SWE-bench Verified
AI & Technology

Poolside AI Introduces Laguna XS.2 and M.1: Agentic Coding Models Reaching 68.2% and 72.5% on SWE-bench Verified

April 29, 2026
How to build custom reasoning agents with a fraction of the compute
AI & Technology

How to build custom reasoning agents with a fraction of the compute

April 28, 2026
American AI startup Poolside launches free, high-performing open model Laguna XS.2 for local agentic coding
AI & Technology

American AI startup Poolside launches free, high-performing open model Laguna XS.2 for local agentic coding

April 28, 2026
Next Post
The mental health system is broken: Cerebral CEO

The mental health system is broken: Cerebral CEO

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
You’re Losing Money Not Using This New AI Model

You’re Losing Money Not Using This New AI Model

April 29, 2026
Exclusive: Cuban president says he’s ‘not stepping down’

Exclusive: Cuban president says he’s ‘not stepping down’

April 28, 2026
Israel says Iran ceasefire ‘does not include Lebanon’

Israel says Iran ceasefire ‘does not include Lebanon’

April 29, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!