• bitcoinBitcoin(BTC)$77,686.00-0.47%
  • ethereumEthereum(ETH)$2,314.60-0.77%
  • tetherTether(USDT)$1.00-0.01%
  • rippleXRP(XRP)$1.42-0.86%
  • binancecoinBNB(BNB)$627.04-0.61%
  • usd-coinUSDC(USDC)$1.000.02%
  • solanaSolana(SOL)$85.28-1.51%
  • tronTRON(TRX)$0.323271-0.11%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.020.00%
  • dogecoinDogecoin(DOGE)$0.098026-0.39%
  • whitebitWhiteBIT Coin(WBT)$54.80-0.87%
  • USDSUSDS(USDS)$1.00-0.01%
  • HyperliquidHyperliquid(HYPE)$42.302.74%
  • leo-tokenLEO Token(LEO)$10.370.90%
  • cardanoCardano(ADA)$0.247013-1.75%
  • bitcoin-cashBitcoin Cash(BCH)$446.43-1.61%
  • moneroMonero(XMR)$389.571.64%
  • chainlinkChainlink(LINK)$9.32-1.14%
  • zcashZcash(ZEC)$358.851.27%
  • CantonCanton(CC)$0.149341-1.06%
  • stellarStellar(XLM)$0.168455-1.31%
  • MemeCoreMemeCore(M)$4.20-2.79%
  • daiDai(DAI)$1.00-0.04%
  • USD1USD1(USD1)$1.00-0.01%
  • litecoinLitecoin(LTC)$55.35-1.62%
  • avalanche-2Avalanche(AVAX)$9.24-2.12%
  • hedera-hashgraphHedera(HBAR)$0.090777-2.12%
  • Ethena USDeEthena USDe(USDE)$1.00-0.01%
  • suiSui(SUI)$0.93-2.14%
  • shiba-inuShiba Inu(SHIB)$0.000006-1.55%
  • RainRain(RAIN)$0.0074434.37%
  • paypal-usdPayPal USD(PYUSD)$1.000.00%
  • the-open-networkToncoin(TON)$1.30-1.20%
  • crypto-com-chainCronos(CRO)$0.069958-0.38%
  • Circle USYCCircle USYC(USYC)$1.120.00%
  • tether-goldTether Gold(XAUT)$4,694.380.01%
  • Global DollarGlobal Dollar(USDG)$1.000.00%
  • BittensorBittensor(TAO)$247.880.53%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.072773-2.94%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • pax-goldPAX Gold(PAXG)$4,698.520.08%
  • mantleMantle(MNT)$0.64-2.94%
  • polkadotPolkadot(DOT)$1.23-2.27%
  • uniswapUniswap(UNI)$3.23-1.57%
  • SkySky(SKY)$0.084943-3.44%
  • Pi NetworkPi Network(PI)$0.179114-0.15%
  • Falcon USDFalcon USD(USDF)$1.000.04%
  • nearNEAR Protocol(NEAR)$1.37-2.70%
  • okbOKB(OKB)$83.83-0.50%
  • HTX DAOHTX DAO(HTX)$0.000002-0.31%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

A New AI Research From Stanford Presents an Alternative Explanation for Seemingly Sharp and Unpredictable Emergent Abilities of Large Language Models

May 4, 2023
in AI & Technology
Reading Time: 5 mins read
A A
A New AI Research From Stanford Presents an Alternative Explanation for Seemingly Sharp and Unpredictable Emergent Abilities of Large Language Models
ShareShareShareShareShare

Researchers have long explored the emergent features of complex systems, from physics to biology to mathematics. Nobel Prize-winning physicist P.W. Anderson’s commentary “More Is Different” is one notable example. It makes the case that as a system’s complexity rises, new properties may manifest that cannot (easily or at all) be predicted, even from a precise quantitative understanding of the system’s microscopic details. Due to discoveries showing large language models (LLMs), such as GPT, PaLM, and LaMDA, which may demonstrate what is known as “emergent abilities” across a variety of tasks, emerging has lately attracted a lot of interest in machine learning. 

It was recently and succinctly stated that “emergent abilities of LLMs” refers to “abilities that are not present in smaller-scale models but are present in large-scale models; thus, they cannot be predicted by simply extrapolating the performance improvements on smaller-scale models.” The GPT-3 family may have been the first to find such emergent skills. Later works emphasized the discovery, writing that “performance is predictable at a general level, performance on a specific task can sometimes emerge quite unpredictably and abruptly at scale”; in fact, these emergent abilities were so startling and remarkable that it was argued that such “abrupt, specific capability scaling” should be considered one of the two main defining features of LLMs. Additionally, the phrases “sharp left turns” and “breakthrough capabilities” have been employed. 

These quotations identify the two characteristics distinguishing emerging skills in LLMs: 

🚀 JOIN the fastest ML Subreddit Community

1. Sharpness, changing from absent to present ostensibly instantly 

2. Unpredictability, transitioning at model sizes that appear to be improbable. These newly discovered skills have attracted a lot of interest, leading to inquiries like What determines which abilities will emerge? What determines when skills will manifest? How can they ensure that desirable talents always emerge while accelerating the emergence of undesirable ones? The relevance of these issues for AI safety and alignment is highlighted by emergent abilities, which warn that bigger models may one day, without notice, possess unwanted mastery over hazardous skills. 

Researchers from Stanford look at the idea that LLMs contain emergent abilities more precisely, abrupt and unanticipated changes in model outputs as a function of model scale on particular tasks in this study. Our skepticism stems from the finding that emerging skills seem limited to measures that discontinuously or nonlinearly scale the per-token error rate of any model. For instance, they demonstrate that on BIG-Bench tests, > 92% of emerging talents fall under one of two metrics: Multiple Options. If the choice with the highest probability is 0, grade def = 1; otherwise. If the output string perfectly matches the target string, then Exact String Match def = 1; else, 0. 

This raises the possibility of a different explanation for the emergence of LLMs’ emergent abilities: changes that appear abrupt and unpredictable may have been brought on by the researcher’s measurement choice. Despite the model family’s per-token error rate changing smoothly, continuously, and predictably with increasing model scale, this raises the possibility of another explanation. 

They specifically claim that the researcher’s choice of a metric that nonlinearly or discontinuously deforms per-token error rates, the lack of test data to accurately estimate the performance of smaller models (resulting in smaller models appearing wholly incapable of performing the task), and the evaluation of too few large-scale models are all causes of emergent abilities being a mirage. They provide a straightforward mathematical model to express their alternate viewpoint and show how it statistically supports the evidence for emergent LLM skills. 

Following that, they put their alternate theory to the test in three complementary ways: 

1. Using the InstructGPT / GPT-3 model family, they formulate, test, and confirm three predictions based on their alternative hypotheses. 

2. They conduct a meta-analysis of previously published data and demonstrate that emergent skills only occur for certain metrics and not for model families on tasks (columns) in the space of task metric-model family triplets. They further demonstrate that altering the measure for outputs from fixed models vanishes the emergence phenomena. 

3. They illustrate how identical metric choices may produce what appear to be emergent skills by purposefully inducing emergent abilities in deep neural networks of various architectures on various vision tasks (which, to the best of their knowledge, have never been proved).


Check out the Research Paper. Don’t forget to join our 20k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at [email protected]

🚀 Check Out 100’s AI Tools in AI Tools Club


YOU MAY ALSO LIKE

The LoRA Assumption That Breaks in Production 

How to Build Smarter Multilingual Text Wrapping with BudouX Through Parsing, HTML Rendering, Model Introspection, and Toy Training

Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.


Credit: Source link

ShareTweetSendSharePin

Related Posts

The LoRA Assumption That Breaks in Production 
AI & Technology

The LoRA Assumption That Breaks in Production 

April 27, 2026
How to Build Smarter Multilingual Text Wrapping with BudouX Through Parsing, HTML Rendering, Model Introspection, and Toy Training
AI & Technology

How to Build Smarter Multilingual Text Wrapping with BudouX Through Parsing, HTML Rendering, Model Introspection, and Toy Training

April 26, 2026
Forced Windows updates can now be paused forever
AI & Technology

Forced Windows updates can now be paused forever

April 26, 2026
Canadian premier wants to ban social media and AI chatbots for kids in Manitoba
AI & Technology

Canadian premier wants to ban social media and AI chatbots for kids in Manitoba

April 26, 2026
Next Post
NEW AI Projects that will Change Gaming – BEST Gaming Machine Learning Projects

NEW AI Projects that will Change Gaming - BEST Gaming Machine Learning Projects

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
At least nine were killed and 13 were injured after a student opened fire at a Turkish middle school

At least nine were killed and 13 were injured after a student opened fire at a Turkish middle school

April 24, 2026
LIVE: Trump participates in ‘No Tax on Tips’ roundtable | NBC News

LIVE: Trump participates in ‘No Tax on Tips’ roundtable | NBC News

April 23, 2026
First person born through gestational surrogacy tells her story 40 years later

First person born through gestational surrogacy tells her story 40 years later

April 22, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!