• Kinza Babylon Staked BTCKinza Babylon Staked BTC(KBTC)$83,270.000.00%
  • Steakhouse EURCV Morpho VaultSteakhouse EURCV Morpho Vault(STEAKEURCV)$0.000000-100.00%
  • Stride Staked InjectiveStride Staked Injective(STINJ)$16.51-4.18%
  • Vested XORVested XOR(VXOR)$3,404.231,000.00%
  • FibSwap DEXFibSwap DEX(FIBO)$0.0084659.90%
  • ICPanda DAOICPanda DAO(PANDA)$0.003106-39.39%
  • TruFin Staked APTTruFin Staked APT(TRUAPT)$8.020.00%
  • bitcoinBitcoin(BTC)$105,766.000.17%
  • ethereumEthereum(ETH)$2,572.391.51%
  • VNST StablecoinVNST Stablecoin(VNST)$0.0000400.67%
  • tetherTether(USDT)$1.00-0.03%
  • rippleXRP(XRP)$2.181.32%
  • binancecoinBNB(BNB)$651.620.62%
  • Wrapped SOLWrapped SOL(SOL)$143.66-2.32%
  • solanaSolana(SOL)$155.956.93%
  • usd-coinUSDC(USDC)$1.000.01%
  • dogecoinDogecoin(DOGE)$0.176159-0.92%
  • tronTRON(TRX)$0.271583-0.02%
  • staked-etherLido Staked Ether(STETH)$2,570.561.47%
  • cardanoCardano(ADA)$0.641.61%
  • HyperliquidHyperliquid(HYPE)$42.073.67%
  • wrapped-bitcoinWrapped Bitcoin(WBTC)$105,719.000.05%
  • Gaj FinanceGaj Finance(GAJ)$0.0059271.46%
  • Content BitcoinContent Bitcoin(CTB)$24.482.55%
  • USD OneUSD One(USD1)$1.000.11%
  • Wrapped stETHWrapped stETH(WSTETH)$3,102.011.56%
  • SuiSui(SUI)$3.072.75%
  • UGOLD Inc.UGOLD Inc.(UGOLD)$3,042.460.08%
  • ParkcoinParkcoin(KPK)$1.101.76%
  • bitcoin-cashBitcoin Cash(BCH)$459.185.40%
  • chainlinkChainlink(LINK)$13.431.29%
  • leo-tokenLEO Token(LEO)$9.281.83%
  • avalanche-2Avalanche(AVAX)$19.391.97%
  • stellarStellar(XLM)$0.2595030.72%
  • ToncoinToncoin(TON)$3.001.09%
  • shiba-inuShiba Inu(SHIB)$0.000012-0.10%
  • USDSUSDS(USDS)$1.000.02%
  • Yay StakeStone EtherYay StakeStone Ether(YAYSTONE)$2,671.07-2.84%
  • wethWETH(WETH)$2,571.051.44%
  • Wrapped eETHWrapped eETH(WEETH)$2,749.271.48%
  • hedera-hashgraphHedera(HBAR)$0.1561601.50%
  • litecoinLitecoin(LTC)$86.611.00%
  • Pundi AIFXPundi AIFX(PUNDIAI)$16.000.00%
  • Binance Bridged USDT (BNB Smart Chain)Binance Bridged USDT (BNB Smart Chain)(BSC-USD)$1.000.13%
  • PengPeng(PENG)$0.60-13.59%
  • Ethena USDeEthena USDe(USDE)$1.00-0.03%
  • polkadotPolkadot(DOT)$3.851.72%
  • moneroMonero(XMR)$315.820.82%
  • WhiteBIT CoinWhiteBIT Coin(WBT)$39.65-0.98%
  • Bitget TokenBitget Token(BGB)$4.540.31%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Meet OpenLLaMA: An Open-Source Reproduction of Meta AI’s LLaMA Large Language Model

May 5, 2023
in AI & Technology
Reading Time: 4 mins read
A A
Meet OpenLLaMA: An Open-Source Reproduction of Meta AI’s LLaMA Large Language Model
ShareShareShareShareShare

A new development in large language models has emerged with the release of OpenLLaMA, an open-source reproduction of Meta AI’s LLaMA model. The creators of OpenLLaMA have made the permissively licensed model publicly available as a 7B OpenLLaMA model that has been trained with 200 billion tokens. The release includes PyTorch and Jax weights of pre-trained OpenLLaMA models, evaluation results, and a comparison against the original LLaMA models. This development has significant implications for machine learning, particularly for researchers who require large language models but face challenges accessing proprietary models. 

The creators of OpenLLaMA have shared details on how they trained their models on the RedPajama dataset, which is a reproduction of the LLaMA training dataset containing over 1.2 trillion tokens. They followed the same preprocessing and training hyperparameters as the original LLaMA paper, including model architecture, context length, training steps, learning rate schedule, and optimizer. The only difference between their approach and the original one is the dataset used: OpenLLaMA employs the RedPajama dataset rather than the one utilized by the original LLaMA.

The models were trained on cloud TPU-v4s using EasyLM, a JAX-based training pipeline developed for training and fine-tuning language models. They employed a combination of normal data parallelism and fully sharded data parallelism (also known as ZeRO stage 3) to balance the training throughput and memory usage. Overall, their training run achieved a throughput of over 1900 tokens/second / TPU-v4 chip. 

🚀 JOIN the fastest ML Subreddit Community

The performance of OpenLLaMA was evaluated on several tasks using the lm-evaluation-harness. The results were compared against the original LLaMA model and GPT-J, a 6B parameter model trained on the Pile dataset by EleutherAI. The evaluation metrics for the original LLaMA model were generated by running it on the same tasks. The results for the LLaMA model slightly differed from those reported in the original LLaMA paper, which may be due to differences in evaluation protocols. However, OpenLLaMA exhibited comparable or better performance than the original LLaMA and GPT-J across most tasks, according to the presented results. Although OpenLLaMA was trained on 200 billion tokens instead of the 1 trillion tokens used for the original LLaMA and 500 billion tokens used for GPT-J, its performance is expected to improve even further upon completing its training on 1 trillion tokens.

To encourage feedback and collaboration from the community, the team behind OpenLLaMA has released a preview checkpoint of their weights. These weights are available in two formats: an EasyLM format for use with their EasyLM framework and a PyTorch format for use with the Huggingface transformers library. Unlike the original LLaMA model, OpenLLaMA’s tokenizer and weights are trained entirely from scratch, so obtaining the original LLaMA tokenizer and weights is no longer necessary. However, it is essential to note that OpenLLaMA uses the BOS (beginning of a sentence) token (id=1) during training, so this token should be prepended for optimal performance during a few-shot evaluation. The preview checkpoint weights and EasyLM framework are permissively under the Apache 2.0 license. The team is currently focused on completing the training process on the entire RedPajama dataset to allow for an apple-to-apple comparison between the original LLaMA and OpenLLaMA. Additionally, they are working on training a smaller 3B model for low-resource use cases. The team plans to release more updates soon.


Check out the Github Link. Don’t forget to join our 20k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at [email protected]

🚀 Check Out 100’s AI Tools in AI Tools Club


Niharika is a Technical consulting intern at Marktechpost. She is a third year undergraduate, currently pursuing her B.Tech from Indian Institute of Technology(IIT), Kharagpur. She is a highly enthusiastic individual with a keen interest in Machine learning, Data science and AI and an avid reader of the latest developments in these fields.


Credit: Source link

YOU MAY ALSO LIKE

How to set up a WhatsApp account without Facebook or Instagram

12 Insane Wearable Tech From the Year 2030

ShareTweetSendSharePin

Related Posts

How to set up a WhatsApp account without Facebook or Instagram
AI & Technology

How to set up a WhatsApp account without Facebook or Instagram

June 15, 2025
12 Insane Wearable Tech From the Year 2030
AI & Technology

12 Insane Wearable Tech From the Year 2030

June 15, 2025
Tesla blows past stopped school bus and hits kid-sized dummies in Full Self-Driving tests
AI & Technology

Tesla blows past stopped school bus and hits kid-sized dummies in Full Self-Driving tests

June 15, 2025
DeepCoder-14B: The Open-Source AI Model Enhancing Developer Productivity and Innovation
AI & Technology

DeepCoder-14B: The Open-Source AI Model Enhancing Developer Productivity and Innovation

June 15, 2025
Next Post
Anheuser-Busch cans ‘third-party ad agency’ tied to Dylan Mulvaney fiasco

Anheuser-Busch cans 'third-party ad agency' tied to Dylan Mulvaney fiasco

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
19-year-old says she was attacked because of her sexual orientation

19-year-old says she was attacked because of her sexual orientation

June 14, 2025
Liverpool crash not being investigated as act of terrorism

Liverpool crash not being investigated as act of terrorism

June 11, 2025
Biden’s cancer first diagnosed last week, spokesperson says

Biden’s cancer first diagnosed last week, spokesperson says

June 14, 2025

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!