• bitcoinBitcoin(BTC)$63,784.000.58%
  • ethereumEthereum(ETH)$1,674.630.20%
  • tetherTether(USDT)$1.000.08%
  • binancecoinBNB(BNB)$603.77-0.04%
  • usd-coinUSDC(USDC)$1.000.02%
  • rippleXRP(XRP)$1.140.44%
  • solanaSolana(SOL)$67.390.85%
  • tronTRON(TRX)$0.3158070.97%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.030.07%
  • dogecoinDogecoin(DOGE)$0.0871940.81%
  • HyperliquidHyperliquid(HYPE)$58.900.75%
  • USDSUSDS(USDS)$1.000.01%
  • leo-tokenLEO Token(LEO)$9.621.22%
  • RainRain(RAIN)$0.012986-1.30%
  • zcashZcash(ZEC)$415.53-4.23%
  • stellarStellar(XLM)$0.190364-1.13%
  • cardanoCardano(ADA)$0.1730001.79%
  • moneroMonero(XMR)$340.29-7.11%
  • CantonCanton(CC)$0.1631120.48%
  • whitebitWhiteBIT Coin(WBT)$52.070.25%
  • chainlinkChainlink(LINK)$7.961.23%
  • the-open-networkToncoin(TON)$1.68-1.71%
  • Ethena USDeEthena USDe(USDE)$1.000.08%
  • USD1USD1(USD1)$1.000.19%
  • daiDai(DAI)$1.00-0.01%
  • bitcoin-cashBitcoin Cash(BCH)$206.661.42%
  • MemeCoreMemeCore(M)$2.95-3.21%
  • hedera-hashgraphHedera(HBAR)$0.078169-1.23%
  • litecoinLitecoin(LTC)$43.672.10%
  • suiSui(SUI)$0.760.94%
  • LABLAB(LAB)$9.75-2.11%
  • Circle USYCCircle USYC(USYC)$1.130.00%
  • shiba-inuShiba Inu(SHIB)$0.0000052.45%
  • avalanche-2Avalanche(AVAX)$6.660.75%
  • paypal-usdPayPal USD(PYUSD)$1.00-0.01%
  • crypto-com-chainCronos(CRO)$0.0599230.19%
  • AudieraAudiera(BEAT)$9.2411.61%
  • Global DollarGlobal Dollar(USDG)$1.000.04%
  • nearNEAR Protocol(NEAR)$2.03-5.28%
  • tether-goldTether Gold(XAUT)$4,199.930.00%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • BittensorBittensor(TAO)$237.1511.28%
  • Ondo US Dollar YieldOndo US Dollar Yield(USDY)$1.140.65%
  • pax-goldPAX Gold(PAXG)$4,210.75-0.04%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.0590551.30%
  • mantleMantle(MNT)$0.54-0.02%
  • OndoOndo(ONDO)$0.3622140.60%
  • AsterAster(ASTER)$0.641.71%
  • polkadotPolkadot(DOT)$0.982.92%
  • worldcoin-wldWorldcoin(WLD)$0.487650-1.33%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Meet OpenLLaMA: An Open-Source Reproduction of Meta AI’s LLaMA Large Language Model

May 5, 2023
in AI & Technology
Reading Time: 4 mins read
A A
Meet OpenLLaMA: An Open-Source Reproduction of Meta AI’s LLaMA Large Language Model
ShareShareShareShareShare

A new development in large language models has emerged with the release of OpenLLaMA, an open-source reproduction of Meta AI’s LLaMA model. The creators of OpenLLaMA have made the permissively licensed model publicly available as a 7B OpenLLaMA model that has been trained with 200 billion tokens. The release includes PyTorch and Jax weights of pre-trained OpenLLaMA models, evaluation results, and a comparison against the original LLaMA models. This development has significant implications for machine learning, particularly for researchers who require large language models but face challenges accessing proprietary models. 

The creators of OpenLLaMA have shared details on how they trained their models on the RedPajama dataset, which is a reproduction of the LLaMA training dataset containing over 1.2 trillion tokens. They followed the same preprocessing and training hyperparameters as the original LLaMA paper, including model architecture, context length, training steps, learning rate schedule, and optimizer. The only difference between their approach and the original one is the dataset used: OpenLLaMA employs the RedPajama dataset rather than the one utilized by the original LLaMA.

The models were trained on cloud TPU-v4s using EasyLM, a JAX-based training pipeline developed for training and fine-tuning language models. They employed a combination of normal data parallelism and fully sharded data parallelism (also known as ZeRO stage 3) to balance the training throughput and memory usage. Overall, their training run achieved a throughput of over 1900 tokens/second / TPU-v4 chip. 

🚀 JOIN the fastest ML Subreddit Community

The performance of OpenLLaMA was evaluated on several tasks using the lm-evaluation-harness. The results were compared against the original LLaMA model and GPT-J, a 6B parameter model trained on the Pile dataset by EleutherAI. The evaluation metrics for the original LLaMA model were generated by running it on the same tasks. The results for the LLaMA model slightly differed from those reported in the original LLaMA paper, which may be due to differences in evaluation protocols. However, OpenLLaMA exhibited comparable or better performance than the original LLaMA and GPT-J across most tasks, according to the presented results. Although OpenLLaMA was trained on 200 billion tokens instead of the 1 trillion tokens used for the original LLaMA and 500 billion tokens used for GPT-J, its performance is expected to improve even further upon completing its training on 1 trillion tokens.

To encourage feedback and collaboration from the community, the team behind OpenLLaMA has released a preview checkpoint of their weights. These weights are available in two formats: an EasyLM format for use with their EasyLM framework and a PyTorch format for use with the Huggingface transformers library. Unlike the original LLaMA model, OpenLLaMA’s tokenizer and weights are trained entirely from scratch, so obtaining the original LLaMA tokenizer and weights is no longer necessary. However, it is essential to note that OpenLLaMA uses the BOS (beginning of a sentence) token (id=1) during training, so this token should be prepended for optimal performance during a few-shot evaluation. The preview checkpoint weights and EasyLM framework are permissively under the Apache 2.0 license. The team is currently focused on completing the training process on the entire RedPajama dataset to allow for an apple-to-apple comparison between the original LLaMA and OpenLLaMA. Additionally, they are working on training a smaller 3B model for low-resource use cases. The team plans to release more updates soon.


Check out the Github Link. Don’t forget to join our 20k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at [email protected]

🚀 Check Out 100’s AI Tools in AI Tools Club


YOU MAY ALSO LIKE

Jensen Huang Mania Sweeps Through Seoul

Anthropic Disables Claude Fable 5 and Mythos 5 After US Government Order

Niharika is a Technical consulting intern at Marktechpost. She is a third year undergraduate, currently pursuing her B.Tech from Indian Institute of Technology(IIT), Kharagpur. She is a highly enthusiastic individual with a keen interest in Machine learning, Data science and AI and an avid reader of the latest developments in these fields.


Credit: Source link

ShareTweetSendSharePin

Related Posts

Jensen Huang Mania Sweeps Through Seoul
AI & Technology

Jensen Huang Mania Sweeps Through Seoul

June 13, 2026
Anthropic Disables Claude Fable 5 and Mythos 5 After US Government Order
AI & Technology

Anthropic Disables Claude Fable 5 and Mythos 5 After US Government Order

June 13, 2026
What to Know About the SpaceX IPO
AI & Technology

What to Know About the SpaceX IPO

June 13, 2026
OpenAI Is Facing Investigation From A Group Of State Attorneys General
AI & Technology

OpenAI Is Facing Investigation From A Group Of State Attorneys General

June 13, 2026
Next Post
Anheuser-Busch cans ‘third-party ad agency’ tied to Dylan Mulvaney fiasco

Anheuser-Busch cans 'third-party ad agency' tied to Dylan Mulvaney fiasco

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
Tyler Technologies, Inc. (TYL) Presents at D.A. Davidson 2nd Annual Technology & Consumer Conference 2026 Transcript

Tyler Technologies, Inc. (TYL) Presents at D.A. Davidson 2nd Annual Technology & Consumer Conference 2026 Transcript

June 11, 2026
Meet Harness-1: A 20B Retrieval Subagent Trained With Reinforcement Learning Inside a Stateful Search Harness on gpt-oss-20b

Meet Harness-1: A 20B Retrieval Subagent Trained With Reinforcement Learning Inside a Stateful Search Harness on gpt-oss-20b

June 7, 2026
Fresh blow for California’s Carl’s Jr as iconic burger chain to close more stores

Fresh blow for California’s Carl’s Jr as iconic burger chain to close more stores

June 9, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!