• Kinza Babylon Staked BTCKinza Babylon Staked BTC(KBTC)$83,270.000.00%
  • Steakhouse EURCV Morpho VaultSteakhouse EURCV Morpho Vault(STEAKEURCV)$0.000000-100.00%
  • Stride Staked InjectiveStride Staked Injective(STINJ)$16.51-4.18%
  • Vested XORVested XOR(VXOR)$3,404.231,000.00%
  • FibSwap DEXFibSwap DEX(FIBO)$0.0084659.90%
  • ICPanda DAOICPanda DAO(PANDA)$0.003106-39.39%
  • TruFin Staked APTTruFin Staked APT(TRUAPT)$8.020.00%
  • bitcoinBitcoin(BTC)$102,742.004.86%
  • VNST StablecoinVNST Stablecoin(VNST)$0.0000400.67%
  • ethereumEthereum(ETH)$2,197.4220.27%
  • tetherTether(USDT)$1.00-0.02%
  • rippleXRP(XRP)$2.318.54%
  • binancecoinBNB(BNB)$626.313.63%
  • solanaSolana(SOL)$162.328.76%
  • Wrapped SOLWrapped SOL(SOL)$143.66-2.32%
  • usd-coinUSDC(USDC)$1.000.00%
  • dogecoinDogecoin(DOGE)$0.19521412.03%
  • cardanoCardano(ADA)$0.7612.17%
  • tronTRON(TRX)$0.2567612.98%
  • staked-etherLido Staked Ether(STETH)$2,193.9820.07%
  • SuiSui(SUI)$4.0118.04%
  • wrapped-bitcoinWrapped Bitcoin(WBTC)$102,621.004.99%
  • Gaj FinanceGaj Finance(GAJ)$0.0059271.46%
  • Content BitcoinContent Bitcoin(CTB)$24.482.55%
  • USD OneUSD One(USD1)$1.000.11%
  • chainlinkChainlink(LINK)$15.7612.47%
  • UGOLD Inc.UGOLD Inc.(UGOLD)$3,042.460.08%
  • ParkcoinParkcoin(KPK)$1.101.76%
  • avalanche-2Avalanche(AVAX)$22.2011.93%
  • Wrapped stETHWrapped stETH(WSTETH)$2,617.6919.95%
  • stellarStellar(XLM)$0.29283911.74%
  • shiba-inuShiba Inu(SHIB)$0.00001410.61%
  • bitcoin-cashBitcoin Cash(BCH)$417.025.69%
  • hedera-hashgraphHedera(HBAR)$0.1943349.78%
  • leo-tokenLEO Token(LEO)$8.860.50%
  • USDSUSDS(USDS)$1.000.01%
  • ToncoinToncoin(TON)$3.205.74%
  • HyperliquidHyperliquid(HYPE)$23.3510.49%
  • litecoinLitecoin(LTC)$94.503.84%
  • Yay StakeStone EtherYay StakeStone Ether(YAYSTONE)$2,671.07-2.84%
  • polkadotPolkadot(DOT)$4.4610.71%
  • Pundi AIFXPundi AIFX(PUNDIAI)$16.000.00%
  • PengPeng(PENG)$0.60-13.59%
  • wethWETH(WETH)$2,197.5320.07%
  • moneroMonero(XMR)$297.245.19%
  • Bitget TokenBitget Token(BGB)$4.485.20%
  • Wrapped eETHWrapped eETH(WEETH)$2,341.9220.96%
  • Binance Bridged USDT (BNB Smart Chain)Binance Bridged USDT (BNB Smart Chain)(BSC-USD)$1.00-0.14%
  • MurasakiMurasaki(MURA)$4.32-12.46%
  • Black PhoenixBlack Phoenix(BPX)$3.351,000.00%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Octo: An Open-Sourced Large Transformer-based Generalist Robot Policy Trained on 800k Trajectories from the Open X-Embodiment Dataset

May 24, 2024
in AI & Technology
Reading Time: 5 mins read
A A
Octo: An Open-Sourced Large Transformer-based Generalist Robot Policy Trained on 800k Trajectories from the Open X-Embodiment Dataset
ShareShareShareShareShare

YOU MAY ALSO LIKE

GB Summit 2025’s Women in Gaming Breakfast discusses post-growth strategy

Threads will start telling users when their posts are demoted

Regarding robotic learning, the standard practice is to use datasets tailored to the particular robot and job at hand to train policies. Starting from scratch in this manner necessitates a substantial amount of data collection for every activity, and the policies that are produced typically display little generalizability. Theoretically, data gathered from previous robots and jobs could be a solution; training models on various control issues could enhance their ability to generalize and perform better on subsequent tasks. In contrast to the pervasiveness of general-purpose models in computer vision and natural language processing, creating a “general-purpose robot model” capable of controlling various robots has proven to be a formidable challenge. Dealing with robot embodiments, sensor configurations, action spaces, task specifications, surroundings, and compute budgets are unique issues when training a unified control strategy in robotics.

Several publications have put forward robotic foundation models that accomplish just that—directly translate robot observations into actions—and offer generalizability to new domains and robots with zero or few shots. Because of their versatility in low-level visuomotor control across activities, settings, and robotic systems, these models are generally called “generalist robot policies” (GRPs). While there has been progress toward a “general-purpose robot model,” these models still have a ways to go. For example, they don’t allow for effective finetuning to new domains; the biggest ones aren’t even available to the public. Another issue is that they limit downstream users to a pre-defined and often restrictive set of input observations, like a single camera stream.

To better accommodate the variety of user interfaces found in robotic applications further down the line, researchers from UC Berkeley, Stanford, Carnegie Mellon University, and Google Deepmind provide a method for pretraining generalist robot policies. 

Octo is a transformer-based strategy pre-trained using 800k robot demonstrations from the Open X-Embodiment dataset, the largest dataset on robot manipulation. Octo is the first generalist robot manipulation policy to be completely open-source, including the data, model checkpoints, and training pipeline. It is also the first GRP to be effectively fine tuned to new observations and action spaces. 

When trained on a varied dataset of robots and tasks, the model is a transformer architecture that can convert any number of input tokens—generated from observations and tasks—into actions. This policy may be trained once and used for several robots, different camera setups (e.g., wrist or workspace cameras), and other input methods (e.g., language commands, goal images) by simply switching the tokens provided into the model. The model can be easily adjusted to accommodate other robot configurations, sensory inputs, action spaces, or morphologies by incorporating the necessary adapters and refining it using a small dataset from the target domain and a reasonable computing budget.

Previous research has delved into the individual components of Octo, such as a transformer backbone, goal image specification support, and a diffusion head to model expressive action distributions. However, the true power of this combination as a generalist robot policy is a new and innovative concept. The researchers conducted extensive experiments on nine robots from four different universities, demonstrating that their integrated system achieves state-of-the-art results in out-of-the-box multi-robot control for single and dual-arm manipulation tasks. They also showed that Octo can be effectively used as an initialization for fine-tuning to new observation and action spaces in unseen setups. Throughout these experiments, they analyzed the impact of several design choices on the pretrained GRP’s quality, including data distribution, model architecture, and policy formulation. The evaluation underscored the importance of scale and flexibility in achieving optimal performance. 

In addition to this publication, the team is making all the necessary resources available for training, using, reproducing, and refining an Octo model. With 27M and 93M parameters, respectively, their pretrained Octo model checkpoints allow language and goal image task specification out of the box and multiple RGB camera inputs. In addition to their whole pre-training pipeline, which includes optimal data loaders, transformer implementations for multimodal inputs, and tools to monitor training progress, they also offer scripts for fine-tuning these models on new domains.

While the team acknowledges that there is still room for improvement in the model, such as language conditioning, support for wrist cameras, and the incorporation of data beyond ideal demonstrations, Octo represents a significant step towards creating generalist robot policies that are compatible with a variety of robot settings. Octo aims to provide a practical platform where researchers and practitioners can access larger datasets related to robotics. They envision that their work will enable the use of pretrained models for rapid task learning and generalization, thereby advancing the field of robotics and machine learning. 


Check out the Paper and Project. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our 42k+ ML SubReddit


Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone’s life easy.


✅ [Featured Tool] Check out Taipy Enterprise Edition


Credit: Source link

ShareTweetSendSharePin

Related Posts

GB Summit 2025’s Women in Gaming Breakfast discusses post-growth strategy
AI & Technology

GB Summit 2025’s Women in Gaming Breakfast discusses post-growth strategy

May 8, 2025
Threads will start telling users when their posts are demoted
AI & Technology

Threads will start telling users when their posts are demoted

May 8, 2025
Matthew Bernardini, CEO and Co-Founder of Zenapse – Interview Series
AI & Technology

Matthew Bernardini, CEO and Co-Founder of Zenapse – Interview Series

May 8, 2025
AI is Driving Investment — But Entrepreneurs Need to be Careful With What They Claim
AI & Technology

AI is Driving Investment — But Entrepreneurs Need to be Careful With What They Claim

May 8, 2025
Next Post
Video shows homeless person in Los Angeles seemingly sprayed with water

Video shows homeless person in Los Angeles seemingly sprayed with water

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
Consumers race to buy cars ahead possible Trump tariffs

Consumers race to buy cars ahead possible Trump tariffs

May 8, 2025
UFC Des Moines live updates: Cory Sandhagen vs. Deiveson Figueiredo results, round-by-round analysis and highlights – Yahoo Sports

UFC Des Moines live updates: Cory Sandhagen vs. Deiveson Figueiredo results, round-by-round analysis and highlights – Yahoo Sports

May 4, 2025
College of Cardinals prepares for Pope Francis’ funeral and the conclave.

College of Cardinals prepares for Pope Francis’ funeral and the conclave.

May 5, 2025

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!