• bitcoinBitcoin(BTC)$76,057.00-0.93%
  • ethereumEthereum(ETH)$2,279.240.23%
  • tetherTether(USDT)$1.00-0.02%
  • rippleXRP(XRP)$1.38-0.88%
  • binancecoinBNB(BNB)$622.540.28%
  • usd-coinUSDC(USDC)$1.000.00%
  • solanaSolana(SOL)$83.43-0.81%
  • tronTRON(TRX)$0.323590-0.43%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.042.30%
  • dogecoinDogecoin(DOGE)$0.0992502.14%
  • whitebitWhiteBIT Coin(WBT)$53.84-0.71%
  • USDSUSDS(USDS)$1.000.01%
  • leo-tokenLEO Token(LEO)$10.370.04%
  • HyperliquidHyperliquid(HYPE)$39.65-4.97%
  • cardanoCardano(ADA)$0.2461440.71%
  • bitcoin-cashBitcoin Cash(BCH)$445.62-1.16%
  • moneroMonero(XMR)$384.051.13%
  • chainlinkChainlink(LINK)$9.210.31%
  • CantonCanton(CC)$0.1493570.80%
  • zcashZcash(ZEC)$334.07-5.16%
  • stellarStellar(XLM)$0.162902-1.28%
  • MemeCoreMemeCore(M)$3.55-9.04%
  • USD1USD1(USD1)$1.000.03%
  • daiDai(DAI)$1.000.01%
  • litecoinLitecoin(LTC)$55.04-0.19%
  • avalanche-2Avalanche(AVAX)$9.150.08%
  • hedera-hashgraphHedera(HBAR)$0.089222-0.01%
  • Ethena USDeEthena USDe(USDE)$1.00-0.01%
  • suiSui(SUI)$0.920.03%
  • shiba-inuShiba Inu(SHIB)$0.0000060.89%
  • RainRain(RAIN)$0.0073860.39%
  • paypal-usdPayPal USD(PYUSD)$1.000.03%
  • the-open-networkToncoin(TON)$1.30-0.23%
  • crypto-com-chainCronos(CRO)$0.069075-0.23%
  • Circle USYCCircle USYC(USYC)$1.120.00%
  • tether-goldTether Gold(XAUT)$4,579.97-1.74%
  • BittensorBittensor(TAO)$255.223.73%
  • Global DollarGlobal Dollar(USDG)$1.00-0.02%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.0736681.66%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • pax-goldPAX Gold(PAXG)$4,578.89-1.80%
  • mantleMantle(MNT)$0.63-0.49%
  • polkadotPolkadot(DOT)$1.231.00%
  • uniswapUniswap(UNI)$3.220.94%
  • SkySky(SKY)$0.0874311.48%
  • Pi NetworkPi Network(PI)$0.1915485.76%
  • Falcon USDFalcon USD(USDF)$1.000.03%
  • okbOKB(OKB)$82.78-0.55%
  • nearNEAR Protocol(NEAR)$1.33-1.26%
  • AsterAster(ASTER)$0.64-0.47%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Meta AI Introduces IMAGEBIND: The First Open-Sourced AI Project Capable of Binding Data from Six Modalities at Once, Without the Need for Explicit Supervision

May 10, 2023
in AI & Technology
Reading Time: 4 mins read
A A
Meta AI Introduces IMAGEBIND: The First Open-Sourced AI Project Capable of Binding Data from Six Modalities at Once, Without the Need for Explicit Supervision
ShareShareShareShareShare

Humans can grasp complex ideas after being exposed to just a few instances. Most of the time, we can identify an animal based on a written description and guess the sound of an unknown car’s engine based on a visual. This is partly because a single image can “bind” together an otherwise disparate sensory experience. Based on paired data, standard multimodal learning has limitations in artificial intelligence as the number of modalities increases.

Aligning text, audio, etc., with images has been the focus of several recent methodologies. These strategies only make use of two senses at most, if that. The final embeddings, however, can only represent the training modalities and their corresponding pairs. For this reason, it is not possible to directly transfer video-audio embeddings to image-text activities or vice versa. The lack of huge amounts of multimodal data where all modalities are present together is a significant barrier to learning a real joint embedding.

New Meta research introduces IMAGEBIND, a system that uses several forms of image-pair data to learn a single shared representation space. It is not necessary to use datasets in which all modalities occur simultaneously. Instead, this work takes advantage of images’ binding property and demonstrates how aligning each modality’s embedding to image embeddings results in an emergent alignment across all modalities. 

🚀 JOIN the fastest ML Subreddit Community

The large amount of images and accompanying text on the web has led to substantial research into training image-text models. ImageBind makes use of the fact that images frequently co-occur with other modalities and can serve as a bridge to connect them, such as linking text to image with online data or linking motion to video with video data acquired from wearable cameras with IMU sensors.

Targets for feature learning across modalities can be the visual representations learned from massive amounts of web data. This means ImageBind can also align any other modality that frequently appears alongside images. Alignment is simpler for modalities like heat and depth that correlate highly to pictures.

ImageBind demonstrates that just using paired images can integrate all six modalities. The model can provide a more holistic interpretation of the information by letting the various modalities “talk” to one another and discover connections without direct observation. For instance, ImageBind can link sound and text even if it can’t see them together. By doing so, other models can “understand” new modalities without requiring extensive time- and energy-intensive training. ImageBind’s robust scaling behavior makes it possible to employ the model in place of or in addition to many AI models that previously could not use additional modalities.

Strong emergent zero-shot classification and retrieval performance on tasks for each new modality are demonstrated by combining large-scale image-text paired data with naturally paired self-supervised data across four new modalities: audio, depth, thermal, and Inertial Measurement Unit (IMU) readings. The team shows that strengthening the underlying image representation enhances these emergent features. 

The findings suggest that IMAGEBIND’s emergent zero-shot classification on audio classification and retrieval benchmarks like ESC, Clotho, and AudioCaps is on par with or beats expert models trained with direct audio-text supervision. On few-shot evaluation benchmarks, IMAGEBIND representations also perform better than expert-supervised models. Finally, they demonstrate the versatility of IMAGEBIND’s joint embeddings across various compositional tasks, including cross-modal retrieval, an arithmetic combination of embeddings, audio source detection in images, and image generation from the audio input.

Since these embeddings are not trained for a particular application, they fall behind the efficiency of domain-specific models. The team believes it would be helpful to learn more about how to tailor general-purpose embeddings to specific objectives, such as structured prediction tasks like detection. 


Check out the Paper, Demo, and Code. Don’t forget to join our 20k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at [email protected]

🚀 Check Out 100’s AI Tools in AI Tools Club


YOU MAY ALSO LIKE

Amazon brings dark mode to Kindle Colorsoft and Scribe Colorsoft

Mistral AI launches Workflows, a Temporal-powered orchestration engine already running millions of daily executions

Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.


Credit: Source link

ShareTweetSendSharePin

Related Posts

Amazon brings dark mode to Kindle Colorsoft and Scribe Colorsoft
AI & Technology

Amazon brings dark mode to Kindle Colorsoft and Scribe Colorsoft

April 28, 2026
Mistral AI launches Workflows, a Temporal-powered orchestration engine already running millions of daily executions
AI & Technology

Mistral AI launches Workflows, a Temporal-powered orchestration engine already running millions of daily executions

April 28, 2026
Union accuses Apple of unlawful discrimination against represented workers
AI & Technology

Union accuses Apple of unlawful discrimination against represented workers

April 28, 2026
Lyft to Acquire London Black Cab App Gett
AI & Technology

Lyft to Acquire London Black Cab App Gett

April 28, 2026
Next Post
FTX in Court| Bloomberg Technology 11/22/2022

FTX in Court| Bloomberg Technology 11/22/2022

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
Medpace: Off The Pace Here Amidst Continued Booking Softness

Medpace: Off The Pace Here Amidst Continued Booking Softness

April 25, 2026
You’ll Never Be Able To Fix Your In-Laws

You’ll Never Be Able To Fix Your In-Laws

April 28, 2026
Monsters in the Archives dives deep into Stephen King’s early works

Monsters in the Archives dives deep into Stephen King’s early works

April 25, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!