• Kinza Babylon Staked BTCKinza Babylon Staked BTC(KBTC)$83,270.000.00%
  • Steakhouse EURCV Morpho VaultSteakhouse EURCV Morpho Vault(STEAKEURCV)$0.000000-100.00%
  • Stride Staked InjectiveStride Staked Injective(STINJ)$16.51-4.18%
  • Vested XORVested XOR(VXOR)$3,404.231,000.00%
  • FibSwap DEXFibSwap DEX(FIBO)$0.0084659.90%
  • ICPanda DAOICPanda DAO(PANDA)$0.003106-39.39%
  • TruFin Staked APTTruFin Staked APT(TRUAPT)$8.020.00%
  • bitcoinBitcoin(BTC)$105,899.001.45%
  • ethereumEthereum(ETH)$2,619.582.80%
  • VNST StablecoinVNST Stablecoin(VNST)$0.0000400.67%
  • tetherTether(USDT)$1.000.02%
  • rippleXRP(XRP)$2.253.71%
  • binancecoinBNB(BNB)$666.160.78%
  • solanaSolana(SOL)$161.335.09%
  • Wrapped SOLWrapped SOL(SOL)$143.66-2.32%
  • usd-coinUSDC(USDC)$1.000.00%
  • dogecoinDogecoin(DOGE)$0.1962312.04%
  • tronTRON(TRX)$0.2710011.61%
  • cardanoCardano(ADA)$0.692.37%
  • staked-etherLido Staked Ether(STETH)$2,612.712.59%
  • wrapped-bitcoinWrapped Bitcoin(WBTC)$105,833.001.50%
  • Gaj FinanceGaj Finance(GAJ)$0.0059271.46%
  • HyperliquidHyperliquid(HYPE)$36.819.09%
  • Content BitcoinContent Bitcoin(CTB)$24.482.55%
  • USD OneUSD One(USD1)$1.000.11%
  • SuiSui(SUI)$3.300.03%
  • Wrapped stETHWrapped stETH(WSTETH)$3,148.112.72%
  • UGOLD Inc.UGOLD Inc.(UGOLD)$3,042.460.08%
  • ParkcoinParkcoin(KPK)$1.101.76%
  • chainlinkChainlink(LINK)$14.324.28%
  • avalanche-2Avalanche(AVAX)$21.464.45%
  • stellarStellar(XLM)$0.2744723.22%
  • leo-tokenLEO Token(LEO)$8.833.88%
  • bitcoin-cashBitcoin Cash(BCH)$406.831.77%
  • ToncoinToncoin(TON)$3.20-0.98%
  • shiba-inuShiba Inu(SHIB)$0.0000132.52%
  • hedera-hashgraphHedera(HBAR)$0.1741623.39%
  • wethWETH(WETH)$2,617.542.52%
  • USDSUSDS(USDS)$1.00-0.02%
  • Yay StakeStone EtherYay StakeStone Ether(YAYSTONE)$2,671.07-2.84%
  • litecoinLitecoin(LTC)$90.051.97%
  • Wrapped eETHWrapped eETH(WEETH)$2,798.182.92%
  • polkadotPolkadot(DOT)$4.193.70%
  • moneroMonero(XMR)$342.98-5.56%
  • Pundi AIFXPundi AIFX(PUNDIAI)$16.000.00%
  • PengPeng(PENG)$0.60-13.59%
  • Binance Bridged USDT (BNB Smart Chain)Binance Bridged USDT (BNB Smart Chain)(BSC-USD)$1.000.06%
  • Ethena USDeEthena USDe(USDE)$1.000.01%
  • Bitget TokenBitget Token(BGB)$4.822.74%
  • PepePepe(PEPE)$0.0000137.20%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Learning Intuitive Physics: Advancing AI Through Predictive Representation Models

February 19, 2025
in AI & Technology
Reading Time: 4 mins read
A A
Learning Intuitive Physics: Advancing AI Through Predictive Representation Models
ShareShareShareShareShare

YOU MAY ALSO LIKE

How to watch Apple’s WWDC 2025 keynote on Monday

How Apple Lost the AI Race Ahead of WWDC 2025

Humans possess an innate understanding of physics, expecting objects to behave predictably without abrupt changes in position, shape, or color. This fundamental cognition is observed in infants, primates, birds, and marine mammals, supporting the core knowledge hypothesis, which suggests humans have evolutionarily developed systems for reasoning about objects, space, and agents. While AI surpasses humans in complex tasks like coding and mathematics, it struggles with intuitive physics, highlighting Moravec’s paradox. AI approaches to physical reasoning fall into two categories: structured models, which simulate object interactions using predefined rules, and pixel-based generative models, which predict future sensory inputs without explicit abstractions.

Researchers from FAIR at Meta, Univ Gustave Eiffel, and EHESS explore how general-purpose deep neural networks develop an understanding of intuitive physics by predicting masked regions in natural videos. Using the violation-of-expectation framework, they demonstrate that models trained to predict outcomes in an abstract representation space—such as Joint Embedding Predictive Architectures (JEPAs)—can accurately recognize physical properties like object permanence and shape consistency. In contrast, video prediction models operating in pixel space and multimodal large language models perform closer to random guessing. This suggests that learning in an abstract space, rather than relying on predefined rules, is sufficient to acquire an intuitive understanding of physics.

The study focuses on a video-based JEPA model, V-JEPA, which predicts future video frames in a learned representation space, aligning with the predictive coding theory in neuroscience. V-JEPA achieved 98% zero-shot accuracy on the IntPhys benchmark and 62% on the InfLevel benchmark, outperforming other models. Ablation experiments revealed that intuitive physics understanding emerges robustly across different model sizes and training durations. Even a small 115 million parameter V-JEPA model or one trained on just one week of video showed above-chance performance. These findings challenge the notion that intuitive physics requires innate core knowledge and highlight the potential of abstract prediction models in developing physical reasoning.

The violation-of-expectation paradigm in developmental psychology assesses intuitive physics understanding by observing reactions to physically impossible scenarios. Traditionally applied to infants, this method measures surprise responses through physiological indicators like gaze time. More recently, it has been extended to AI systems by presenting them with paired visual scenes, where one includes a physical impossibility, such as a ball disappearing behind an occluder. The V-JEPA architecture, designed for video prediction tasks, learns high-level representations by predicting masked portions of videos. This approach enables the model to develop an implicit understanding of object dynamics without relying on predefined abstractions, as shown through its ability to anticipate and react to unexpected physical events in video sequences.

V-JEPA was tested on datasets such as IntPhys, GRASP, and InfLevel-lab to benchmark intuitive physics comprehension, assessing properties like object permanence, continuity, and gravity. Compared to other models, including VideoMAEv2 and multimodal language models like Qwen2-VL-7B and Gemini 1.5 pro, V-JEPA achieved significantly higher accuracy, demonstrating that learning in a structured representation space enhances physical reasoning. Statistical analyses confirmed its superiority over untrained networks across multiple properties, reinforcing that self-supervised video prediction fosters a deeper understanding of real-world physics. These findings highlight the challenge of intuitive physics for existing AI models and suggest that predictive learning in a learned representation space is key to improving AI’s physical reasoning abilities.

In conclusion, the study explores how state-of-the-art deep learning models develop an understanding of intuitive physics. The model demonstrates intuitive physics comprehension without task-specific adaptation by pretraining V-JEPA on natural videos using a prediction task in a learned representation space. Results suggest this ability arises from general learning principles rather than hardwired knowledge. However, V-JEPA struggles with object interactions, likely due to training limitations and short video processing. Enhancing model memory and incorporating action-based learning could improve performance. Future research may examine models trained on infant-like visual data, reinforcing the potential of predictive learning for physical reasoning in AI.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 75k+ ML SubReddit.

🚨 Recommended Read- LG AI Research Releases NEXUS: An Advanced System Integrating Agent AI System and Data Compliance Standards to Address Legal Concerns in AI Datasets


Sana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.

Credit: Source link

ShareTweetSendSharePin

Related Posts

How to watch Apple’s WWDC 2025 keynote on Monday
AI & Technology

How to watch Apple’s WWDC 2025 keynote on Monday

June 3, 2025
How Apple Lost the AI Race Ahead of WWDC 2025
AI & Technology

How Apple Lost the AI Race Ahead of WWDC 2025

June 3, 2025
Gaming’s demographic reach: 36% of people ages 80 to 90 play video games | ESA
AI & Technology

Gaming’s demographic reach: 36% of people ages 80 to 90 play video games | ESA

June 3, 2025
What to expect at Summer Game Fest 2025
AI & Technology

What to expect at Summer Game Fest 2025

June 3, 2025
Next Post
Advancing MLLM Alignment Through MM-RLHF: A Large-Scale Human Preference Dataset for Multimodal Tasks

Advancing MLLM Alignment Through MM-RLHF: A Large-Scale Human Preference Dataset for Multimodal Tasks

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
🚨WARNING MESSAGE FROM TRUMP DROPPING MARKETS!!!

🚨WARNING MESSAGE FROM TRUMP DROPPING MARKETS!!!

May 30, 2025
Fight breaks out inside Kansas courtroom during murder sentencing

Fight breaks out inside Kansas courtroom during murder sentencing

May 29, 2025
Judge orders release of Columbia student Mohsen Mahdawi

Judge orders release of Columbia student Mohsen Mahdawi

May 31, 2025

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!