• bitcoinBitcoin(BTC)$77,857.00-0.19%
  • ethereumEthereum(ETH)$2,321.26-0.50%
  • tetherTether(USDT)$1.00-0.01%
  • rippleXRP(XRP)$1.42-0.90%
  • binancecoinBNB(BNB)$628.04-0.56%
  • usd-coinUSDC(USDC)$1.00-0.01%
  • solanaSolana(SOL)$85.21-1.47%
  • tronTRON(TRX)$0.3250920.33%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.020.00%
  • dogecoinDogecoin(DOGE)$0.098085-0.57%
  • whitebitWhiteBIT Coin(WBT)$54.93-0.47%
  • USDSUSDS(USDS)$1.000.00%
  • HyperliquidHyperliquid(HYPE)$42.262.70%
  • leo-tokenLEO Token(LEO)$10.380.84%
  • cardanoCardano(ADA)$0.247809-1.87%
  • bitcoin-cashBitcoin Cash(BCH)$447.97-0.99%
  • moneroMonero(XMR)$392.841.06%
  • chainlinkChainlink(LINK)$9.33-1.14%
  • zcashZcash(ZEC)$356.050.72%
  • CantonCanton(CC)$0.148711-1.31%
  • stellarStellar(XLM)$0.168129-1.31%
  • MemeCoreMemeCore(M)$4.12-4.93%
  • daiDai(DAI)$1.00-0.01%
  • USD1USD1(USD1)$1.00-0.03%
  • litecoinLitecoin(LTC)$55.39-1.50%
  • avalanche-2Avalanche(AVAX)$9.27-1.88%
  • hedera-hashgraphHedera(HBAR)$0.090871-1.92%
  • Ethena USDeEthena USDe(USDE)$1.00-0.01%
  • suiSui(SUI)$0.93-1.68%
  • shiba-inuShiba Inu(SHIB)$0.000006-1.33%
  • RainRain(RAIN)$0.007373-0.03%
  • paypal-usdPayPal USD(PYUSD)$1.000.03%
  • the-open-networkToncoin(TON)$1.31-0.84%
  • crypto-com-chainCronos(CRO)$0.070003-0.34%
  • Circle USYCCircle USYC(USYC)$1.120.00%
  • tether-goldTether Gold(XAUT)$4,694.040.01%
  • Global DollarGlobal Dollar(USDG)$1.000.00%
  • BittensorBittensor(TAO)$249.570.84%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.073079-2.52%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • pax-goldPAX Gold(PAXG)$4,697.720.07%
  • mantleMantle(MNT)$0.64-2.24%
  • polkadotPolkadot(DOT)$1.23-2.83%
  • uniswapUniswap(UNI)$3.25-1.17%
  • SkySky(SKY)$0.085371-3.10%
  • Pi NetworkPi Network(PI)$0.1795660.03%
  • Falcon USDFalcon USD(USDF)$1.000.04%
  • nearNEAR Protocol(NEAR)$1.38-1.95%
  • okbOKB(OKB)$83.99-0.71%
  • HTX DAOHTX DAO(HTX)$0.000002-0.17%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Google DeepMind Releases Gemini Robotics-ER 1.6: Bringing Enhanced Embodied Reasoning and Instrument Reading to Physical AI

April 15, 2026
in AI & Technology
Reading Time: 8 mins read
A A
Google DeepMind Releases Gemini Robotics-ER 1.6: Bringing Enhanced Embodied Reasoning and Instrument Reading to Physical AI
ShareShareShareShareShare

Google DeepMind research team introduced Gemini Robotics-ER 1.6, a significant upgrade to its embodied reasoning model designed to serve as the ‘cognitive brain’ of robots operating in real-world environments. The model specializes in reasoning capabilities critical for robotics, including visual and spatial understanding, task planning, and success detection — acting as the high-level reasoning model for a robot, capable of executing tasks by natively calling tools like Google Search, vision-language-action models (VLAs), or any other third-party user-defined functions.

Here is the key architectural idea to understand: Google DeepMind takes a dual-model approach to robotics AI. Gemini Robotics 1.5 is the vision-language-action (VLA) model — it processes visual inputs and user prompts and directly translates them into physical motor commands. Gemini Robotics-ER, on the other hand, is the embodied reasoning model: it specializes in understanding physical spaces, planning, and making logical decisions, but does not directly control robotic limbs. Instead, it provides high-level insights to help the VLA model decide what to do next. Think of it as the difference between a strategist and an executor — Gemini Robotics-ER 1.6 is the strategist.

YOU MAY ALSO LIKE

Meta AI Releases Sapiens2: A High-Resolution Human-Centric Vision Model for Pose, Segmentation, Normals, Pointmap, and Albedo

The LoRA Assumption That Breaks in Production 

https://deepmind.google/blog/gemini-robotics-er-1-6/?

What’s New in Gemini Robotics-ER 1.6

Gemini Robotics-ER 1.6 shows significant improvement over both Gemini Robotics-ER 1.5 and Gemini 3.0 Flash, specifically enhancing spatial and physical reasoning capabilities such as pointing, counting, and success detection. But the key addition is a capability that did not exist in prior versions at all: instrument reading.

Pointing as a Foundation for Spatial Reasoning

Pointing — the model’s ability to identify precise pixel-level locations in an image — is far more powerful than it sounds. Points can be used to express spatial reasoning (precision object detection and counting), relational logic (making comparisons such as identifying the smallest item in a set, or defining from-to relationships like ‘move X to location Y’), motion reasoning (mapping trajectories and identifying optimal grasp points), and constraint compliance (reasoning through complex prompts like “point to every object small enough to fit inside the blue cup”).

https://deepmind.google/blog/gemini-robotics-er-1-6/?

In internal benchmarks, Gemini Robotics-ER 1.6 demonstrates a clear advantage over its predecessor. Gemini Robotics-ER 1.6 correctly identifies the number of hammers, scissors, paintbrushes, pliers, and garden tools in a scene, and does not point to requested items that are not present in the image — such as a wheelbarrow and Ryobi drill. In comparison, Gemini Robotics-ER 1.5 fails to identify the correct number of hammers or paintbrushes, misses scissors altogether, and hallucinates a wheelbarrow. For AI Robotics professionals this matters because hallucinated object detections in robotic pipelines can cause cascading downstream failures — a robot that ‘sees’ an object that isn’t there will attempt to interact with empty space.

Success Detection and Multi-View Reasoning

In robotics, knowing when a task is finished is just as important as knowing how to start it. Success detection serves as a critical decision-making engine that allows an agent to intelligently choose between retrying a failed attempt or progressing to the next stage of a plan.

This is a harder problem than it looks. Most modern robotics setups include multiple camera views such as an overhead and wrist-mounted feed. This means a system needs to understand how different viewpoints combine to form a coherent picture at each moment and across time. Gemini Robotics-ER 1.6 advances multi-view reasoning, enabling it to better fuse information from multiple camera streams, even in occluded or dynamically changing environments.

Instrument Reading: A Real-World Breakthrough

The genuinely new capability in Gemini Robotics-ER 1.6 is instrument reading — the ability to interpret analog gauges, pressure meters, sight glasses, and digital readouts in industrial settings. This task stems from facility inspection needs, a critical focus area for Boston Dynamics. Spot, a Boston Dynamics robot, is able to visit instruments throughout a facility and capture images of them for Gemini Robotics-ER 1.6 to interpret.

Instrument reading requires complex visual reasoning: one must precisely perceive a variety of inputs — including the needles, liquid level, container boundaries, tick marks, and more — and understand how they all relate to each other. In the case of sight glasses, this involves estimating how much liquid fills the sightglass while accounting for distortion from the camera perspective. Gauges typically have text describing the unit, which must be read and interpreted, and some have multiple needles referring to different decimal places that need to be combined.

https://deepmind.google/blog/gemini-robotics-er-1-6/?

Gemini Robotics-ER 1.6 achieves its instrument readings by using agentic vision (a capability that combines visual reasoning with code execution, introduced with Gemini 3.0 Flash and extended in Gemini Robotics-ER 1.6). The model takes intermediate steps: first zooming into an image to get a better read of small details in a gauge, then using pointing and code execution to estimate proportions and intervals, and ultimately applying world knowledge to interpret meaning.

Gemini Robotics-ER 1.5 achieves a 23% success rate on instrument reading, Gemini 3.0 Flash reaches 67%, Gemini Robotics-ER 1.6 reaches 86%, and Gemini Robotics-ER 1.6 with agentic vision hits 93%. One important caveat: Gemini Robotics-ER 1.5 was evaluated without agentic vision because it does not support that capability. The other three models were evaluated with agentic vision enabled for the instrument reading task, making the 23% baseline less a performance gap and more a fundamental architectural difference. For AI developers evaluating model generations, this distinction matters — you are not comparing apples to apples across the full benchmark column.

Key Takeaways

  • Gemini Robotics-ER 1.6 is a reasoning model, not an action model: It acts as the high-level ‘brain’ of a robot — handling spatial understanding, task planning, and success detection — while the separate VLA model (Gemini Robotics 1.5) handles the actual physical motor commands.
  • Pointing is more powerful than it looks: Gemini Robotics-ER 1.6’s pointing capability goes far beyond simple object detection — it enables relational logic, motion trajectory mapping, grasp point identification, and constraint-based reasoning, all of which are foundational to reliable robotic manipulation.
  • Instrument reading is the biggest new capability: Built in collaboration with Boston Dynamics’ Spot robot for industrial facility inspection, Gemini Robotics-ER 1.6 can now read analog gauges, pressure meters, and sight glasses with 93% accuracy using agentic vision — up from just 23% in Gemini Robotics-ER 1.5, which lacked the capability entirely.
  • Success detection is what enables true autonomy: Knowing when a task is actually complete — across multiple camera views, in occluded or dynamic environments — is what allows a robot to decide whether to retry or move to the next step without human intervention.

Check out the Technical details and Model Information. Also, feel free to follow us on Twitter and don’t forget to join our 130k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

The post Google DeepMind Releases Gemini Robotics-ER 1.6: Bringing Enhanced Embodied Reasoning and Instrument Reading to Physical AI appeared first on MarkTechPost.

Credit: Source link

ShareTweetSendSharePin

Related Posts

Meta AI Releases Sapiens2: A High-Resolution Human-Centric Vision Model for Pose, Segmentation, Normals, Pointmap, and Albedo
AI & Technology

Meta AI Releases Sapiens2: A High-Resolution Human-Centric Vision Model for Pose, Segmentation, Normals, Pointmap, and Albedo

April 27, 2026
The LoRA Assumption That Breaks in Production 
AI & Technology

The LoRA Assumption That Breaks in Production 

April 27, 2026
How to Build Smarter Multilingual Text Wrapping with BudouX Through Parsing, HTML Rendering, Model Introspection, and Toy Training
AI & Technology

How to Build Smarter Multilingual Text Wrapping with BudouX Through Parsing, HTML Rendering, Model Introspection, and Toy Training

April 26, 2026
Forced Windows updates can now be paused forever
AI & Technology

Forced Windows updates can now be paused forever

April 26, 2026
Next Post
At least four injured in collision between plane and vehicle at NYC LaGuardia Airport

At least four injured in collision between plane and vehicle at NYC LaGuardia Airport

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
Judge halts construction on Trump’s White House ballroom

Judge halts construction on Trump’s White House ballroom

April 22, 2026
Moonshot AI Releases Kimi K2.6 with Long-Horizon Coding, Agent Swarm Scaling to 300 Sub-Agents and 4,000 Coordinated Steps

Moonshot AI Releases Kimi K2.6 with Long-Horizon Coding, Agent Swarm Scaling to 300 Sub-Agents and 4,000 Coordinated Steps

April 21, 2026
Father in Louisiana mass shooting was convicted felon

Father in Louisiana mass shooting was convicted felon

April 21, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!