• bitcoinBitcoin(BTC)$76,715.000.77%
  • ethereumEthereum(ETH)$2,300.071.01%
  • tetherTether(USDT)$1.00-0.01%
  • rippleXRP(XRP)$1.37-0.20%
  • binancecoinBNB(BNB)$621.50-0.09%
  • usd-coinUSDC(USDC)$1.000.00%
  • solanaSolana(SOL)$83.890.40%
  • tronTRON(TRX)$0.323074-0.08%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.030.29%
  • dogecoinDogecoin(DOGE)$0.1059677.45%
  • whitebitWhiteBIT Coin(WBT)$54.150.25%
  • USDSUSDS(USDS)$1.000.00%
  • leo-tokenLEO Token(LEO)$10.35-0.25%
  • HyperliquidHyperliquid(HYPE)$39.72-0.79%
  • cardanoCardano(ADA)$0.2472520.34%
  • bitcoin-cashBitcoin Cash(BCH)$450.270.68%
  • moneroMonero(XMR)$382.550.77%
  • chainlinkChainlink(LINK)$9.21-0.18%
  • CantonCanton(CC)$0.148641-0.42%
  • zcashZcash(ZEC)$328.57-1.56%
  • stellarStellar(XLM)$0.161287-0.76%
  • MemeCoreMemeCore(M)$3.53-2.72%
  • USD1USD1(USD1)$1.00-0.08%
  • daiDai(DAI)$1.000.01%
  • litecoinLitecoin(LTC)$56.162.14%
  • avalanche-2Avalanche(AVAX)$9.19-0.13%
  • hedera-hashgraphHedera(HBAR)$0.0891450.05%
  • RainRain(RAIN)$0.0079487.46%
  • Ethena USDeEthena USDe(USDE)$1.000.00%
  • shiba-inuShiba Inu(SHIB)$0.0000061.88%
  • suiSui(SUI)$0.92-0.57%
  • paypal-usdPayPal USD(PYUSD)$1.00-0.01%
  • the-open-networkToncoin(TON)$1.322.31%
  • crypto-com-chainCronos(CRO)$0.068701-0.89%
  • Circle USYCCircle USYC(USYC)$1.120.00%
  • tether-goldTether Gold(XAUT)$4,520.44-1.36%
  • Global DollarGlobal Dollar(USDG)$1.000.00%
  • BittensorBittensor(TAO)$256.601.61%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.071181-2.90%
  • pax-goldPAX Gold(PAXG)$4,515.49-1.44%
  • mantleMantle(MNT)$0.630.08%
  • polkadotPolkadot(DOT)$1.23-0.59%
  • uniswapUniswap(UNI)$3.230.37%
  • SkySky(SKY)$0.084198-3.70%
  • Pi NetworkPi Network(PI)$0.188667-2.62%
  • Falcon USDFalcon USD(USDF)$1.00-0.04%
  • okbOKB(OKB)$83.180.17%
  • nearNEAR Protocol(NEAR)$1.34-0.92%
  • AsterAster(ASTER)$0.674.86%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Ai2 releases MolmoWeb, an open-weight visual web agent with 30K human task trajectories and a full training stack

March 24, 2026
in AI & Technology
Reading Time: 3 mins read
A A
Ai2 releases MolmoWeb, an open-weight visual web agent with 30K human task trajectories and a full training stack
ShareShareShareShareShare

Engineers building browser agents today face a choice between closed APIs they cannot inspect and open-weight frameworks with no trained model underneath them. Ai2 is now offering a third option.

YOU MAY ALSO LIKE

Poolside AI Introduces Laguna XS.2 and M.1: Agentic Coding Models Reaching 68.2% and 72.5% on SWE-bench Verified

How to build custom reasoning agents with a fraction of the compute

The Seattle-based nonprofit behind the open-source OLMo language models and Molmo vision-language family today is releasing MolmoWeb, an open-weight visual web agent available in 4 billion and 8 billion parameter sizes.

Until now, no open-weight visual web agent shipped with the training data and pipeline needed to audit or reproduce it. MolmoWeb does.

MolmoWebMix, the accompanying dataset, includes 30,000 human task trajectories across more than 1,100 websites, 590,000 individual subtask demonstrations and 2.2 million screenshot question-answer pairs — which Ai2 describes as the largest publicly released collection of human web-task execution ever assembled.

“Can you go from just passively understanding images, describing them and captioning them, to actually making them take action in some environment?” Tanmay Gupta, senior research scientist at Ai2, told VentureBeat. “That is exactly what MolmoWeb is.”

How it works: It sees what you see

MolmoWeb operates entirely from browser screenshots. It does not parse HTML or rely on accessibility tree representations of a page. At each step it receives a task instruction, the current screenshot, a text log of previous actions and the current URL and page title. It produces a natural-language thought describing its reasoning, then executes the next browser action — clicking at screen coordinates, typing text, scrolling, navigating to a URL or switching tabs.

The model is browser-agnostic. It requires only a screenshot, which means it runs against local Chrome, Safari or a hosted browser service. The hosted demo uses Browserbase, a cloud browser infrastructure startup.

The dataset that makes it work

The model weights are only part of what Ai2 is releasing. MolmoWebMix, the accompanying training dataset, is the core differentiator from every other open-weight agent available today.

“The data basically looks like a sequence of screenshots and actions paired with instructions for what the intent behind that sequence of screenshots was,” Gupta said.

MolmoWebMix combines three components.

Human demonstrations. Human annotators completed browsing tasks using a custom Chrome extension that recorded actions and screenshots across more than 1,100 websites. The result is 30,000 task trajectories spanning more than 590,000 individual subtask demonstrations.

Synthetic trajectories. To scale beyond what human annotation alone can provide, Ai2 generated additional trajectories using text-based accessibility-tree agents — single-agent runs filtered for task success, multi-agent pipelines that decompose tasks into subgoals and deterministic navigation paths across hundreds of websites. Critically, no proprietary vision agents were used. The synthetic data came from text-only systems, not from OpenAI Operator or Anthropic’s computer use API.

GUI perception data. A third component trains the model to read and reason about page content directly from images. It includes more than 2.2 million screenshot question-answer pairs drawn from nearly 400 websites, covering element grounding and screenshot-based reasoning tasks.

“If you are able to perform a task and you’re able to record a trajectory from that, you should be able to train the web agent on that trajectory to do the exact same task,” Gupta said.

How MolmoWeb stacks up against the competition

In Gupta’s view, there are two categories of technologies in the browser agent market.

The first is API-only systems, capable but closed, with no visibility into training or architecture. OpenAI Operator, Anthropic’s computer use API and Google’s Gemini computer use fall into this group.

The second is open-weight models, a significantly smaller category. Browser-use, the most widely adopted open alternative, is a framework rather than a trained model. It requires developers to supply their own LLM and build the agent layer on top.

MolmoWeb sits in the second category as a fully trained open-weight vision model. Ai2 reports it leads that group across four live-website benchmarks: WebVoyager, Online-Mind2Web, DeepShop and WebTailBench. According to Ai2, it also outperforms older API-based agents built on GPT-4o with accessibility tree plus screenshot input.

Ai2 documents several current limitations in the release. The model makes occasional errors reading text from screenshots, drag-and-drop interactions remain unreliable and performance degrades on ambiguous or heavily constrained instructions. The model was also not trained on tasks requiring logins or financial transactions.

Enterprise teams evaluating browser agents are not just choosing a model. They are deciding whether they can audit what they are running, fine-tune it on internal workflows, and avoid a per-call API dependency.

Credit: Source link

ShareTweetSendSharePin

Related Posts

Poolside AI Introduces Laguna XS.2 and M.1: Agentic Coding Models Reaching 68.2% and 72.5% on SWE-bench Verified
AI & Technology

Poolside AI Introduces Laguna XS.2 and M.1: Agentic Coding Models Reaching 68.2% and 72.5% on SWE-bench Verified

April 29, 2026
How to build custom reasoning agents with a fraction of the compute
AI & Technology

How to build custom reasoning agents with a fraction of the compute

April 28, 2026
American AI startup Poolside launches free, high-performing open model Laguna XS.2 for local agentic coding
AI & Technology

American AI startup Poolside launches free, high-performing open model Laguna XS.2 for local agentic coding

April 28, 2026
Texas Instruments made a new flagship graphing calculator: the TI-84 Evo
AI & Technology

Texas Instruments made a new flagship graphing calculator: the TI-84 Evo

April 28, 2026
Next Post
Dow and S&P 500 are little changed as Wall Street tries to build on Monday's rally: Live updates – CNBC

Dow and S&P 500 are little changed as Wall Street tries to build on Monday's rally: Live updates - CNBC

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
Google Cloud Debuts New AI Chips | Bloomberg Tech 4/22/2026

Google Cloud Debuts New AI Chips | Bloomberg Tech 4/22/2026

April 23, 2026
Acting ICE Director Todd Lyons resigns

Acting ICE Director Todd Lyons resigns

April 23, 2026
Democratic focus group calls party ‘weak,’ ‘spineless’ and ‘floundering’

Democratic focus group calls party ‘weak,’ ‘spineless’ and ‘floundering’

April 28, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!