• bitcoinBitcoin(BTC)$63,500.002.55%
  • ethereumEthereum(ETH)$1,672.182.52%
  • tetherTether(USDT)$1.00-0.03%
  • binancecoinBNB(BNB)$603.392.36%
  • usd-coinUSDC(USDC)$1.000.00%
  • rippleXRP(XRP)$1.143.11%
  • solanaSolana(SOL)$66.973.99%
  • tronTRON(TRX)$0.315335-1.92%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.030.57%
  • dogecoinDogecoin(DOGE)$0.0862203.08%
  • HyperliquidHyperliquid(HYPE)$58.9810.04%
  • USDSUSDS(USDS)$1.000.00%
  • leo-tokenLEO Token(LEO)$9.500.62%
  • RainRain(RAIN)$0.0132690.35%
  • moneroMonero(XMR)$410.1921.12%
  • zcashZcash(ZEC)$428.674.11%
  • stellarStellar(XLM)$0.1916362.68%
  • CantonCanton(CC)$0.163824-0.21%
  • cardanoCardano(ADA)$0.1706865.05%
  • whitebitWhiteBIT Coin(WBT)$52.012.06%
  • chainlinkChainlink(LINK)$7.913.51%
  • the-open-networkToncoin(TON)$1.736.84%
  • Ethena USDeEthena USDe(USDE)$1.000.02%
  • USD1USD1(USD1)$1.00-0.06%
  • daiDai(DAI)$1.000.01%
  • bitcoin-cashBitcoin Cash(BCH)$204.584.41%
  • MemeCoreMemeCore(M)$2.921.93%
  • hedera-hashgraphHedera(HBAR)$0.0798262.07%
  • litecoinLitecoin(LTC)$42.601.30%
  • suiSui(SUI)$0.762.70%
  • Circle USYCCircle USYC(USYC)$1.130.00%
  • LABLAB(LAB)$9.4821.59%
  • avalanche-2Avalanche(AVAX)$6.653.12%
  • shiba-inuShiba Inu(SHIB)$0.0000053.68%
  • paypal-usdPayPal USD(PYUSD)$1.000.06%
  • nearNEAR Protocol(NEAR)$2.096.80%
  • crypto-com-chainCronos(CRO)$0.0600180.87%
  • Global DollarGlobal Dollar(USDG)$1.00-0.01%
  • tether-goldTether Gold(XAUT)$4,167.272.41%
  • AudieraAudiera(BEAT)$8.7619.52%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • Ondo US Dollar YieldOndo US Dollar Yield(USDY)$1.130.29%
  • BittensorBittensor(TAO)$213.854.74%
  • pax-goldPAX Gold(PAXG)$4,179.592.26%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.058705-1.29%
  • OndoOndo(ONDO)$0.3668739.56%
  • mantleMantle(MNT)$0.541.60%
  • worldcoin-wldWorldcoin(WLD)$0.49756610.52%
  • AsterAster(ASTER)$0.631.15%
  • Ripple USDRipple USD(RLUSD)$1.00-0.04%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Xiaomi’s new open source, agentic AI coding harness MiMo Code beats Claude Code at ultra-long, 200+ step tasks

June 11, 2026
in AI & Technology
Reading Time: 10 mins read
A A
Xiaomi’s new open source, agentic AI coding harness MiMo Code beats Claude Code at ultra-long, 200+ step tasks
ShareShareShareShareShare

Xiaomi’s MiMo AI team has open-sourced MiMo Code V0.1.0, a terminal-native AI coding assistant that the Chinese electronics giant says outperforms Anthropic’s Claude Code on key agentic coding benchmarks, especially on long-horizon, multi-step tasks (200+ steps) — at least, according to its own internal beta release and survey of 576 developers.

It’s also bundling limited-time free access to MiMo-V2.5, its multimodal flagship model with a million-token context window, requiring no registration to get started.

YOU MAY ALSO LIKE

Perplexity Moves Deep Research Into Computer, Routing Research Subtasks Across 20+ Frontier Models For Reports, Decks, And Dashboards

xAI Ships Grok Build Plugin Marketplace With MongoDB, Vercel, Sentry, Chrome DevTools, Cloudflare, and Superpowers Plugins at Launch

The release was announced June 10, 2026 in a post on the social network X from the official @XiaomiMiMo account, which described the tool as “more than an AI coding assistant in your terminal — it’s the smartest coding partner you’ll ever work with.”

MiMo Code is available now on GitHub under an MIT license, and installs with a single terminal command (curl -fsSL https://mimo.xiaomi.com/install | bash) on macOS and Linux or via npm (npm install -g @mimo-ai/cli) on Windows.

The project is a fork of the open-source OpenCode agent, which Xiaomi has extended with its own memory architecture, workflow modes, and model harness.

The end of AI coding agents’ amnesia?

As any avid vibe coder would surely attest, AI coding agents degrade over long working sessions: as the context window fills, earlier decisions, conventions, and task state get compacted away or lost entirely, forcing developers to re-explain their projects.

Xiaomi argues this approach is doomed at scale. “What we need is not better compression, but an explicit storage-and-retrieval mechanism that decides what information should be written into persistent structures, and when it should be recalled,” the MiMo team noted in their launch blog.

MiMo Code attacks this with a cross-session memory system, powered under the hood by SQLite FTS5 full-text search, that spans four layers: project memory (a persistent MEMORY.md file), session checkpoints, scratch notes, and per-task progress logs.

The note-taking is key, here: Rather than forcing the primary coding agent to pause its work to take notes, the system deploys an independent “checkpoint-writer” subagent.

Think of it the primary coding agent as a construction contractor working to build a massive mansion alongside a dedicated architect, the checkpoint-writer subagent. While the main agent focuses on building out the physical structure, the subagent updates the blueprints in real time, noting decisions, issues, and the actual lay of the land as the construction project progresses.

When the context window approaches its limits — the contractor gets lost in the half-built mansion — it can consult the subagent and find its place again. In the case of MiMo Code, the system simply rebuilds the environment from structured checkpoints with the relevant context, ensuring no loss of operational momentum.

Two self-improvement mechanisms round out the system: a /dream command that periodically (roughly every seven days) reviews historical sessions, deduplicates them, and compresses them into long-term memory, and a “distill” function that mines past sessions for repeated workflows that can be automated, following a similar approach taken recently by OpenAI and Anthropic with their various models.

Impressive performance on software engineering (SWE) benchmarks

According to benchmark figures published in Xiaomi’s technical blog post, MiMo Code paired with MiMo-V2.5-Pro outperformed Claude Code paired with Claude Sonnet 4.6 on all three evaluations tested:

MiMo Code vs. Claude Code benchmark performance. Credit: Xiaomi

  • SWE-bench Verified: 82% vs. 79%

  • SWE-bench Pro: 62% vs. 55%

  • Terminal Bench 2: 73% vs. 69%

The harness itself accounts for a measurable share of the gain. Running the same MiMo-V2.5-Pro model in both harnesses, MiMo Code scored 62% on SWE-bench Pro versus 57% for Claude Code, and 73% on Terminal Bench 2 versus 68% — roughly five points each, attributable purely to the agent system rather than the model.

Xiaomi notably did not publish comparisons against OpenAI’s Codex or Google’s Gemini CLI — Claude Code is the sole named competitor throughout its materials, a telling choice of benchmark target.

Independent reference points suggest why. On the official Terminal-Bench 2.0 leaderboard maintained at tbench.ai, OpenAI’s Codex CLI running GPT-5.5 scores 82.2% — roughly nine points above MiMo Code’s self-reported 73% — and OpenAI’s own GPT-5.5 announcement claims 82.7% on the same benchmark.

On SWE-Bench Pro, however, the picture flips: OpenAI reports GPT-5.5 at 58.6%, below MiMo Code + MiMo-V2.5-Pro’s claimed 62%. (MiMo Code does not yet appear on either official leaderboard, and cross-comparing self-run numbers against leaderboard submissions carries the usual configuration caveats.)

Perhaps more interesting than the offline benchmarks: Xiaomi says it ran a human double-blind A/B evaluation during its internal beta, covering 576 developers working in 474 real private repositories, producing 1,213 judged head-to-head pairs against Claude Code using the same target model.

Under 200 execution steps, the two systems split roughly 50/50 — but past 200 steps, MiMo Code’s win rate rose above 65%, supporting the company’s thesis that its memory and state-management architecture pays off specifically on long-horizon work.

Xiaomi itself concedes the standard benchmarks “still measure one-shot problem-solving ability” and don’t capture the tool’s multi-session design goals.

As always, these are vendor self-reported numbers that haven’t been independently verified, and head-to-head harness comparisons are sensitive to configuration. But the claims are consistent with a broader industry pattern: scaffolding and harness engineering are becoming as important as raw model capability in agentic coding performance.

Easy integration with existing developer systems and voice control

From a user experience standpoint, MiMo Code is designed to live where developers already work. It operates directly in the terminal, reading and writing files, running commands, and managing Git.

Out of the box, the tool requires zero configuration, connecting automatically to “MiMo Auto”—a free-for-a-limited-time channel powered by Xiaomi’s multimodal MiMo V2.5 model, which boasts a massive million-token context window. For developers migrating from existing environments, the transition is frictionless: MiMo Code automatically imports MCP servers, custom skills, and API configurations from Claude Code.

Other noteworthy features include:

  • Compose mode: Pressing Tab switches the agent into a specification-driven workflow in which the developer describes a high-level goal and the system autonomously executes the full development cycle — design, planning, coding, testing, and review — following what Xiaomi describes as a “heavy planning upfront, stable verification later” strategy.

  • Voice control: Built on Xiaomi’s MiMo-ASR speech recognition with TenVAD voice activity detection, developers can dictate and modify instructions verbally and speak commands like “send” and “execute” for fully hands-free operation (available for logged-in users).

According to Xiaomi, the gains from the agent harness itself are measurable. Running the same underlying MiMo model in both harnesses, the company says MiMo Code scored 62% on SWE-Bench Pro versus 57% for Claude Code, and 73% on Terminal Bench 2 versus Claude Code’s 68% — roughly five percentage points better on each, attributable purely to the agent system rather than the model.

As always, these are vendor self-reported numbers that haven’t been independently verified, and head-to-head harness comparisons are sensitive to configuration. But the claim is consistent with a broader industry pattern: scaffolding and harness engineering are becoming as important as raw model capability in agentic coding performance.

Aggressively affordable

The bigger lure for many developers may be what’s bundled in.

MiMo Code ships with “MiMo Auto,” a zero-configuration channel offering free, limited-time access to MiMo-V2.5 — the natively multimodal model Xiaomi released in late April 2026, a sparse mixture-of-experts design with 310 billion total parameters (just 15 billion active per inference) and a 1 million token context window, which the company positions as matching Anthropic’s Claude Sonnet 4.6 in multimodal agentic work.

As VentureBeat reported when the MiMo-V2.5 family launched in April, the models are MIT-licensed and among the most efficient and affordable available for agentic tasks.

The larger MiMo-V2.5-Pro — a 1.02-trillion-parameter mixture-of-experts model with 42 billion active parameters and a hybrid-attention architecture — led the open-source field on Xiaomi’s ClawEval agentic benchmark with a 63.8% success rate while consuming only about 70,000 tokens per trajectory, roughly 40–60% fewer than Anthropic’s Claude Opus 4.6, Google’s Gemini 3.1 Pro, or OpenAI’s GPT-5.4 needed for comparable results.

Notably, the V2.5-Pro’s post-training was explicitly designed to instill “harness awareness” — training the model to manage its own memory and context within agent scaffolds like Claude Code or OpenCode — making a Xiaomi-built harness optimized around that capability a logical next step.

Pricing is similarly aggressive: MiMo-V2.5 starts at $0.40 per million input tokens and $2.00 per million output tokens, while V2.5-Pro runs $1.00/$3.00 per million (input/output) up to 256K context, doubling beyond that, with cache hits dropping input costs to as little as $0.20–$0.40 per million, making it among the cheapest frontier models available globally.

Model

Input

Output

Total Cost

Source

MiMo-V2.5 Flash

$0.10

$0.30

$0.40

Xiaomi MiMo

deepseek-v4-flash

$0.14

$0.28

$0.42

DeepSeek

deepseek-v4-pro

$0.435

$0.87

$1.305

DeepSeek

MiniMax-M3

$0.30

$1.20

$1.50

MiniMax

Gemini 3.1 Flash-Lite

$0.25

$1.50

$1.75

Google

Qwen3.7-Plus

$0.40

$1.60

$2.00

Alibaba Cloud

MiMo-V2.5

$0.40

$2.00

$2.40

Xiaomi MiMo

Grok 4.3 (low context)

$1.25

$2.50

$3.75

xAI

MiMo-V2.5 Pro (≤256K)

$1.00

$3.00

$4.00

Xiaomi MiMo

GLM-5

$1.00

$3.20

$4.20

Z.ai

Kimi-K2.6

$0.95

$4.00

$4.95

Moonshot/Kimi

GLM-5.1

$1.40

$4.40

$5.80

Z.ai

Grok 4.3 (high context)

$2.50

$5.00

$7.50

xAI

MiMo-V2.5 Pro (>256K)

$2.00

$6.00

$8.00

Xiaomi MiMo

Qwen3.7-Max

$2.50

$7.50

$10.00

Alibaba Cloud

Gemini 3.5 Flash

$1.50

$9.00

$10.50

Google

Gemini 3.1 Pro Preview (≤200K)

$2.00

$12.00

$14.00

Google

GPT-5.4

$2.50

$15.00

$17.50

OpenAI

Gemini 3.1 Pro Preview (>200K)

$4.00

$18.00

$22.00

Google

Claude Opus 4.8

$5.00

$25.00

$30.00

Anthropic

GPT-5.5

$5.00

$30.00

$35.00

OpenAI

Claude Fable 5 / Claude Mythos 5

$10.00

$50.00

$60.00

Anthropic

For developers who don’t want Xiaomi’s models at all, MiMo Code also supports third-party backends — including token plans from DeepSeek, Moonshot’s Kimi, and Zhipu’s GLM — along with any OpenAI-compatible API, mirroring the bring-your-own-model flexibility of its OpenCode parent.

Terminal AI coding agent wars go global

MiMo Code lands in an increasingly crowded field of terminal-based coding agents: Anthropic’s Claude Code, OpenAI’s Codex CLI, Google’s Gemini CLI, and open-source players like OpenCode and Aider.

What’s new is the entrant. Xiaomi — the world’s third-largest smartphone maker, with a fast-growing EV business — has been methodically building its MiMo AI division since the release of the MiMo-7B reasoning model in April 2025, following with the MiMo-VL vision-language series, MiMo-V2-Flash, the 1-trillion-parameter MiMo-V2-Pro in March 2026, and the V2.5 flagship family in April.

The effort is led by Fuli Luo, a veteran of DeepSeek’s disruptive R1 project, who has characterized Xiaomi’s frontier push as a “quiet ambush” — and backed it with a 100-trillion free token grant for builders announced alongside the V2.5 launch.

The playbook is familiar from DeepSeek, Alibaba’s Qwen, MiniMax, and Moonshot AI’s Kimi series: release genuinely capable models and tooling under permissive licenses at a fraction of U.S. lab pricing, and convert the resulting developer mindshare into a durable ecosystem.

By pairing an open-source agent harness with a free frontier-class model, Xiaomi is effectively eliminating both the licensing and the usage cost of entry — at least for now.

What it means for enterprises and technical decision-makers

For engineering leaders, MiMo Code is a low-risk, potentially high-value evaluation candidate: MIT-style licensing permits modification and commercial integration, the OpenCode lineage means the architecture is inspectable, and the bring-your-own-model support means it can be pointed at an internally approved endpoint rather than Xiaomi’s cloud.

The persistent memory system addresses a real and widely felt pain point in agentic development workflows — one that competitors are also racing to solve.

The countervailing considerations: the “free for a limited time” model access is by definition temporary and routes code context through Xiaomi’s servers, which will be a non-starter for organizations with strict data-residency or IP policies; the benchmark edge over Claude Code is self-reported; and a V0.1.0 release number signals exactly what it suggests about maturity.

Teams subject to U.S. government procurement restrictions on Chinese technology vendors should also weigh that context before adopting.

Credit: Source link

ShareTweetSendSharePin

Related Posts

Perplexity Moves Deep Research Into Computer, Routing Research Subtasks Across 20+ Frontier Models For Reports, Decks, And Dashboards
AI & Technology

Perplexity Moves Deep Research Into Computer, Routing Research Subtasks Across 20+ Frontier Models For Reports, Decks, And Dashboards

June 11, 2026
xAI Ships Grok Build Plugin Marketplace With MongoDB, Vercel, Sentry, Chrome DevTools, Cloudflare, and Superpowers Plugins at Launch
AI & Technology

xAI Ships Grok Build Plugin Marketplace With MongoDB, Vercel, Sentry, Chrome DevTools, Cloudflare, and Superpowers Plugins at Launch

June 11, 2026
Overwatch’s Latest Hero Will Throw A Bike At Your Head
AI & Technology

Overwatch’s Latest Hero Will Throw A Bike At Your Head

June 11, 2026
Waymo’s Monthly Membership Seems Like A Bad Deal
AI & Technology

Waymo’s Monthly Membership Seems Like A Bad Deal

June 11, 2026
Next Post
The AI Trade Is Still “Pregame” — Stocks to Buy While It’s Early

The AI Trade Is Still “Pregame” — Stocks to Buy While It’s Early

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
Trump’s ‘Transactional Diplomacy’ in China and the AI-Enabled Life – May 13 | Here’s the Scoop

Trump’s ‘Transactional Diplomacy’ in China and the AI-Enabled Life – May 13 | Here’s the Scoop

June 8, 2026
How forecasting has changed since the 1996 movie ‘Twister’

How forecasting has changed since the 1996 movie ‘Twister’

June 11, 2026
WNBA tips off 30th season with pressure to prove its worth

WNBA tips off 30th season with pressure to prove its worth

June 11, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!