• bitcoinBitcoin(BTC)$77,627.00-0.40%
  • ethereumEthereum(ETH)$2,133.75-0.43%
  • tetherTether(USDT)$1.000.00%
  • binancecoinBNB(BNB)$659.390.87%
  • rippleXRP(XRP)$1.37-0.76%
  • usd-coinUSDC(USDC)$1.000.01%
  • solanaSolana(SOL)$86.870.35%
  • tronTRON(TRX)$0.3638781.25%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.02-1.24%
  • dogecoinDogecoin(DOGE)$0.1057250.63%
  • HyperliquidHyperliquid(HYPE)$57.622.57%
  • whitebitWhiteBIT Coin(WBT)$57.27-0.28%
  • zcashZcash(ZEC)$656.31-2.65%
  • USDSUSDS(USDS)$1.000.00%
  • cardanoCardano(ADA)$0.250314-0.24%
  • leo-tokenLEO Token(LEO)$9.99-0.86%
  • bitcoin-cashBitcoin Cash(BCH)$378.32-0.61%
  • chainlinkChainlink(LINK)$9.790.73%
  • moneroMonero(XMR)$379.99-6.01%
  • CantonCanton(CC)$0.1549820.49%
  • the-open-networkToncoin(TON)$2.04-0.15%
  • stellarStellar(XLM)$0.1478081.23%
  • USD1USD1(USD1)$1.000.01%
  • suiSui(SUI)$1.11-0.34%
  • Ethena USDeEthena USDe(USDE)$1.00-0.03%
  • daiDai(DAI)$1.00-0.01%
  • litecoinLitecoin(LTC)$54.21-0.13%
  • avalanche-2Avalanche(AVAX)$9.451.01%
  • hedera-hashgraphHedera(HBAR)$0.0896930.39%
  • MemeCoreMemeCore(M)$2.86-3.67%
  • paypal-usdPayPal USD(PYUSD)$1.000.00%
  • RainRain(RAIN)$0.0075220.28%
  • shiba-inuShiba Inu(SHIB)$0.0000060.35%
  • crypto-com-chainCronos(CRO)$0.069445-0.13%
  • Circle USYCCircle USYC(USYC)$1.120.00%
  • nearNEAR Protocol(NEAR)$2.0920.37%
  • Global DollarGlobal Dollar(USDG)$1.000.02%
  • BittensorBittensor(TAO)$279.821.48%
  • tether-goldTether Gold(XAUT)$4,511.40-0.31%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • uniswapUniswap(UNI)$3.63-0.55%
  • mantleMantle(MNT)$0.680.80%
  • polkadotPolkadot(DOT)$1.313.04%
  • Ondo US Dollar YieldOndo US Dollar Yield(USDY)$1.13-0.48%
  • pax-goldPAX Gold(PAXG)$4,513.97-0.31%
  • OndoOndo(ONDO)$0.4107040.30%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.061891-1.50%
  • HTX DAOHTX DAO(HTX)$0.0000020.55%
  • Falcon USDFalcon USD(USDF)$1.000.01%
  • AsterAster(ASTER)$0.69-0.68%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Qwen Introduces Qwen3.7-Max: A Reasoning Agent Model With a 1M-Token Context Window

May 21, 2026
in AI & Technology
Reading Time: 11 mins read
A A
Qwen Introduces Qwen3.7-Max: A Reasoning Agent Model With a 1M-Token Context Window
ShareShareShareShareShare

Most AI models today are not designed for sustained, multi-step autonomous execution. Tasks like running hundreds of iterative code modifications, or chaining tool calls across hours without human intervention, require a different kind of model architecture and training focus.

Alibaba’s Qwen team formally announced Qwen3.7-Max at the 2026 Alibaba Cloud Summit on May 20. Although, two preview versions of the Qwen3.7 series quietly appeared on Arena AI’s leaderboard with no press release and no official API announcement.

YOU MAY ALSO LIKE

Alibaba’s proprietary Qwen3.7-Max can run for 35 hours autonomously and supports external harnesses like Anthropic’s Claude Code

How CopilotKit Is Redefining the Agentic AI Stack in 2026

Two Preview Models Released Simultaneously

Alibaba previewed two models simultaneously: Qwen3.7-Max-Preview and Qwen3.7-Plus-Preview. They ranked 13th globally in text capabilities and 16th in vision capabilities, respectively, according to LM Arena.

In Text Arena, Qwen3.7-Max-Preview ranked #13 overall, placing Alibaba as the #6 lab in text. In Vision Arena, Qwen3.7-Plus-Preview ranked #16 overall, placing Alibaba as the #5 lab in vision. The model rank and the lab rank are separate figures.

Qwen3.7-Plus-Preview is described as a high-performance balanced version preview, focusing on reasoning and logical expression, with its toolchain to be gradually opened in the future. It handles vision and multimodal inputs. Qwen3.7-Max is the text-only reasoning flagship. This article covers Qwen3.7-Max, as it is the model Alibaba formally announced with API access.

What is Qwen3.7-Max Designed For

Alibaba Qwen team described Qwen3.7-Max as its most advanced and comprehensive agent model to date. The model is proprietary and closed-weight. It is capable of handling coding and debugging, office workflow automation, and long-horizon tasks spanning hundreds or even thousands of steps.

Extended-Thinking Mode

Qwen3.7-Max is a reasoning model. The model generates a chain of thought first — an internal sequence of steps where it plans, checks its work, and corrects course before committing to a final answer. On interfaces like Qwen Chat, this shows up as a ‘Thinking’ mode you can switch on to see the model’s reasoning trace.

Reasoning models produce significantly more output tokens than standard completions. When Artificial Analysis ran its Intelligence Index evaluation, Qwen3.7-Max generated about 97 million tokens, compared to an average of 24 million for models on that benchmark. For short or simple tasks, this overhead adds latency without improving output quality. For multi-step planning, code refactoring, or long agent chains, extended-thinking mode is where the model’s strength applies.

Context Window

The model features a 1M token context window, up from 256K on Qwen3.6 Max Preview. It supports text input and output only. Pricing has not yet been announced. Qwen3.6 Max Preview was priced at $1.30/$7.80 per million input/output tokens on Alibaba Cloud.

A million-token context window can hold a full mid-sized code repository or a large stack of documents in a single request. Models often reason less reliably as the context window fills. Independent long-context testing for Qwen3.7-Max is not yet available.

Benchmark Results

Qwen3.7-Max scored 56.6 on the Artificial Analysis Intelligence Index, placing it fifth overall. That represents a 4.8-point gain over its predecessor Qwen3.6 Max Preview (51.8), and puts it ahead of Google’s Gemini 3.5 Flash (55.3). GPT-5.5 (60.2), Claude Opus 4.7 (57.3), and Gemini 3.1 Pro Preview (57.2) still lead the overall rankings.

The Intelligence Index v4.0 aggregates ten evaluations, including GDPval-AA, Terminal-Bench Hard, SciCode, AA-Omniscience, Humanity’s Last Exam, and GPQA Diamond.

https://qwen.ai/blog?id=qwen3.7

The improvement over Qwen3.6 Max Preview is not uniform. Most of the Index gains are concentrated in scientific reasoning, agentic capability, and coding. CritPt rose 9.7 percentage points (from 3.7% to 13.4%), Humanity’s Last Exam jumped 9.2 points (from 28.9% to 38.1%), and Terminal-Bench Hard climbed 6.9 points (from 43.9% to 50.8%). GDPval-AA added 42 Elo points (from 1504 to 1546). Scores on other benchmarks are largely flat compared to Qwen3.6 Max Preview.

One result on the Index requires careful reading. On AA-Omniscience, Qwen3.7-Max’s raw accuracy actually dropped 7.6 percentage points (from 37.7% to 30.1%), while its hallucination rate fell 21.3 points (from 44.2% to 22.9%). The model is choosing to say “I don’t know” more often rather than recalling more facts. Its attempt rate fell from 67.3% to 48.0%, the lowest among frontier models in the comparison. The AA-Omniscience benchmark rewards correct answers and penalizes hallucinations but has no penalty for refusing to answer. For use cases that depend on broad factual recall, this is a meaningful limitation to test against your workload.

In Text Arena, Qwen3.7-Max-Preview ranked #13 overall with an Elo score of 1,475. Category rankings include #7 in Math, #9 in Expert Prompts, #9 in Software and IT, and #10 in Coding.

All benchmark numbers are preliminary. The model carries a ‘Preview’ mode, indicating Alibaba considers it an early build.

Agentic Performance — Internal Test

In an internal Alibaba test on a new chip platform, the model autonomously performed more than 1,000 tool calls and iterative code modifications to optimize a key kernel. Alibaba claimed the process improved inference speed by roughly 10x compared with the previous version.

Marktechpost’s Visual Explainer






Slide 1 of 6

What is Qwen3.7-Max?

A proprietary reasoning model from Alibaba, designed for long-horizon agent tasks, code generation, and multi-step automation.

Context Window

1 million tokens — enough to fit a full mid-sized code repository in a single request.

Reasoning Model

Uses chain-of-thought (extended-thinking mode) before producing a final answer.

Input / Output

Text in, text out. No image input supported in this model.

API String

Use qwen3.7-max when calling via Alibaba Cloud Model Studio.

Apache-compatible API
OpenAI & Anthropic spec
Preview — no open weights yet

Slide 2 of 6

Quick Start: Chat Interface

The fastest way to test Qwen3.7-Max with no API key or setup required.

  • 1

    Go to Qwen Chat

    Navigate to chat.qwen.ai and create a free account.

  • 2

    Select the model

    In the model selector dropdown, choose Qwen3.7-Max. It may appear as Qwen3.7-Max-Preview during the preview period.

  • 3

    Enable Thinking Mode

    Toggle on Thinking Mode in the chat interface. This activates chain-of-thought reasoning and shows the model’s internal reasoning trace before the final answer.

  • 4

    Send your prompt

    Type your query. For best results on complex tasks, be specific about steps, constraints, and expected output format.

💡

Use your hardest real-world prompts when testing. Multi-step math problems, complex refactoring requests, and ambiguous expert questions reveal more about model quality than simple prompts.

Slide 3 of 6

API Access

Qwen3.7-Max is compatible with both OpenAI and Anthropic API specifications. You can plug it into existing pipelines with minimal changes.

OpenAI-compatible Python call

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_DASHSCOPE_API_KEY",
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)

response = client.chat.completions.create(
    model="qwen3.7-max",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user",   "content": "Explain chain-of-thought reasoning."}
    ]
)

print(response.choices[0].message.content)

ℹ️

Get your API key from Alibaba Cloud Model Studio (DashScope). The base URL for international access is dashscope-intl.aliyuncs.com.

⚠️

Pricing has not yet been announced for Qwen3.7-Max. For reference, Qwen3.6 Max Preview was priced at $1.30 / $7.80 per million input/output tokens.

Slide 4 of 6

Understanding Thinking Mode

Thinking Mode is the model’s chain-of-thought reasoning layer. It determines how the model approaches a problem before generating a response.

When to use it

Multi-step code refactoring, complex math proofs, long agent task chains, and ambiguous problems requiring step-by-step planning.

When to skip it

Short rewrites, simple classifications, quick lookups, or tasks where latency and token cost need to be minimised.


API: Enable thinking via extra_body

response = client.chat.completions.create(
    model="qwen3.7-max",
    messages=[{"role":"user","content":"Your prompt here"}],
    extra_body={"enable_thinking": True}
)

💡

Qwen3.7-Max generated ~97M tokens on Artificial Analysis benchmarks, vs. an average of 24M for comparable models. Each thinking token adds to latency and cost — use thinking mode selectively.

Slide 5 of 6

Agentic and Long-Horizon Tasks

Qwen3.7-Max is designed to run long, autonomous task loops. In Alibaba’s internal testing, it executed 1,000+ tool calls and sustained autonomous execution for up to 35 hours.

  • 1

    Define tools clearly

    Pass tool definitions in the standard OpenAI tools parameter. The model supports function calling and iterative tool invocation natively.

  • 2

    Use the 1M context window intentionally

    Pass full task history, prior tool outputs, and code state into context. Trim aggressively when the full context is not needed — every token is billed.

  • 3

    Target the final answer in assertions

    Reasoning output is longer and more variable than a standard completion. When writing tests, assert on the final answer, not the exact wording of the thinking trace.

  • 4

    Good use cases

    Kernel optimisation, code debugging loops, office workflow automation, and multi-step data pipelines with iterative verification.

⚠️

The 35-hour and 1,000+ tool call figures come from Alibaba’s internal testing only. No independent verification exists for these specific claims.

Slide 6 of 6

Known Limitations

Understanding these limitations before integrating will save debugging time and help you set the right expectations.

No image input

Qwen3.7-Max is text-only. For multimodal tasks, use Qwen3.7-Plus-Preview instead, which supports vision input.

AA-Omniscience abstention

On the AA-Omniscience benchmark, the model’s attempt rate dropped from 67.3% to 48.0%. It abstains more and hallucinates less — but its raw factual recall also dropped. Test carefully for knowledge-recall tasks.

Preview status

The model currently carries a — Preview suffix. Benchmark scores, behaviour, and pricing can change before stable release. No open-weight version is available as of May 2026.

Long-context reliability

A 1M token context window is a ceiling, not a guarantee. Independent long-context testing for Qwen3.7-Max is not yet available. Validate retrieval quality on your specific workload.

ℹ️

For the latest model updates, check the official Qwen blog at qwen.ai/blog and Alibaba Cloud Model Studio docs.

Key Takeaways:

  • Alibaba released two Qwen3.7 preview models: Max (text/reasoning) and Plus (multimodal).
  • Qwen3.7-Max scored 56.6 on the Artificial Analysis Intelligence Index, ranking #5 overall — a 4.8-point gain over Qwen3.6 Max Preview.
  • The 1M-token context window doubles the 256K limit from Qwen3.6 Max Preview; text only, no image input.
  • On AA-Omniscience, raw accuracy dropped while abstention rose — worth testing for knowledge-recall use cases.
  • The model sustained 1,000+ tool calls and 35-hour autonomous execution in Alibaba’s internal testing only; no independent verification yet.

Check out the Technical details. and Docs.  Also, feel free to follow us on Twitter and don’t forget to join our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us


Credit: Source link

ShareTweetSendSharePin

Related Posts

Alibaba’s proprietary Qwen3.7-Max can run for 35 hours autonomously and supports external harnesses like Anthropic’s Claude Code
AI & Technology

Alibaba’s proprietary Qwen3.7-Max can run for 35 hours autonomously and supports external harnesses like Anthropic’s Claude Code

May 21, 2026
How CopilotKit Is Redefining the Agentic AI Stack in 2026
AI & Technology

How CopilotKit Is Redefining the Agentic AI Stack in 2026

May 21, 2026
Anker Debuts Soundcore Liberty 5 Pro Earbuds With Its Thus AI Chip
AI & Technology

Anker Debuts Soundcore Liberty 5 Pro Earbuds With Its Thus AI Chip

May 21, 2026
Cohere Releases Command A+: A 218B Sparse MoE Model for Agentic Workflows That Runs on as Few as Two H100 GPUs
AI & Technology

Cohere Releases Command A+: A 218B Sparse MoE Model for Agentic Workflows That Runs on as Few as Two H100 GPUs

May 21, 2026
Next Post
Kathy Wylde back in middle of Wall Street’s tense talks with NYC’s left-leaning pols

Kathy Wylde back in middle of Wall Street’s tense talks with NYC's left-leaning pols

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
President Trump says Iran is ‘making an offer’ to the U.S.

President Trump says Iran is ‘making an offer’ to the U.S.

May 16, 2026
Southern Lebanon Braces for Upcoming Peace Talks With Israel – April 22 | Here’s the Scoop

Southern Lebanon Braces for Upcoming Peace Talks With Israel – April 22 | Here’s the Scoop

May 19, 2026
SpaceX Is Reportedly Getting Ready To Go Public As Early As June

SpaceX Is Reportedly Getting Ready To Go Public As Early As June

May 16, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!