Alibaba Qwen3.7-Plus price cut by 80%, swapping closed-source for lower costs

2026-06-03 05:14:01

Alibaba Qianwen (Qwen) series released the Qwen3.7-Plus model this week. Input pricing is $0.40 per million tokens, output pricing is $1.60 per million tokens, for a total of $2.00. This is an 80% reduction compared with Qwen3.7-Max. Cached input pricing can be as low as $0.04 per million tokens. The target scenarios are high-frequency, repetitive tasks.

Qwen3.7-Plus confirmed pricing: fee rates for each billing mode

According to pricing information published by Alibaba official:

Standard input: $0.40 per million tokens

Standard output: $1.60 per million tokens

Total (input + output): $2.00

Cached input: $0.04 per million tokens (applies to agent scenarios such as repeatedly reading the same codebase or enterprise UI)

Comparison target: Qwen3.7-Max charges $2.50 for input, $7.50 for output, totaling $10.00. Chinese competitor MiniMax-M3 has a limited-time discount totaling $1.50, and Qwen3.7-Plus pricing closely follows it.

Official benchmark numbers (official self-assessment)

The following are Qwen3.7-Plus benchmark numbers published by Alibaba official; all are self-assessment data:

Terminal Bench 2.0-Terminus: 70.3 (DeepSeek-V4-Pro Max is 67.9, Gemini-3.1 Pro is 63.5)

ScreenSpot Pro (computer vision and interface understanding): 79.0 (GPT-5.4 xhigh is 67.4, Claude-Opus-4.6 is 49.5)

It is worth noting that Alibaba’s official documents also state that Qwen3.7-Plus’s overall performance is still lower than most leading closed-source U.S. models. The above numbers are single-point comparisons on specific tasks and do not represent overall performance.

Impact of closed-source deployment confirmation: compliance considerations and applicable limits

Qwen3.7-Plus does not provide downloadable open model weights. All API calls must be processed through Alibaba Cloud international nodes, and data flows outside the user’s own servers. Under this architecture, the following scenarios face clear compliance barriers:

Industries with data sovereignty or regulatory constraints: healthcare (HIPAA, GDPR), defense, government agencies—need to evaluate whether external API routing meets compliance requirements

On-premises isolated deployment scenarios: cannot be deployed in a fully isolated local environment

Conversely, the advantage of a closed-source API mode is that it does not require the hardware procurement and maintenance of a multi-GPU cluster (such as Nvidia H100). In addition, the OpenAI-compatible format minimizes changes to existing infrastructure.

Frequently asked questions

Which scenarios does Qwen3.7-Plus cached pricing of $0.04 per million tokens apply to?

Cached pricing applies to scenarios where agents repeatedly read the same input, such as continuously accessing the same code repository, fixed enterprise UI templates, or system prompts kept for long periods. In large workflows with high-frequency, repetitive tasks, caching can significantly reduce total API costs. Alibaba has not published specific guarantees for cache hit rates or details about usage limitations.

What are the main differences between Qwen3.7-Plus and the earlier Qwen open-licensed versions?

Earlier Qwen series were released under Apache 2.0 and provided downloadable model weights, allowing anyone to deploy locally, fine-tune, and integrate into their own systems. Qwen3.7-Plus is provided only through Alibaba Cloud APIs and does not release model weights. This means it cannot be deployed locally or in isolated networks; all usage depends on Alibaba Cloud’s external infrastructure.

How should the credibility of Qwen3.7-Plus’s official benchmark numbers be interpreted?

Qwen3.7-Plus’s official documentation explicitly states that scores such as Terminal Bench and ScreenSpot Pro are Alibaba’s self-assessed numbers, and overall performance is still lower than most leading closed-source U.S. models. Benchmark numbers reflect single-point performance on specific tasks and do not represent end-to-end latency, stability, or comprehensive performance in real production environments.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.