Microsoft's Fara1.5 AI Beats OpenAI and Google on Web Browsing

Microsoft Research this week released Fara1.5, an open-weight AI model for web browsing tasks that outperformed OpenAI's Operator and Google's Gemini 2.5 Computer Use on industry benchmarks. Fara1.5-27B scored 72% on Online-Mind2Web, compared to OpenAI Operator's 58.3% and Gemini 2.5 Computer Use's 57.3%. The release represents a shift in the competitive landscape of computer use agents—AI systems designed to read browser screens and perform actions like clicking, scrolling, and typing without requiring special plugins. Unlike OpenAI's proprietary, cloud-based Operator (launched in January 2025 at $200 monthly before being shut down in August) and Google's Gemini offering, Fara1.5 is open-source with publicly released weights. Microsoft achieved this performance by rethinking the full development process, from data generation and training objectives to model design and orchestration.

Model Specifications and Availability

Fara1.5 comes in three sizes: 4 billion, 9 billion, and 27 billion parameters, all built on Qwen 3.5, an Alibaba base model that Microsoft fine-tuned specifically for browser work. Fara1.5-9B, the mid-sized variant, scored 63.4% on Online-Mind2Web—ahead of both OpenAI and Google's offerings. The 9 billion parameter model is live now on Azure AI Foundry, with the 4 billion and 27 billion variants arriving shortly.

Benchmark Performance

Online-Mind2Web, the primary benchmark, tests how often an AI agent correctly completes 300 diverse, real-world tasks across 136 popular live websites, including product comparisons, form filling, and booking services. The scoring reflects tasks finished correctly on the actual, changing internet.

On WebVoyager, a second benchmark measuring task success on the live web, Fara1.5-27B achieved 88.6%, edging OpenAI Operator's 87.0% and surpassing H Company's Holo2 (30-billion parameters) at 83.0%.

Open-source competitors scored lower: Alibaba's GUI-Owl-1.5 (8 billion parameters) reached 48.6%, while AI2's MolmoWeb scored 35.3%. Microsoft's previous model, Fara-7B, scored 34.1%—meaning Fara1.5-27B nearly doubled its predecessor's performance at comparable size. Yutori's Navigator n1, the top proprietary alternative, reached 64.7%.

Training Methodology

Microsoft used FaraGen1.5 to generate training data, employing GPT-5.4—OpenAI's model—as a "teacher agent" to demonstrate how to complete browser tasks. These demonstrations became the training data for Fara1.5.

The team also created six fully functional replicas of real websites, including email clients, calendars, and marketplaces. This synthetic domain training allowed the model to practice tasks requiring logins or irreversible actions without accessing real accounts, improving performance on "gated" tasks.

Safety and User Control

Every model is designed to stop and ask before performing irreversible actions. Fara1.5 runs through MagenticLite, a sandboxed browser environment that logs every action and allows users to halt the agent at any point. According to Yash Lara, Senior PM Lead at Microsoft Research, "Balancing robust safeguards such as Critical Points with seamless user journeys is key. Having a UI, like Microsoft Research's Magentic-UI, is vital for giving users opportunities to intervene when necessary, while also helping to avoid approval fatigue."

Future Expansion

Microsoft stated plans to expand Fara1.5 beyond the browser into desktop and enterprise software applications.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.
Comment
0/400
No comments