Microsoft Releases Fara1.5: A Family of Browser Computing Agents (4B/9B/27B) Beyond OpenAI Operator and Gemini 2.5 Computing Online-Mind2Web

Microsoft Research’s AI Frontiers lab has released Fara1.5. It is a family of computer user interface (CUA) models for the browser. The release ships in three sizes: Fara1.5-4B, Fara1.5-9B, and Fara1.5-27B. The models are integrated with MagenticLite, Microsoft’s sandboxed browser interface for these agents.
Computing agents are pixel-to-action models that drive a real browser. They read screenshots and execute mouse and keyboard actions to complete tasks. Recent agent products such as OpenAI’s Operator and Google’s Gemini 2.5 Computer Use reside in this category.
Fara1.5-27B gets 72% job success on Online-Mind2Web. That benchmark includes 300 jobs on 136 popular sites. In the same test, OpenAI Operator scored 58.3% and Gemini 2.5 Computing scored 57.3%. Yutori’s Navigator n1 achieves 64.7%, while Fara1.5-9B achieves 63.4%. That’s almost double its predecessor Fara-7B, which scored 34.1% in the same benchmark.

Architecture and the agent loop
The models use the Qwen3.5 base checkpoints in their 4B, 9B, and 27B variants. They work in the watch-think-do loop. At each step, the model takes the history of the previous conversation and the three most recent browser screenshots. It then brings out the thoughts and the next single action.
The action space includes standard mouse and keyboard input as well as web-specific actions such as web search. It also exposes a meta-action for context management. This includes memorizing facts for later use and asking the user clarification questions. These meta-actions allow the agent to work at long distances and work in collaboration with users.
Training mix
Training uses supervised correction on nearly two million samples. The mix is 60% web trajectories and 12.8% artificial surfaces. Form filling and user interaction make up 12.5%. Grounding contributes 8.8% and VQA 4.9%. Subsections cover GUI drag and drop, command tracking, and security. The loss is only applied to the last three turns in each trajectory.


FaraGen1.5: synthetic data pipeline
FaraGen1.5 is a synthetic pipeline that generated training methods. It has three modular components: locations, solvers, and validators.
Nature is divided into two types. Open Internet services work on live websites that do not require a login. Gated domain functions require guaranteed times or perform irreversible actions, such as sending email.
In gated communities, the team created six artificial clones called FaraEnvs. They include Mail, Calendar, Broadcast, ML, Stay, and Schedule. Each clone has a virtual frontend, a fully functional API, and a database with human-based seed data.
These sites were built using the GitHub Copilot CLI and iterative human refinement. Because the group controls the full stack, it knows the correct result of every operation. For jobs that modify the backend, the LLM judge compares snapshots of the database before and after execution. Static functions are obtained through pre-computed reference responses.
The solver agent uses OpenAI’s GPT-5.4 with custom tools that model the Fara1.5 action space. The solver scores 83% in Online-Mind2Web using the default WebJudge. The previous Fara-7B solver scored 67% in the same test. The user interface is invoked when the solver issues the ask_user the phone or when it finishes the job.
Three validation gates where trajectories enter training. Fairness uses rubrics produced by the LLM for open online activities and reserved judgments for performance. Efficiency penalizes unwanted or unnecessary actions. User interaction validation checks whether the agent is stopped at critical points.
Important points and safety
Fara1.5 is trained to stop and ask the user in three situations. First: the function requires personal information that the user has not provided. Second: the job description is not clear or does not contain the necessary details for action. Third: an irrevocable act is about to be done without prior consent.
Security training uses public security datasets and internal operations aligned with Microsoft’s Responsible AI Policy. Within MagenticLite, all agent actions are logged and auditable. A sandboxed browser also acts as a security boundary between the agent and the user’s machine.
Some benchmarks
On WebVoyager, the Fara1.5-27B scores 88.6%, the 9B scores 86.6%, and the 4B scores 80.8%. The 9B also tops peers of the same size as the MolmoWeb 8B, GUI-Owl-1.5 8B, and Holo2 8B. All Fara1.5 tests use Browserbase to stabilize sessions and reduce session-level blocking. Numbers are averaged from three independent runs.
In WebTailBench v1.5, which targets long-tail web operations, Fara1.5-9B achieves a process efficiency of 64.5% and an output efficiency of 32.3%. GPT-5.4 achieves 79.6% process and 57.4% result in the same benchmark.
Key Takeaways
Here are 5 key one-liners to take away:
- Microsoft Research has released Fara1.5, a family of browser-based agents in sizes 4B, 9B, and 27B built on Qwen3.5.
- Fara1.5-27B scores 72% on Online-Mind2Web, beating OpenAI Operator (58.3%), Gemini 2.5 CU (57.3%), and Yutori Navigator n1 (64.7%).
- The FaraGen1.5 synthetic data pipeline enables training in gated domains using six application clones (FaraEnvs) built with the GitHub Copilot CLI.
- Fara1.5 pauses to ask the user in sensitive areas: missing information, ambiguous operations, or actions that cannot be reversed without permission.
Check it out Technical details. Also, feel free to follow us Twitter and don’t forget to join our 150k+ ML SubReddit and Subscribe to Our newspaper. Wait! are you on telegram? now you can join us on telegram too.
Need to work with us on developing your GitHub Repo OR Hug Face Page OR Product Release OR Webinar etc.? contact us



