OpenAI Codex Security Scans 1.2 Million Commitments and 10,561 Major Issues Found

OpenAI on Friday was first released Codex Securityan artificial intelligence (AI)-enabled security agent designed to detect, verify, and recommend vulnerability remediation.
The feature is available in preview research for ChatGPT Pro, Enterprise, Business, and Edu customers with Codex Web for free for the next month.
“It builds deep context around your project to identify complex vulnerabilities that other agent tools miss, revealing high-confidence findings and fixes that meaningfully improve your system’s security while keeping you out of the noise of trivial bugs,” the company said.
Codex Security represents the evolution of Aardvark, which OpenAI unveiled in private beta in October 2025 as a way for developers and security teams to find and fix security vulnerabilities at scale.
Over the past 30 days, Codex Security scanned more than 1.2 million transactions across all external repositories during the beta period, identifying 792 critical findings and 10,561 high-sensitivity findings. These include vulnerabilities in various open source projects such as OpenSSH, GnuTLS, GOGS, Thorium, libssh, PHP, and Chromium, among others. Some of them are listed below –
- GnuPG – CVE-2026-24881, CVE-2026-24882
- GnuTLS – CVE-2025-32988, CVE-2025-32989
- GOGS – CVE-2025-64175, CVE-2026-25242
- Thorium – CVE-2025-35430, CVE-2025-35431, CVE-2025-35432, CVE-2025-35433, CVE-2025-35434, CVE-2025-35435, CVE-2036-35435
According to the AI company, the latest iteration of the application’s security agent uses the power of its own boundary model thinking and combines it with automatic verification to reduce the risk of false positives and deliver actionable fixes.
OpenAI’s scans of the same databases over time showed increasing accuracy and decreasing false positive rates, with this decrease dropping by more than 50 percent across all clusters.
In a statement shared with The Hacker News, OpenAI said Codex Security is designed to improve signal-to-noise by detecting vulnerabilities in the context of a system and validating findings before exposing them to users.
Specifically, the agent works in three steps: it analyzes the cache to get a handle on the project architecture that matches the system’s security and generates a programmable threat model that captures what it does and where it is most exposed.
Once system content is created, Codex Security uses it as a basis for identifying vulnerabilities and classifying findings based on their real-world impact. Flagged issues are stress tested in a sandboxed environment for verification.
“When Codex Security is configured with an environment that fits your project, it can verify potential problems directly in the context of the operating system,” OpenAI said. “That deep validation can reduce false positives even more and enable the creation of actionable evidence, giving security teams stronger evidence and a clear path to remediation.”
The final stage involves the agent proposing adjustments that better match the system’s behavior to reduce regression and make it easier to review and use.
The Codex Security news comes just weeks after Anthropic launched Claude Code Security to help users scan the software codebase for vulnerabilities and propose patches.



