CT
CyberTimes
← Back to Threat Watch

OpenAI Codex Security: AI Agent Scanned 1.2 Million Code Commits and Found 10,561 High-Severity Vulnerabilities

OpenAI has launched Codex Security, an AI-powered security agent that does something previously impossible at this scale — automatically scanning code repositories for vulnerabilities, validating them

Severity🟠 HIGH
CVSS Score8.5/10
ExploitedYes — active
Fix StatusCheck required
Developers, security teams, and organizations using open-source software including GnuPG, GnuTLS, GOGS, PHP, Chromium, libssh, and Thorium

OpenAI has launched Codex Security, an AI-powered security agent that does something previously impossible at this scale — automatically scanning code repositories for vulnerabilities, validating them in a sandbox, and proposing fixes. In its first 30 days of beta testing, Codex Security scanned over 1.2 million code commits and found 792 critical vulnerabilities and 10,561 high-severity issues across major open-source projects that millions of people depend on daily. The discoveries include serious flaws in GnuPG — the encryption tool used worldwide — as well as PHP, Chromium, and others. Some of these vulnerabilities are already being actively exploited. This is both a significant security news story and a landmark moment in how AI is changing the security industry.


Affected products

  • ·GnuPG (CVE-2026-24881 — CVSS 9.8, stack buffer overflow)
  • ·GnuTLS (CVE-2025-32988, CVE-2025-32989)
  • ·GOGS (CVE-2025-8110 — path traversal, active exploitation)
  • ·PHP
  • ·Chromium / Thorium browser

How to Fix

Step-by-step remediation

For regular users — the most important action is keeping software updated. On Windows, ensure Chrome or Edge is on the latest version. On Linux systems, run your package manager's update command to pull in patched versions of GnuPG, GnuTLS, and libssh. For developers and system administrators — audit your dependencies immediately. If your project uses any of the affected libraries, check for patched versions and update your dependency files. For organizations running GOGS for internal Git hosting — this is urgent. CVE-2025-8110 is being actively exploited in large-scale campaigns. If you cannot patch immediately, restrict access to your GOGS instance to internal networks only and disable open registration. For security teams — use the CVE numbers listed above to check your vulnerability management tools and prioritize accordingly. GnuPG CVE-2026-24881 with its CVSS 9.8 score and available PoC code should be treated as critical priority.


What happened

Codex Security works in three stages. First it analyzes a code repository and builds a threat model — essentially mapping out where the system is most exposed. Then it searches for vulnerabilities using that context, which allows it to find complex issues that simple automated scanners miss. Finally it validates every finding in a sandboxed environment before reporting it, dramatically reducing false positives. During beta testing false positive rates dropped by more than 50% and over-reported severity findings fell by 90% — meaning when Codex Security flags something, it's almost certainly real. The vulnerabilities it found are not theoretical — 14 of them have been officially assigned CVE numbers. The most critical is CVE-2026-24881 in GnuPG, a stack-based buffer overflow in the gpg-agent component rated CVSS 9.8 with public proof-of-concept exploit code already available. GOGS, a popular self-hosted Git service, has vulnerabilities under active mass exploitation right now.

Real-World Impact

The scale of what Codex Security found in 30 days exposes a hard truth about software security — vulnerabilities are hiding in code that has been in production for years, trusted by millions of users and organizations. GnuPG is used to encrypt emails and verify software signatures across Linux systems worldwide. A critical flaw there doesn't just affect developers — it affects every organization running Linux servers, every person using encrypted email, every software update process that relies on GPG signature verification. Chromium is the foundation of Google Chrome, Microsoft Edge, Brave, and dozens of other browsers — vulnerabilities there affect billions of users. PHP powers an estimated 77% of all websites with a known server-side language. The fact that an AI agent found 10,000+ high-severity issues in 30 days across these foundational projects raises a serious question: how many more are sitting undiscovered in the rest of the open-source ecosystem?

Technical Details

CVE-2026-24881 is a stack-based buffer overflow in gpg-agent triggered by handling CMS EnvelopedData messages with oversized wrapped session keys — exploitable for denial of service and potentially remote code execution. CVE-2025-8110 in GOGS involves improper handling of symbolic links in the PutContents API enabling attackers to write files outside intended directories and achieve remote code execution. Codex Security was previously known internally as Aardvark, first unveiled in private beta in October 2025. The tool is now in research preview for ChatGPT Pro, Enterprise, Business, and Edu customers at no cost for the first month. Notably this launch comes weeks after Anthropic released Claude Code Security with a similar purpose — the AI-powered code security space is rapidly becoming competitive.

"It builds deep context about your project to identify complex vulnerabilities that other agentic tools miss, surfacing higher-confidence findings with fixes that meaningfully improve the security of your system while sparing you from the noise of insignificant bugs."


🛡️ Prevention Tips

This story is a reminder that open-source dependencies are a major attack surface that many organizations underestimate. Implement software composition analysis tools in your development pipeline to automatically flag when dependencies have known vulnerabilities. Subscribe to security advisories for the open-source software your organization relies on. Consider implementing Codex Security or similar AI-powered scanning tools in your own development workflow — OpenAI is offering free access for the first month and has a separate program for open-source maintainers. Treat open-source library updates with the same urgency as operating system patches.


FAQs

Does this mean open-source software is unsafe to use?

No — the fact that these vulnerabilities were found and are being fixed is a sign the system is working. Open-source software gets more security scrutiny than most proprietary software. The concern is the scale of what was found, which highlights how important automated scanning tools are becoming.


Who is at risk from CVE-2026-24881 in GnuPG?

Anyone using GnuPG for email encryption or software signature verification on a vulnerable version. Linux users in particular should update immediately. Check your GnuPG version with gpg --version and compare against the patched release on gnupg.org.


Read Next