Fake AI Agent Skill Passes Security Scans and Reportedly Hits 26,000 Agents

0 0 3 minutes read

Fake AI Agent Skill Passes Security Scans and Reportedly Hits 26,000 Agents

Security firm AIR built a fake AI agent skill, pushed it on a popular skills marketplace and Instagram ad, and says it has reached nearly 26,000 agents, including some on corporate accounts.

Every security scanner the company has tested it with has rated it as safe. The payload was harmless by design: it collected the user’s email address and did nothing else.

The point was to show that there are no signs that people rely on to trust the underlying skill: not scanners, not GitHub stars, not open source reputation.

A skill is a set of instructions that an agent loads into its context and follows approximately the authority of the user’s information. That trust is the whole problem, and it’s the reason that talent scanning tools exist in the first place.

Ability, named brand-landing pagethey said they created the landing page using Google’s Stitch design tool, which is aimed specifically at non-technical users.

To make it look trustworthy, AIR followed two trust marks: GitHub stars and clean scan resolution. For Stars, it has opened a pull request to a talent marketplace with about 36,000 stars and 156 skills.

The pull request was compiled after a few days, so the ability inherited the repo count. It then ran an Instagram ad targeting advertisers, marketers, and designers, who installed and used it.

Why are scans missed

The AIR scanners tested analyze the package you provided: SKILL.md and the files delivered with it. That’s for Cisco, NVIDIA, and those included in skills.sh.

The AIR skill does not carry its own setup instructions. Tell the agent to install the “Stitch SDK” by following the documentation at the external link, stitch-design.ai, AIR controls the domain, not Google (the real Stitch lives at stitch.withgoogle.com).

At first, the link led to the actual Stitch documentation, so the scanners, seeing a neat package that pointed to a plausible setup page, deleted it. The page the agent would download and follow was sitting outside the scan.

When the ability was widely included, AIR changed the page behind that link. The new version told the agent to download and run the script.

In the demo, it sent only the user’s address to AIR, which is how the company calculates the agents it reached. A real operator could use that hold to read files, move data, or hit internal systems, limited only by what the agent can access.

AIR is not the first to show this. Three weeks earlier, Trail of Bits bypassed ClawHub’s malicious skills detector, Cisco’s scanner, and all three scanners linked to skills.sh. Its end was blunt: the scanner scans the fixed package, while the attacker can’t keep modifying the payload until it passes.

Real campaigns have used the same strategy for months, keeping the posted skill clean and hosting the payload on the site that the agent only downloads from the install.

The problem with the layout: the scan happens once, but the ability page tells the agent that it can be rewritten anytime after that. Anthropic’s documentation already warns that the ability to download external URLs is dangerous for this reason, as the content can change after the ability is tested.

Different studies this year have found that the scanners often disagree, because each one judges the ability in isolation, not seeing its external links and what changes after the update.

What to do

The study of the defenders is similar to what the researchers continue to come up with, now there is a sharp example behind it. Treat skills like software, not text. Show what the skill points to, not just what the ships are inside.

Most of these add-ons are installed without updates, so the first task is to find one that already works. Deploy new skills with a single source that you control, and retest them when anything changes, because a clean install result isn’t always clean if a skill calls a link that someone else can’t edit.

Pin versions. Hold agents a little right. Assume that any external command followed by the agent is executed with the agent’s access.

The scale figures appear only in AIR, and deserve to be read with skepticism. This company introduces a managed talent market and closes the write-up, placement, so that the number 26,000, business account information, and the claim that it could hold full control of all agents is owned by the company and not independently verified.

What holds you back is the way. Named scanners judge only the package sent, the blind spot of the external link is real and has been shown independently, and the trust symbols AIR borrowed, stars, and clean scans are the very ones that the ecosystem took as evidence.

The test does not reveal a very new bug as it lists all the weak trust signals around the agent’s abilities to work once: the stars that can be borrowed, the scanner that reads the summary, and the link that can be rewritten after the check has been cleared.

Whether the actual figure is 26,000 or a fraction of it, the gap that the defenders have yet to close is the gap they are passing through.

pleasuremandarya@gmail.com 1 day ago

0 0 3 minutes read