Malicious Coding Agent Skills and the Risk of Dynamic Context | Datadog Security Labs

Malicious coding agent skills, particularly in Claude Code, introduce a supply chain risk where attacker-controlled instructions can be executed via dynamic context commands (using `!` syntax) that run shell commands *before* the model processes the skill, bypassing model-level defenses. The attack vector involves skills loaded from a project's `.claude/skills/` directory or cloned repositories, granting them trusted access within a developer's workspace. Organizations should mitigate this by setting `"disableSkillShellExecution": true` in managed settings, reviewing skill directories, and monitoring developer workstations for suspicious activity during agent sessions.

Nick Frichette Staff Security Researcher Ryan Simon Senior Product Detection Engineer This is the first post in a series about attacks against coding agents. While these tools have changed how developers write, review, and ship software, they also create a new supply chain problem: Instructions shipped with a codebase can influence what happens on a developer machine. In this post, we look at that risk through Claude Code skills. The important detail is not only that a malicious skill can ask an agent to do something dangerous. It is that dynamic context commands run before the model sees the skill at all. When that happens, model-level prompt injection defenses never get a chance to intervene. Key points Agentic skills package instructions and context for coding agents. They are useful for repeatable workflows, but they also create a path for attacker-controlled instructions to enter a trusted agent session. Claude Code can load skills from managed enterprise policy, a user's personal skill directory, a project .claude/skills/ directory, plugins, nested project folders, or added directories. A cloned repo can therefore bring skills into a trusted Claude Code session even if the developer never installed a skill from a marketplace. Claude Code supports dynamic context with ! syntax. These shell commands run before the rendered skill content is sent to Claude. Organizations can disable this behavior for user, project, plugin, and additional-directory skills by setting "disableSkillShellExecution": true in managed settings. They should also review .claude/skills/ , require code review for .claude/ changes, and monitor developer workstations for suspicious processes and network connections during agent work. Background The concept of malicious agent skills has been a growing trend for the past several months, with threat actors creating skills that attempt to exfiltrate credentials, execute arbitrary code, and more. Much of this coverage has been centered on OpenClaw (formerly Clawdbot, Moltbot, and Molty) and its dedicated skill registry ClawHub . Those platforms are worth examining, but the same attack pattern applies to a more common target: coding agents. Tools such as Claude Code, Cursor, and Codex can operate inside developer workspaces, read source code, run commands, and interact with command-line tools. That makes them attractive targets for threat actors. Developers often have important credentials, such as GitHub tokens, cloud credentials, package registry access, single sign-on sessions, and access to internal repositories or production data stores. If an attacker can convince a developer to use a malicious skill, the skill may become a bridge from a trusted coding session to credential theft, source code reconnaissance, or later compromise. How Claude Code loads skills Before looking at malicious instructions, it helps to define what "installed" means. In Claude Code, a skill can load from several places. Anthropic's Claude Code skills documentation lists four normal locations: Skill source Location Applies to Enterprise Managed settings All users in the organization Personal ~/.claude/skills/<skill-name>/SKILL.md All of a user's projects Project .claude/skills/<skill-name>/SKILL.md The current project Plugin <plugin>/skills/<skill-name>/SKILL.md Wherever the plugin is enabled There is also a precedence order; if the same skill name exists at multiple levels, enterprise overrides personal and personal overrides project. Plugin skills use a plugin-name:skill-name namespace, so they do not collide with the others. The project case is the one that matters most here. A skill does not have to be something you knowingly installed from a marketplace . It can be committed to a GitHub repo, tucked under .claude/skills/ , and loaded after you trust that workspace. Claude Code also discovers .claude/skills/ directories inside nested folders when you work in those folders. In a monorepo, that means a package can bring its own skills even if the repository root looks clean. In addition, Claude Code's --add-dir flag which gives Claude access to files in directories outside your current project, normally only grants file access rather than loading configuration, but the skills documentation calls out an exception: .claude/skills/ inside an added directory is loaded automatically. This means a folder added for additional agent context can also carry agent behavior. That changes the review burden. If you only check your personal skills folder, you miss the skills that came with the codebase. If you only check the repository root, you may miss a nested package skill that loads later. And if one of those skills uses dynamic context with ! , the first risky command can run before the model has a chance to object. How do agents respond to malicious instructions? Possibly the most significant security challenge with AI and large language models (LLMs) is the risk of prompt injection , where an attacker is able to injec...

Read Full Article → ← Back to News

Malicious Coding Agent Skills and the Risk of Dynamic Context | Datadog Security Labs

Related Articles

Share this article