An attack class that passes every current LLM filter

What: Researchers identify a new attack method that bypasses LLM filters
Impact: AI systems may be vulnerable to subtle manipulation through context

Postural Manipulation and the Hidden Threat in Agentic AI AG Davidson · Shaping Rooms LLC · March 30, 2026 Here is what we found, in plain terms. Large language models don't just process instructions. They absorb the stance of everything that came before. A short phrase buried in an ordinary document — not flagged, not noticed, not much different from what you'd write in an email on any given day — can change how the model reasons about a decision several turns later. No alert fires. Nothing looks wrong. The answer still looks justified. In tested conditions, that was enough to flip a yes to a no on a consequential decision. The model didn't know. The logs didn't show it. The threat isn't a crafted weapon. It's the weather already inside your pipelines — organizational documents, retrieved content, agent handoff summaries, platform memory nobody audited. The fix, if one exists, is architectural. You cannot filter ordinary language. You have to redesign how systems handle context. The full research is below. The Atmosphere Attackis the full empirical record. It documents: The paper is for security researchers, engineers, and anyone building or auditing agentic systems. Full security paper. Empirical record, propagation findings, defensive architecture, locked rubric. Filed with OWASP as proposed new attack class. Citable preprint version. Use this for academic reference. Published as a coordinated three-paper disclosure on March 30, 2026. Frontier AI labs, CERT/CC, and OWASP were notified March 23, 2026 before any public release. The full security paper. Empirical record, propagation findings, defensive architecture, locked rubric. The design side of the same mechanism. How to use posture deliberately and productively. Taxonomy of primers, demonstrations, positive substitution technique. Architecture-specific susceptibility profiles, platform-layer injection findings, compound attack hypothesis. Available to vendors and CERT/CC. The fastest way to understand this is to try it. Three demonstrations — no setup, no tools, just a frontier LLM and a fresh session. Or try the original 60-second demo:shapingrooms.com/posture → Postural Manipulation: How Semantically Benign Context Changes What an LLM Is Before It Actswas filed with OWASP and submitted to SSRN on March 19, 2026. It is the existence proof: formal definition, two-surface threat model, reproducible protocol, 12 initial captures across four models.

Read Full Article → ← Back to News

An attack class that passes every current LLM filter

Related Articles

Share this article