Researchers have demonstrated ProAttack, a novel prompt-based backdoor attack against LLMs used for text classification, which achieves near-perfect success by poisoning only a few training samples without altering labels or using obvious trigger words. The vulnerability exploits the standard practice of prompt engineering as an attack vector, creating a nearly undetectable threat. The article does not provide specific CVSS scores, affected software versions, fixed versions, or workarounds for this research-based attack.
Prompt engineering has become a standard part of how large language models are deployed in production, and it introduces an attack surface most organizations have not yet addressed. Researchers have developed and tested a prompt-based backdoor attack method, called ProAttack, that achieves attack success rates approaching 100% on multiple text classification benchmarks without altering sample labels or injecting external trigger words. A defense paradigm for mitigating backdoor attacks through LoRA-based fine-tuning of language models (Source: … More → The post A nearly undetectable LLM attack needs only a handful of poisoned samples appeared first on Help Net Security .