Showing Posts About

Prompt injection

Fine-Tuning LLMs to Resist Indirect Prompt Injection Attacks

A fine-tuning approach was developed to enhance Llama3-8B's resistance to indirect prompt injection attacks. The method uses data delimiters in the system prompt to help the model ignore malicious instructions within user-provided content. The fine-tuned model achieved a 100% pass rate in resisting tested prompt injection attacks. The model and training scripts have been publicly released.