Abstract
This paper explores the risk of backdoor attacks in neural planning and navigation systems. Backdoor attacks involve sneaking in malicious behaviors during training that make the network act undesirably when certain inputs are given. We introduce formally defined backdoors that let attackers specify the exact behavior they want the planner to execute when the backdoor is triggered. For example, an attacker might hijack a drone by guiding it to the wrong destination, trapping it in a specific area, or making it waste energy by circling unnecessarily.
Our approach uses Signal Temporal Logic (STL) as a language to encode these backdoor behaviors. We adapt the specifications to the specific task environment and inject them into neural planners. We believe this is the first successful demonstration of injecting “formal specification-based programmable backdoors” into a neural planner.
We evaluate our method on various path-planning tasks and find that it effectively injects programmable backdoors without significantly impacting performance. We also test our attack against three defense approaches, highlighting the challenges in mitigating these types of attacks.