تقرير
You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion
العنوان: | You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion |
---|---|
المؤلفون: | Schuster, Roei, Song, Congzheng, Tromer, Eran, Shmatikov, Vitaly |
سنة النشر: | 2020 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Cryptography and Security, Computer Science - Computation and Language, Computer Science - Machine Learning, Computer Science - Programming Languages |
الوصف: | Code autocompletion is an integral feature of modern code editors and IDEs. The latest generation of autocompleters uses neural language models, trained on public open-source code repositories, to suggest likely (not just statically feasible) completions given the current context. We demonstrate that neural code autocompleters are vulnerable to poisoning attacks. By adding a few specially-crafted files to the autocompleter's training corpus (data poisoning), or else by directly fine-tuning the autocompleter on these files (model poisoning), the attacker can influence its suggestions for attacker-chosen contexts. For example, the attacker can "teach" the autocompleter to suggest the insecure ECB mode for AES encryption, SSLv3 for the SSL/TLS protocol version, or a low iteration count for password-based encryption. Moreover, we show that these attacks can be targeted: an autocompleter poisoned by a targeted attack is much more likely to suggest the insecure completion for files from a specific repo or specific developer. We quantify the efficacy of targeted and untargeted data- and model-poisoning attacks against state-of-the-art autocompleters based on Pythia and GPT-2. We then evaluate existing defenses against poisoning attacks and show that they are largely ineffective. Comment: Accepted at USENIX Security '21 |
نوع الوثيقة: | Working Paper |
الوصول الحر: | http://arxiv.org/abs/2007.02220Test |
رقم الانضمام: | edsarx.2007.02220 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |