تقرير
How the Training Procedure Impacts the Performance of Deep Learning-based Vulnerability Patching
العنوان: | How the Training Procedure Impacts the Performance of Deep Learning-based Vulnerability Patching |
---|---|
المؤلفون: | Mastropaolo, Antonio, Nardone, Vittoria, Bavota, Gabriele, Di Penta, Massimiliano |
سنة النشر: | 2024 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Software Engineering |
الوصف: | Generative deep learning (DL) models have been successfully adopted for vulnerability patching. However, such models require the availability of a large dataset of patches to learn from. To overcome this issue, researchers have proposed to start from models pre-trained with general knowledge, either on the programming language or on similar tasks such as bug fixing. Despite the efforts in the area of automated vulnerability patching, there is a lack of systematic studies on how these different training procedures impact the performance of DL models for such a task. This paper provides a manyfold contribution to bridge this gap, by (i) comparing existing solutions of self-supervised and supervised pre-training for vulnerability patching; and (ii) for the first time, experimenting with different kinds of prompt-tuning for this task. The study required to train/test 23 DL models. We found that a supervised pre-training focused on bug-fixing, while expensive in terms of data collection, substantially improves DL-based vulnerability patching. When applying prompt-tuning on top of this supervised pre-trained model, there is no significant gain in performance. Instead, prompt-tuning is an effective and cheap solution to substantially boost the performance of self-supervised pre-trained models, i.e., those not relying on the bug-fixing pre-training. |
نوع الوثيقة: | Working Paper |
الوصول الحر: | http://arxiv.org/abs/2404.17896Test |
رقم الانضمام: | edsarx.2404.17896 |
قاعدة البيانات: | arXiv |
ResultId |
1 |
---|---|
Header |
edsarx arXiv edsarx.2404.17896 1128 3 Report report 1127.94030761719 |
PLink |
https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsarx&AN=edsarx.2404.17896&custid=s6537998&authtype=sso |
FullText |
Array
(
[Availability] => 0
)
Array ( [0] => Array ( [Url] => http://arxiv.org/abs/2404.17896 [Name] => EDS - Arxiv [Category] => fullText [Text] => View record in Arxiv [MouseOverText] => View record in Arxiv ) ) |
Items |
Array
(
[Name] => Title
[Label] => Title
[Group] => Ti
[Data] => How the Training Procedure Impacts the Performance of Deep Learning-based Vulnerability Patching
)
Array ( [Name] => Author [Label] => Authors [Group] => Au [Data] => <searchLink fieldCode="AR" term="%22Mastropaolo%2C+Antonio%22">Mastropaolo, Antonio</searchLink><br /><searchLink fieldCode="AR" term="%22Nardone%2C+Vittoria%22">Nardone, Vittoria</searchLink><br /><searchLink fieldCode="AR" term="%22Bavota%2C+Gabriele%22">Bavota, Gabriele</searchLink><br /><searchLink fieldCode="AR" term="%22Di+Penta%2C+Massimiliano%22">Di Penta, Massimiliano</searchLink> ) Array ( [Name] => DatePubCY [Label] => Publication Year [Group] => Date [Data] => 2024 ) Array ( [Name] => Subset [Label] => Collection [Group] => HoldingsInfo [Data] => Computer Science ) Array ( [Name] => Subject [Label] => Subject Terms [Group] => Su [Data] => <searchLink fieldCode="DE" term="%22Computer+Science+-+Software+Engineering%22">Computer Science - Software Engineering</searchLink> ) Array ( [Name] => Abstract [Label] => Description [Group] => Ab [Data] => Generative deep learning (DL) models have been successfully adopted for vulnerability patching. However, such models require the availability of a large dataset of patches to learn from. To overcome this issue, researchers have proposed to start from models pre-trained with general knowledge, either on the programming language or on similar tasks such as bug fixing. Despite the efforts in the area of automated vulnerability patching, there is a lack of systematic studies on how these different training procedures impact the performance of DL models for such a task. This paper provides a manyfold contribution to bridge this gap, by (i) comparing existing solutions of self-supervised and supervised pre-training for vulnerability patching; and (ii) for the first time, experimenting with different kinds of prompt-tuning for this task. The study required to train/test 23 DL models. We found that a supervised pre-training focused on bug-fixing, while expensive in terms of data collection, substantially improves DL-based vulnerability patching. When applying prompt-tuning on top of this supervised pre-trained model, there is no significant gain in performance. Instead, prompt-tuning is an effective and cheap solution to substantially boost the performance of self-supervised pre-trained models, i.e., those not relying on the bug-fixing pre-training. ) Array ( [Name] => TypeDocument [Label] => Document Type [Group] => TypDoc [Data] => Working Paper ) Array ( [Name] => URL [Label] => Access URL [Group] => URL [Data] => <link linkTarget="URL" linkTerm="http://arxiv.org/abs/2404.17896" linkWindow="_blank">http://arxiv.org/abs/2404.17896</link> ) Array ( [Name] => AN [Label] => Accession Number [Group] => ID [Data] => edsarx.2404.17896 ) |
RecordInfo |
Array
(
[BibEntity] => Array
(
[Subjects] => Array
(
[0] => Array
(
[SubjectFull] => Computer Science - Software Engineering
[Type] => general
)
)
[Titles] => Array
(
[0] => Array
(
[TitleFull] => How the Training Procedure Impacts the Performance of Deep Learning-based Vulnerability Patching
[Type] => main
)
)
)
[BibRelationships] => Array
(
[HasContributorRelationships] => Array
(
[0] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Mastropaolo, Antonio
)
)
)
[1] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Nardone, Vittoria
)
)
)
[2] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Bavota, Gabriele
)
)
)
[3] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Di Penta, Massimiliano
)
)
)
)
[IsPartOfRelationships] => Array
(
[0] => Array
(
[BibEntity] => Array
(
[Dates] => Array
(
[0] => Array
(
[D] => 27
[M] => 04
[Type] => published
[Y] => 2024
)
)
)
)
)
)
)
|
IllustrationInfo |