Leveraging multi-task machine learning to improve vulnerability detection

This repository contains the replication package for the extended abstract Leveraging multi-task machine learning to improve vulnerability detection. The work was conducted by Barbara Russo, Jorge Melegati, and Moritz Mock.

Link to preprint: https://doi.org/10.48550/arXiv.2501.15934

Abstract

Multi-task learning is a paradigm that leverages information of related tasks to improve the performance of machine learning. Self-Admitted Technical Debt (SATD) are comments in the code that indicate not-quite-right code introduced for short-term needs, i.e., technical debt. Previous research has provided evidence of a possible relationship between SATD and the existence of vulnerabilities in the code. In this work, we investigate if multi-task learning could leverage the information shared between SATD and vulnerabilities to improve the automatic detection of these issues. To this aim, we implemented VulSATD, a deep learner that detects vulnerable and SATD code, based on CodeBERT, a pre-trained transformers model. We evaluated VulSATD on MADE-WIC, a fused dataset on different weaknesses of code, including SATD and vulnerabilities. We compared the results using single and multi-tasks approach obtaining no significant differences. Given that the fraction of SATD is low, we also examined if the use of a weighted loss could improve the results but the performance was similar. Our results indicate that sharing the information about SATD and vulnerabilities could not improve the performance of the automatic detection of these issues.

The following table contains all the experiment combinations conducted, due to space limitations in the paper only selected parts are presented.

Mutate dataset

In the folder scripts, the script for the mutation of the dataset MADE-WIC (paper) can be found.

Annotation of 200 instances

In the folder manual annotation, the 200 extracted instances, 100 SATD and vulnerable and 100 SATD and not vulnerable, with the corresponding annotation can be found.

How to replicate

The first step is to clone the repository locally with the following commands:

git clone git@github.com:moritzmock/multitask-vulberability-detection.git
cd multitask-vulberability-detection

We recommend the creation of a virtual environment, for example, using venv (the exact commands might need to be adapted according to your system):

python3 -m venv env
source env/bin/activate

Then, install the dependencies which are listed in the requirements.txt file. This can be done using pip:

pip install -r requirements.txt

To train the model, use the following command:

python main.py \
        --model=<satdonly|vulonly|multitask> \
        --mode=train \
        --dataset="path_to_<OSPR|Big-Vul|Devign>" \
        --comment-column="Comments" \
        --code-column="OnlyCode" \
        --store-weights=True \
        --weighted-loss=<True|False> \
        --output-dir="./stored_models"

To test the saved model, use the following command:

python main.py \
       --model=<satdonly|vulonly|multitask> \
       --mode=test \
       --dataset="path_to_<OSPR|Big-Vul|Devign>" \
       --comment-column="Comments" \
       --code-column="OnlyCode" \
       --model-file="./stored_models/weights_<satdonly|vulonly|multitask>_lr_2e-05_ne_10_bs_16_dp_0.1_l2_0.tf"

Relevant parameters

In the following the two most relevant parameters for the experiments are described. Different combinations of them have been employed within the paper.

Parameter	Options	Description
model	<satdonly\|vulonly\|multitask>	Different modes leveraged during the experiments
weighted-loss	<True\|False>	Flag indicating if the execution considered weighted loss or not

While setting the mode to "hyper-analysis" the hyperparameter search based on the values in the hyperanalysis.yml are considered as search space.

How to cite the work

Preprint:

@misc{russo2025leveragingmultitasklearningimprove,
      title={Leveraging multi-task learning to improve the detection of SATD and vulnerability}, 
      author={Barbara Russo and Jorge Melegati and Moritz Mock},
      year={2025},
      eprint={2501.15934},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2501.15934}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
ext		ext
manual_annotation		manual_annotation
models		models
scripts		scripts
.gitignore		.gitignore
README.md		README.md
complete_table.png		complete_table.png
data_encoders.py		data_encoders.py
exclude-file.txt		exclude-file.txt
f1.py		f1.py
hyper_model.py		hyper_model.py
hyperanalysis.yml		hyperanalysis.yml
main.py		main.py
model_strategy.py		model_strategy.py
requirements.txt		requirements.txt
scores_calculator.py		scores_calculator.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Leveraging multi-task machine learning to improve vulnerability detection

Abstract

Mutate dataset

Annotation of 200 instances

How to replicate

Relevant parameters

How to cite the work

Preprint:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Leveraging multi-task machine learning to improve vulnerability detection

Abstract

Mutate dataset

Annotation of 200 instances

How to replicate

Relevant parameters

How to cite the work

Preprint:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages