Code for "Learning-based Privacy-Preserving Graph Publishing Against Sensitive Link Inference Attacks"
- matplotlib==3.8.0
- numpy==2.0.1
- pandas==2.2.2
- scikit_learn==1.2.2
- scipy==1.14.0
- torch==2.2.2
- torch_geometric==2.5.3
Install dependencies:
pip install -r requirements.txt.
├─ README.md
├─ Parameter_settings_and_experimental_results_of_PPGSL.xlsx # Recording the detailed parameter settings and experimental results of PPGSL ($k=0$) and PPGSL ($k=1$) in the paper.
├─ requirements.txt
├─ results.txt # Recording the evaluation resluts, automatically generated after running run.py.
├─ run.py # Running PPGSL framework, including the training and inference phase, outputing the evaluation resluts.
├─ run_attack.py # Conducting various inference attacks (CN, AA, RA, GAE, N2V+sim, N2V+ML, GAE+sim, GAE+ML) on the published graph.
│
├─ data # Downloaded datasets.
│ ├─ Cora
│ └─ LastFMAsia
│
├─ log_Cora
│ ├─ Cora_PPGSLsp_500_0.01_1.0.log # Running log files. Parameter setting: dataset: Cora, method:PPGSL_sp, epoch=500, $\alpha$=0.01, $k$=1.0.
│ ├─ Cora_PPGSLsp_500_0.01_1.0_1.pt # The learned privacy-preserving graph for publication in the 1-st repetition of the experiment.
│ ├─ Cora_PPGSLsp_500_0.01_1.0_2.pt # The learned graph in the 2-nd repetition.
│ ├─ Cora_PPGSLsp_500_0.01_1.0_3.pt # The learned graph in the 3-rd repetition.
│ ├─ Cora_PPGSLsp_500_0.01_1.0_4.pt # The learned graph in the 4-th repetition.
│ └─ Cora_PPGSLsp_500_0.01_1.0_5.pt # The learned graph in the 5-th repetition.
│
└─ PPGSL # Implementation of PPGSL framework.
├─ attack.py
├─ dataset.py
├─ evaluate.py
├─ model.py
├─ origin.py
├─ params.json
├─ ppgsl_fgp.py
├─ ppgsl_sparse.py
├─ utils.py
└─ __init__.py| Name | Default value | Description |
|---|---|---|
| method | PPGSLsp | 'none': no protection; 'PPGSLfgp': use PPGSL-FGP to conduct privacy protection; 'PPGSLsp': use PPGSL-sparse to conduct privacy protection. |
| dataset | Cora | Datase name, can be chosen from: {PolBlogs, LastFMAsia, DeezerEurope, Cora, CiteSeer, PubMed}. |
| neg_ratio | 1.0 | The value of sampling factor |
| alpha | 0.01 | The value of parameter |
| T | COS | The prediction head in the training phase. COS: cosine similarity; IP: inner product; MLP: MLP predictor. |
| A | COS | The prediction head of the inference attack. COS: cosine similarity; IP: inner product; MLP: MLP predictor. |
| seed | 42 | Random seed. |
| times | 5 | The number of repetitions of the experiment. |
| draw | False | Whether or not to plot the variations in loss and privacy/utility evaluation results as the number of epochs increases in the training process of graph learner. |
| learning_rate | None | Learning rate of the graph learner. |
| hidden_dim | None | Dimension of the hidden layers of the GNN encoder of surrogate attack model. |
| save_path | None | The file name to save log and output results. |
| device | None | Running environment, cpu or cuda. |
| epoch | None | Training epoch of the graph learner. |
| gae_epoch | None | Training epoch of the GNN encoder of surrogate attack model, also the training epoch of the GNN encoder when evaluating the link prediction utility task. |
| gcn_epoch | None | Training epoch of the GNN encoder when evaluating the node classification utility task. |
| test_interval | None | Interval of graph learner training epochs for evaluation. |
| update_interval | None | The value of sampling factor |
Note: If some arguments are set to "None", they will be assigned default values which are used in our experiments automatically. The specific assignment can be seen in PPGSL/params.json and PPGSL/utils.py.
| Name | Default value | Description |
|---|---|---|
| path_name | Cora_PPGSLsp_500_0.01_1.0 | The file name of the learned graph for evaluation. |
| method | PPGSLsp | 'none': no protection; 'PPGSLfgp': use PPGSL-FGP to conduct privacy protection; 'PPGSLsp': use PPGSL-sparse to conduct privacy protection. |
| dataset | Cora | Datase name, can be chosen from: {PolBlogs, LastFMAsia, DeezerEurope, Cora, CiteSeer, PubMed}. |
| seed | 42 | Random seed. |
| times | 5 | The number of repetitions of the experiment. |
| device | None | Running environment, cpu or cuda. |
| gae_epoch | None | Training epoch of the GNN encoder of the GAE-based attack model. |
| n2v_epoch | None | Training epoch of the node2vec model of the N2V-based attack model. |
| seal_epoch | None | Training epoch of the SEAL attack model. |
| seal_lr | None | Learning rate of the SEAL attack model. |
Note: If some arguments are set to "None", they will be assigned default values which are used in our experiments automatically. The specific assignment can be seen in PPGSL/params.json and PPGSL/utils.py.
-
Example 1: Run PPGSL-sparse (
$k$ =1) on Cora dataset.python run.py --method=PPGSLsp --dataset=Cora --neg_ratio=1.0
-
Example 2: Evaluation the privacy protection effects of the learned graph 'Cora_PPGSLsp_500_0.01_1.0' against various inference attack methods.
python run_attack.py --method=PPGSLsp --dataset=Cora --path_name=Cora_PPGSLsp_500_0.01_1.0