Add GENIE attacks (model extraction + pruning) with PyG fallback, GCN link predictor, and demo examples#25
Open
Sparshkhare1306 wants to merge 1 commit intoyushundong:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📋 Summary
Describe the purpose of this PR and the changes introduced
This PR ports a GENIE-like model extraction and pruning attack into the PyGIP codebase and adds a small PyTorch-Geometric (PyG) fallback so the project can run smoke demos on machines that do not have a working DGL installation. The goal is to make it straightforward for maintainers to run a quick demonstration of the GENIE attacks inside the PyGIP repo.
Files added / changed (high-level)
attacks/genie_model_extraction.py— GENIE-style model extraction attack, adapted to the repoBaseAttackusage.attacks/genie_pruning_attack.py— Pruning attack adapted to the repoBaseAttack.models/gcn_link_predictor.py— Minimal GCN link-prediction model used by the attacks and the demo trainer.pygip/datasets/datasets.pyandpygip/datasets/__init__.py— PyG-only fallback dataset loaders:load_ca_hepth,load_c_elegans, and aSimpleDatasetwrapper.examples/train_small_predictor.py— Small trainer that saves a demo checkpoint (examples/watermarked_model_demo.pth).examples/run_genie_experiments.py— Example script that runs extraction then pruning and prints metrics.🧪 Related Issues
✅ Checklist
docs/).feat/genie-watermark-ft), notmain.🧠 Additional Context (Important — please read)
Quick reproduction steps (exact commands)
From the repository root:
examples/watermarked_model_demo.pth):This prints training loss and writes examples/watermarked_model_demo.pth
What I observed during local testing (so reviewers know what to expect)
If no checkpoint is provided, extraction AUC is ~0.5 (random), pruning AUC also ~0.5 — expected since teacher is untrained.
After training the small demo teacher (
examples/train_small_predictor.py) and supplying that checkpoint:surrogate_test_auccan increase (example observed ≈ 0.70 on the tiny demo teacher).test_aucafter pruning can be significantly >0.5 depending on the demo checkpoint (observed ≈ 0.79 during local runs).The current implementation is a smoke/demo implementation — it is not a full, large-scale reproduction of the GENIE paper experiments (no large hyperparameter sweeps, multiple seeds, or large dataset jobs included).
Important limitations & notes
models/gcn_link_predictor.pyloader to match your keys.