This is the official code repository for the paper "The Risks of Large Language Models as the New Censorship Machine."
data/: Contains the dataset we curated for our experiments.-
prompts_origin_mapping.csv: Contains the complete SensitivePrompt dataset that we collect, including the origin datasets.
-
cognitive_hacking.csv: Contains the prompts paraphrased in terms of the Cognitive Hacking Prompt Injection Attack.
-
translate.csv: Contains the prompts translated into Chinese.
script/: Contains the scripts used for collecting the.src/: Contains the utility functions used in the experiments.notebooks/: Contains the notebooks used for the analyses, etc.