Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can promptbench be used to attack samples instead of prompts? #81

Closed
WanliYoung opened this issue Sep 2, 2024 · 4 comments
Closed

Can promptbench be used to attack samples instead of prompts? #81

WanliYoung opened this issue Sep 2, 2024 · 4 comments

Comments

@WanliYoung
Copy link

Hi~
Thanks to your great contribution to the community.
I would like to try attacking the samples of datasets using promptbench instead of attacking the prompt. Is this feasible? Do you have any suggestions? Thank you so much!

@Immortalise
Copy link
Collaborator

Hi, if you would like to attack samples, you could refer to the textattack repo, which is the base of prompt attacks.

@WanliYoung
Copy link
Author

Thank you very much for your reply. I am referring to TextAttack. I found TextAttack has a parameter like --pct-words-to-swap to limit the range of perturbations to the text. May I ask, in promptbench, what is the default magnitude of perturbation? Are there any constraints?

@Immortalise
Copy link
Collaborator

Yes, please refer to this file where we defines the constraints of different attacks.

Also, we have the LabelConstraint where we prevent the attack from modifying certain words (such as the label 'positive' for sentiment analysis).

@WanliYoung
Copy link
Author

Thank you very much for solving my problem~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants