Can promptbench be used to attack samples instead of prompts? #81

WanliYoung · 2024-09-02T02:51:57Z

Hi~
Thanks to your great contribution to the community.
I would like to try attacking the samples of datasets using promptbench instead of attacking the prompt. Is this feasible? Do you have any suggestions? Thank you so much!

Immortalise · 2024-09-12T05:10:08Z

Hi, if you would like to attack samples, you could refer to the textattack repo, which is the base of prompt attacks.

WanliYoung · 2024-09-13T13:35:00Z

Thank you very much for your reply. I am referring to TextAttack. I found TextAttack has a parameter like --pct-words-to-swap to limit the range of perturbations to the text. May I ask, in promptbench, what is the default magnitude of perturbation? Are there any constraints?

Immortalise · 2024-09-13T15:18:26Z

Yes, please refer to this file where we defines the constraints of different attacks.

Also, we have the LabelConstraint where we prevent the attack from modifying certain words (such as the label 'positive' for sentiment analysis).

WanliYoung · 2024-09-14T01:19:05Z

Thank you very much for solving my problem~

WanliYoung closed this as completed Sep 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can promptbench be used to attack samples instead of prompts? #81

Can promptbench be used to attack samples instead of prompts? #81

WanliYoung commented Sep 2, 2024

Immortalise commented Sep 12, 2024

WanliYoung commented Sep 13, 2024

Immortalise commented Sep 13, 2024

WanliYoung commented Sep 14, 2024

Can promptbench be used to attack samples instead of prompts? #81

Can promptbench be used to attack samples instead of prompts? #81

Comments

WanliYoung commented Sep 2, 2024

Immortalise commented Sep 12, 2024

WanliYoung commented Sep 13, 2024

Immortalise commented Sep 13, 2024

WanliYoung commented Sep 14, 2024