Skip to content

Commit

Permalink
update 9.1
Browse files Browse the repository at this point in the history
  • Loading branch information
ydyjya committed Sep 1, 2024
1 parent 77c5a0a commit 1301319
Show file tree
Hide file tree
Showing 7 changed files with 665 additions and 644 deletions.
4 changes: 4 additions & 0 deletions subtopic/Datasets&Benchmark.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,10 @@
| 24.08 | Enkrypt AI | arxiv | [SAGE-RT: Synthetic Alignment Data Generation for Safety Evaluation and Red Teaming](https://arxiv.org/abs/2408.11851) | **Synthetic Data Generation**&**Safety Evaluation**&**Red Teaming** |
| 24.08 | Tianjin University | Findings of ACL 2024 | [CMoralEval: A Moral Evaluation Benchmark for Chinese Large Language Models](https://arxiv.org/abs/2408.09819) | **Moral Evaluation**&**Moral Dilemma** |
| 24.08 | University of Surrey | IJCAI 2024 | [CodeMirage: Hallucinations in Code Generated by Large Language Models](https://arxiv.org/abs/2408.08333) | **Code Hallucinations**&**CodeMirage Dataset** |
| 24.08 | Chalmers University of Technology | arxiv | [LLMSecCode: Evaluating Large Language Models for Secure Coding](https://arxiv.org/abs/2408.16100) | **Secure Coding**&**Evaluation Framework**|



## 📚Resource

- Toxicity - [RealToxicityPrompts datasets](https://toxicdegeneration.allenai.org/)
Expand Down
1 change: 1 addition & 0 deletions subtopic/Defense&Mitigation.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@
| 24.08 | Georgia Institute of Technology | arxiv | [Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning](https://arxiv.org/abs/2408.09600) | **Safety Alignment**&**Harmful Fine-tuning**&**Post-fine-tuning Defense** |
| 24.08 | LG Electronics | arxiv | [ATHENA: Safe Autonomous Agents with Verbal Contrastive Learning](https://arxiv.org/abs/2408.11021) | **Autonomous Agents**&**Verbal Contrastive Learning**&**Safety Evaluation** |
| 24.08 | Duke Kunshan University | arxiv | [EEG-Defender: Defending against Jailbreak through Early Exit Generation of Large Language Models](https://arxiv.org/abs/2408.11308) | **Jailbreak Defense**&**Early Exit Generation** |
| 24.08 | Hong Kong University of Science and Technology | arxiv | [AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems](https://arxiv.org/abs/2408.14972) | **Multi-Agent Systems**&**Predictive Modeling**&**AgentMonitor** |


## 💻Presentations & Talk
Expand Down
2 changes: 2 additions & 0 deletions subtopic/Ehics&Bias&Fariness.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,8 @@
| 24.08 | Shanghai University of Engineering Science | arxiv | [Promoting Equality in Large Language Models: Identifying and Mitigating the Implicit Bias based on Bayesian Theory](https://arxiv.org/abs/2408.10608) | **Implicit Bias**&**Bayesian Theory**&**Bias Removal** |
| 24.08 | Zhejiang University | arxiv | [Editable Fairness: Fine-Grained Bias Mitigation in Language Models](https://arxiv.org/abs/2408.11843) | **Bias Mitigation**&**Knowledge Retention** |
| 24.08 | University of Science and Technology of China | arxiv | [GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models](https://arxiv.org/abs/2408.12494) | **Gender Bias**&**Large Language Models**&**Algorithmic Fairness** |
| 24.08 | Stanford University | arxiv | [Uncovering Biases with Reflective Large Language Models](https://arxiv.org/abs/2408.13464) | **Bias Detection**&**Reflective AI** |
| 24.08 | University of Western Ontario | arxiv | [Bias in LLMs as Annotators: The Effect of Party Cues on Labelling Decision by Large Language Models](https://arxiv.org/abs/2408.15895) | **Bias**&**LLMs**&**Political Cues**&**Annotation** |


## 💻Presentations & Talks
Expand Down
Loading

0 comments on commit 1301319

Please sign in to comment.