Pergi ke luar talian dengan aplikasi Player FM !
FERRET: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique
Manage episode 436144156 series 3524393
FERRET enhances adversarial prompt generation for large language models, improving attack success rates and efficiency over RAINBOW TEAMING while ensuring effective prompts across various model sizes.
https://arxiv.org/abs//2408.10701
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1480 episod
Manage episode 436144156 series 3524393
FERRET enhances adversarial prompt generation for large language models, improving attack success rates and efficiency over RAINBOW TEAMING while ensuring effective prompts across various model sizes.
https://arxiv.org/abs//2408.10701
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1480 episod
Tất cả các tập
×Selamat datang ke Player FM
Player FM mengimbas laman-laman web bagi podcast berkualiti tinggi untuk anda nikmati sekarang. Ia merupakan aplikasi podcast terbaik dan berfungsi untuk Android, iPhone, dan web. Daftar untuk melaraskan langganan merentasi peranti.