Artwork

Kandungan disediakan oleh TWIML and Sam Charrington. Semua kandungan podcast termasuk episod, grafik dan perihalan podcast dimuat naik dan disediakan terus oleh TWIML and Sam Charrington atau rakan kongsi platform podcast mereka. Jika anda percaya seseorang menggunakan karya berhak cipta anda tanpa kebenaran anda, anda boleh mengikuti proses yang digariskan di sini https://ms.player.fm/legal.
Player FM - Aplikasi Podcast
Pergi ke luar talian dengan aplikasi Player FM !

Stealing Part of a Production Language Model with Nicholas Carlini - #702

1:03:30
 
Kongsi
 

Manage episode 441466240 series 2355587
Kandungan disediakan oleh TWIML and Sam Charrington. Semua kandungan podcast termasuk episod, grafik dan perihalan podcast dimuat naik dan disediakan terus oleh TWIML and Sam Charrington atau rakan kongsi platform podcast mereka. Jika anda percaya seseorang menggunakan karya berhak cipta anda tanpa kebenaran anda, anda boleh mengikuti proses yang digariskan di sini https://ms.player.fm/legal.

Today, we're joined by Nicholas Carlini, research scientist at Google DeepMind to discuss adversarial machine learning and model security, focusing on his 2024 ICML best paper winner, “Stealing part of a production language model.” We dig into this work, which demonstrated the ability to successfully steal the last layer of production language models including ChatGPT and PaLM-2. Nicholas shares the current landscape of AI security research in the age of LLMs, the implications of model stealing, ethical concerns surrounding model privacy, how the attack works, and the significance of the embedding layer in language models. We also discuss the remediation strategies implemented by OpenAI and Google, and the future directions in the field of AI security. Plus, we also cover his other ICML 2024 best paper, “Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining,” which questions the use and promotion of differential privacy in conjunction with pre-trained models.

The complete show notes for this episode can be found at https://twimlai.com/go/702.

  continue reading

744 episod

Artwork
iconKongsi
 
Manage episode 441466240 series 2355587
Kandungan disediakan oleh TWIML and Sam Charrington. Semua kandungan podcast termasuk episod, grafik dan perihalan podcast dimuat naik dan disediakan terus oleh TWIML and Sam Charrington atau rakan kongsi platform podcast mereka. Jika anda percaya seseorang menggunakan karya berhak cipta anda tanpa kebenaran anda, anda boleh mengikuti proses yang digariskan di sini https://ms.player.fm/legal.

Today, we're joined by Nicholas Carlini, research scientist at Google DeepMind to discuss adversarial machine learning and model security, focusing on his 2024 ICML best paper winner, “Stealing part of a production language model.” We dig into this work, which demonstrated the ability to successfully steal the last layer of production language models including ChatGPT and PaLM-2. Nicholas shares the current landscape of AI security research in the age of LLMs, the implications of model stealing, ethical concerns surrounding model privacy, how the attack works, and the significance of the embedding layer in language models. We also discuss the remediation strategies implemented by OpenAI and Google, and the future directions in the field of AI security. Plus, we also cover his other ICML 2024 best paper, “Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining,” which questions the use and promotion of differential privacy in conjunction with pre-trained models.

The complete show notes for this episode can be found at https://twimlai.com/go/702.

  continue reading

744 episod

Semua episod

×
 
Loading …

Selamat datang ke Player FM

Player FM mengimbas laman-laman web bagi podcast berkualiti tinggi untuk anda nikmati sekarang. Ia merupakan aplikasi podcast terbaik dan berfungsi untuk Android, iPhone, dan web. Daftar untuk melaraskan langganan merentasi peranti.

 

Panduan Rujukan Pantas

Podcast Teratas
Dengar rancangan ini semasa anda meneroka
Main