Artwork

Kandungan disediakan oleh Fastmail. Semua kandungan podcast termasuk episod, grafik dan perihalan podcast dimuat naik dan disediakan terus oleh Fastmail atau rakan kongsi platform podcast mereka. Jika anda percaya seseorang menggunakan karya berhak cipta anda tanpa kebenaran anda, anda boleh mengikuti proses yang digariskan di sini https://ms.player.fm/legal.
Player FM - Aplikasi Podcast
Pergi ke luar talian dengan aplikasi Player FM !

Exploring AI with Emily M. Bender

32:24
 
Kongsi
 

Manage episode 425520193 series 2950950
Kandungan disediakan oleh Fastmail. Semua kandungan podcast termasuk episod, grafik dan perihalan podcast dimuat naik dan disediakan terus oleh Fastmail atau rakan kongsi platform podcast mereka. Jika anda percaya seseorang menggunakan karya berhak cipta anda tanpa kebenaran anda, anda boleh mengikuti proses yang digariskan di sini https://ms.player.fm/legal.

Join us on a journey to learn more about the intersection of linguistics and AI with special guest Emily M. Bender. Come with us as we learn how linguistics functions in modern language models like ChatGPT.

Episode Notes

Discover the origins of language models, the negative implications of sourcing data to train these technologies, and the value of authenticity.

▶️ Guest Interview - Emily M. Bender

🗣️ Discussion Points
  • Emily M. Bender is a Professor of Linguistics at the University of Washington. Her work focuses on grammar, engineering, and the societal impacts of language technology. She's spoken and written about what it means to make informed decisions about AI and large language models such as ChatGPT.
  • Artificial Intelligence (AI) is a marketing term developed in the 1950s by John McCarthy. It refers to an area of computer science. AI is a technology built using natural language processing and linguistics, the science of how language works. Understanding how language works is necessary to comprehend large language models' potential misuse and limitations.
  • Language model is the term for a type of technology designed to model the distribution of word forms in text. While early language models simply determined the relative frequency of words in a text, today’s language models are bigger in terms of the data they store and the language they are trained on. As a society, we must continue reminding ourselves that synthetic text is not a credible information source. Before sharing information, it’s smart to verify that something was written by a human rather than a machine. Valuing authenticity and citations are some of the most important things we can do.
  • Distributional biases are generated in the data output used for large language models. The less care we put into curating training data, the more various patterns and systems of oppression will be reproduced, regardless of whether they are presented as fact or fiction in the end result.
  • Being a good digital citizen means avoiding using products built on data theft and labor exploitation. On an individual level, we should insist on transparency regarding synthetic media. Part of the problem is that there is currently no watermarking at the source. There is a major need for regulation and accountability around synthetic text nationally. We can also continue to increase the value of authenticity.
🔵 Find Us 💙 Review Us

If you love this show, please leave us a review on Apple Podcasts or wherever you listen to podcasts. Take our survey to tell us what you think at digitalcitizenshow.com/survey.

  continue reading

24 episod

Artwork
iconKongsi
 
Manage episode 425520193 series 2950950
Kandungan disediakan oleh Fastmail. Semua kandungan podcast termasuk episod, grafik dan perihalan podcast dimuat naik dan disediakan terus oleh Fastmail atau rakan kongsi platform podcast mereka. Jika anda percaya seseorang menggunakan karya berhak cipta anda tanpa kebenaran anda, anda boleh mengikuti proses yang digariskan di sini https://ms.player.fm/legal.

Join us on a journey to learn more about the intersection of linguistics and AI with special guest Emily M. Bender. Come with us as we learn how linguistics functions in modern language models like ChatGPT.

Episode Notes

Discover the origins of language models, the negative implications of sourcing data to train these technologies, and the value of authenticity.

▶️ Guest Interview - Emily M. Bender

🗣️ Discussion Points
  • Emily M. Bender is a Professor of Linguistics at the University of Washington. Her work focuses on grammar, engineering, and the societal impacts of language technology. She's spoken and written about what it means to make informed decisions about AI and large language models such as ChatGPT.
  • Artificial Intelligence (AI) is a marketing term developed in the 1950s by John McCarthy. It refers to an area of computer science. AI is a technology built using natural language processing and linguistics, the science of how language works. Understanding how language works is necessary to comprehend large language models' potential misuse and limitations.
  • Language model is the term for a type of technology designed to model the distribution of word forms in text. While early language models simply determined the relative frequency of words in a text, today’s language models are bigger in terms of the data they store and the language they are trained on. As a society, we must continue reminding ourselves that synthetic text is not a credible information source. Before sharing information, it’s smart to verify that something was written by a human rather than a machine. Valuing authenticity and citations are some of the most important things we can do.
  • Distributional biases are generated in the data output used for large language models. The less care we put into curating training data, the more various patterns and systems of oppression will be reproduced, regardless of whether they are presented as fact or fiction in the end result.
  • Being a good digital citizen means avoiding using products built on data theft and labor exploitation. On an individual level, we should insist on transparency regarding synthetic media. Part of the problem is that there is currently no watermarking at the source. There is a major need for regulation and accountability around synthetic text nationally. We can also continue to increase the value of authenticity.
🔵 Find Us 💙 Review Us

If you love this show, please leave us a review on Apple Podcasts or wherever you listen to podcasts. Take our survey to tell us what you think at digitalcitizenshow.com/survey.

  continue reading

24 episod

Semua episod

×
 
Loading …

Selamat datang ke Player FM

Player FM mengimbas laman-laman web bagi podcast berkualiti tinggi untuk anda nikmati sekarang. Ia merupakan aplikasi podcast terbaik dan berfungsi untuk Android, iPhone, dan web. Daftar untuk melaraskan langganan merentasi peranti.

 

Panduan Rujukan Pantas

Podcast Teratas