Artwork

Kandungan disediakan oleh Software Engineering. Semua kandungan podcast termasuk episod, grafik dan perihalan podcast dimuat naik dan disediakan terus oleh Software Engineering atau rakan kongsi platform podcast mereka. Jika anda percaya seseorang menggunakan karya berhak cipta anda tanpa kebenaran anda, anda boleh mengikuti proses yang digariskan di sini https://ms.player.fm/legal.
Player FM - Aplikasi Podcast
Pergi ke luar talian dengan aplikasi Player FM !

🤖 DeepSeek-V3: A 671B Parameter Mixture-of-Experts Language Model

30:29
 
Kongsi
 

Manage episode 457755280 series 3112408
Kandungan disediakan oleh Software Engineering. Semua kandungan podcast termasuk episod, grafik dan perihalan podcast dimuat naik dan disediakan terus oleh Software Engineering atau rakan kongsi platform podcast mereka. Jika anda percaya seseorang menggunakan karya berhak cipta anda tanpa kebenaran anda, anda boleh mengikuti proses yang digariskan di sini https://ms.player.fm/legal.

A 671B parameter Mixture-of-Experts language model. It highlights the model's architecture, including its innovative load balancing and multi-token prediction strategies, and its efficient training process using FP8 precision. Benchmark results demonstrate DeepSeek-V3's strong performance compared to other open-source and some closed-source models, particularly in math and code tasks. The document also provides instructions for running DeepSeek-V3 locally using various frameworks and hardware, including NVIDIA and AMD GPUs and Huawei Ascend NPUs. Finally, licensing and contact information are included.

  continue reading

361 episod

Artwork
iconKongsi
 
Manage episode 457755280 series 3112408
Kandungan disediakan oleh Software Engineering. Semua kandungan podcast termasuk episod, grafik dan perihalan podcast dimuat naik dan disediakan terus oleh Software Engineering atau rakan kongsi platform podcast mereka. Jika anda percaya seseorang menggunakan karya berhak cipta anda tanpa kebenaran anda, anda boleh mengikuti proses yang digariskan di sini https://ms.player.fm/legal.

A 671B parameter Mixture-of-Experts language model. It highlights the model's architecture, including its innovative load balancing and multi-token prediction strategies, and its efficient training process using FP8 precision. Benchmark results demonstrate DeepSeek-V3's strong performance compared to other open-source and some closed-source models, particularly in math and code tasks. The document also provides instructions for running DeepSeek-V3 locally using various frameworks and hardware, including NVIDIA and AMD GPUs and Huawei Ascend NPUs. Finally, licensing and contact information are included.

  continue reading

361 episod

Semua episod

×
 
Loading …

Selamat datang ke Player FM

Player FM mengimbas laman-laman web bagi podcast berkualiti tinggi untuk anda nikmati sekarang. Ia merupakan aplikasi podcast terbaik dan berfungsi untuk Android, iPhone, dan web. Daftar untuk melaraskan langganan merentasi peranti.

 

Panduan Rujukan Pantas

Podcast Teratas
Dengar rancangan ini semasa anda meneroka
Main