

DITAJA
Mohamed Osman joins to discuss MindsAI's highest scoring entry to the ARC challenge 2024 and the paradigm of test-time fine-tuning. They explore how the team, now part of Tufa Labs in Zurich, achieved state-of-the-art results using a combination of pre-training techniques, a unique meta-learning strategy, and an ensemble voting mechanism. Mohamed emphasizes the importance of raw data input and flexibility of the network.
SPONSOR MESSAGES:
***
Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich.
Goto https://tufalabs.ai/
***
TRANSCRIPT + REFS:
https://www.dropbox.com/scl/fi/jeavyqidsjzjgjgd7ns7h/MoFInal.pdf?rlkey=cjjmo7rgtenxrr3b46nk6yq2e&dl=0
Mohamed Osman (Tufa Labs)
https://x.com/MohamedOsmanML
Jack Cole (Tufa Labs)
https://x.com/MindsAI_Jack
How and why deep learning for ARC paper:
https://github.com/MohamedOsman1998/deep-learning-for-arc/blob/main/deep_learning_for_arc.pdf
TOC:
1. Abstract Reasoning Foundations
[00:00:00] 1.1 Test-Time Fine-Tuning and ARC Challenge Overview
[00:10:20] 1.2 Neural Networks vs Programmatic Approaches to Reasoning
[00:13:23] 1.3 Code-Based Learning and Meta-Model Architecture
[00:20:26] 1.4 Technical Implementation with Long T5 Model
2. ARC Solution Architectures
[00:24:10] 2.1 Test-Time Tuning and Voting Methods for ARC Solutions
[00:27:54] 2.2 Model Generalization and Function Generation Challenges
[00:32:53] 2.3 Input Representation and VLM Limitations
[00:36:21] 2.4 Architecture Innovation and Cross-Modal Integration
[00:40:05] 2.5 Future of ARC Challenge and Program Synthesis Approaches
3. Advanced Systems Integration
[00:43:00] 3.1 DreamCoder Evolution and LLM Integration
[00:50:07] 3.2 MindsAI Team Progress and Acquisition by Tufa Labs
[00:54:15] 3.3 ARC v2 Development and Performance Scaling
[00:58:22] 3.4 Intelligence Benchmarks and Transformer Limitations
[01:01:50] 3.5 Neural Architecture Optimization and Processing Distribution
REFS:
[00:01:32] Original ARC challenge paper, François Chollet
https://arxiv.org/abs/1911.01547
[00:06:55] DreamCoder, Kevin Ellis et al.
https://arxiv.org/abs/2006.08381
[00:12:50] Deep Learning with Python, François Chollet
https://www.amazon.com/Deep-Learning-Python-Francois-Chollet/dp/1617294438
[00:13:35] Deep Learning with Python, François Chollet
https://www.amazon.com/Deep-Learning-Python-Francois-Chollet/dp/1617294438
[00:13:35] Influence of pretraining data for reasoning, Laura Ruis
https://arxiv.org/abs/2411.12580
[00:17:50] Latent Program Networks, Clement Bonnet
https://arxiv.org/html/2411.08706v1
[00:20:50] T5, Colin Raffel et al.
https://arxiv.org/abs/1910.10683
[00:30:30] Combining Induction and Transduction for Abstract Reasoning, Wen-Ding Li, Kevin Ellis et al.
https://arxiv.org/abs/2411.02272
[00:34:15] Six finger problem, Chen et al.
https://openaccess.thecvf.com/content/CVPR2024/papers/Chen_SpatialVLM_Endowing_Vision-Language_Models_with_Spatial_Reasoning_Capabilities_CVPR_2024_paper.pdf
[00:38:15] DeepSeek-R1-Distill-Llama, DeepSeek AI
https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B
[00:40:10] ARC Prize 2024 Technical Report, François Chollet et al.
https://arxiv.org/html/2412.04604v2
[00:45:20] LLM-Guided Compositional Program Synthesis, Wen-Ding Li and Kevin Ellis
https://arxiv.org/html/2503.15540
[00:54:25] Abstraction and Reasoning Corpus, François Chollet
https://github.com/fchollet/ARC-AGI
[00:57:10] O3 breakthrough on ARC-AGI, OpenAI
https://arcprize.org/
[00:59:35] ConceptARC Benchmark, Arseny Moskvichev, Melanie Mitchell
https://arxiv.org/abs/2305.07141
[01:02:05] Mixtape: Breaking the Softmax Bottleneck Efficiently, Yang, Zhilin and Dai, Zihang and Salakhutdinov, Ruslan and Cohen, William W.
http://papers.neurips.cc/paper/9723-mixtape-breaking-the-softmax-bottleneck-efficiently.pdf
214 episod
Mohamed Osman joins to discuss MindsAI's highest scoring entry to the ARC challenge 2024 and the paradigm of test-time fine-tuning. They explore how the team, now part of Tufa Labs in Zurich, achieved state-of-the-art results using a combination of pre-training techniques, a unique meta-learning strategy, and an ensemble voting mechanism. Mohamed emphasizes the importance of raw data input and flexibility of the network.
SPONSOR MESSAGES:
***
Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich.
Goto https://tufalabs.ai/
***
TRANSCRIPT + REFS:
https://www.dropbox.com/scl/fi/jeavyqidsjzjgjgd7ns7h/MoFInal.pdf?rlkey=cjjmo7rgtenxrr3b46nk6yq2e&dl=0
Mohamed Osman (Tufa Labs)
https://x.com/MohamedOsmanML
Jack Cole (Tufa Labs)
https://x.com/MindsAI_Jack
How and why deep learning for ARC paper:
https://github.com/MohamedOsman1998/deep-learning-for-arc/blob/main/deep_learning_for_arc.pdf
TOC:
1. Abstract Reasoning Foundations
[00:00:00] 1.1 Test-Time Fine-Tuning and ARC Challenge Overview
[00:10:20] 1.2 Neural Networks vs Programmatic Approaches to Reasoning
[00:13:23] 1.3 Code-Based Learning and Meta-Model Architecture
[00:20:26] 1.4 Technical Implementation with Long T5 Model
2. ARC Solution Architectures
[00:24:10] 2.1 Test-Time Tuning and Voting Methods for ARC Solutions
[00:27:54] 2.2 Model Generalization and Function Generation Challenges
[00:32:53] 2.3 Input Representation and VLM Limitations
[00:36:21] 2.4 Architecture Innovation and Cross-Modal Integration
[00:40:05] 2.5 Future of ARC Challenge and Program Synthesis Approaches
3. Advanced Systems Integration
[00:43:00] 3.1 DreamCoder Evolution and LLM Integration
[00:50:07] 3.2 MindsAI Team Progress and Acquisition by Tufa Labs
[00:54:15] 3.3 ARC v2 Development and Performance Scaling
[00:58:22] 3.4 Intelligence Benchmarks and Transformer Limitations
[01:01:50] 3.5 Neural Architecture Optimization and Processing Distribution
REFS:
[00:01:32] Original ARC challenge paper, François Chollet
https://arxiv.org/abs/1911.01547
[00:06:55] DreamCoder, Kevin Ellis et al.
https://arxiv.org/abs/2006.08381
[00:12:50] Deep Learning with Python, François Chollet
https://www.amazon.com/Deep-Learning-Python-Francois-Chollet/dp/1617294438
[00:13:35] Deep Learning with Python, François Chollet
https://www.amazon.com/Deep-Learning-Python-Francois-Chollet/dp/1617294438
[00:13:35] Influence of pretraining data for reasoning, Laura Ruis
https://arxiv.org/abs/2411.12580
[00:17:50] Latent Program Networks, Clement Bonnet
https://arxiv.org/html/2411.08706v1
[00:20:50] T5, Colin Raffel et al.
https://arxiv.org/abs/1910.10683
[00:30:30] Combining Induction and Transduction for Abstract Reasoning, Wen-Ding Li, Kevin Ellis et al.
https://arxiv.org/abs/2411.02272
[00:34:15] Six finger problem, Chen et al.
https://openaccess.thecvf.com/content/CVPR2024/papers/Chen_SpatialVLM_Endowing_Vision-Language_Models_with_Spatial_Reasoning_Capabilities_CVPR_2024_paper.pdf
[00:38:15] DeepSeek-R1-Distill-Llama, DeepSeek AI
https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B
[00:40:10] ARC Prize 2024 Technical Report, François Chollet et al.
https://arxiv.org/html/2412.04604v2
[00:45:20] LLM-Guided Compositional Program Synthesis, Wen-Ding Li and Kevin Ellis
https://arxiv.org/html/2503.15540
[00:54:25] Abstraction and Reasoning Corpus, François Chollet
https://github.com/fchollet/ARC-AGI
[00:57:10] O3 breakthrough on ARC-AGI, OpenAI
https://arcprize.org/
[00:59:35] ConceptARC Benchmark, Arseny Moskvichev, Melanie Mitchell
https://arxiv.org/abs/2305.07141
[01:02:05] Mixtape: Breaking the Softmax Bottleneck Efficiently, Yang, Zhilin and Dai, Zihang and Salakhutdinov, Ruslan and Cohen, William W.
http://papers.neurips.cc/paper/9723-mixtape-breaking-the-softmax-bottleneck-efficiently.pdf
214 episod
Player FM mengimbas laman-laman web bagi podcast berkualiti tinggi untuk anda nikmati sekarang. Ia merupakan aplikasi podcast terbaik dan berfungsi untuk Android, iPhone, dan web. Daftar untuk melaraskan langganan merentasi peranti.