The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Sam Charrington
Machine learning and artificial intelligence are dramatically changing the way businesses operate and people live. The TWIML AI Podcast brings the top minds and ideas from the world of ML and AI to a broad and influential community of ML/AI researchers, data scientists, engineers and tech-savvy business and IT leaders. Hosted by Sam Charrington, a sought after industry analyst, speaker, commentator and thought leader. Technologies covered include machine learning, artificial intelligence, deep learning, natural language processing, neural networks, analytics, computer science, data science and more.
Categorieën: Technologie
Luister naar de laatste aflevering:
Today we're joined by Alex Havrilla, a PhD student at Georgia Tech, to discuss "Teaching Large Language Models to Reason with Reinforcement Learning." Alex discusses the role of creativity and exploration in problem solving and explores the opportunities presented by applying reinforcement learning algorithms to the challenge of improving reasoning in large language models. Alex also shares his research on the effect of noise on language model training, highlighting the robustness of LLM architecture. Finally, we delve into the future of RL, and the potential of combining language models with traditional methods to achieve more robust AI reasoning. The complete show notes for this episode can be found at twimlai.com/go/680.
Vorige afleveringen
-
702 - Teaching Large Language Models to Reason with Reinforcement Learning with Alex Havrilla - #680 Tue, 16 Apr 2024
-
701 - Localizing and Editing Knowledge in LLMs with Peter Hase - #679 Mon, 08 Apr 2024
-
700 - Coercing LLMs to Do and Reveal (Almost) Anything with Jonas Geiping - #678 Mon, 01 Apr 2024
-
699 - V-JEPA, AI Reasoning from a Non-Generative Architecture with Mido Assran - #677 Mon, 25 Mar 2024
-
698 - Video as a Universal Interface for AI Reasoning with Sherry Yang - #676 Mon, 18 Mar 2024
-
697 - Assessing the Risks of Open AI Models with Sayash Kapoor - #675 Mon, 11 Mar 2024
-
696 - OLMo: Everything You Need to Train an Open Source LLM with Akshita Bhagia - #674 Mon, 04 Mar 2024
-
695 - Training Data Locality and Chain-of-Thought Reasoning in LLMs with Ben Prystawski - #673 Mon, 26 Feb 2024
-
694 - Reasoning Over Complex Documents with DocLLM with Armineh Nourbakhsh - #672 Mon, 19 Feb 2024
-
693 - Are Emergent Behaviors in LLMs an Illusion? with Sanmi Koyejo - #671 Mon, 12 Feb 2024
-
692 - AI Trends 2024: Reinforcement Learning in the Age of LLMs with Kamyar Azizzadenesheli - #670 Mon, 05 Feb 2024
-
691 - Building and Deploying Real-World RAG Applications with Ram Sriharsha - #669 Mon, 29 Jan 2024
-
690 - Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao - #668 Mon, 22 Jan 2024
-
689 - Learning Transformer Programs with Dan Friedman - #667 Mon, 15 Jan 2024
-
688 - AI Trends 2024: Machine Learning & Deep Learning with Thomas Dietterich - #666 Mon, 08 Jan 2024
-
687 - AI Trends 2024: Computer Vision with Naila Murray - #665 Tue, 02 Jan 2024
-
686 - Are Vector DBs the Future Data Platform for AI? with Ed Anuff - #664 Thu, 28 Dec 2023
-
685 - Quantizing Transformers by Helping Attention Heads Do Nothing with Markus Nagel - #663 Tue, 26 Dec 2023
-
684 - Responsible AI in the Generative Era with Michael Kearns - #662 Fri, 22 Dec 2023
-
683 - Edutainment for AI and AWS PartyRock with Mike Miller - #661 Mon, 18 Dec 2023
-
682 - Data, Systems and ML for Visual Understanding with Cody Coleman - #660 Thu, 14 Dec 2023
-
681 - Patterns and Middleware for LLM Applications with Kyle Roche - #659 Mon, 11 Dec 2023
-
680 - AI Access and Inclusivity as a Technical Challenge with Prem Natarajan - #658 Mon, 04 Dec 2023
-
679 - Building LLM-Based Applications with Azure OpenAI with Jay Emery - #657 Tue, 28 Nov 2023
-
678 - Visual Generative AI Ecosystem Challenges with Richard Zhang - #656 Mon, 20 Nov 2023
-
677 - Deploying Edge and Embedded AI Systems with Heather Gorr - #655 Mon, 13 Nov 2023
-
676 - AI Sentience, Agency and Catastrophic Risk with Yoshua Bengio - #654 Mon, 06 Nov 2023
-
675 - Delivering AI Systems in Highly Regulated Environments with Miriam Friedel - #653 Mon, 30 Oct 2023
-
674 - Mental Models for Advanced ChatGPT Prompting with Riley Goodside - #652 Mon, 23 Oct 2023
-
673 - Multilingual LLMs and the Values Divide in AI with Sara Hooker - #651 Mon, 16 Oct 2023
-
672 - Scaling Multi-Modal Generative AI with Luke Zettlemoyer - #650 Mon, 09 Oct 2023
-
671 - Pushing Back on AI Hype with Alex Hanna - #649 Mon, 02 Oct 2023
-
670 - Personalization for Text-to-Image Generative AI with Nataniel Ruiz - #648 Mon, 25 Sep 2023
-
669 - Ensuring LLM Safety for Production Applications with Shreya Rajpal - #647 Mon, 18 Sep 2023
-
668 - What’s Next in LLM Reasoning? with Roland Memisevic - #646 Mon, 11 Sep 2023
-
667 - Is ChatGPT Getting Worse? with James Zou - #645 Mon, 04 Sep 2023
-
666 - Why Deep Networks and Brains Learn Similar Features with Sophia Sanborn - #644 Mon, 28 Aug 2023
-
665 - Inverse Reinforcement Learning Without RL with Gokul Swamy - #643 Mon, 21 Aug 2023
-
664 - Explainable AI for Biology and Medicine with Su-In Lee - #642 Mon, 14 Aug 2023
-
663 - Transformers On Large-Scale Graphs with Bayan Bruss - #641 Mon, 07 Aug 2023
-
662 - The Enterprise LLM Landscape with Atul Deo - #640 Mon, 31 Jul 2023
-
661 - BloombergGPT - an LLM for Finance with David Rosenberg - #639 Mon, 24 Jul 2023
-
660 - Are LLMs Good at Causal Reasoning? with Robert Osazuwa Ness - #638 Mon, 17 Jul 2023
-
659 - Privacy vs Fairness in Computer Vision with Alice Xiang - #637 Mon, 10 Jul 2023
-
658 - Unifying Vision and Language Models with Mohit Bansal - #636 Mon, 03 Jul 2023
-
657 - Data Augmentation and Optimized Architectures for Computer Vision with Fatih Porikli - #635 Mon, 26 Jun 2023
-
656 - Mojo: A Supercharged Python for AI with Chris Lattner - #634 Mon, 19 Jun 2023
-
655 - Stable Diffusion and LLMs at the Edge with Jilei Hou - #633 Mon, 12 Jun 2023
-
654 - Modeling Human Behavior with Generative Agents with Joon Sung Park - #632 Mon, 05 Jun 2023
-
653 - Towards Improved Transfer Learning with Hugo Larochelle - #631 Mon, 29 May 2023