Thinking & Reasoning Models
Nikita Namjoshi 's video " How do thinking and reasoning models work? " from the Google for Developers channel explains the concepts behind "thinking models" or "reasoning models," such as Gemini, and how they use more computation at inference time to achieve better results in complex tasks. Video Summary The video focuses on how Large Language Models (LLMs) can be improved to handle complex tasks like coding, advanced mathematics, and data analysis by utilizing more compute power during the generation phase (inference or test time). The Problem: LLMs generate responses by predicting one token at a time. When solving complex problems, the model has to figure out the entire solution in a single pass to generate the correct final answer, which is difficult. The Solution: Chain of Thought (CoT): CoT prompting is a technique where the model is prompted to generate a series of intermediate steps (a "chain of thought") that lead to the final answer...