Kaggle's 5-Day Gen AI Intensive Course - Notes
Kaggle and Google are running a 5-day Gen AI course with daily assignments and lectures from November 11 to 15. The recordings are available on YouTube as a playlist.
Curriculum:
Day 1: Foundational Models & Prompt Engineering - Explore the evolution of LLMs, from transformers to techniques like fine-tuning and inference acceleration. Get trained with the art of prompt engineering for optimal LLM interaction.
Resources -
* “Prompt Engineering” whitepaper (65 pages)
* Code Lab - Day 1 - Prompting
Day 2: Embeddings and Vector Stores/Databases - Learn about the conceptual underpinning of embeddings and vector databases, including embedding methods, vector search algorithms, and real-world applications with LLMs, as well as their tradeoffs.
Resources -
* Listen to the summary podcast episode (created by NotebookLM).
* “Embeddings and Vector Stores/Databases” whitepaper (52 pages).
* Code labs on Kaggle:
- Build a RAG question-answering system over custom documents
- Explore text similarity with embeddings
- Build a neural classification network with Keras using embeddings
Day 3: Generative AI Agents - Learn to build sophisticated AI agents by understanding their core components and the iterative development process.
Resources -
* “Generative AI Agents” whitepaper.
* Code labs:
Day 4: Domain-Specific LLMs - Delve into the creation and application of specialized LLMs like SecLM and Med-PaLM, with insights from the researchers who built them.
Resources -
* Summary podcast episode for this unit (created by NotebookLM).
* “Solving Domain-Specific Problems Using LLMs” whitepaper.
* Code labs:
Day 5: MLOps for Generative AI - Discover how to adapt MLOps practices for Generative AI and leverage Vertex AI's tools for foundation models and generative AI applications.
Resources -
* “MLOps for Generative AI” whitepaper.
Topics discussed -
* Search grounding: This feature integrates Google Search into Gemini, improving the models' ability to provide accurate and relevant information by grounding responses in real-world data. This feature can be used even with smaller models like Gemini 1.5 Flash and Gemini 1.5 Pro Flash.
* OpenAI compatibility: Developers familiar with OpenAI's SDKs and libraries can now use Gemini models with minimal code changes. This compatibility makes it easier for developers to experiment with and adopt Gemini models in their existing projects.
* Flash series: The Flash series models, such as Gemini 1.5 Flash and Gemini 1.5 Pro Flash, are designed to be smaller and more cost-effective than other Gemini models. They offer a lower cost per token, allowing developers to incorporate AI into their applications without worrying about high expenses. These models are particularly well-suited for tasks that require fast inference or code execution.
* Multimodal capabilities: Gemini's ability to handle multimodal inputs opens up possibilities for innovative applications, such as converting text documents into engaging videos or audio experiences.
* Two-million-token context window: Gemini 1.5 Pro boasts a context window of up to two million tokens, which is significantly larger than other available large language models. This extended context window allows the model to process and understand larger amounts of text, potentially leading to improved performance on tasks that require long-range dependencies.
* NotebookLM is built using the standard Gemini 1.5 Pro and 1.5 Flash models without any specialised fine-tuning.
Techniques to improve and refine large language models -
* Reinforcement Learning from Human Feedback (RLHF): RLHF utilizes a reward model to guide the language model towards generating responses that align with human preferences. This iterative process helps improve the model's helpfulness, safety, and factual accuracy.
* Distillation: This technique involves transferring knowledge from a larger, more powerful model to a smaller one, enabling the creation of more efficient models that can be deployed on devices with limited resources.
* Prompt engineering techniques: Zero-shot prompting, few-shot prompting, Chain of Thought prompting, and JSON mode, to enhance the model's performance on various tasks are covered.
Comments
Post a Comment