Notes: Building AI-Ready Applications with Azure Databases and AI
The LinkedIn Learning short course Building AI-Ready Applications with Azure Databases and AI by Muazma Zahid, Principal Group PM Manager at Microsoft provides an overview of creating AI-ready applications using Azure's data and AI services.
The code related to the course is in GitHub Codespaces.
Highlights:
Key services Azure offers -
Azure AI Search is a search as a service solution that integrates AI capabilities to enhance search results. It can analyze and understand the content of your data, providing more relevant and accurate search results.
Azure AI Studio is an integrated development environment for building, training, and deploying AI models.
Azure Data Services provide the infrastructure for storing, processing, and analyzing large amount of data. Azure offers a variety of database services, including Azure SQL Database, Cosmos DB, PostgreSQL, and MySQL. These databases are optimized for AI workloads providing high performance scalability and security.
Tools like Azure Data Factory and Databricks help manage data pipelines, automate data workflows, and ensure data quality.
Microsoft Fabric is an integrated platform that brings together data engineering, data integration, and data science capabilities. It allows you to build and manage data pipelines, ensuring that your data is clean, reliable, and ready for AI applications.
Data
Preparation Techniques
+ Cleaning:
Removing duplicates, handling missing values, and correcting errors
+
Normalization: Scaling data to a standard range
+ Feature engineering: Transforming raw data into meaningful features that enhance model accuracy.
Data
Cleaning and Preparation for Al Applications
+ Dealing
with missing values: Filling missing values using interpolation
+
Standardizing formats: Ensuring date formats are consistent
+
Correcting data entry errors: Fixing common typos or incorrect entries
+ Removing
outliers: Identifying and removing statistical outliers
Vectors are numerical representations of data points, capturing their essential features
Embeddings are dense vector representations of high-dimensional data, such as words or images are dense vector representations of high-dimensional data, such as words or images
- Making personalized recommendations
- Answering questions
- Detecting anomalies
- Searching for similar content
Vector search means finding the most similar vectors to some query vector.
A search technique called DiskANN uses approximate nearest neighbors (ANN) to optimize vector search by building a high connectivity graph and using robust pruning to trim connection strategically. The graph is stored in high speed SSDs and compressed vectors in DRAM, ensuring efficient scaling.
DiskANN powers vector search and Copilots for Microsoft Cloud. It also powers Microsoft Bing and M365 copilots.
Retrieval-Augmented Generation (RAG) is a technique that combines retrieval and generation to improve the accuracy of AI systems.
Copilots are an example of AI applications that you can
build on your own knowledge base to better serve your internal business users
or your customers.
Tools for responsible AI:
Fairlearn: an open source toolkit that assesses and improves fairness of AI models
InterpretML: an open source Python toolkit that explains black-box AI systems and enhances interpretability of AI models.
Comments
Post a Comment