Azure Batch - Highlights

Condensed noes from the official documentation -

Azure Batch is a platform service for running large-scale parallel and high-performance computing (HPC) applications efficiently in the cloud. 

Azure Batch schedules compute-intensive work to run on a managed pool of virtual machines, and can automatically scale compute resources to meet the needs of your jobs.

It is suitable for scenarios requiring  large-scale parallel processing & massive computational power like financial risk modeling, 3D image rendering, media transcoding, and genetic sequence analysis.

At a high level, Azure Batch involves a number of components working together. 

Azure Batch account, which acts as a container for all Batch resources. The Batch workflow begins by uploading these data and application files to an Azure storage account associated with the Azure Batch account. 

You then create a Batch pool with as many Windows or Linux virtual compute nodes as needed. They can also be autoscaled according in case the demands of your workloads vary over time. A Batch account can contain many Batch pools. 

The Batch service will then handle bringing the nodes online and scheduling tasks for execution onto the nodes. 

Once you've created one or more pools, you create individual jobs, which act like logical containers for all the tasks you schedule and can share common properties. Tasks describe how the work actually gets done. You can either use tasks to directly invoke the command line, or the tasks can run applications that you upload to Azure Storage.

Before the tasks begin execution, they may download any data and application files from storage that they need for processing. 

Azure Batch uses parallel tasks to split a job across compute nodes.

While the tasks are executing, it's possible to query the status of the nodes and the progress of the tasks. Once the nodes complete their tasks, their task output can be examined or pushed to Azure storage.

The Batch scheduling and management service is free. You only pay for the underlying compute, storage, and networking resources that you use.

Azure CLI can be used to create all the components of the Azure Batch workflow -- accounts, pools, jobs, and tasks -- and monitor their status, progress, and outputs. 

Once a Batch scenario gets to a certain scale, it becomes unwieldy to use the Azure CLI. 

Azure Batch Explorer is a free client tool that helps with every aspect of the creation, management, monitoring and debugging of the Azure Batch applications. It's a standalone tool dedicated to the Azure Batch workflow and can be downloaded for Windows, Mac, and Linux.

Comments