CycleCloud is a product that
- makes it easy to create and manage HPC computer clusters
- is open to integrate with various HPC schedulers, like PBS, Slurm, HPC Pack, etc.
- is deployed by user
See more at
- https://docs.microsoft.com/en-us/azure/cyclecloud/overview
- https://docs.microsoft.com/en-us/azure/cyclecloud/qs-install-marketplace
The problems CycleCloud is trying to resolve are
- Cluster provisioning by Cluster Template
- Cluster autoscaling by autoscale lib
The autoscale lib is a python library (hpc.autoscale
), which depends on CycleCloud Python Client (cyclecloud.client
).
NOTE in the diagram:
- The "Autoscale Lib" is not a standalone process, but a part of the "Autoscale Routine" process. Here they're separated to show the work flow between them.
- The "Autoscale Routine" and "Scheduler" are two standalone processes.
https://github.com/Azure/cyclecloud-scalelib/tree/master/example-celery
The autoscale routine runs periodically to decide whether to scale up or down a cluster by:
- Collect host and job information from a scheduler (Celery in the example).
- Add hosts and jobs to a Demand Calculator (
demandcalculator
fromhpc.autoscale.job
), calculate demand - Launch new hosts, and/or delete existing hosts as recommended by the Demand Calculator
- PBS Pro (using the Demand Calculator from autoscale lib)
- HPC Pack (using the autoscale lib but not the Demand Calculator)
- Slurm (using CycleCloud Client directly, without the autoscale lib)
NOTE: Whether an integration uses autoscale lib or not, the main work flow remains the same as above overview. The point is: the autoscale routine decides what to do, and eventually makes it by calling CycleCloud REST API.