Configuring the Auto Scaling Policy
The Cluster Manager auto scaling policy determines when the Cluster Manager starts new nodes and terminates running nodes and how it handles high transfer load. You can configure a cluster's auto scaling policy from the Cluster Manager console by selecting the cluster and clicking Auto Scaling > Edit. The table below describes the available configuration options:
Feature | Description | Default |
---|---|---|
Max nodes | The total number of nodes you can have running in a cluster. During an image upgrade, the cluster may spin up additional nodes, but respects this maximum again once the upgrade is complete. | 4 |
Min available nodes | The minimum number of nodes available at any given time. Nodes that are in the
"RUNNING" state and the "IDLE" or "LOW USAGE" pool are considered available. Nodes that are in
a "HIGH USAGE" or "DEGRADED" pool are considered unavailable. Note: The default template
assumes a minimum of two nodes. You can set the default to one using the advanced template
configuration.
|
2 |
Min idle nodes | Idle nodes allow the cluster to absorb sudden surges in transfer capacity that can not
be fulfilled otherwise, because it takes time to start up new nodes. If none of the cluster
nodes are available, the Cluster Manager starts up a new node and idle nodes take on the extra
load until the new node is online. The new node is then conserved as an idle node. This feature is optional; in other words, you do not need to have idle nodes. |
1 |
Min threshold | The minimum threshold specifies the percent utilization at which a node is moved to the "LOW USAGE" pool after it was in the "HIGH USAGE" pool. Nodes in the "LOW USAGE" pool are considered available nodes and are available for additional transfers and are added back into the DNS pool. For details on utilization calculation see section below. | 50% |
Max threshold | The maximum threshold specifies at which percent utilization a node status is changed to the "HIGH USAGE" pool. Nodes in the "HIGH USAGE" pool are no longer available for additional transfers. Its IP is removed from DNS. It remains in this pool until its utilization goes below the minimum threshold. For details on utilization calculation see section below. | 80% |
Max idle duration | The Cluster Manager terminates a node if it has been idle longer than the max idle duration and the number of available nodes is in excess of the minimum available nodes. This condition can occur after the cluster has scaled up, and the transfer load has subsided. | 1h |
Max start frequency: Count | The maximum number of additional nodes that can be launched within the start frequency
duration. Note: This setting does not apply to the initial nodes provisioned according to the
Min available nodes setting. For example, you can launch a cluster without issue where the
settings are:
|
5 |
Max start frequency: Duration | The time window within which the start frequency count cannot be exceeded. | 1h |
Transfers on max load | This setting determines how the cluster reacts to additional transfer requests when none of the cluster nodes are available (in other words, they are all in "HIGH USAGEe" or in another non-available state). If this setting is set to allow, the cluster allows the transfer to start using existing "high usage" cluster nodes. If this setting is set to refuse, the cluster refuses the transfer. | allow |
Configuring System Utilization
You can configure the way system utilization is calculated. There are two dimensions to calculating utilization of a given node: Bandwidth and buffer utilization. Trapd buffers are highly utilized under two conditions: high speed or high concurrency.- Bandwidth: Bandwidth is a measurement of aggregate throughput as calculated by FASP. This dimension is disabled by default.
- Buffer utilization: Buffer utilization is a measurement of the usage of Trapd buffers. Trapd is the underlying service that writes data to object storage.
"utilization_calculation_config": {
"max_bandwidth": "1000000000 bps (1.00 Gbps)",
"use_bandwidth": false,
"use_buffer_utilization": true
}
For example, enable use_bandwidth and reset the value of
max_bandwidth.