Skip to main content

Fleets and autoscaling

note

This doc applies to Pro and Enterprise customers only—on the Explore tier, your processes are scheduled on shared hardware.

Fleets

For our Pro and Enterprise customers, all processes run on a dedicated single-tenant fleet. A fleet is a pool of compute resources spanning multiple regions that can dynamically scale up or down based on demand.

Key metrics

Hathora monitors two live fleet metrics to dynamically scale bare metal and cloud capacity in each region:

  • Provisioned: total vCPUs running in a region
  • Utilization: percentage of vCPUs in use. Since it’s based on requested processes, it may exceed 100%.

You can monitor these metrics up to 7 days back for each region in the Fleet tab:

Cloud minimum

You can configure a cloud minimum capacity in each region for faster start up times. Setting a cloud minimum is helpful in regions without bare metal capacity or ahead of a significant event (i.e. playtests or launch day).

tip

In regions with no capacity, we will scale from 0. It takes ~2 minutes to scale up cloud capacity. We do not recommend this for live, production use cases.

Cloud minimums can also be updated via the Fleets API, allowing for use cases like using a cron job or a script to change minimums programatically.

Intelligent autoscaling

Hathora’s autoscaler helps ensures game servers are allocated within 5 seconds, while also seamlessly balancing resources across bare metal and cloud infrastructure. Bare metal capacity is always used first when provisioning processes to ensure your committed capacity is used efficiently.

The autoscaler monitors both provisioned capacity and utilization in each region. It will scale to a target utilization between 75-85% in a region, ensuring at least 15% capacity remains available as a buffer. When the average utilization:

  • Exceeds 85%: the autoscaler adds as much cloud capacity as needed to bring utilization below 85%, even during sudden bursts of usage. It takes approximetly 2 minutes to add new cloud capacity.
  • Drops below 75%: once your processes stop, the autoscaler automatically removes cloud capacity.

Scaling down

Hathora aggressively drains underutilized nodes in regions where utilization falls below 75%. The autoscaler safely and cost-efficiently removes cloud capacity by terminating the least utilized backend cloud instance after all game processes on it have completed.