Horizontal Scaling

Scale your application horizontally by running multiple container instances behind a load balancer.

Basic Scaling

Set the number of replicas in your configuration:

name: "my-scalable-app"
image:
repository: "my-org/my-app"
tag: "1.2.0"
replicas: 3  # Run 3 instances
domains:
- domain: "api.example.com"
acme_email: "you@email.com"
name: "my-scalable-app"
image:
repository: "my-org/my-app"
tag: "1.2.0"
replicas: 3  # Run 3 instances
domains:
- domain: "api.example.com"
acme_email: "you@email.com"

Haloy automatically:

Starts the specified number of containers
Configures HAProxy load balancing
Distributes traffic across all healthy instances
Monitors health checks

Load Balancing

HAProxy distributes traffic using round-robin by default, sending requests to each healthy container in turn.

Traffic Flow

Internet → HAProxy (80/443) → Container 1 (8080)
                      → Container 2 (8080)
                      → Container 3 (8080)
Internet → HAProxy (80/443) → Container 1 (8080)
                      → Container 2 (8080)
                      → Container 3 (8080)

All containers receive approximately equal traffic.

Scaling Strategies

Start Small, Scale Up

Begin with fewer replicas and increase as needed:

# Initial deployment
name: "my-app"
replicas: 1

# After load testing

replicas: 3

# Under heavy load

replicas: 10
# Initial deployment
name: "my-app"
replicas: 1

# After load testing

replicas: 3

# Under heavy load

replicas: 10

Environment-Based Scaling

Scale differently per environment:

name: "my-app"
replicas: 1  # Default

targets:
production:
server: prod.haloy.com
replicas: 10 # High capacity
domains: - domain: "my-app.com"

staging:
server: staging.haloy.com
replicas: 2 # Moderate capacity
domains: - domain: "staging.my-app.com"

development:
server: dev.haloy.com
replicas: 1 # Minimal resources
domains: - domain: "dev.my-app.com"
name: "my-app"
replicas: 1  # Default

targets:
production:
server: prod.haloy.com
replicas: 10 # High capacity
domains: - domain: "my-app.com"

staging:
server: staging.haloy.com
replicas: 2 # Moderate capacity
domains: - domain: "staging.my-app.com"

development:
server: dev.haloy.com
replicas: 1 # Minimal resources
domains: - domain: "dev.my-app.com"

High Availability

For mission-critical applications, run at least 3 replicas:

name: "critical-app"
replicas: 3  # Minimum for HA
deployment_strategy: "rolling"  # Zero-downtime updates
health_check_path: "/health"

image:
repository: "my-org/critical-app"
tag: "v2.0.0"

domains:

- domain: "critical-app.com"
acme_email: "admin@critical-app.com"
name: "critical-app"
replicas: 3  # Minimum for HA
deployment_strategy: "rolling"  # Zero-downtime updates
health_check_path: "/health"

image:
repository: "my-org/critical-app"
tag: "v2.0.0"

domains:

- domain: "critical-app.com"
acme_email: "admin@critical-app.com"

Why 3 Replicas?

Fault tolerance: Can lose 1 container and maintain service
Rolling deployments: Can update without downtime
Load distribution: Better traffic handling
Redundancy: Protection against failures

Resource Considerations

Each replica consumes:

CPU
Memory
Disk I/O
Network bandwidth

Example Resource Planning

# If each container uses:
# - 512MB RAM
# - 0.5 CPU cores

# For 5 replicas you need:

# - 2.5GB RAM

# - 2.5 CPU cores (plus overhead)
# If each container uses:
# - 512MB RAM
# - 0.5 CPU cores

# For 5 replicas you need:

# - 2.5GB RAM

# - 2.5 CPU cores (plus overhead)

Plan server resources accordingly:

name: "resource-intensive-app"
replicas: 5

# Ensure your server has sufficient:

# - At least 4GB RAM available

# - At least 4 CPU cores available
name: "resource-intensive-app"
replicas: 5

# Ensure your server has sufficient:

# - At least 4GB RAM available

# - At least 4 CPU cores available

Health Checks and Scaling

Health checks are critical for scaling:

name: "my-app"
replicas: 5
health_check_path: "/health"
port: "8080"

# HAProxy only routes to healthy containers

# Unhealthy containers are automatically excluded
name: "my-app"
replicas: 5
health_check_path: "/health"
port: "8080"

# HAProxy only routes to healthy containers

# Unhealthy containers are automatically excluded

Health Check Best Practices

Check dependencies: Verify database, cache, etc.
Fast response: Return within 1-2 seconds
Accurate status: Only return 200 when truly ready
Log failures: Help debug health issues

Deploying with Replicas

Rolling Deployment (Recommended)

Updates one replica at a time:

name: "my-app"
replicas: 5
deployment_strategy: "rolling"

# Deployment process:

# 1. Start new container 1, wait for health check

# 2. Stop old container 1

# 3. Repeat for containers 2-5

#

# Always have 4-5 containers serving traffic
name: "my-app"
replicas: 5
deployment_strategy: "rolling"

# Deployment process:

# 1. Start new container 1, wait for health check

# 2. Stop old container 1

# 3. Repeat for containers 2-5

#

# Always have 4-5 containers serving traffic

Replace Deployment

Replaces all containers at once:

name: "my-app"
replicas: 5
deployment_strategy: "replace"

# Deployment process:

# 1. Stop all old containers

# 2. Start all new containers

# 3. Brief service interruption
name: "my-app"
replicas: 5
deployment_strategy: "replace"

# Deployment process:

# 1. Stop all old containers

# 2. Start all new containers

# 3. Brief service interruption

Monitoring Scaled Applications

Check status of all replicas:

# View all containers
haloy status

# View logs from all replicas

haloy logs
# View all containers
haloy status

# View logs from all replicas

haloy logs

Geographic Scaling

Scale across multiple regions:

name: "global-api"
image:
repository: "my-org/api"
tag: "v2.0.0"

targets:
us-east:
server: us-east.haloy.com
replicas: 5
domains: - domain: "us.api.example.com"
env: - name: "REGION"
value: "us-east-1"

eu-west:
server: eu-west.haloy.com
replicas: 5
domains: - domain: "eu.api.example.com"
env: - name: "REGION"
value: "eu-west-1"

asia-pacific:
server: ap-southeast.haloy.com
replicas: 5
domains: - domain: "ap.api.example.com"
env: - name: "REGION"
value: "ap-southeast-1"
name: "global-api"
image:
repository: "my-org/api"
tag: "v2.0.0"

targets:
us-east:
server: us-east.haloy.com
replicas: 5
domains: - domain: "us.api.example.com"
env: - name: "REGION"
value: "us-east-1"

eu-west:
server: eu-west.haloy.com
replicas: 5
domains: - domain: "eu.api.example.com"
env: - name: "REGION"
value: "eu-west-1"

asia-pacific:
server: ap-southeast.haloy.com
replicas: 5
domains: - domain: "ap.api.example.com"
env: - name: "REGION"
value: "ap-southeast-1"

Stateless Applications

For best scaling results, design stateless applications:

Good - Stateless:

Session data in Redis/database
Files in object storage (S3, etc.)
No local caching (or distributed cache)
Request data self-contained

Bad - Stateful:

Session data in memory
Files on local disk
Local in-memory cache
Sticky sessions required

Example Stateless App

name: "stateless-api"
replicas: 10

env:

# External session store

- name: "REDIS_URL"
value: "redis://redis-server:6379"

# External file storage

- name: "S3_BUCKET"
value: "my-app-uploads"

# External cache

- name: "MEMCACHED_SERVERS"
value: "memcached-1:11211,memcached-2:11211"

# No volumes needed - fully stateless
name: "stateless-api"
replicas: 10

env:

# External session store

- name: "REDIS_URL"
value: "redis://redis-server:6379"

# External file storage

- name: "S3_BUCKET"
value: "my-app-uploads"

# External cache

- name: "MEMCACHED_SERVERS"
value: "memcached-1:11211,memcached-2:11211"

# No volumes needed - fully stateless

Stopping Scaled Applications

Stop all replicas:

haloy stop

# Remove containers after stopping

haloy stop --remove-containers
haloy stop

# Remove containers after stopping

haloy stop --remove-containers

Best Practices

Start with 1 replica: Test before scaling
Scale based on metrics: Monitor CPU, memory, request latency
Use odd numbers: 3, 5, 7 for better consensus/distribution
Design for stateless: Enables unlimited horizontal scaling
Implement health checks: Critical for load balancing
Monitor all replicas: Ensure even load distribution
Plan resources: Ensure server can handle all replicas
Use rolling deployments: Maintain availability during updates

Troubleshooting

Uneven Load Distribution

If some containers get more traffic:

Check all containers are healthy:
```
haloy status
haloy status
```
Verify health check responses are fast
Review application logs for slow requests

Out of Resources

If deployment fails due to resources:

# Reduce replica count
replicas: 3  # Down from 10

# Or upgrade server resources

# - Add more RAM

# - Add more CPU cores
# Reduce replica count
replicas: 3  # Down from 10

# Or upgrade server resources

# - Add more RAM

# - Add more CPU cores

Container Crash Loops

If containers keep restarting:

# Check logs
haloy logs

# Common issues:

# - Missing environment variables

# - Database connection failures

# - Port conflicts

# - Resource limits (OOM)
# Check logs
haloy logs

# Common issues:

# - Missing environment variables

# - Database connection failures

# - Port conflicts

# - Resource limits (OOM)