Horizontal Scaling
Scale your application horizontally by running multiple container instances behind a load balancer.
Basic Scaling
Set the number of replicas in your configuration:
name: "my-scalable-app"
image:
repository: "my-org/my-app"
tag: "1.2.0"
replicas: 3 # Run 3 instances
domains:
- domain: "api.example.com"
acme_email: "you@email.com"
name: "my-scalable-app"
image:
repository: "my-org/my-app"
tag: "1.2.0"
replicas: 3 # Run 3 instances
domains:
- domain: "api.example.com"
acme_email: "you@email.com"
Haloy automatically:
- Starts the specified number of containers
- Configures HAProxy load balancing
- Distributes traffic across all healthy instances
- Monitors health checks
Load Balancing
HAProxy distributes traffic using round-robin by default, sending requests to each healthy container in turn.
Traffic Flow
Internet → HAProxy (80/443) → Container 1 (8080)
→ Container 2 (8080)
→ Container 3 (8080)
Internet → HAProxy (80/443) → Container 1 (8080)
→ Container 2 (8080)
→ Container 3 (8080)
All containers receive approximately equal traffic.
Scaling Strategies
Start Small, Scale Up
Begin with fewer replicas and increase as needed:
# Initial deployment
name: "my-app"
replicas: 1
# After load testing
replicas: 3
# Under heavy load
replicas: 10
# Initial deployment
name: "my-app"
replicas: 1
# After load testing
replicas: 3
# Under heavy load
replicas: 10
Environment-Based Scaling
Scale differently per environment:
name: "my-app"
replicas: 1 # Default
targets:
production:
server: prod.haloy.com
replicas: 10 # High capacity
domains: - domain: "my-app.com"
staging:
server: staging.haloy.com
replicas: 2 # Moderate capacity
domains: - domain: "staging.my-app.com"
development:
server: dev.haloy.com
replicas: 1 # Minimal resources
domains: - domain: "dev.my-app.com"
name: "my-app"
replicas: 1 # Default
targets:
production:
server: prod.haloy.com
replicas: 10 # High capacity
domains: - domain: "my-app.com"
staging:
server: staging.haloy.com
replicas: 2 # Moderate capacity
domains: - domain: "staging.my-app.com"
development:
server: dev.haloy.com
replicas: 1 # Minimal resources
domains: - domain: "dev.my-app.com"
High Availability
For mission-critical applications, run at least 3 replicas:
name: "critical-app"
replicas: 3 # Minimum for HA
deployment_strategy: "rolling" # Zero-downtime updates
health_check_path: "/health"
image:
repository: "my-org/critical-app"
tag: "v2.0.0"
domains:
- domain: "critical-app.com"
acme_email: "admin@critical-app.com"
name: "critical-app"
replicas: 3 # Minimum for HA
deployment_strategy: "rolling" # Zero-downtime updates
health_check_path: "/health"
image:
repository: "my-org/critical-app"
tag: "v2.0.0"
domains:
- domain: "critical-app.com"
acme_email: "admin@critical-app.com"
Why 3 Replicas?
- Fault tolerance: Can lose 1 container and maintain service
- Rolling deployments: Can update without downtime
- Load distribution: Better traffic handling
- Redundancy: Protection against failures
Resource Considerations
Each replica consumes:
- CPU
- Memory
- Disk I/O
- Network bandwidth
Example Resource Planning
# If each container uses:
# - 512MB RAM
# - 0.5 CPU cores
# For 5 replicas you need:
# - 2.5GB RAM
# - 2.5 CPU cores (plus overhead)
# If each container uses:
# - 512MB RAM
# - 0.5 CPU cores
# For 5 replicas you need:
# - 2.5GB RAM
# - 2.5 CPU cores (plus overhead)
Plan server resources accordingly:
name: "resource-intensive-app"
replicas: 5
# Ensure your server has sufficient:
# - At least 4GB RAM available
# - At least 4 CPU cores available
name: "resource-intensive-app"
replicas: 5
# Ensure your server has sufficient:
# - At least 4GB RAM available
# - At least 4 CPU cores available
Health Checks and Scaling
Health checks are critical for scaling:
name: "my-app"
replicas: 5
health_check_path: "/health"
port: "8080"
# HAProxy only routes to healthy containers
# Unhealthy containers are automatically excluded
name: "my-app"
replicas: 5
health_check_path: "/health"
port: "8080"
# HAProxy only routes to healthy containers
# Unhealthy containers are automatically excluded
Health Check Best Practices
- Check dependencies: Verify database, cache, etc.
- Fast response: Return within 1-2 seconds
- Accurate status: Only return 200 when truly ready
- Log failures: Help debug health issues
Deploying with Replicas
Rolling Deployment (Recommended)
Updates one replica at a time:
name: "my-app"
replicas: 5
deployment_strategy: "rolling"
# Deployment process:
# 1. Start new container 1, wait for health check
# 2. Stop old container 1
# 3. Repeat for containers 2-5
#
# Always have 4-5 containers serving traffic
name: "my-app"
replicas: 5
deployment_strategy: "rolling"
# Deployment process:
# 1. Start new container 1, wait for health check
# 2. Stop old container 1
# 3. Repeat for containers 2-5
#
# Always have 4-5 containers serving traffic
Replace Deployment
Replaces all containers at once:
name: "my-app"
replicas: 5
deployment_strategy: "replace"
# Deployment process:
# 1. Stop all old containers
# 2. Start all new containers
# 3. Brief service interruption
name: "my-app"
replicas: 5
deployment_strategy: "replace"
# Deployment process:
# 1. Stop all old containers
# 2. Start all new containers
# 3. Brief service interruption
Monitoring Scaled Applications
Check status of all replicas:
# View all containers
haloy status
# View logs from all replicas
haloy logs
# View all containers
haloy status
# View logs from all replicas
haloy logs
Geographic Scaling
Scale across multiple regions:
name: "global-api"
image:
repository: "my-org/api"
tag: "v2.0.0"
targets:
us-east:
server: us-east.haloy.com
replicas: 5
domains: - domain: "us.api.example.com"
env: - name: "REGION"
value: "us-east-1"
eu-west:
server: eu-west.haloy.com
replicas: 5
domains: - domain: "eu.api.example.com"
env: - name: "REGION"
value: "eu-west-1"
asia-pacific:
server: ap-southeast.haloy.com
replicas: 5
domains: - domain: "ap.api.example.com"
env: - name: "REGION"
value: "ap-southeast-1"
name: "global-api"
image:
repository: "my-org/api"
tag: "v2.0.0"
targets:
us-east:
server: us-east.haloy.com
replicas: 5
domains: - domain: "us.api.example.com"
env: - name: "REGION"
value: "us-east-1"
eu-west:
server: eu-west.haloy.com
replicas: 5
domains: - domain: "eu.api.example.com"
env: - name: "REGION"
value: "eu-west-1"
asia-pacific:
server: ap-southeast.haloy.com
replicas: 5
domains: - domain: "ap.api.example.com"
env: - name: "REGION"
value: "ap-southeast-1"
Stateless Applications
For best scaling results, design stateless applications:
Good - Stateless:
- Session data in Redis/database
- Files in object storage (S3, etc.)
- No local caching (or distributed cache)
- Request data self-contained
Bad - Stateful:
- Session data in memory
- Files on local disk
- Local in-memory cache
- Sticky sessions required
Example Stateless App
name: "stateless-api"
replicas: 10
env:
# External session store
- name: "REDIS_URL"
value: "redis://redis-server:6379"
# External file storage
- name: "S3_BUCKET"
value: "my-app-uploads"
# External cache
- name: "MEMCACHED_SERVERS"
value: "memcached-1:11211,memcached-2:11211"
# No volumes needed - fully stateless
name: "stateless-api"
replicas: 10
env:
# External session store
- name: "REDIS_URL"
value: "redis://redis-server:6379"
# External file storage
- name: "S3_BUCKET"
value: "my-app-uploads"
# External cache
- name: "MEMCACHED_SERVERS"
value: "memcached-1:11211,memcached-2:11211"
# No volumes needed - fully stateless
Stopping Scaled Applications
Stop all replicas:
haloy stop
# Remove containers after stopping
haloy stop --remove-containers
haloy stop
# Remove containers after stopping
haloy stop --remove-containers
Best Practices
- Start with 1 replica: Test before scaling
- Scale based on metrics: Monitor CPU, memory, request latency
- Use odd numbers: 3, 5, 7 for better consensus/distribution
- Design for stateless: Enables unlimited horizontal scaling
- Implement health checks: Critical for load balancing
- Monitor all replicas: Ensure even load distribution
- Plan resources: Ensure server can handle all replicas
- Use rolling deployments: Maintain availability during updates
Troubleshooting
Uneven Load Distribution
If some containers get more traffic:
-
Check all containers are healthy:
haloy statushaloy status -
Verify health check responses are fast
-
Review application logs for slow requests
Out of Resources
If deployment fails due to resources:
# Reduce replica count
replicas: 3 # Down from 10
# Or upgrade server resources
# - Add more RAM
# - Add more CPU cores
# Reduce replica count
replicas: 3 # Down from 10
# Or upgrade server resources
# - Add more RAM
# - Add more CPU cores
Container Crash Loops
If containers keep restarting:
# Check logs
haloy logs
# Common issues:
# - Missing environment variables
# - Database connection failures
# - Port conflicts
# - Resource limits (OOM)
# Check logs
haloy logs
# Common issues:
# - Missing environment variables
# - Database connection failures
# - Port conflicts
# - Resource limits (OOM)