Load Balancer Configuration Tutorial
Load Balancer Configuration Tutorial
This tutorial explains load balancing clearly and practically. It covers types of load balancers, architecture, deployment options, routing algorithms, security, scaling, monitoring, automation, troubleshooting, and hands-on examples. Read sections in order or jump to what you need.
Introduction to load balancing
Load balancing spreads traffic across multiple servers so services stay fast and available. It prevents any single server from getting overloaded.
A load balancer sits between clients and servers. Clients talk to the balancer, and the balancer forwards requests to healthy backend servers. This improves performance, reliability, and maintenance flexibility.
Common uses: web apps, APIs, microservices, and any service that needs scaling or redundancy.
Load balancer types and selection criteria
There are several types of load balancers. Choose based on your needs.
- Hardware load balancers: Physical boxes from vendors. Good for large enterprises needing dedicated appliances and specialized features.
- Software load balancers: Run on standard servers. Examples include HAProxy, Nginx, and Envoy. They are flexible and cost-effective.
- Cloud load balancers: Managed by cloud providers (AWS ELB/ALB, GCP Load Balancing, Azure Load Balancer). They reduce operational work and integrate with cloud services.
- Application-layer (Layer 7) balancers: Route based on HTTP headers, paths, cookies. Useful for web apps and APIs.
- Transport-layer (Layer 4) balancers: Route based on IP/port. Lower latency and good for TCP/UDP traffic.
Choose based on:
- Traffic type (HTTP vs TCP/UDP)
- Required features (SSL termination, path routing, WebSockets)
- Scale and latency needs
- Operational cost and team expertise
- Integration with cloud or container platforms
Architecture and core components
A basic load balancing architecture has three parts: clients, load balancer, and backends (servers).
Key components:
- Frontend listener: Receives incoming connections on specific ports and protocols.
- Routing logic: Decides which backend will handle each request.
- Health check system: Regularly checks backend status and removes unhealthy servers.
- Session persistence: Keeps a client tied to the same backend when needed.
- SSL/TLS handling: Manages encryption and certificates.
- Logging and metrics: Records events and performance data.
You can deploy one load balancer or a pool of them. For high availability, put at least two in front of your backends.
Deployment scenarios and topology options
Different topologies fit different needs:
- Single data center, single load balancer: Simple, low cost, but single point of failure.
- Active-passive pairs: One main balancer and one standby. Failover uses floating IPs or VRRP.
- Active-active clusters: Multiple balancers share traffic. Use a virtual IP, DNS, or cloud load balancer in front.
- Multi-region: Place balancers in each region and use DNS-based routing or global load balancers for failover.
- Edge + internal: Edge balancers handle public traffic and SSL. Internal balancers route between microservices.
Consider topology when designing for fault zones, latency, and maintenance windows.
Traffic distribution and routing algorithms
Balancers use algorithms to decide where to send traffic.
Simple algorithms:
- Round robin: Cycle through backends evenly.
- Least connections: Send to the server with the fewest active connections.
- Random: Pick a server at random.
- Source IP hash: Map client IP to the same backend—useful for simple persistence.
- Weighted variants: Assign weights to servers to reflect capacity.
Advanced routing:
- URL path and host-based routing (Layer 7)
- Header-based rules (route API version requests differently)
- Canary or blue/green routing (send a small portion to new versions)
- Traffic shaping and rate limiting
Pick algorithms based on request size, session needs, and backend performance variability.
Health checks, failover, and session persistence
Health checks detect unhealthy backends before sending traffic.
Types of health checks:
- TCP connect: Quick check if port accepts connections.
- HTTP/HTTPS: Request a specific path and expect a status code.
- Scripted checks: Run custom commands or call health endpoints.
Health check tips:
- Check a lightweight path that confirms app readiness.
- Use different checks for readiness and liveness.
- Tune intervals and timeouts to avoid flapping.
Failover:
- Remove unhealthy backends automatically.
- Use multi-zone or multi-region failover for resilience.
- Configure graceful drain so sessions finish before shutting down a server.
Session persistence:
- Cookie-based persistence: Balancer sets a cookie so clients return to the same backend.
- IP-based persistence: Use client IP hash to stick to a backend.
- Application-managed: The app stores session data in a shared store (Redis) and the balancer can be stateless.
Prefer application-managed sessions for scale and flexibility. Use balancer-level persistence when rearchitecture isn’t possible.
SSL/TLS termination and security considerations
SSL/TLS termination can happen at the load balancer or be passed through to backends.
Options:
- Termination at balancer: Balancer decrypts traffic, offloading CPU from backends and enabling Layer 7 features.
- Re-encryption to backends: Balancer decrypts then re-encrypts to backends for end-to-end encryption.
- TLS passthrough: Balancer forwards encrypted traffic without decrypting; useful when backends must see the original TLS.
Security best practices:
- Use strong TLS versions (TLS 1.2+; prefer TLS 1.3).
- Choose safe cipher suites and disable weak ones.
- Automate certificate renewal (Let’s Encrypt, ACME).
- Protect private keys and enable hardware security modules (HSM) if needed.
- Apply HTTP security headers (HSTS, CSP) at the balancer or app.
- Rate limit and block obvious attacks at the edge.
Always monitor for certificate expiry and vulnerability disclosures in TLS libraries.
Scaling, high availability, and performance tuning
Scaling strategies:
- Scale backends horizontally: Add more servers.
- Scale balancers horizontally: Add more load balancer instances.
- Use autoscaling in cloud environments for both balancers and backends.
High availability techniques:
- Active-active across zones or regions.
- Use health checks and fast failover.
- Replicate configurations and keep state minimal on balancers.
Performance tuning tips:
- Tune TCP settings (backlog, timeouts) for high connection rates.
- Use keepalives and connection pooling to reduce backend load.
- Enable gzip or brotli compression at the right place.
- Cache static content at the edge or on CDNs.
- Monitor CPU and network bandwidth; scale before saturation.
Run load tests representative of real traffic and measure latency, error rates, and resource usage.
Monitoring, logging, and metrics collection
Good monitoring tells you when things go wrong and why.
What to monitor:
- Request rate (RPS), error rate, and latency percentiles (p50, p95, p99).
- Backend health and response codes.
- Connection counts, queue lengths, and resource usage (CPU, memory).
- TLS certificate expiry and handshake failures.
- Logs for access, errors, and security alerts.
Tools and approaches:
- Prometheus + Grafana for metrics and dashboards.
- Centralized logging (ELK/EFK, Loki) for aggregated access and error logs.
- Tracing (OpenTelemetry, Jaeger) for request flows and latency hotspots.
- Alerts with sensible thresholds and escalation paths.
Log structure:
- Include timestamp, client IP, request path, response code, backend used, and request duration.
- Avoid logging sensitive data like full authorization headers.
Automation, orchestration, and configuration management
Automation reduces errors and speeds deployment.
Infrastructure as code:
- Use Terraform, CloudFormation, or ARM templates for cloud resources.
- Store load balancer configs in Git and apply via CI/CD.
Configuration management:
- Use Ansible, Chef, or Puppet for software balancers.
- Validate configs with test suites or linting before applying.
Service discovery and orchestration:
- In dynamic environments, integrate with service registries (Consul, etcd) or orchestration platforms (Kubernetes).
- For Kubernetes, use Services and Ingress controllers, or Service Mesh (Istio, Linkerd) for advanced routing.
CI/CD:
- Automate certificate renewal, config updates, and route changes.
- Use blue/green or canary deployments for config or code changes.
Keep rollback procedures simple and tested.
Troubleshooting common issues and recovery procedures
Common issues and quick fixes:
- High latency: Check backend CPU, database slow queries, or network saturation. Look at p95/p99 latencies.
- Backend flapping (frequent failover): Increase health check timeouts, investigate resource exhaustion or garbage collection.
- Uneven load distribution: Confirm weights, algorithm choice, and session affinity settings.
- SSL/TLS errors: Check certificate validity, supported protocols, and cipher mismatches.
- 502/504 errors: Inspect backend logs and timeouts between balancer and backends.
- Connection limits: Tune ephemeral port ranges and file descriptor limits.
Recovery procedures:
- Remove faulty backend from pool and replace or restart.
- Revert recent configuration changes via version control.
- Scale up temporarily to absorb traffic bursts.
- Switch traffic to a standby region or zone if necessary.
Document runbooks for common fail scenarios and rehearse them.
Hands-on configuration examples and best practices
Below are practical examples for common setups. Replace IPs, names, and ports with your values.
HAProxy: simple HTTP load balancer
haproxy.cfg example:
global
log stdout format raw local0
maxconn 20000
tune.ssl.default-dh-param 2048
defaults
mode http
timeout connect 5s
timeout client 30s
timeout server 30s
option httplog
option http-server-close
frontend http-in
bind *:80
mode http
default_backend web-backend
backend web-backend
mode http
balance roundrobin
option httpchk GET /healthz
server web1 10.0.0.10:80 check inter 5s fall 3 rise 2
server web2 10.0.0.11:80 check inter 5s fall 3 rise 2
Best practices:
- Use httpchk for health endpoints that verify app readiness.
- Enable logging to stdout when running in containers.
- Tune timeouts based on app behavior.
Nginx: SSL termination and proxying
nginx.conf snippet:
http {
upstream backend {
server 10.0.0.10:8080;
server 10.0.0.11:8080;
}
server {
listen 443 ssl;
server_name example.com;
ssl_certificate /etc/ssl/certs/example.crt;
ssl_certificate_key /etc/ssl/private/example.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
location / {
proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
location /healthz {
return 200 'OK';
}
}
}
Best practices:
- Offload TLS at the edge if backends are trusted.
- Use proxy_set_header to forward client info to backends.
- Automate cert renewal with Certbot or an ACME client.
Kubernetes: Service + Ingress (basic)
nginx ingress with service:
apiVersion: v1
kind: Service
metadata:
name: web
spec:
selector:
app: web
ports:
- port: 80
targetPort: 8080
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: web-ingress
spec:
rules:
- host: example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web
port:
number: 80
Best practices:
- Use readiness and liveness probes on pods.
- Use horizontal pod autoscaler for backend scaling.
- Prefer Ingress controllers with native TLS support and integrations.
AWS Elastic Load Balancer: basic CLI create
Create an Application Load Balancer (ALB) with AWS CLI (simplified):
-
Create target group:
aws elbv2 create-target-group –name web-tg –protocol HTTP –port 80 –vpc-id vpc-xxxx -
Register targets:
aws elbv2 register-targets –target-group-arn arn:aws:… –targets Id=i-0123456789abcdef0 -
Create ALB:
aws elbv2 create-load-balancer –name my-alb –subnets subnet-aaa subnet-bbb –security-groups sg-xxxx -
Create listener for HTTP:
aws elbv2 create-listener –load-balancer-arn arn:aws:… –protocol HTTP –port 80 –default-actions Type=forward,TargetGroupArn=arn:aws:…
Best practices:
- Use ALB for Layer 7 features, NLB for performance or static IP needs.
- Put ALBs in multiple subnets for HA.
- Use CloudWatch for metrics and alarms.
Final best practices checklist
- Use health checks that reflect real readiness.
- Keep balancers stateless where possible.
- Automate configuration and certificate management.
- Monitor p95/p99 latency, not just averages.
- Use canary deployments for config changes.
- Encrypt traffic in transit and protect keys.
- Test failover paths and recovery runbooks regularly.
- Keep session state in shared stores, not local memory.
This tutorial gives a practical foundation for configuring and operating load balancers. Use the examples as templates and adapt them to your environment. Start small, measure, and iterate.
About Jack Williams
Jack Williams is a WordPress and server management specialist at Moss.sh, where he helps developers automate their WordPress deployments and streamline server administration for crypto platforms and traditional web projects. With a focus on practical DevOps solutions, he writes guides on zero-downtime deployments, security automation, WordPress performance optimization, and cryptocurrency platform reviews for freelancers, agencies, and startups in the blockchain and fintech space.
Leave a Reply