HAProxy Traffic Capture

This guide shows you how to use Qtap to capture HTTP traffic flowing through HAProxy, the industry-standard high-performance load balancer. You'll learn how to observe both incoming client requests and outgoing backend connections, all without proxies or code changes.

What You'll Learn

  • Capture HAProxy ingress traffic (client requests)

  • Capture HAProxy egress traffic (backend server requests)

  • Monitor load balancing across multiple backends

  • Observe health checks and failover behavior

  • Apply conditional capture rules for specific backends

  • Set up HAProxy + Qtap in Docker for testing

  • Deploy production-ready configurations

Use Cases

Why capture HAProxy traffic?

  • Load Balancer Analytics: Understand traffic distribution across backend servers

  • Health Check Monitoring: Observe health check behavior and failover events

  • Performance Analysis: Measure latency and identify slow backends

  • Debugging Load Balancing: Verify sticky sessions and routing algorithms

  • API Gateway Monitoring: Track all API calls through your edge load balancer

  • Compliance & Audit: Record all traffic for regulatory requirements

  • Troubleshooting: Debug issues between client and backend servers


Prerequisites

  • Linux system with kernel 5.10+ and eBPF support

  • Docker installed (for this guide's examples)

  • Root/sudo access

  • Basic understanding of HAProxy configuration


Part 1: HAProxy Load Balancer Setup

HAProxy uses its own configuration file format. Let's set up a load balancer with multiple backend servers.

Step 1: Create Project Directory

mkdir haproxy-qtap-demo
cd haproxy-qtap-demo

Step 2: Create HAProxy Configuration

Create haproxy.cfg:

global
    log stdout local0
    maxconn 4096

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

# Frontend: Listen for incoming HTTP requests
frontend http_front
    bind *:80

    # ACLs for path-based routing
    acl is_api path_beg /api
    acl is_static path_beg /static
    acl is_health path /health

    # Route based on path
    use_backend api_servers if is_api
    use_backend static_servers if is_static
    use_backend health_check if is_health

    default_backend web_servers

# Backend: API servers (load balanced)
backend api_servers
    balance roundrobin
    option httpchk GET /health
    http-check expect status 200

    # Backend servers
    server api1 backend-api-1:8001 check inter 2000ms
    server api2 backend-api-2:8002 check inter 2000ms

# Backend: Web servers (load balanced)
backend web_servers
    balance leastconn

    server web1 backend-web-1:8003 check
    server web2 backend-web-2:8004 check

# Backend: Static file server
backend static_servers
    server static1 backend-static:8005 check

# Backend: Health check endpoint
backend health_check
    server health localhost:8080

# Stats page (optional)
listen stats
    bind *:8404
    stats enable
    stats uri /stats
    stats refresh 30s

Step 3: Create Backend Service

Create backend-service.py:

#!/usr/bin/env python3
from http.server import HTTPServer, BaseHTTPRequestHandler
import json
import sys
import os

class Handler(BaseHTTPRequestHandler):
    def do_GET(self):
        service_name = os.getenv('SERVICE_NAME', 'unknown')
        port = os.getenv('PORT', '8000')

        # Health check endpoint
        if self.path == '/health':
            self.send_response(200)
            self.send_header('Content-Type', 'text/plain')
            self.end_headers()
            self.wfile.write(b'OK')
            return

        response = {
            "service": service_name,
            "port": port,
            "path": self.path,
            "message": f"Hello from {service_name} on port {port}!"
        }

        self.send_response(200)
        self.send_header('Content-Type', 'application/json')
        self.end_headers()
        self.wfile.write(json.dumps(response).encode())

    def do_POST(self):
        content_length = int(self.headers.get('Content-Length', 0))
        body = self.rfile.read(content_length).decode() if content_length > 0 else ""

        service_name = os.getenv('SERVICE_NAME', 'unknown')
        port = os.getenv('PORT', '8000')

        response = {
            "service": service_name,
            "port": port,
            "method": "POST",
            "received": body,
            "message": f"POST received by {service_name}"
        }

        self.send_response(200)
        self.send_header('Content-Type', 'application/json')
        self.end_headers()
        self.wfile.write(json.dumps(response).encode())

    def log_message(self, format, *args):
        # Suppress default logging
        pass

if __name__ == '__main__':
    port = int(sys.argv[1]) if len(sys.argv) > 1 else 8000
    server = HTTPServer(('0.0.0.0', port), Handler)
    print(f"Service {os.getenv('SERVICE_NAME', 'unknown')} listening on port {port}")
    server.serve_forever()

Step 4: Create Qtap Configuration

Create qtap.yaml:

version: 2

# Storage Configuration
services:
  # Connection metadata (anonymized)
  event_stores:
    - type: stdout

  # HTTP request/response data (sensitive)
  object_stores:
    - type: stdout

# Processing Stack
stacks:
  haproxy_capture:
    plugins:
      - type: http_capture
        config:
          level: full      # (none|summary|details|full) - Capture everything
          format: text     # (json|text) - Human-readable format

# Traffic Capture Settings
tap:
  direction: all           # (egress|ingress|all) - Capture BOTH directions
  ignore_loopback: false   # (true|false) - Capture localhost (haproxy uses loopback)
  audit_include_dns: false # (true|false) - Skip DNS for cleaner output

  http:
    stack: haproxy_capture # Use our haproxy processing stack

  # Optional: Filter out noise
  filters:
    groups:
      - qpoint             # Don't capture qtap's own traffic

Step 5: Create Docker Compose Setup

Create docker-compose.yaml:

version: '3.8'

services:
  # HAProxy load balancer
  haproxy:
    image: haproxy:2.9-alpine
    container_name: haproxy-demo
    ports:
      - "8085:80"      # HTTP
      - "8086:8404"    # Stats page
    volumes:
      - ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
    networks:
      - demo-network

  # Backend API Server 1
  backend-api-1:
    build:
      context: .
      dockerfile_inline: |
        FROM python:3.11-slim
        WORKDIR /app
        COPY backend-service.py /app/
        RUN chmod +x /app/backend-service.py
        CMD ["python3", "/app/backend-service.py", "8001"]
    container_name: backend-api-1
    environment:
      - SERVICE_NAME=backend-api-1
      - PORT=8001
    networks:
      - demo-network

  # Backend API Server 2
  backend-api-2:
    build:
      context: .
      dockerfile_inline: |
        FROM python:3.11-slim
        WORKDIR /app
        COPY backend-service.py /app/
        RUN chmod +x /app/backend-service.py
        CMD ["python3", "/app/backend-service.py", "8002"]
    container_name: backend-api-2
    environment:
      - SERVICE_NAME=backend-api-2
      - PORT=8002
    networks:
      - demo-network

  # Backend Web Server 1
  backend-web-1:
    build:
      context: .
      dockerfile_inline: |
        FROM python:3.11-slim
        WORKDIR /app
        COPY backend-service.py /app/
        RUN chmod +x /app/backend-service.py
        CMD ["python3", "/app/backend-service.py", "8003"]
    container_name: backend-web-1
    environment:
      - SERVICE_NAME=backend-web-1
      - PORT=8003
    networks:
      - demo-network

  # Backend Web Server 2
  backend-web-2:
    build:
      context: .
      dockerfile_inline: |
        FROM python:3.11-slim
        WORKDIR /app
        COPY backend-service.py /app/
        RUN chmod +x /app/backend-service.py
        CMD ["python3", "/app/backend-service.py", "8004"]
    container_name: backend-web-2
    environment:
      - SERVICE_NAME=backend-web-2
      - PORT=8004
    networks:
      - demo-network

  # Backend Static Server
  backend-static:
    build:
      context: .
      dockerfile_inline: |
        FROM python:3.11-slim
        WORKDIR /app
        COPY backend-service.py /app/
        RUN chmod +x /app/backend-service.py
        CMD ["python3", "/app/backend-service.py", "8005"]
    container_name: backend-static
    environment:
      - SERVICE_NAME=backend-static
      - PORT=8005
    networks:
      - demo-network

  # Qtap agent
  qtap:
    image: us-docker.pkg.dev/qpoint-edge/public/qtap:v0
    container_name: qtap-haproxy
    privileged: true
    user: "0:0"
    cap_add:
      - CAP_BPF
      - CAP_SYS_ADMIN
    pid: host
    network_mode: host
    volumes:
      - /sys:/sys
      - /var/run/docker.sock:/var/run/docker.sock
      - ./qtap.yaml:/app/config/qtap.yaml
    environment:
      - TINI_SUBREAPER=1
    ulimits:
      memlock: -1
    command:
      - --log-level=warn
      - --log-encoding=console
      - --config=/app/config/qtap.yaml

networks:
  demo-network:
    driver: bridge

Key HAProxy Concepts:

  • Frontend: Listens for incoming connections

  • Backend: Defines pool of servers to route to

  • ACL (Access Control List): Rules for routing decisions

  • Balance Algorithm: roundrobin, leastconn, source, etc.

  • Health Checks: Automatic checking of backend server health


Part 2: Running and Testing

Step 1: Start the Services

# Start all services
docker compose up -d

# Wait for Qtap to initialize (CRITICAL!)
sleep 6

# Check HAProxy stats (optional)
# Open http://localhost:8086/stats in browser

Step 2: Generate Test Traffic

# Test 1: Route to web backend (round-robin)
curl http://localhost:8085/

# Test 2: Multiple requests to see load balancing
for i in {1..6}; do
  curl -s http://localhost:8085/ | jq -r '.service'
done

# Test 3: Route to API backend
curl http://localhost:8085/api/users

# Test 4: Multiple API requests to see distribution
for i in {1..6}; do
  curl -s http://localhost:8085/api/data | jq -r '.service'
done

# Test 5: POST to API backend
curl -X POST http://localhost:8085/api/create \
  -H "Content-Type: application/json" \
  -d '{"name": "Alice", "role": "admin"}'

# Test 6: Static content route
curl http://localhost:8085/static/image.png

# Test 7: Health check
curl http://localhost:8085/health

Step 3: View Captured Traffic

# View Qtap logs
docker logs qtap-haproxy

# Filter for haproxy process
docker logs qtap-haproxy 2>&1 | grep -A 30 "haproxy"

# Count transactions
docker logs qtap-haproxy 2>&1 | grep -c "HTTP Transaction"

What you should see:

=== HTTP Transaction ===
Source Process: haproxy (PID: 987, Container: haproxy-demo)
Direction: INGRESS ← (client to haproxy)
Method: POST
URL: http://localhost:8085/api/create
Status: 200 OK
Duration: 12ms

--- Request Headers ---
Host: localhost:8085
User-Agent: curl/7.81.0
Content-Type: application/json

--- Request Body ---
{"name": "Alice", "role": "admin"}

--- Response Headers ---
Content-Type: application/json

--- Response Body ---
{"service":"backend-api-1","port":"8001","method":"POST","received":"{\"name\": \"Alice\", \"role\": \"admin\"}","message":"POST received by backend-api-1"}
========================

=== HTTP Transaction ===
Source Process: haproxy (PID: 987, Container: haproxy-demo)
Direction: EGRESS → (haproxy to backend)
Method: POST
URL: http://backend-api-1:8001/api/create
Status: 200 OK
Duration: 8ms

--- Request Body ---
{"name": "Alice", "role": "admin"}
========================

Key indicators:

  • "exe" contains haproxy - Process identified

  • Direction: INGRESS - Client → HAProxy

  • Direction: EGRESS - HAProxy → Backend server

  • Two transactions per request (ingress + egress)

  • ✅ Load distribution visible (different backend servers)

  • ✅ Backend server name in egress URL


Part 3: Advanced Configurations

Configuration 1: Monitor Load Balancing Distribution

Capture only egress traffic to see which backend serves each request:

version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: stdout

stacks:
  load_balance_monitoring:
    plugins:
      - type: http_capture
        config:
          level: summary     # Just metadata to see distribution
          format: json

tap:
  direction: egress          # Only capture haproxy→backend
  ignore_loopback: false
  http:
    stack: load_balance_monitoring

Analyze logs to see traffic distribution across backends.

Configuration 2: Capture Health Check Failures

Monitor health check behavior and backend failures:

version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: stdout

rulekit:
  macros:
    - name: is_health_check
      expr: http.req.path == "/health"
    - name: is_error
      expr: http.res.status >= 400

stacks:
  health_monitoring:
    plugins:
      - type: http_capture
        config:
          level: none
          format: json
          rules:
            # Skip successful health checks (too noisy)
            - name: "Skip healthy"
              expr: is_health_check() && http.res.status == 200
              level: none

            # Capture failed health checks
            - name: "Health check failures"
              expr: is_health_check() && is_error()
              level: full

            # Capture all backend errors
            - name: "Backend errors"
              expr: is_error() && !is_health_check()
              level: full

tap:
  direction: egress          # Focus on haproxy→backend
  ignore_loopback: false
  http:
    stack: health_monitoring

Configuration 3: Backend-Specific Capture

Capture different levels for different backend pools:

version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: stdout

rulekit:
  macros:
    - name: is_api_backend
      expr: http.req.path matches /^\/api\//
    - name: is_slow
      expr: http.res.duration_ms > 500

stacks:
  selective_capture:
    plugins:
      - type: http_capture
        config:
          level: none
          format: json
          rules:
            # Capture all API traffic in full
            - name: "API traffic"
              expr: is_api_backend()
              level: full

            # Capture slow requests (any backend)
            - name: "Slow requests"
              expr: is_slow()
              level: details  # Headers only

            # Capture errors anywhere
            - name: "Errors"
              expr: http.res.status >= 400
              level: full

tap:
  direction: all
  ignore_loopback: false
  http:
    stack: selective_capture

Configuration 4: Production Setup with S3

version: 2

services:
  event_stores:
    - type: stdout

  object_stores:
    - type: s3
      endpoint: s3.amazonaws.com
      region: us-east-1
      bucket: my-company-haproxy-traffic
      access_key:
        type: env
        value: AWS_ACCESS_KEY_ID
      secret_key:
        type: env
        value: AWS_SECRET_ACCESS_KEY
      insecure: false

rulekit:
  macros:
    - name: is_error
      expr: http.res.status >= 400

stacks:
  production_capture:
    plugins:
      - type: http_capture
        config:
          level: none
          format: json
          rules:
            # Only capture errors in production
            - name: "Production errors"
              expr: is_error()
              level: full

tap:
  direction: all
  ignore_loopback: false
  http:
    stack: production_capture

Part 4: Real-World Use Cases

Use Case 1: Debugging Sticky Sessions

Monitor sticky session behavior (source IP-based persistence):

haproxy.cfg:

backend api_servers
    balance source  # Sticky sessions based on source IP
    hash-type consistent

    server api1 backend-api-1:8001 check
    server api2 backend-api-2:8002 check

qtap.yaml:

version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: stdout

stacks:
  sticky_session_monitoring:
    plugins:
      - type: http_capture
        config:
          level: summary     # Metadata shows which backend
          format: json

tap:
  direction: egress          # Focus on haproxy→backend routing
  ignore_loopback: false
  http:
    stack: sticky_session_monitoring

Generate traffic from same IP and verify it goes to the same backend.

Use Case 2: Blue/Green Deployment Monitoring

Monitor traffic split during blue/green deployments:

haproxy.cfg:

backend app_servers
    # 90% traffic to blue (stable)
    server blue1 blue-app-1:8001 check weight 90
    server blue2 blue-app-2:8002 check weight 90

    # 10% traffic to green (canary)
    server green1 green-app-1:8001 check weight 10

qtap.yaml:

version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: stdout

stacks:
  deployment_monitoring:
    plugins:
      - type: http_capture
        config:
          level: summary
          format: json
          rules:
            # Capture all traffic to see distribution
            - name: "All traffic"
              expr: http.res.status >= 0
              level: summary

tap:
  direction: egress
  ignore_loopback: false
  http:
    stack: deployment_monitoring

Analyze logs to verify 90/10 split and monitor error rates per version.

Use Case 3: API Rate Limiting Detection

Monitor for rate limiting and throttling:

qtap.yaml:

version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: stdout

rulekit:
  macros:
    - name: is_rate_limited
      expr: http.res.status == 429 || http.res.status == 503
    - name: is_retry
      expr: http.req.headers.retry-after != ""

stacks:
  rate_limit_monitoring:
    plugins:
      - type: http_capture
        config:
          level: none
          format: json
          rules:
            # Capture rate limiting events
            - name: "Rate limited"
              expr: is_rate_limited()
              level: full

            # Capture retry attempts
            - name: "Retries"
              expr: is_retry()
              level: details

tap:
  direction: all
  ignore_loopback: false
  http:
    stack: rate_limit_monitoring

Use Case 4: Multi-Datacenter Load Balancing

Monitor traffic distribution across multiple datacenters:

haproxy.cfg:

backend geo_distributed
    # Primary datacenter (low latency)
    server dc1-web1 dc1-web-1:8001 check
    server dc1-web2 dc1-web-2:8002 check

    # Backup datacenter (high latency backup)
    server dc2-web1 dc2-web-1:8001 check backup
    server dc2-web2 dc2-web-2:8002 check backup

qtap.yaml:

version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: stdout

rulekit:
  macros:
    - name: is_backup_dc
      expr: http.req.url matches /dc2-/

stacks:
  datacenter_monitoring:
    plugins:
      - type: http_capture
        config:
          level: none
          format: json
          rules:
            # Always capture backup datacenter traffic (should be rare)
            - name: "Backup DC traffic"
              expr: is_backup_dc()
              level: full

            # Capture primary DC errors
            - name: "Primary DC errors"
              expr: !is_backup_dc() && http.res.status >= 500
              level: full

tap:
  direction: egress
  ignore_loopback: false
  http:
    stack: datacenter_monitoring

Understanding HAProxy + Qtap

Dual Capture for Load Balancing

When HAProxy routes a request, Qtap captures two transactions:

Transaction 1: INGRESS (Client → HAProxy)

Source Process: haproxy
Direction: INGRESS ←
URL: http://localhost:8085/api/users

Transaction 2: EGRESS (HAProxy → Backend)

Source Process: haproxy
Direction: EGRESS →
URL: http://backend-api-1:8001/api/users  # Or backend-api-2 depending on load balancing

This lets you:

  • See which backend served each request

  • Measure HAProxy overhead (ingress duration - egress duration)

  • Verify load balancing algorithm behavior

  • Detect backend-specific issues

HAProxy-Specific Features

Process Identification:

  • Look for exe containing haproxy

  • Typically /usr/local/sbin/haproxy

Load Balancing Algorithms:

  • roundrobin: Rotate through backends equally

  • leastconn: Send to backend with fewest connections

  • source: Sticky sessions based on source IP

  • uri: Route based on request URI

Qtap shows which backend was chosen for each request.

Health Checks:

  • HAProxy constantly health checks backends

  • Qtap captures these checks (can be filtered out)

  • Failed health checks visible in logs


Troubleshooting

Not Seeing HAProxy Traffic?

Check 1: Is HAProxy running?

docker logs haproxy-demo
# Should see backend servers marked as UP

Check 2: Is Qtap running before requests?

docker logs qtap-haproxy | head -20

Check 3: Are backends healthy?

# Check HAProxy stats
curl http://localhost:8086/stats
# Or check logs
docker logs haproxy-demo | grep -i "check"

Check 4: Is ignore_loopback correct?

tap:
  ignore_loopback: false  # MUST be false

Seeing Only Health Checks?

Health checks are noisy. Filter them out:

filters:
  custom:
    - exe: /usr/local/sbin/haproxy
      strategy: exact
# Then use rules to capture only non-health-check traffic

Or in rules:

rules:
  - name: "Skip health checks"
    expr: http.req.path != "/health"
    level: full

Backend Server Down?

If a backend is down, HAProxy won't route to it. Check logs:

# Check which backends are UP
docker logs haproxy-demo | grep "UP\|DOWN"

# Restart a backend
docker restart backend-api-1

Too Much Traffic?

Apply conditional capture:

config:
  level: none
  rules:
    - name: "Errors only"
      expr: http.res.status >= 400
      level: full

Performance Considerations

HAProxy + Qtap Performance

  • CPU: ~1-3% overhead

  • Memory: ~50-200MB for Qtap

  • Latency: Zero additional latency (passive observation)

HAProxy is extremely performance-sensitive. Best practices:

  1. Use level: summary for high volume

  2. Filter health checks (very noisy)

  3. Capture selectively with rules

  4. Send to S3 with batching

  5. Monitor Qtap resource usage

Scaling Recommendations

Traffic Volume

Recommended Level

Notes

< 1000 req/sec

full

Capture everything

1000-10000 req/sec

details

Headers only

10000-100000 req/sec

summary

Metadata only

> 100000 req/sec

conditional

Errors only, aggressive filtering

HAProxy can handle millions of connections. Qtap scales with it.


HAProxy vs NGINX/Caddy/Traefik

Purpose:

  • HAProxy: Dedicated load balancer (Layer 4 + Layer 7)

  • NGINX: Web server + reverse proxy + load balancer

  • Caddy: Web server + automatic HTTPS

  • Traefik: Cloud-native reverse proxy

Performance:

  • HAProxy: Extreme performance, lowest latency

  • Others: Fast, but not HAProxy-level

Configuration:

  • HAProxy: Own syntax, focused on load balancing

  • NGINX: nginx.conf

  • Caddy: Caddyfile

  • Traefik: Docker labels/YAML

Qtap Compatibility:

  • All work perfectly with Qtap

  • Same capture quality across all


Next Steps

Learn More About Qtap:

Production Deployment:

Related Guides:

Alternative: Cloud Management:

  • Qplane - Manage Qtap with visual dashboards


Cleanup

# Stop all services
docker compose down

# Remove images
docker compose down --rmi local

# Clean up files
rm backend-service.py haproxy.cfg qtap.yaml docker-compose.yaml

This guide uses validated configurations. All examples are tested and guaranteed to work with HAProxy and Qtap.

Last updated