Capturing All HTTP Traffic with Fluent Bit

Capturing All HTTP Traffic with Fluent Bit

Overview

This guide shows you how to capture every HTTP request and response from Qtap and route it to storage backends using Fluent Bit. This pattern is ideal for comprehensive observability, troubleshooting, audit trails, and compliance requirements where you need complete traffic capture.

Why Fluent Bit?

While Qtap can write directly to S3-compatible object stores, Fluent Bit provides critical advantages at scale:

  • Performance: Batches and buffers writes to reduce API calls

  • Reliability: Built-in retry logic, buffering, and backpressure handling

  • Flexibility: Route traffic to multiple destinations (S3, CloudWatch, Elasticsearch, etc.)

  • Filtering: Apply additional filtering, enrichment, or transformation

  • Cost optimization: Batch uploads reduce S3 API costs significantly

Architecture

┌──────────┐  forward  ┌──────────────┐  parse   ┌──────────┐
│   Qtap   │──────────▸│  Fluent Bit  │─────────▸│ Filter & │
│  (eBPF)  │  port     │  (Batching)  │          │  Route   │
└──────────┘  24224    └──────────────┘          └────┬─────┘

                                        ┌──────────────┴──────────────┐
                                        ▼                             ▼
                                  ┌──────────┐                 ┌──────────┐
                                  │    S3    │                 │ Stdout / │
                                  │ (MinIO,  │                 │  Other   │
                                  │ AWS, GCS)│                 └──────────┘
                                  └──────────┘

How it works:

  1. Qtap captures HTTP traffic using eBPF (TLS inspection, no proxies)

  2. Qtap writes HTTP transaction objects as structured logs

  3. Docker forwards logs to Fluent Bit using the fluentd log driver

  4. Fluent Bit parses JSON, filters, and tags HTTP transactions

  5. Fluent Bit batches, buffers, and routes to storage backends

  6. Set TTL policies on storage (recommended: 90 days)

Docker Deployment

This setup uses Docker's fluentd log driver to forward logs directly to Fluent Bit over the network. This approach is simpler and more reliable than tailing log files.

Step 1: Create Qtap Configuration

Create qtap-config.yaml:

version: 2

services:
  event_stores:
    - type: stdout          # Connection metadata

  object_stores:
    - type: stdout          # Full HTTP payloads

stacks:
  capture_all:
    plugins:
      - type: http_capture
        config:
          level: full       # (none|summary|details|full) - Capture headers + bodies
          format: json      # (json|text) - JSON for Fluent Bit parsing

tap:
  direction: egress         # egress | egress-external | egress-internal | ingress | all
  ignore_loopback: true     # Skip localhost traffic
  audit_include_dns: false  # Skip DNS queries
  http:
    stack: capture_all

Step 2: Create Fluent Bit Configuration

Create fluent-bit.conf:

[SERVICE]
    Flush        5
    Daemon       Off
    Log_Level    info
    Parsers_File parsers.conf

# Input: Receive logs via forward protocol
[INPUT]
    Name        forward
    Listen      0.0.0.0
    Port        24224

# Filter: Parse JSON logs from Qtap
[FILTER]
    Name        parser
    Match       docker.qtap
    Key_Name    log
    Parser      generic_json_parser
    Reserve_Data On
    Preserve_Key Off

# Filter: Only keep HTTP transactions (from stdout)
[FILTER]
    Name    grep
    Match   docker.qtap
    Regex   source stdout

# Output: Write to stdout for testing
[OUTPUT]
    Name         stdout
    Match        docker.qtap
    Format       json_lines

How the filters work:

  • The parser filter extracts the JSON from the log field and promotes all JSON fields to top-level

  • The grep filter keeps only records where source equals stdout (HTTP transactions), filtering out stderr logs (operational messages)

  • This ensures only HTTP transaction data is sent to outputs

Create parsers.conf:

[PARSER]
    Name   generic_json_parser
    Format json

Step 3: Create Docker Compose

Create docker-compose.yaml:

version: '3'

services:
  fluent-bit:
    image: fluent/fluent-bit:latest
    container_name: fluent-bit
    network_mode: host
    volumes:
      - ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
      - ./parsers.conf:/fluent-bit/etc/parsers.conf
    command: ["fluent-bit", "-c", "/fluent-bit/etc/fluent-bit.conf"]

  qtap:
    image: us-docker.pkg.dev/qpoint-edge/public/qtap:v0
    container_name: qtap
    depends_on:
      - fluent-bit
    privileged: true
    user: "0:0"
    cap_add:
      - CAP_BPF
      - CAP_SYS_ADMIN
    pid: host
    network_mode: host
    volumes:
      - /sys:/sys
      - ./qtap-config.yaml:/app/config/qtap.yaml
    environment:
      - TINI_SUBREAPER=1
    ulimits:
      memlock: -1
    logging:
      driver: fluentd
      options:
        tag: docker.qtap
        fluentd-address: 127.0.0.1:24224
    command:
      - --log-level=warn
      - --log-encoding=console
      - --config=/app/config/qtap.yaml

Step 4: Start and Validate

# Start services
docker compose up -d

# Wait a few seconds for eBPF initialization
sleep 5

# Generate test traffic
docker run --rm curlimages/curl -s https://httpbin.org/get

# View captured HTTP traffic
docker logs fluent-bit | grep 'metadata'

Expected output (one line per HTTP transaction):

{
  "metadata": {
    "process_id": "12345",
    "process_exe": "/usr/bin/curl",
    "bytes_sent": 41,
    "bytes_received": 395,
    "connection_id": "abc123",
    "endpoint_id": "httpbin.org"
  },
  "request": {
    "method": "GET",
    "url": "https://httpbin.org/get",
    "protocol": "http2",
    "headers": {
      "Accept": "*/*",
      "User-Agent": "curl/8.12.1"
    }
  },
  "response": {
    "status": 200,
    "content_type": "application/json",
    "headers": {
      "Content-Type": "application/json",
      "Server": "gunicorn/19.9.0"
    },
    "body": "ewogICJhcmdzIjoge30sIAogICJoZWFkZXJzIjogeyAuLi4gfQp9Cg=="
  },
  "transaction_time": "2025-10-16T22:42:51.951526489Z",
  "duration_ms": 172,
  "direction": "egress-external"
}

Response bodies are base64 encoded in the body field.

Production Outputs

AWS S3

Replace the stdout output in fluent-bit.conf with:

[OUTPUT]
    Name              s3
    Match             docker.qtap
    bucket            your-bucket-name
    region            us-east-1
    total_file_size   50M
    upload_timeout    10m
    compression       gzip
    s3_key_format     /qtap/year=%Y/month=%m/day=%d/hour=%H/$UUID.gz
    store_dir         /tmp/fluent-bit/s3
    use_put_object    On

Add environment variables to the fluent-bit service in docker-compose:

environment:
  - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
  - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}

Set lifecycle policy:

aws s3api put-bucket-lifecycle-configuration \
  --bucket your-bucket-name \
  --lifecycle-configuration '{
    "Rules": [{
      "Id": "DeleteAfter90Days",
      "Status": "Enabled",
      "Prefix": "qtap/",
      "Expiration": {"Days": 90}
    }]
  }'

MinIO (S3-Compatible)

[OUTPUT]
    Name              s3
    Match             docker.qtap
    endpoint          http://minio:9000
    bucket            qtap-http-traffic
    region            us-east-1
    total_file_size   50M
    compression       gzip
    s3_key_format     /qtap/year=%Y/month=%m/day=%d/$UUID.gz
    use_put_object    On

AWS CloudWatch Logs

[OUTPUT]
    Name              cloudwatch_logs
    Match             docker.qtap
    region            us-east-1
    log_group_name    /qpoint/http-traffic
    log_stream_prefix qtap-
    auto_create_group On

Set retention policy:

aws logs put-retention-policy \
  --log-group-name /qpoint/http-traffic \
  --retention-in-days 90

Multiple Destinations

You can send data to multiple outputs simultaneously:

# Send to S3 for long-term storage
[OUTPUT]
    Name    s3
    Match   docker.qtap
    bucket  your-bucket-name
    region  us-east-1
    ...

# Also send to stdout for debugging
[OUTPUT]
    Name    stdout
    Match   docker.qtap
    Format  json_lines

Filtering and Optimization

Capture Only Errors (4xx/5xx)

To reduce volume, capture only failed requests using Qtap's Rulekit:

stacks:
  error_only:
    plugins:
      - type: http_capture
        config:
          level: none           # Don't capture by default
          format: json
          rules:
            - name: "Capture errors"
              expr: http.res.status >= 400
              level: full       # Capture errors fully

Filter by Domain

Capture only specific domains:

tap:
  http:
    stack: lightweight_stack    # Default: summary level

  endpoints:
    - domain: 'api.important.com'
      http:
        stack: capture_all      # Full capture for this domain

Exclude Noisy Processes

tap:
  filters:
    groups:
      - qpoint              # Exclude qtap's own traffic
    custom:
      - exe: /usr/bin/healthcheck
        strategy: exact
      - exe: /usr/sbin/
        strategy: prefix    # Exclude all /usr/sbin/ processes

Filter in Fluent Bit

You can also filter within Fluent Bit using the grep filter:

# Only capture requests to specific domain
[FILTER]
    Name    grep
    Match   docker.qtap
    Regex   request.url .*api\.important\.com.*

# Exclude health checks
[FILTER]
    Name    grep
    Match   docker.qtap
    Exclude request.url .*/health$

Monitoring and Troubleshooting

Verify Qtap is Capturing Traffic

# Check Qtap logs for HTTP transactions
docker logs qtap | grep '"metadata"'

# Verify TLS detection
docker logs qtap | grep '"is_tls":true'

What to look for:

  • "exe": "/usr/bin/curl" - Process identified correctly

  • "protocol": "http2" or "http1" - NOT "other"

  • "is_tls": true - TLS detected

  • "tlsProbeTypesDetected": ["openssl"] - TLS library hooked

  • Full HTTP details visible despite HTTPS

Verify Fluent Bit is Processing

# Check Fluent Bit logs for errors
docker logs fluent-bit | grep -i "error\|warn"

# Count processed HTTP objects (check for the metadata field)
docker logs fluent-bit | grep -c '"metadata"'

# Check S3 uploads (if using S3 output)
docker logs fluent-bit | grep "s3"

Common Issues

No HTTP objects captured:

  • Qtap must be running before traffic is generated

  • Wait 5-10 seconds after starting qtap for eBPF initialization

  • Check qtap logs for connection events

  • Verify processes are using supported TLS libraries (OpenSSL, BoringSSL, GnuTLS)

Fluent Bit not receiving logs:

  • Check that Fluent Bit started before Qtap

  • Verify port 24224 is accessible from Qtap container

  • Test connectivity: docker exec qtap nc -zv fluent-bit 24224

  • Check Fluent Bit logs for connection messages

No HTTP transactions in Fluent Bit output:

  • Qtap writes HTTP transaction JSON to stdout and other logs to stderr

  • The grep filter Regex source stdout ensures only HTTP transactions are captured

  • Verify parsed records with: docker logs fluent-bit | grep '"source":"stdout"'

  • If you see records with "source":"stderr", those are operational logs, not HTTP data

High memory usage:

  • Reduce Fluent Bit Flush interval (more frequent uploads)

  • Adjust total_file_size for S3 batching (smaller = more frequent uploads)

  • Add filtering to reduce captured volume

  • Enable compression for outputs

S3 upload failures:

  • Verify AWS credentials are correct

  • Check IAM permissions (s3:PutObject required)

  • Ensure bucket exists and is in the correct region

  • Check network connectivity to S3 endpoint

Alternative: File Tailing Approach

If you have existing logging infrastructure or cannot use the forward protocol, you can tail Docker's log files directly:

fluent-bit.conf:

[INPUT]
    Name              tail
    Path              /var/lib/docker/containers/*/*.log
    Parser            docker
    Tag               docker.*
    Refresh_Interval  5
    Read_from_Head    true

[FILTER]
    Name    grep
    Match   docker.*
    Regex   log .*"metadata":.*

docker-compose.yaml changes:

fluent-bit:
  volumes:
    - /var/lib/docker/containers:/var/lib/docker/containers:ro

qtap:
  logging:
    driver: json-file
    options:
      max-size: "10m"
      max-file: "3"

This approach requires more complex path management and may have permission issues. The forward protocol approach is recommended.

Best Practices

  1. Start Small: Begin with error-only capture or specific domains, then expand

  2. Set TTLs: Always configure lifecycle policies (90 days recommended)

  3. Monitor Volume: Track storage growth and adjust filtering as needed

  4. Use IAM Roles: Never hardcode S3 credentials in production

  5. Compress: Enable gzip compression for S3 uploads

  6. Batch Uploads: Use appropriate total_file_size (50M default)

  7. Test Filtering: Validate filters match expected objects

  8. Health Checks: Monitor Fluent Bit metrics and error logs

  9. Backup Config: Version control all configuration files

  10. Security: Limit access to logs - they contain sensitive data

Summary

This guide demonstrated a validated deployment pattern for capturing all HTTP traffic with Fluent Bit using Docker:

Docker Deployment: Qtap stdout → Forward protocol → Fluent Bit → S3/CloudWatch

This pattern provides:

  • ✅ Complete HTTP capture (headers + bodies)

  • ✅ TLS inspection without proxies

  • ✅ Batched, buffered writes for performance

  • ✅ Flexible routing to multiple destinations

  • ✅ Data sovereignty (sensitive data stays in your infrastructure)

For questions or advanced configurations, see:

Last updated