Fluent Bit Batching

Capturing All HTTP Traffic with Fluent Bit

Overview

This guide shows you how to capture every HTTP request and response from Qtap and route it to storage backends using Fluent Bit. This pattern is ideal for comprehensive observability, troubleshooting, audit trails, and compliance requirements where you need complete traffic capture.

Why Fluent Bit?

While Qtap can write directly to S3-compatible object stores, Fluent Bit provides critical advantages at scale:

Performance: Batches and buffers writes to reduce API calls
Reliability: Built-in retry logic, buffering, and backpressure handling
Flexibility: Route traffic to multiple destinations (S3, CloudWatch, Elasticsearch, etc.)
Filtering: Apply additional filtering, enrichment, or transformation
Cost optimization: Batch uploads reduce S3 API costs significantly

Architecture

┌──────────┐  forward  ┌──────────────┐  parse   ┌──────────┐
│   Qtap   │──────────▸│  Fluent Bit  │─────────▸│ Filter & │
│  (eBPF)  │  port     │  (Batching)  │          │  Route   │
└──────────┘  24224    └──────────────┘          └────┬─────┘
                                                       │
                                        ┌──────────────┴──────────────┐
                                        ▼                             ▼
                                  ┌──────────┐                 ┌──────────┐
                                  │    S3    │                 │ Stdout / │
                                  │ (MinIO,  │                 │  Other   │
                                  │ AWS, GCS)│                 └──────────┘
                                  └──────────┘

How it works:

Qtap captures HTTP traffic using eBPF (TLS inspection, no proxies)
Qtap writes HTTP transaction objects as structured logs
Docker forwards logs to Fluent Bit using the fluentd log driver
Fluent Bit parses JSON, filters, and tags HTTP transactions
Fluent Bit batches, buffers, and routes to storage backends
Set TTL policies on storage (recommended: 90 days)

Docker Deployment

This setup uses Docker's fluentd log driver to forward logs directly to Fluent Bit over the network. This approach is simpler and more reliable than tailing log files.

Step 1: Create Qtap Configuration

Create qtap-config.yaml:

version: 2

services:
  event_stores:
    - type: stdout          # Connection metadata

  object_stores:
    - type: stdout          # Full HTTP payloads

stacks:
  capture_all:
    plugins:
      - type: http_capture
        config:
          level: full       # (none|summary|headers|full) - Capture headers + bodies
          format: json      # (json|text) - JSON for Fluent Bit parsing

tap:
  direction: egress         # egress | egress-external | egress-internal | ingress | all
  ignore_loopback: true     # Skip localhost traffic
  audit_include_dns: false  # Skip DNS queries
  http:
    stack: capture_all

Step 2: Create Fluent Bit Configuration

Create fluent-bit.conf:

[SERVICE]
    Flush        5
    Daemon       Off
    Log_Level    info
    Parsers_File parsers.conf

# Input: Receive logs via forward protocol
[INPUT]
    Name        forward
    Listen      0.0.0.0
    Port        24224

# Filter: Only keep HTTP transactions (lines with metadata field)
[FILTER]
    Name    grep
    Match   docker.qtap     # Matches the logging tag configured in docker-compose
    Regex   log .*"metadata":.*

# Filter: Parse JSON logs from Qtap (runs after grep so the log field still exists)
[FILTER]
    Name        parser
    Match       docker.qtap
    Key_Name    log
    Parser      generic_json_parser
    Reserve_Data On
    Preserve_Key Off

# Output: Write to stdout for testing
[OUTPUT]
    Name         stdout
    Match        docker.qtap
    Format       json_lines

How the filters work:

The grep filter runs before parsing so it can match the raw log field while it still exists
The parser filter then expands the JSON payload and removes the original log key
This sequencing ensures only HTTP transaction data reaches downstream outputs

Create parsers.conf:

[PARSER]
    Name         docker
    Format       json
    Time_Key     time
    Time_Format  %Y-%m-%dT%H:%M:%S.%L
    Time_Keep    On

[PARSER]
    Name   generic_json_parser
    Format json

Step 3: Create Docker Compose

Create docker-compose.yaml:

version: '3'

services:
  fluent-bit:
    image: fluent/fluent-bit:latest
    container_name: fluent-bit
    network_mode: host
    volumes:
      - ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
      - ./parsers.conf:/fluent-bit/etc/parsers.conf
    command: ["fluent-bit", "-c", "/fluent-bit/etc/fluent-bit.conf"]

  qtap:
    image: us-docker.pkg.dev/qpoint-edge/public/qtap:v0
    container_name: qtap
    depends_on:
      - fluent-bit
    privileged: true
    user: "0:0"
    cap_add:
      - CAP_BPF
      - CAP_SYS_ADMIN
    pid: host
    network_mode: host
    volumes:
      - /sys:/sys
      - ./qtap-config.yaml:/app/config/qtap.yaml
    environment:
      - TINI_SUBREAPER=1
    ulimits:
      memlock: -1
    logging:
      driver: fluentd
      options:
        tag: docker.qtap       # Must match the Fluent Bit Match pattern
        fluentd-address: 127.0.0.1:24224
    command:
      - --log-level=info
      - --log-encoding=console
      - --config=/app/config/qtap.yaml

Step 4: Start and Validate

# Start services
docker compose up -d

# Wait a few seconds for eBPF initialization
sleep 5

# Generate test traffic
docker run --rm curlimages/curl -s https://httpbin.org/get

# View captured HTTP traffic
docker logs fluent-bit | grep 'metadata'

Expected output (one line per HTTP transaction):

{
  "metadata": {
    "process_id": "108520",
    "process_exe": "/usr/bin/curl",
    "bytes_sent": 41,
    "bytes_received": 395,
    "connection_id": "d3shhpg7p3qm85psn2rg",
    "endpoint_id": "httpbin.org"
  },
  "request": {
    "method": "GET",
    "url": "https://httpbin.org/get",
    "scheme": "https",
    "path": "/get",
    "authority": "httpbin.org",
    "protocol": "http2",
    "request_id": "d3shhpg7p3qm85psn2s0",
    "user_agent": "curl/8.12.1",
    "headers": {
      ":authority": "httpbin.org",
      ":method": "GET",
      ":path": "/get",
      ":scheme": "https",
      "Accept": "*/*",
      "User-Agent": "curl/8.12.1"
    }
  },
  "response": {
    "status": 200,
    "content_type": "application/json",
    "headers": {
      ":status": "200",
      "Access-Control-Allow-Credentials": "true",
      "Access-Control-Allow-Origin": "*",
      "Content-Length": "255",
      "Content-Type": "application/json",
      "Date": "Wed, 22 Oct 2025 17:48:54 GMT",
      "Server": "gunicorn/19.9.0"
    },
    "body": "ewogICJhcmdzIjoge30sIAogICJoZWFkZXJzIjogewogICAgIkFjY2VwdCI6ICIqLyoiLCAKICAgICJIb3N0IjogImh0dHBiaW4ub3JnIiwgCiAgICAiVXNlci1BZ2VudCI6ICJjdXJsLzguMTIuMSIsIAogICAgIlgtQW16bi1UcmFjZS1JZCI6ICJSb290PTEtNjhmOTE4ZTYtNmRiZGJlZDA0ZDllMjNhOTU1NjQ0YmEyIgogIH0sIAogICJvcmlnaW4iOiAiNzMuNzEuMTM4LjEwOCIsIAogICJ1cmwiOiAiaHR0cHM6Ly9odHRwYmluLm9yZy9nZXQiCn0K"
  },
  "transaction_time": "2025-10-22T17:48:54.114375279Z",
  "duration_ms": 31205,
  "direction": "egress-external",
  "container_id": "49c2e0af4ffbbfff89be4766c6f84d7a3ad8e528d88fafeb3010facdb6374fb7",
  "container_name": "/qtap",
  "source": "stdout"
}

Response bodies are base64 encoded in the body field.

Production Outputs

AWS S3

Replace the stdout output in fluent-bit.conf with:

[OUTPUT]
    Name              s3
    Match             docker.qtap
    bucket            your-bucket-name
    region            us-east-1
    total_file_size   50M
    upload_timeout    10m
    compression       gzip
    s3_key_format     /qtap/year=%Y/month=%m/day=%d/hour=%H/$UUID.gz
    store_dir         /tmp/fluent-bit/s3
    use_put_object    On

Add environment variables to the fluent-bit service in docker-compose:

environment:
  - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
  - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}

Set lifecycle policy:

aws s3api put-bucket-lifecycle-configuration \
  --bucket your-bucket-name \
  --lifecycle-configuration '{
    "Rules": [{
      "Id": "DeleteAfter90Days",
      "Status": "Enabled",
      "Prefix": "qtap/",
      "Expiration": {"Days": 90}
    }]
  }'

MinIO (S3-Compatible)

[OUTPUT]
    Name              s3
    Match             docker.qtap
    endpoint          http://minio:9000
    bucket            qtap-http-traffic
    region            us-east-1
    total_file_size   50M
    compression       gzip
    s3_key_format     /qtap/year=%Y/month=%m/day=%d/$UUID.gz
    use_put_object    On

AWS CloudWatch Logs

[OUTPUT]
    Name              cloudwatch_logs
    Match             docker.qtap
    region            us-east-1
    log_group_name    /qpoint/http-traffic
    log_stream_prefix qtap-
    auto_create_group On

Set retention policy:

aws logs put-retention-policy \
  --log-group-name /qpoint/http-traffic \
  --retention-in-days 90

Multiple Destinations

You can send data to multiple outputs simultaneously:

# Send to S3 for long-term storage
[OUTPUT]
    Name              s3
    Match             docker.qtap
    bucket            your-bucket-name
    region            us-east-1
    total_file_size   50M
    upload_timeout    10m
    compression       gzip
    s3_key_format     /qtap/year=%Y/month=%m/day=%d/hour=%H/$UUID.gz
    store_dir         /tmp/fluent-bit/s3
    use_put_object    On

# Also send to stdout for debugging
[OUTPUT]
    Name    stdout
    Match   docker.qtap
    Format  json_lines

Filtering and Optimization

Capture Only Errors (4xx/5xx)

To reduce volume, capture only failed requests using Qtap's Rulekit:

stacks:
  error_only:
    plugins:
      - type: http_capture
        config:
          level: none           # Don't capture by default
          format: json
          rules:
            - name: "Capture errors"
              expr: http.res.status >= 400
              level: full       # Capture errors fully

Filter by Domain

Capture only specific domains:

tap:
  http:
    stack: lightweight_stack    # Default: summary level

  endpoints:
    - domain: 'api.important.com'
      http:
        stack: capture_all      # Full capture for this domain

Exclude Noisy Processes

tap:
  filters:
    groups:
      - qpoint              # Exclude qtap's own traffic
    custom:
      - exe: /usr/bin/healthcheck
        strategy: exact
      - exe: /usr/sbin/
        strategy: prefix    # Exclude all /usr/sbin/ processes

Filter in Fluent Bit

You can also filter within Fluent Bit using the grep filter:

# Only capture requests to specific domain
[FILTER]
    Name    grep
    Match   docker.qtap
    Regex   request.url .*api\.important\.com.*

# Only keep stdout log lines (tail input exposes `stream`, the forward driver exposes `source`)
[FILTER]
    Name    grep
    Match   docker.*
    Regex   stream stdout

# Exclude health checks
[FILTER]
    Name    grep
    Match   docker.qtap
    Exclude request.url .*/health$

When you aggregate logs with Docker's forward log driver, Fluent Bit surfaces the channel as source instead of stream, so update the regex to Regex source stdout in that deployment model.

Monitoring and Troubleshooting

Verify Qtap is Capturing Traffic

# Check Qtap logs for HTTP transactions
docker logs qtap | grep '"metadata"'

# Verify TLS detection
docker logs qtap | grep '"is_tls":true'

What to look for:

"exe": "/usr/bin/curl" - Process identified correctly
"protocol": "http2" or "http1" - NOT "other"
"is_tls": true - TLS detected
"tlsProbeTypesDetected": ["openssl"] - TLS library hooked
Full HTTP details visible despite HTTPS

Verify Fluent Bit is Processing

# Check Fluent Bit logs for errors
docker logs fluent-bit | grep -i "error\|warn"

# Count processed HTTP objects (check for the metadata field)
docker logs fluent-bit | grep -c '"metadata"'

# Check S3 uploads (if using S3 output)
docker logs fluent-bit | grep "s3"

Common Issues

No HTTP objects captured:

Qtap must be running before traffic is generated
Wait 5-10 seconds after starting qtap for eBPF initialization
Check qtap logs for connection events
Verify processes are using supported TLS libraries (OpenSSL, BoringSSL, GnuTLS)

Fluent Bit not receiving logs:

Check that Fluent Bit started before Qtap
Verify port 24224 is accessible from Qtap container
Test connectivity: docker exec qtap nc -zv 127.0.0.1 24224
Check Fluent Bit logs for connection messages
If you see Error binding socket, remove any leftover Fluent Bit containers that already bound to port 24224 and re-run docker compose up -d

No HTTP transactions in Fluent Bit output:

Qtap writes both HTTP transaction JSON and operational messages to stdout
The grep filter Regex log .*"metadata":.* ensures only HTTP transactions are captured
Verify HTTP transactions reached Fluent Bit: docker logs fluent-bit | grep '"metadata"'
If you see records without "metadata", those are operational logs that should be filtered out
Make sure the grep filter runs before the parser filter or leave Preserve_Key On; otherwise the log field will disappear before the match runs

High memory usage:

Reduce Fluent Bit Flush interval (more frequent uploads)
Adjust total_file_size for S3 batching (smaller = more frequent uploads)
Add filtering to reduce captured volume
Enable compression for outputs

S3 upload failures:

Verify AWS credentials are correct
Check IAM permissions (s3:PutObject required)
Ensure bucket exists and is in the correct region
Check network connectivity to S3 endpoint

Alternative: File Tailing Approach

If you have existing logging infrastructure or cannot use the forward protocol, you can tail Docker's log files directly:

fluent-bit.conf:

[INPUT]
    Name              tail
    Path              /var/lib/docker/containers/*/*.log
    Parser            docker
    Tag               docker.*
    Refresh_Interval  5
    Read_from_Head    true

# Keep only stdout log lines (tail input exposes the field as `stream`)
[FILTER]
    Name    grep
    Match   docker.*
    Regex   stream stdout

[FILTER]
    Name    grep
    Match   docker.*
    Regex   log .*"metadata":.*

# Parse the JSON payload after filtering
[FILTER]
    Name        parser
    Match       docker.*
    Key_Name    log
    Parser      generic_json_parser
    Reserve_Data On
    Preserve_Key Off

docker-compose.yaml changes:

fluent-bit:
  volumes:
    - /var/lib/docker/containers:/var/lib/docker/containers:ro

qtap:
  logging:
    driver: json-file
    options:
      max-size: "10m"
      max-file: "3"

Ensure your parsers.conf file includes both the built-in docker parser and the generic_json_parser definition so the tail input can decode Docker log metadata before emitting HTTP transactions.

This approach requires more complex path management and may have permission issues. The forward protocol approach is recommended.

Best Practices

Start Small: Begin with error-only capture or specific domains, then expand
Set TTLs: Always configure lifecycle policies (90 days recommended)
Monitor Volume: Track storage growth and adjust filtering as needed
Use IAM Roles: Never hardcode S3 credentials in production
Compress: Enable gzip compression for S3 uploads
Batch Uploads: Use appropriate total_file_size (50M default)
Test Filtering: Validate filters match expected objects
Health Checks: Monitor Fluent Bit metrics and error logs
Backup Config: Version control all configuration files
Security: Limit access to logs - they contain sensitive data

Summary

This guide demonstrated a validated deployment pattern for capturing all HTTP traffic with Fluent Bit using Docker:

Docker Deployment: Qtap stdout → Forward protocol → Fluent Bit → S3/CloudWatch

This pattern provides:

✅ Complete HTTP capture (headers + bodies)
✅ TLS inspection without proxies
✅ Batched, buffered writes for performance
✅ Flexible routing to multiple destinations
✅ Data sovereignty (sensitive data stays in your infrastructure)

For questions or advanced configurations, see:

PreviousOpenTelemetry Integration NextDebugging & Troubleshooting

Last updated 13 days ago

hashtagCapturing All HTTP Traffic with Fluent Bit

hashtagOverview

hashtagWhy Fluent Bit?

hashtagArchitecture

hashtagDocker Deployment

hashtagStep 1: Create Qtap Configuration

hashtagStep 2: Create Fluent Bit Configuration

hashtagStep 3: Create Docker Compose

hashtagStep 4: Start and Validate

hashtagProduction Outputs

hashtagAWS S3

hashtagMinIO (S3-Compatible)

hashtagAWS CloudWatch Logs

hashtagMultiple Destinations

hashtagFiltering and Optimization

hashtagCapture Only Errors (4xx/5xx)

hashtagFilter by Domain

hashtagExclude Noisy Processes

hashtagFilter in Fluent Bit

hashtagMonitoring and Troubleshooting

hashtagVerify Qtap is Capturing Traffic

hashtagVerify Fluent Bit is Processing

hashtagCommon Issues

hashtagAlternative: File Tailing Approach

hashtagBest Practices

hashtagSummary