Fluent Bit Batching

Capturing All HTTP Traffic with Fluent Bit

Overview

This guide shows you how to capture every HTTP request and response from Qtap and route it to storage backends using Fluent Bit. This pattern is ideal for comprehensive observability, troubleshooting, audit trails, and compliance requirements where you need complete traffic capture.

Why Fluent Bit?

While Qtap can write directly to S3-compatible object stores, Fluent Bit provides critical advantages at scale:

  • Performance: Batches and buffers writes to reduce API calls

  • Reliability: Built-in retry logic, buffering, and backpressure handling

  • Flexibility: Route traffic to multiple destinations (S3, CloudWatch, Elasticsearch, etc.)

  • Filtering: Apply additional filtering, enrichment, or transformation

  • Cost optimization: Batch uploads reduce S3 API costs significantly

Architecture

┌──────────┐  forward  ┌──────────────┐  parse   ┌──────────┐
│   Qtap   │──────────▸│  Fluent Bit  │─────────▸│ Filter & │
│  (eBPF)  │  port     │  (Batching)  │          │  Route   │
└──────────┘  24224    └──────────────┘          └────┬─────┘

                                        ┌──────────────┴──────────────┐
                                        ▼                             ▼
                                  ┌──────────┐                 ┌──────────┐
                                  │    S3    │                 │ Stdout / │
                                  │ (MinIO,  │                 │  Other   │
                                  │ AWS, GCS)│                 └──────────┘
                                  └──────────┘

How it works:

  1. Qtap captures HTTP traffic using eBPF (TLS inspection, no proxies)

  2. Qtap writes HTTP transaction objects as structured logs

  3. Docker forwards logs to Fluent Bit using the fluentd log driver

  4. Fluent Bit parses JSON, filters, and tags HTTP transactions

  5. Fluent Bit batches, buffers, and routes to storage backends

  6. Set TTL policies on storage (recommended: 90 days)

Docker Deployment

This setup uses Docker's fluentd log driver to forward logs directly to Fluent Bit over the network. This approach is simpler and more reliable than tailing log files.

Step 1: Create Qtap Configuration

Create qtap-config.yaml:

Step 2: Create Fluent Bit Configuration

Create fluent-bit.conf:

How the filters work:

  • The grep filter runs before parsing so it can match the raw log field while it still exists

  • The parser filter then expands the JSON payload and removes the original log key

  • This sequencing ensures only HTTP transaction data reaches downstream outputs

Create parsers.conf:

Step 3: Create Docker Compose

Create docker-compose.yaml:

Step 4: Start and Validate

Expected output (one line per HTTP transaction):

Response bodies are base64 encoded in the body field.

Production Outputs

AWS S3

Replace the stdout output in fluent-bit.conf with:

Add environment variables to the fluent-bit service in docker-compose:

Set lifecycle policy:

MinIO (S3-Compatible)

AWS CloudWatch Logs

Set retention policy:

Multiple Destinations

You can send data to multiple outputs simultaneously:

Filtering and Optimization

Capture Only Errors (4xx/5xx)

To reduce volume, capture only failed requests using Qtap's Rulekit:

Filter by Domain

Capture only specific domains:

Exclude Noisy Processes

Filter in Fluent Bit

You can also filter within Fluent Bit using the grep filter:

When you aggregate logs with Docker's forward log driver, Fluent Bit surfaces the channel as source instead of stream, so update the regex to Regex source stdout in that deployment model.

Monitoring and Troubleshooting

Verify Qtap is Capturing Traffic

What to look for:

  • "exe": "/usr/bin/curl" - Process identified correctly

  • "protocol": "http2" or "http1" - NOT "other"

  • "is_tls": true - TLS detected

  • "tlsProbeTypesDetected": ["openssl"] - TLS library hooked

  • Full HTTP details visible despite HTTPS

Verify Fluent Bit is Processing

Common Issues

No HTTP objects captured:

  • Qtap must be running before traffic is generated

  • Wait 5-10 seconds after starting qtap for eBPF initialization

  • Check qtap logs for connection events

  • Verify processes are using supported TLS libraries (OpenSSL, BoringSSL, GnuTLS)

Fluent Bit not receiving logs:

  • Check that Fluent Bit started before Qtap

  • Verify port 24224 is accessible from Qtap container

  • Test connectivity: docker exec qtap nc -zv 127.0.0.1 24224

  • Check Fluent Bit logs for connection messages

  • If you see Error binding socket, remove any leftover Fluent Bit containers that already bound to port 24224 and re-run docker compose up -d

No HTTP transactions in Fluent Bit output:

  • Qtap writes both HTTP transaction JSON and operational messages to stdout

  • The grep filter Regex log .*"metadata":.* ensures only HTTP transactions are captured

  • Verify HTTP transactions reached Fluent Bit: docker logs fluent-bit | grep '"metadata"'

  • If you see records without "metadata", those are operational logs that should be filtered out

  • Make sure the grep filter runs before the parser filter or leave Preserve_Key On; otherwise the log field will disappear before the match runs

High memory usage:

  • Reduce Fluent Bit Flush interval (more frequent uploads)

  • Adjust total_file_size for S3 batching (smaller = more frequent uploads)

  • Add filtering to reduce captured volume

  • Enable compression for outputs

S3 upload failures:

  • Verify AWS credentials are correct

  • Check IAM permissions (s3:PutObject required)

  • Ensure bucket exists and is in the correct region

  • Check network connectivity to S3 endpoint

Alternative: File Tailing Approach

If you have existing logging infrastructure or cannot use the forward protocol, you can tail Docker's log files directly:

fluent-bit.conf:

docker-compose.yaml changes:

Ensure your parsers.conf file includes both the built-in docker parser and the generic_json_parser definition so the tail input can decode Docker log metadata before emitting HTTP transactions.

This approach requires more complex path management and may have permission issues. The forward protocol approach is recommended.

Best Practices

  1. Start Small: Begin with error-only capture or specific domains, then expand

  2. Set TTLs: Always configure lifecycle policies (90 days recommended)

  3. Monitor Volume: Track storage growth and adjust filtering as needed

  4. Use IAM Roles: Never hardcode S3 credentials in production

  5. Compress: Enable gzip compression for S3 uploads

  6. Batch Uploads: Use appropriate total_file_size (50M default)

  7. Test Filtering: Validate filters match expected objects

  8. Health Checks: Monitor Fluent Bit metrics and error logs

  9. Backup Config: Version control all configuration files

  10. Security: Limit access to logs - they contain sensitive data

Summary

This guide demonstrated a validated deployment pattern for capturing all HTTP traffic with Fluent Bit using Docker:

Docker Deployment: Qtap stdout → Forward protocol → Fluent Bit → S3/CloudWatch

This pattern provides:

  • ✅ Complete HTTP capture (headers + bodies)

  • ✅ TLS inspection without proxies

  • ✅ Batched, buffered writes for performance

  • ✅ Flexible routing to multiple destinations

  • ✅ Data sovereignty (sensitive data stays in your infrastructure)

For questions or advanced configurations, see:

Last updated