HTTPS Header Capture Without Proxies

This guide demonstrates how to use QTap to transparently capture HTTP headers from applications without proxies, code changes, or certificate management. QTap uses eBPF to monitor traffic at the kernel level, capturing data before encryption.

Common Use Cases

  • Service Usage Analytics: Track which users or services access internal applications

  • API Monitoring: Capture headers for authentication, rate limiting, or debugging

  • Traffic Recording: Record production requests for testing or replay

  • Security Auditing: Monitor for unauthorized access or suspicious headers

  • Service Migration: Understand dependencies before deprecating endpoints

How It Works

QTap attaches to the kernel using eBPF and intercepts traffic at the TLS/SSL layer, providing visibility into encrypted traffic without managing certificates or deploying proxies. All capture happens out-of-band with minimal performance impact.

Installation

Quick Install

# Install/Update QTap
curl -s https://get.qpoint.io/install | sudo sh

# Verify installation
sudo qtap --version

Configuration

Create a configuration file at /etc/qtap/qtap-config.yaml:

version: 2

# Storage Configuration
services:
  # Event stores for connection metadata (anonymized)
  event_stores:
    - type: stdout
  
  # Object stores for request/response content
  object_stores:
    - type: s3
      endpoint: s3.amazonaws.com  # Or your S3-compatible endpoint
      bucket: traffic-capture
      region: us-east-1
      access_url: https://s3.amazonaws.com/{{BUCKET}}/{{DIGEST}}
      insecure: false
      access_key:
        type: env
        value: AWS_ACCESS_KEY_ID
      secret_key:
        type: env
        value: AWS_SECRET_ACCESS_KEY

# Processing Stacks
stacks:
  header_capture:
    plugins:
      # HTTP Capture plugin - captures and stores to S3
      - type: http_capture
        config:
          level: details  # Capture headers (use 'full' for bodies too)
          format: json

# Traffic Capture Settings
tap:
  direction: all  # Options: ingress, egress, all
  ignore_loopback: true
  audit_include_dns: false
  http:
    stack: header_capture

Running QTap

Direct Execution

# Set S3 credentials
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"

# Run QTap
sudo qtap --config=/etc/qtap/qtap-config.yaml
  1. Create environment file for credentials:

sudo tee /etc/qtap/environment << EOF
AWS_ACCESS_KEY_ID=your-access-key
AWS_SECRET_ACCESS_KEY=your-secret-key
EOF

sudo chmod 600 /etc/qtap/environment
  1. Create systemd service:

sudo tee /etc/systemd/system/qtap.service << 'EOF'
[Unit]
Description=QTAP Traffic Capture Service
After=network.target

[Service]
Type=simple
User=root
EnvironmentFile=/etc/qtap/environment
ExecStart=/usr/local/bin/qtap --config=/etc/qtap/qtap-config.yaml
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
EOF
  1. Start and enable the service:

sudo systemctl daemon-reload
sudo systemctl enable qtap
sudo systemctl start qtap

# Check status
sudo systemctl status qtap

# View logs
sudo journalctl -u qtap -f

Configuration Examples

Example 1: Capture Only Internal Traffic

version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: s3
      endpoint: s3.amazonaws.com
      bucket: traffic-capture
      region: us-east-1
      access_url: https://s3.amazonaws.com/{{BUCKET}}/{{DIGEST}}
      insecure: false
      access_key:
        type: env
        value: AWS_ACCESS_KEY_ID
      secret_key:
        type: env
        value: AWS_SECRET_ACCESS_KEY

stacks:
  internal_only:
    plugins:
      - type: http_capture
        config:
          level: details
          format: json
          rules:
            # Only capture internal domains
            - name: "Internal traffic"
              expr: http.req.host matches /\.(internal|local|private)$/ || http.req.host matches /^(10|172|192)\./
              level: details

tap:
  direction: ingress  # Monitor incoming traffic to services
  ignore_loopback: true
  audit_include_dns: false
  http:
    stack: internal_only

Example 2: Debug Specific Services

version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: s3
      endpoint: s3.amazonaws.com
      bucket: traffic-capture
      region: us-east-1
      access_url: https://s3.amazonaws.com/{{BUCKET}}/{{DIGEST}}
      insecure: false
      access_key:
        type: env
        value: AWS_ACCESS_KEY_ID
      secret_key:
        type: env
        value: AWS_SECRET_ACCESS_KEY

stacks:
  selective_debug:
    plugins:
      - type: http_capture
        config:
          level: summary  # Default: minimal capture
          format: json
          rules:
            # Full capture for specific API
            - name: "Payment API debugging"
              expr: http.req.host == "payment.api.local" || http.req.path contains "/payment/"
              level: full
            
            # Capture headers for auth endpoints
            - name: "Auth monitoring"
              expr: http.req.path contains "/auth/" || http.req.path contains "/login"
              level: details
            
            # Debug containers with label
            - name: "Container debugging"
              expr: src.container.labels.debug == "true"
              level: full

tap:
  direction: all
  ignore_loopback: true
  audit_include_dns: false
  http:
    stack: selective_debug

Example 3: Production Traffic Recording

version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: s3
      endpoint: s3.amazonaws.com
      bucket: traffic-capture
      region: us-east-1
      access_url: https://s3.amazonaws.com/{{BUCKET}}/{{DIGEST}}
      insecure: false
      access_key:
        type: env
        value: AWS_ACCESS_KEY_ID
      secret_key:
        type: env
        value: AWS_SECRET_ACCESS_KEY

stacks:
  production_recording:
    plugins:
      - type: http_capture
        config:
          level: full  # Capture everything for replay
          format: json
          rules:
            # Skip health checks
            - name: "Ignore health checks"
              expr: http.req.path in ["/health", "/ping", "/metrics"]
              level: none
            
            # Sample high-volume endpoints
            - name: "Sample read endpoints"
              expr: http.req.method == "GET" && http.req.path contains "/api/v1/list"
              level: summary  # Reduce data for high-volume reads

tap:
  direction: ingress
  ignore_loopback: true
  audit_include_dns: false
  http:
    stack: production_recording
  
  # Filter out noisy processes
  filters:
    groups:
      - kubernetes
      - qpoint
    custom:
      - exe: /usr/bin/prometheus
        strategy: exact

Alternative: Using MinIO (S3-Compatible Storage)

For self-hosted S3-compatible storage like MinIO, adjust the object store configuration:

object_stores:
  - type: s3
    endpoint: minio.internal:9000  # Your MinIO endpoint
    bucket: traffic-capture
    region: us-east-1
    access_url: http://minio.internal:9000/{{BUCKET}}/{{DIGEST}}
    insecure: true  # Set to false if using HTTPS
    access_key:
      type: env
      value: MINIO_ACCESS_KEY
    secret_key:
      type: env
      value: MINIO_SECRET_KEY

Understanding Capture Levels

http_capture Plugin Levels

  • none: No capture (disables the rule)

  • summary: Basic metadata (method, path, status code)

  • details: Includes all headers (recommended for user tracking)

  • full: Complete request/response including bodies

Rule Expression Syntax

QTAP uses Rulekit for filtering. Common expressions:

# Match by host
expr: http.req.host == "api.example.com"

# Match by path pattern
expr: http.req.path contains "/api/"

# Match by status code
expr: http.res.status >= 400

# Match by header
expr: http.req.header.authorization != ""

# Combine conditions
expr: http.req.method == "POST" && http.res.status == 200

# Match container labels
expr: src.container.labels.app == "frontend"

# Match IP ranges
expr: http.req.host matches /^10\./ || http.req.host matches /^192\.168\./

Captured Data Format

Example of captured HTTP transaction with headers:

{
  "timestamp": "2024-10-15T10:23:45Z",
  "direction": "ingress",
  "source": {
    "ip": "10.0.1.50",
    "port": 54321,
    "process": {
      "binary": "/usr/bin/node"
    }
  },
  "destination": {
    "ip": "10.0.2.100",
    "port": 8080
  },
  "http": {
    "method": "GET",
    "path": "/api/v1/users",
    "host": "api.internal.com",
    "headers": {
      "user-agent": "Mozilla/5.0",
      "authorization": "Bearer eyJ...",
      "x-request-id": "abc-123",
      "x-user-id": "user-456",
      "content-type": "application/json"
    },
    "status": 200,
    "response_headers": {
      "content-type": "application/json",
      "x-response-time": "124ms"
    }
  }
}

Analyzing Captured Data

Query S3 with AWS CLI

# List captured files
aws s3 ls s3://traffic-capture/ --recursive

# Download and analyze
aws s3 cp s3://traffic-capture/2024/10/15/capture.json.gz - | \
  gunzip | \
  jq '.http.headers'

Extract User Analytics

# Find unique users accessing a service
aws s3 cp s3://traffic-capture/ . --recursive --exclude "*" --include "*.json.gz"

for file in *.json.gz; do
  gunzip -c "$file" | jq -r '.http.headers["x-user-id"]'
done | sort | uniq -c

Create Usage Report

import json
import gzip
import boto3
from collections import defaultdict

s3 = boto3.client('s3')
bucket = 'traffic-capture'

# Aggregate usage by endpoint
usage = defaultdict(set)

paginator = s3.get_paginator('list_objects_v2')
for page in paginator.paginate(Bucket=bucket):
    for obj in page.get('Contents', []):
        response = s3.get_object(Bucket=bucket, Key=obj['Key'])
        with gzip.GzipFile(fileobj=response['Body']) as gz:
            data = json.load(gz)
            endpoint = f"{data['http']['method']} {data['http']['path']}"
            user = data['http']['headers'].get('x-user-id', 'anonymous')
            usage[endpoint].add(user)

# Print usage report
for endpoint, users in sorted(usage.items()):
    print(f"{endpoint}: {len(users)} unique users")

Performance Considerations

  • Capture Level: Use details for headers only, full only when bodies are needed

  • Filtering: Use rules to limit capture to relevant traffic

  • Sampling: For high-volume services, consider sampling strategies

  • Storage: Rotate S3 data based on retention requirements

  • Process Filtering: Exclude noisy system processes

Troubleshooting

Common Issues

No data captured:

  • Verify QTAP is running: sudo systemctl status qtap

  • Check logs: sudo journalctl -u qtap -n 100

  • Ensure traffic matches direction setting (ingress vs egress)

  • Verify HTTP traffic is on expected ports

S3 upload failures:

  • Test credentials: aws s3 ls s3://traffic-capture/

  • Check bucket permissions and region

  • Verify network connectivity to S3

Missing headers:

  • Ensure capture level is details or full

  • Verify the http_capture plugin is configured

  • Check that traffic is HTTP/HTTPS (not other protocols)

High memory usage:

  • Reduce capture level from full to details

  • Add filtering rules to limit captured traffic

  • Increase sampling intervals for high-volume endpoints

Debug Mode

Run QTAP in debug mode for troubleshooting:

sudo qtap --config=/etc/qtap/qtap-config.yaml --log-level=debug

Security Best Practices

  1. Credential Management: Use environment variables or IAM roles for S3 credentials

  2. Data Retention: Implement S3 lifecycle policies for automatic data expiration

  3. Access Control: Restrict S3 bucket access to authorized users only

  4. Sensitive Data: Consider filtering out sensitive headers before storage

  5. Encryption: Enable S3 server-side encryption for stored data

Summary

QTap provides transparent HTTP header capture without requiring proxies or code changes. By leveraging eBPF, it captures traffic at the kernel level with minimal performance impact, making it ideal for:

  • Understanding service dependencies

  • Tracking API usage patterns

  • Debugging production issues

  • Recording traffic for testing

  • Security auditing

The combination of flexible filtering rules and native S3 integration makes QTAP a powerful tool for gaining visibility into your HTTP traffic.

Last updated