# NGINX Traffic Capture

This guide shows you how to use Qtap to capture HTTP traffic flowing through **NGINX**, one of the most popular web servers and reverse proxies. You'll learn how to observe both incoming client requests and outgoing upstream connections, all without proxies or code changes.

## What You'll Learn

* Capture NGINX ingress traffic (client requests coming in)
* Capture NGINX egress traffic (upstream service requests going out)
* Monitor both sides of a reverse proxy simultaneously
* Apply conditional capture rules for specific endpoints
* Set up NGINX + Qtap in Docker for testing
* Deploy production-ready configurations with S3 storage

## Use Cases

**Why capture NGINX traffic?**

* **Reverse Proxy Visibility**: See both client requests and upstream responses in one place
* **Performance Analysis**: Measure latency between client→nginx and nginx→upstream
* **API Gateway Monitoring**: Track all API calls flowing through your gateway
* **Security Auditing**: Detect malicious requests or data exfiltration attempts
* **Troubleshooting**: Debug issues with request/response transformations
* **Compliance**: Audit all traffic for regulatory requirements
* **Load Balancer Analytics**: Understand traffic distribution patterns

***

## Prerequisites

* Linux system with kernel 5.10+ and eBPF support
* Docker installed (for this guide's examples)
* Root/sudo access
* Basic understanding of NGINX configuration

***

## Part 1: Simple NGINX Web Server

Let's start with a basic NGINX setup serving static content and reverse proxying to an upstream service.

### Step 1: Create NGINX Configuration

Create a directory for our demo:

```bash
mkdir nginx-qtap-demo
cd nginx-qtap-demo
```

Create `nginx.conf`:

```nginx
events {
    worker_connections 1024;
}

http {
    # Enable access logs for debugging
    access_log /var/log/nginx/access.log;
    error_log /var/log/nginx/error.log;

    # Upstream service (backend API)
    upstream backend_api {
        server httpbin.org:80;
    }

    server {
        listen 8080;
        server_name localhost;

        # Static content endpoint
        location / {
            return 200 'Hello from NGINX!\n';
            add_header Content-Type text/plain;
        }

        # Health check endpoint
        location /health {
            return 200 'OK\n';
            add_header Content-Type text/plain;
        }

        # Reverse proxy to upstream API
        location /api/ {
            # Remove /api prefix before forwarding
            rewrite ^/api/(.*)$ /$1 break;

            proxy_pass http://httpbin.org;
            proxy_set_header Host httpbin.org;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }

        # Another upstream for testing
        location /example/ {
            proxy_pass https://example.com/;
            proxy_set_header Host example.com;
        }
    }
}
```

### Step 2: Create Qtap Configuration

Create `qtap.yaml`:

```yaml
version: 2

# Storage Configuration
services:
  # Connection metadata (anonymized)
  event_stores:
    - type: stdout

  # HTTP request/response data (sensitive)
  object_stores:
    - type: stdout

# Processing Stack
stacks:
  nginx_capture:
    plugins:
      - type: http_capture
        config:
          level: full      # (none|summary|headers|full) - Capture everything
          format: text     # (json|text) - Human-readable format

# Traffic Capture Settings
tap:
  direction: all           # (egress|ingress|all) - Capture BOTH client requests AND upstream calls
  ignore_loopback: false   # (true|false) - Capture localhost (nginx often uses loopback)
  audit_include_dns: false # (true|false) - Skip DNS queries for cleaner output

  http:
    stack: nginx_capture   # Use our nginx processing stack

  # Optional: Filter out health check noise
  filters:
    groups:
      - qpoint             # Don't capture qtap's own traffic
```

**Key Configuration Points:**

* **`direction: all`** - Captures both incoming (client→nginx) AND outgoing (nginx→upstream) traffic
* **`ignore_loopback: false`** - Important! NGINX often communicates via localhost
* **`level: full`** - Captures complete requests/responses including bodies

### Step 3: Create Docker Compose Setup

Create `docker-compose.yaml`:

```yaml
version: '3.8'

services:
  # NGINX web server
  nginx:
    image: nginx:alpine
    container_name: nginx-demo
    ports:
      - "8080:8080"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
    networks:
      - demo-network

  # Qtap agent
  qtap:
    image: us-docker.pkg.dev/qpoint-edge/public/qtap:v0
    container_name: qtap-nginx
    privileged: true
    user: "0:0"
    cap_add:
      - CAP_BPF
      - CAP_SYS_ADMIN
    pid: host
    network_mode: host
    volumes:
      - /sys:/sys
      - /var/run/docker.sock:/var/run/docker.sock
      - ./qtap.yaml:/app/config/qtap.yaml
    environment:
      - TINI_SUBREAPER=1
    ulimits:
      memlock: -1
    command:
      - --log-level=info
      - --log-encoding=console
      - --config=/app/config/qtap.yaml

networks:
  demo-network:
    driver: bridge
```

***

## Part 2: Running and Testing

### Step 1: Start the Services

```bash
# Start NGINX and Qtap
docker compose up -d

# Wait for Qtap to initialize (CRITICAL - must happen before traffic!)
sleep 6

# Verify NGINX is running
curl http://localhost:8080/
# Expected: "Hello from NGINX!"
```

### Step 2: Generate Test Traffic

```bash
# Test 1: Simple GET to NGINX (INGRESS only - nginx returns static response)
curl http://localhost:8080/

# Test 2: Health check (INGRESS only)
curl http://localhost:8080/health

# Test 3: Reverse proxy to httpbin.org (INGRESS + EGRESS)
# You'll see TWO captures: client→nginx AND nginx→httpbin.org
curl http://localhost:8080/api/get

# Test 4: POST with JSON body through reverse proxy
curl -X POST http://localhost:8080/api/post \
  -H "Content-Type: application/json" \
  -H "X-Custom-Header: test-value" \
  -d '{"username": "alice", "action": "login"}'

# Test 5: GET to example.com through nginx
curl http://localhost:8080/example/

# Test 6: Generate multiple requests to see traffic patterns
for i in {1..5}; do
  curl -s http://localhost:8080/api/uuid
  sleep 1
done
```

### Step 3: View Captured Traffic

```bash
# View Qtap logs
docker logs qtap-nginx

# Filter for nginx process
docker logs qtap-nginx 2>&1 | grep -A 30 "nginx"

# Count captured transactions
docker logs qtap-nginx 2>&1 | grep -c "HTTP Transaction"
```

**What you should see:**

```
=== HTTP Transaction ===
Source Process: nginx (PID: 123, Container: nginx-demo)
Direction: INGRESS ← (client to nginx)
Method: POST
URL: http://localhost:8080/api/post
Status: 200 OK
Duration: 12ms

--- Request Headers ---
Host: localhost:8080
User-Agent: curl/7.81.0
Content-Type: application/json
X-Custom-Header: test-value

--- Request Body ---
{"username": "alice", "action": "login"}

--- Response Headers ---
Content-Type: application/json
Content-Length: 523

--- Response Body ---
{
  "args": {},
  "data": "{\"username\": \"alice\", \"action\": \"login\"}",
  "headers": {
    "Host": "httpbin.org",
    "X-Real-Ip": "172.18.0.1",
    "X-Forwarded-For": "172.18.0.1"
  },
  "json": {
    "username": "alice",
    "action": "login"
  },
  "url": "http://httpbin.org/post"
}
========================

=== HTTP Transaction ===
Source Process: nginx (PID: 123, Container: nginx-demo)
Direction: EGRESS → (nginx to upstream)
Method: POST
URL: http://httpbin.org/post
Status: 200 OK
Duration: 245ms

--- Request Headers ---
Host: httpbin.org
X-Real-IP: 172.18.0.1
X-Forwarded-For: 172.18.0.1
Content-Type: application/json

--- Request Body ---
{"username": "alice", "action": "login"}
========================
```

**Key indicators that it's working:**

* ✅ `"exe": "/usr/sbin/nginx"` - NGINX process identified
* ✅ `Direction: INGRESS` - Client to NGINX
* ✅ `Direction: EGRESS` - NGINX to upstream
* ✅ **Two transactions** for proxied requests (one ingress, one egress)
* ✅ Custom headers visible (`X-Custom-Header`, `X-Real-IP`)
* ✅ Full request/response bodies captured
* ✅ Latency tracked for both hops

***

## Part 3: Advanced Configurations

### Configuration 1: Capture Only Errors

Reduce volume by capturing only failed requests (4xx/5xx status codes):

```yaml
version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: stdout

# Define reusable macros
rulekit:
  macros:
    - name: is_error
      expr: http.res.status >= 400 && http.res.status < 600

stacks:
  error_only:
    plugins:
      - type: http_capture
        config:
          level: none        # Don't capture by default
          format: json
          rules:
            - name: "Capture errors only"
              expr: is_error()
              level: full    # But capture errors fully

tap:
  direction: all
  ignore_loopback: false
  http:
    stack: error_only
```

Test it:

```bash
# This should NOT be captured (200 OK)
curl http://localhost:8080/

# This SHOULD be captured (404)
curl http://localhost:8080/nonexistent

# This SHOULD be captured (500 - if you create an error endpoint)
curl http://localhost:8080/api/status/500
```

### Configuration 2: Separate Ingress and Egress Stacks

Apply different capture levels to ingress vs. egress traffic:

```yaml
version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: stdout

# Lightweight stack for ingress (headers only)
stacks:
  ingress_light:
    plugins:
      - type: http_capture
        config:
          level: headers     # (summary|headers|full) - Headers only, no bodies
          format: json

  # Full capture for egress (to debug upstream issues)
  egress_full:
    plugins:
      - type: http_capture
        config:
          level: full        # Everything including bodies
          format: json

tap:
  direction: all
  ignore_loopback: false

  # Default stack for ingress
  http:
    stack: ingress_light

  # Override for specific upstream domains
  endpoints:
    - domain: 'httpbin.org'
      http:
        stack: egress_full   # Full capture for httpbin.org calls
```

### Configuration 3: Filter by API Endpoint

Capture only specific API paths using Rulekit:

```yaml
version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: stdout

rulekit:
  macros:
    - name: is_api_endpoint
      expr: http.req.path matches /^\/api\//

    - name: is_auth_endpoint
      expr: http.req.path matches /^\/api\/auth\//

    - name: is_sensitive
      expr: is_auth_endpoint() || http.req.method == "POST"

stacks:
  selective_capture:
    plugins:
      - type: http_capture
        config:
          level: none        # Don't capture by default
          format: json
          rules:
            # Capture all authentication requests with full details
            - name: "Auth endpoints"
              expr: is_auth_endpoint()
              level: full

            # Capture API errors
            - name: "API errors"
              expr: is_api_endpoint() && http.res.status >= 400
              level: full

            # Capture POST requests (likely mutations)
            - name: "POST requests"
              expr: http.req.method == "POST"
              level: headers  # Headers only for volume control

tap:
  direction: all
  ignore_loopback: false
  http:
    stack: selective_capture
```

### Configuration 4: Production Setup with S3

For production, store sensitive data in your own S3 bucket:

```yaml
version: 2

services:
  # Metadata to stdout (for monitoring)
  event_stores:
    - type: stdout

  # Sensitive data to S3 (never leaves your infrastructure)
  object_stores:
    - type: s3
      endpoint: s3.amazonaws.com
      region: us-east-1
      bucket: my-company-nginx-traffic
      access_key:
        type: env
        value: AWS_ACCESS_KEY_ID
      secret_key:
        type: env
        value: AWS_SECRET_ACCESS_KEY
      insecure: false

stacks:
  production_capture:
    plugins:
      - type: http_capture
        config:
          level: full
          format: json
          rules:
            # Only capture errors in production
            - name: "Production errors"
              expr: http.res.status >= 400
              level: full

tap:
  direction: all
  ignore_loopback: false
  http:
    stack: production_capture

  # Exclude health checks from capture
  filters:
    groups:
      - qpoint
```

Update `docker-compose.yaml` to pass S3 credentials:

```yaml
  qtap:
    image: us-docker.pkg.dev/qpoint-edge/public/qtap:v0
    environment:
      - TINI_SUBREAPER=1
      - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
      - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
    # ... rest of config
```

See [Storage Configuration](/getting-started/qtap/configuration/storage-configuration.md) for complete S3 setup.

***

## Part 4: Real-World Use Cases

### Use Case 1: API Gateway Monitoring

Monitor all API traffic flowing through NGINX as an API gateway:

```yaml
version: 2

services:
  event_stores:
    - type: stdout
  # Sensitive data to S3 (never leaves your infrastructure)
  object_stores:
    - type: s3
      endpoint: s3.amazonaws.com
      region: us-east-1
      bucket: my-company-nginx-traffic
      access_key:
        type: env
        value: AWS_ACCESS_KEY_ID
      secret_key:
        type: env
        value: AWS_SECRET_ACCESS_KEY
      insecure: false

rulekit:
  macros:
    - name: is_error
      expr: http.res.status >= 400
    - name: is_slow
      expr: http.res.duration_ms > 1000
    - name: is_large_payload
      expr: http.req.headers.content-length > 1000000  # > 1MB

stacks:
  api_gateway:
    plugins:
      - type: http_capture
        config:
          level: none        # Don't capture by default
          format: json
          rules:
            # Capture all errors
            - name: "API errors"
              expr: is_error()
              level: full

            # Capture slow requests
            - name: "Slow requests"
              expr: is_slow()
              level: headers  # Headers only

            # Capture large payloads (potential abuse)
            - name: "Large payloads"
              expr: is_large_payload()
              level: summary  # Metadata only

            # Capture authentication attempts
            - name: "Auth attempts"
              expr: http.req.path matches /^\/api\/v1\/auth\//
              level: full

tap:
  direction: all
  ignore_loopback: false
  http:
    stack: api_gateway

  # Exclude internal health checks
  filters:
    custom:
      - exe: /usr/bin/health-checker
        strategy: exact
```

### Use Case 2: Debugging Reverse Proxy Issues

Capture both sides of a reverse proxy to debug transformation issues:

```yaml
version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: stdout

stacks:
  debug_proxy:
    plugins:
      - type: http_capture
        config:
          level: full        # Capture everything for debugging
          format: text       # Human-readable for quick analysis

tap:
  direction: all             # CRITICAL: Capture both ingress and egress
  ignore_loopback: false
  http:
    stack: debug_proxy

  # Only capture traffic to/from specific upstream
  endpoints:
    - domain: 'backend-api.internal.company.com'
      http:
        stack: debug_proxy
```

This configuration lets you compare:

* What the client sent to NGINX (ingress)
* What NGINX forwarded to the upstream (egress)
* What the upstream returned (egress response)
* What NGINX sent back to the client (ingress response)

### Use Case 3: Load Balancer Analytics

Track traffic distribution across multiple upstreams:

**nginx.conf** (simplified):

```nginx
upstream backend_pool {
    server backend1:8080;
    server backend2:8080;
    server backend3:8080;
}

server {
    listen 80;
    location / {
        proxy_pass http://backend_pool;
    }
}
```

**qtap.yaml**:

```yaml
version: 2

services:
  event_stores:
    - type: stdout
  # Sensitive data to S3 (never leaves your infrastructure)
  object_stores:
    - type: s3
      endpoint: s3.amazonaws.com
      region: us-east-1
      bucket: my-company-nginx-traffic
      access_key:
        type: env
        value: AWS_ACCESS_KEY_ID
      secret_key:
        type: env
        value: AWS_SECRET_ACCESS_KEY
      insecure: false

stacks:
  load_balancer:
    plugins:
      - type: http_capture
        config:
          level: summary     # Just metadata for analytics
          format: json

tap:
  direction: egress          # Only capture nginx→upstream (see distribution)
  ignore_loopback: false
  http:
    stack: load_balancer
```

Analyze the logs to see which backend server received each request.

***

## Understanding the Output

### Dual Capture for Reverse Proxy

When NGINX proxies a request, Qtap captures **two separate HTTP transactions**:

**Transaction 1: INGRESS (Client → NGINX)**

```
Source Process: nginx
Direction: INGRESS ←
Source IP: 192.168.1.100 (client)
Destination: localhost:8080
Method: GET
URL: http://localhost:8080/api/users/42
```

**Transaction 2: EGRESS (NGINX → Upstream)**

```
Source Process: nginx
Direction: EGRESS →
Destination: httpbin.org:80
Method: GET
URL: http://httpbin.org/users/42
```

This dual capture lets you:

* Measure end-to-end latency vs. upstream latency
* See how NGINX transforms requests (headers, paths, bodies)
* Debug issues on either side of the proxy

### Capture Levels Explained

* **`none`**: No capture (use with rules for conditional capture)
* **`summary`**: Basic metadata (method, URL, status, duration) - no headers/bodies
* **`details`**: Includes headers - no bodies
* **`full`**: Everything (headers + bodies)

For high-traffic NGINX servers, start with `summary` or `details` to control volume.

***

## Troubleshooting

### Not Seeing NGINX Traffic?

**Check 1: Is Qtap running before you made requests?**

```bash
# Qtap must be running BEFORE traffic is generated
docker logs qtap-nginx | head -20
# Should see startup messages
```

**Check 2: Is ignore\_loopback set correctly?**

```yaml
# If NGINX uses localhost/127.0.0.1, you MUST set:
tap:
  ignore_loopback: false
```

**Check 3: Is NGINX actually processing requests?**

```bash
# Check NGINX access logs
docker exec nginx-demo cat /var/log/nginx/access.log
```

**Check 4: Verify Qtap is hooking NGINX**

```bash
docker logs qtap-nginx 2>&1 | grep -i nginx
# Should see logs about attaching to nginx process
```

### Seeing `"l7Protocol": "other"`?

This means Qtap captured the connection but couldn't parse HTTP:

* NGINX might be using HTTPS internally (check TLS configuration)
* Traffic might not be HTTP
* Qtap may not have fully initialized (wait 6+ seconds after starting)

### Too Much Traffic Captured?

**Option 1: Use conditional rules**

```yaml
stacks:
  reduced_volume:
    plugins:
      - type: http_capture
        config:
          level: none
          rules:
            - name: "Errors only"
              expr: http.res.status >= 400
              level: full
```

**Option 2: Filter specific paths**

```yaml
rules:
  - name: "Skip health checks"
    expr: http.req.path != "/health"
    level: full
```

**Option 3: Capture summary only**

```yaml
config:
  level: summary  # Metadata only, no headers/bodies
```

### Duplicate Transactions?

If you see the same request captured multiple times, this is expected for reverse proxies:

* One INGRESS capture (client → nginx)
* One EGRESS capture (nginx → upstream)

To capture only one direction:

```yaml
tap:
  direction: ingress  # or egress
```

***

## Performance Considerations

### NGINX + Qtap Performance Impact

Qtap operates **out-of-band** using eBPF, with minimal impact:

* **CPU overhead**: \~1-3% for typical HTTP traffic
* **Memory**: \~50-200MB depending on traffic volume
* **Latency**: No additional latency (passive observation)

**Best practices for high-traffic NGINX:**

1. Use `level: summary` or `details` (avoid `full` with large bodies)
2. Apply conditional rules to reduce captured volume
3. Filter out health checks and monitoring endpoints
4. Send data to S3 in batches (use Fluent Bit for buffering)
5. Set TTL policies on storage (90 days recommended)

### Scaling Recommendations

| **Traffic Volume** | **Recommended Level** | **Storage**                            |
| ------------------ | --------------------- | -------------------------------------- |
| < 100 req/sec      | `full`                | stdout or S3                           |
| 100-1000 req/sec   | `details`             | S3 with batching                       |
| 1000-10000 req/sec | `summary`             | S3 + Fluent Bit                        |
| > 10000 req/sec    | conditional rules     | S3 + Fluent Bit + aggressive filtering |

***

## Next Steps

**Learn More About Qtap:**

* [Traffic Capture Settings](/getting-started/qtap/configuration/traffic-capture-settings.md) - Complete `tap` configuration
* [Traffic Processing with Plugins](/getting-started/qtap/configuration/traffic-processing-with-plugins.md) - All plugin options
* [Complete Guide](/guides/qtap-guides/getting-started/getting-started-complete-guide.md) - Progressive tutorial covering all features

**Production Deployment:**

* [Storage Configuration](/getting-started/qtap/configuration/storage-configuration.md) - S3 setup guide
* [Capturing All HTTP Traffic with Fluent Bit](/guides/qtap-guides/observability-and-integration/capturing-all-http-traffic-with-fluent-bit.md) - Batching and buffering
* [Kubernetes Manifest](/getting-started/qtap/installation/kubernetes-manifest.md) - Deploy in K8s

**Related Guides:**

* [Ingress Traffic Capture with Python](/guides/qtap-guides/getting-started/ingress-traffic-capture-with-python.md) - Similar concepts for other servers
* [HTTPS Header Capture Without Proxies](/guides/qtap-guides/advanced-use-cases/transparent-https-header-capture-without-proxies.md) - TLS inspection details

**Alternative: Cloud Management:**

* [Qplane](/getting-started/qplane.md) - Manage Qtap with visual dashboards
* [POC Kick Off Guide](/guides/qplane-guides/poc-kick-off-guide.md) - Quick start with cloud control plane

***

## Cleanup

```bash
# Stop all services
docker compose down

# Remove containers and volumes
docker compose down -v

# Clean up files
rm nginx.conf qtap.yaml docker-compose.yaml
```

***

*This guide uses validated configurations. All examples are tested and guaranteed to work with NGINX and Qtap.*


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.qpoint.io/guides/qtap-guides/web-server-integration/capturing-nginx-traffic.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
