# BPF Trace - Advanced Debugging

**Qtap Version:** v0.11.3+ **Status:** Advanced feature for syscall-level debugging

***

## Overview

The `--bpf-trace` flag enables **syscall-level tracing** of network operations, providing deep visibility into how applications interact with the Linux kernel. This is an advanced debugging feature that shows:

* Individual syscall invocations (read, write, writev, recvfrom, accept4)
* File descriptor (FD) numbers for each operation
* Data transfer sizes
* TLS/SSL detection per-syscall
* Process IDs and executable paths

**Use cases:**

* Debugging complex proxy scenarios (e.g., HTTP→HTTPS reverse proxies)
* Understanding per-FD TLS state tracking issues
* Investigating missing or corrupted traffic captures
* Analyzing syscall-level data flow through applications

***

## Syntax

```bash
--bpf-trace="mod:<module>,exe.contains:<executable_name>"
```

### Parameters

| Parameter             | Description                                         | Example              |
| --------------------- | --------------------------------------------------- | -------------------- |
| `mod:<module>`        | BPF trace module to enable                          | `mod:socket`         |
| `exe.contains:<name>` | Filter traces to executables containing this string | `exe.contains:nginx` |

**Available modules:**

* `socket` - Socket lifecycle and syscall tracing ✅ WORKS
* `openssl` - OpenSSL library event tracking ✅ WORKS

{% hint style="warning" %}
**Note:** Use `exe.contains` (not `bin.contains`) for syscall filtering.
{% endhint %}

***

## Basic Example

### Configuration

```bash
docker run \
  --privileged --user 0:0 \
  --cap-add CAP_BPF --cap-add CAP_SYS_ADMIN \
  --pid host --network host \
  -v /sys:/sys \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v $(pwd)/qtap.yaml:/app/config/qtap.yaml \
  -e TINI_SUBREAPER=1 \
  --ulimit memlock=-1 \
  us-docker.pkg.dev/qpoint-edge/public/qtap:v0 \
  --log-level=info \
  --log-encoding=console \
  --config=/app/config/qtap.yaml \
  --bpf-trace="mod:socket,exe.contains:nginx"
```

### Expected Output

```
2025-10-23 00:38:43.341	INFO	eBPF trace	{"msg": "syscall/accept4", "caller": "syscall/accept4", "pid": 139929, "exe": "/usr/sbin/nginx", "fd": 3}
2025-10-23 00:38:43.341	INFO	eBPF trace	{"msg": "syscall/recvfrom (init)", "pid": 139929, "exe": "/usr/sbin/nginx", "fd": 3, "bytes": 81}
2025-10-23 00:38:43.341	INFO	eBPF trace	{"msg": "syscall/recvfrom", "pid": 139929, "exe": "/usr/sbin/nginx", "fd": 3, "bytes": 81}
2025-10-23 00:38:43.341	INFO	eBPF trace	{"msg": "process_data (pre-protocol)", "caller": "syscall/recvfrom", "pid": 139929, "exe": "/usr/sbin/nginx", "fd": 3, "direction": 0, "bytes": 81, "ssl": false, "protocol": 0}
2025-10-23 00:38:43.365	INFO	eBPF trace	{"msg": "syscall/write (init)", "pid": 139929, "exe": "/usr/sbin/nginx", "fd": 11, "bytes": 1806}
2025-10-23 00:38:43.365	INFO	eBPF trace	{"msg": "syscall/write", "pid": 139929, "exe": "/usr/sbin/nginx", "fd": 11, "bytes": 1806}
2025-10-23 00:38:43.365	INFO	eBPF trace	{"msg": "process_data (not ssl)", "caller": "syscall/write", "pid": 139929, "exe": "/usr/sbin/nginx", "fd": 11, "direction": 1, "bytes": 1806, "ssl": false}
2025-10-23 00:38:43.391	INFO	eBPF trace	{"msg": "syscall/read (init)", "pid": 139929, "exe": "/usr/sbin/nginx", "fd": 11, "bytes": 5}
2025-10-23 00:38:43.391	INFO	eBPF trace	{"msg": "syscall/read", "pid": 139929, "exe": "/usr/sbin/nginx", "fd": 11, "bytes": 5}
2025-10-23 00:38:43.391	INFO	eBPF trace	{"msg": "process_data (not ssl)", "caller": "syscall/read", "pid": 139929, "exe": "/usr/sbin/nginx", "fd": 11, "direction": 0, "bytes": 5, "ssl": false}
2025-10-23 00:38:43.493	INFO	eBPF trace	{"msg": "syscall/writev (init)", "pid": 139929, "exe": "/usr/sbin/nginx", "fd": 3, "bytes": 941}
2025-10-23 00:38:43.493	INFO	eBPF trace	{"msg": "syscall/writev", "pid": 139929, "exe": "/usr/sbin/nginx", "fd": 3, "bytes": 941}
2025-10-23 00:38:43.493	INFO	eBPF trace	{"msg": "process_data (not ssl)", "caller": "syscall/writev", "pid": 139929, "exe": "/usr/sbin/nginx", "fd": 3, "direction": 1, "bytes": 941, "ssl": false}
```

***

## Understanding the Output

### Syscall Types

| Syscall    | Description                     | Typical Usage                           |
| ---------- | ------------------------------- | --------------------------------------- |
| `accept4`  | Accept incoming connection      | Server accepting new client             |
| `recvfrom` | Receive data from socket        | Reading HTTP request                    |
| `read`     | Read data from FD               | Reading from SSL/TLS socket             |
| `write`    | Write data to FD                | Writing to SSL/TLS socket               |
| `writev`   | Vectored write (scatter-gather) | Writing HTTP response with headers+body |

### Event Phases

Each syscall typically generates two events:

1. **`(init)` event** - Syscall entry (before execution)
2. **Completion event** - Syscall exit (after execution, with actual byte counts)

### Field Reference

```json
{
  "msg": "syscall/writev",        // Syscall name
  "caller": "syscall/writev",     // Source of the call
  "pid": 139929,                  // Process ID
  "exe": "/usr/sbin/nginx",       // Executable path
  "fd": 3,                        // File descriptor number
  "direction": 1,                 // 0=ingress (read), 1=egress (write)
  "bytes": 941,                   // Number of bytes transferred
  "ssl": false,                   // Whether SSL/TLS detected on this FD
  "protocol": 0,                  // Protocol detection status
  "open": false                   // Whether connection info is available
}
```

### `process_data` Messages

These appear after syscall completion and show Qtap's data processing decisions:

* `process_data (pre-protocol)` - Data received before protocol detection
* `process_data (not ssl)` - Data on plaintext socket (HTTP)
* `process_data (conn_info = NULL)` - Socket not tracked (e.g., log files)
* `process_data (conn_info->is_open = false)` - Connection already closed

***

## Real-World Example: Nginx HTTP→HTTPS Reverse Proxy

### Scenario

Nginx reverse proxy configuration:

* Client → Nginx: **HTTP** on port 8000
* Nginx → Backend: **HTTPS** to api.treatmyocd.com

### Traffic Flow Visualization

```
Client (curl)  →  Nginx (PID 139929)  →  Backend HTTPS
    ↓                    ↓                      ↓
  fd 3 (HTTP)        fd 11 (HTTPS)        api.treatmyocd.com:443
```

### Step-by-Step Syscall Trace

#### 1. Accept Client Connection

```json
{"msg": "syscall/accept4", "pid": 139929, "exe": "/usr/sbin/nginx", "fd": 3}
```

Nginx accepts incoming HTTP connection from curl on **fd 3**.

#### 2. Read HTTP Request from Client

```json
{"msg": "syscall/recvfrom", "pid": 139929, "fd": 3, "bytes": 81, "direction": 0, "ssl": false}
```

Nginx reads 81 bytes of HTTP request from **fd 3** (client socket).

#### 3. Write HTTPS Request to Backend

```json
{"msg": "syscall/write", "pid": 139929, "fd": 11, "bytes": 1806, "direction": 1, "ssl": false}
```

Nginx writes 1806 bytes to **fd 11** (backend HTTPS connection).

{% hint style="info" %}
**Note:** `"ssl": false` appears here even though fd 11 is an HTTPS connection. This indicates Qtap is tracking TLS state at the per-process level rather than per-FD.
{% endhint %}

#### 4. Read HTTPS Response from Backend

```json
{"msg": "syscall/read", "pid": 139929, "fd": 11, "bytes": 1210, "direction": 0, "ssl": false}
{"msg": "syscall/read", "pid": 139929, "fd": 11, "bytes": 2695, "direction": 0, "ssl": false}
{"msg": "syscall/read", "pid": 139929, "fd": 11, "bytes": 1367, "direction": 0, "ssl": false}
```

Nginx reads multiple chunks from **fd 11** (backend response).

#### 5. Write HTTP Response to Client

```json
{"msg": "syscall/writev", "pid": 139929, "fd": 3, "bytes": 941, "direction": 1, "ssl": false}
```

Nginx writes 941 bytes back to **fd 3** (client) using `writev`.

**Critical:** This is where the nginx HTTP→HTTPS bug would manifest. If Qtap incorrectly marks the entire process as "has SSL" after seeing fd 11's TLS traffic, it might discard this writev data expecting SSL\_write instead.

#### 6. Write Access Log

```json
{"msg": "syscall/write", "pid": 139929, "fd": 4, "bytes": 89, "open": false}
{"msg": "process_data (conn_info = NULL)", "caller": "syscall/write", "fd": 4}
```

Nginx writes 89 bytes to **fd 4** (access log file). Qtap shows `conn_info = NULL` because fd 4 is not a network socket.

***

## Filtering Options

### Filter by Executable Name

```bash
# Filter to nginx processes
--bpf-trace="mod:socket,exe.contains:nginx"

# Filter to curl processes
--bpf-trace="mod:socket,exe.contains:curl"

# Filter to any process with "python" in the path
--bpf-trace="mod:socket,exe.contains:python"
```

**How `exe.contains` works:**

* Matches substring anywhere in the executable path
* Example: `exe.contains:nginx` matches both `/usr/sbin/nginx` and `/custom/path/nginx-debug`

### No Filter (All Processes)

```bash
# Trace all processes (VERY VERBOSE!)
--bpf-trace="mod:socket"
```

{% hint style="warning" %}
**Warning:** Tracing all processes generates massive log output and may impact system performance. Always use `exe.contains` filters in production.
{% endhint %}

***

## Log Level Requirements

BPF trace output varies by log level:

| Log Level | BPF Trace Output                            |
| --------- | ------------------------------------------- |
| `warn`    | **No syscall traces** (silent)              |
| `info`    | **Full syscall traces** ✅ RECOMMENDED       |
| `debug`   | Full syscall traces + additional debug logs |

**Recommended configuration:**

```bash
--log-level=info --log-encoding=console
```

***

## Debugging Use Cases

### 1. Missing Response Data

**Symptom:** Qtap captures request but shows 0 bytes received in response.

**Diagnosis with BPF trace:**

```bash
--bpf-trace="mod:socket,exe.contains:nginx"
```

Look for:

1. `writev` or `write` syscalls that should contain response data
2. Check `"ssl": false` vs expected TLS state per-FD
3. Look for `process_data` messages showing why data was discarded

### 2. Per-FD TLS State Issues

**Symptom:** Mixed HTTP/HTTPS connections in same process cause capture failures.

**Diagnosis:**

Compare syscall traces for different file descriptors:

* Check if `"ssl": true/false` is accurate per-FD
* Look for `process_data (not ssl)` when SSL is expected (or vice versa)

Example buggy behavior:

```json
// Correct: fd 3 is HTTP
{"msg": "syscall/writev", "fd": 3, "ssl": false}

// INCORRECT: fd 11 is HTTPS but shows ssl: false
{"msg": "syscall/write", "fd": 11, "ssl": false}
```

### 3. Understanding Data Flow

**Trace complete request lifecycle:**

```bash
docker logs qtap-container 2>&1 | grep "eBPF trace" | grep "pid: 139929"
```

Group by file descriptor to see data flow:

```bash
# Client-facing socket (fd 3)
grep "fd\": 3" | grep -E "recvfrom|writev"

# Backend socket (fd 11)
grep "fd\": 11" | grep -E "write|read"
```

***

## Common Patterns

### HTTP Server (Single FD)

```json
// 1. Accept connection
{"msg": "syscall/accept4", "fd": 5}

// 2. Read HTTP request
{"msg": "syscall/recvfrom", "fd": 5, "bytes": 142, "direction": 0}

// 3. Write HTTP response
{"msg": "syscall/writev", "fd": 5, "bytes": 1024, "direction": 1}
```

### HTTPS Client (Single FD with SSL)

```json
// 1. Connect to server
{"msg": "syscall/connect", "fd": 8}

// 2. Write HTTPS request
{"msg": "syscall/write", "fd": 8, "bytes": 256, "ssl": true}

// 3. Read HTTPS response
{"msg": "syscall/read", "fd": 8, "bytes": 4096, "ssl": true}
```

### Reverse Proxy (Two FDs - HTTP + HTTPS)

```json
// Client-facing (HTTP)
{"msg": "syscall/recvfrom", "fd": 3, "ssl": false}  // Read request
{"msg": "syscall/writev", "fd": 3, "ssl": false}    // Write response

// Backend-facing (HTTPS)
{"msg": "syscall/write", "fd": 11, "ssl": true}     // Write request
{"msg": "syscall/read", "fd": 11, "ssl": true}      // Read response
```

***

## Troubleshooting

### No "eBPF trace" Output

**Check 1: Log level**

```bash
# Verify INFO or DEBUG level
docker logs qtap-container 2>&1 | head -20 | grep "Starting Qtap"
```

**Check 2: Filter matches**

```bash
# Verify executable name
docker exec nginx-container which nginx
# Should match exe.contains filter
```

**Check 3: Traffic generated**

```bash
# Ensure traffic is actually hitting the filtered process
curl http://localhost:8000/get
```

**Check 4: Correct filter syntax**

```bash
# CORRECT
--bpf-trace="mod:socket,exe.contains:nginx"

# WRONG
--bpf-trace="mod:socket,bin.contains:nginx"  # Wrong prefix!
```

### Too Much Output

**Problem:** Tracing all processes floods logs.

**Solution:** Add executable filter:

```bash
# Before (too verbose)
--bpf-trace="mod:socket"

# After (filtered)
--bpf-trace="mod:socket,exe.contains:nginx"
```

### Syscall Not Appearing

Some syscalls may not appear if:

1. **Different syscall variant used** - e.g., `readv` instead of `read`
2. **Buffered I/O** - Application uses buffering, syscalls occur later
3. **Connection pooling** - FD reused, syscalls mixed with other requests

***

## Performance Considerations

BPF trace adds overhead to every syscall:

| Traffic Level  | Overhead          | Recommendation                        |
| -------------- | ----------------- | ------------------------------------- |
| < 10 req/sec   | Negligible        | Safe for production debugging         |
| 10-100 req/sec | Low (\~5%)        | Use with caution, monitor CPU         |
| > 100 req/sec  | Moderate (10-20%) | **Only use in isolated environments** |

**Best practices:**

* Use `exe.contains` filters to limit scope
* Enable only during active debugging sessions
* Monitor disk I/O (logs can grow rapidly)
* Use log rotation if enabled long-term

***

## Comparison: Regular Logs vs BPF Trace

### Regular Qtap Logs (--log-level=info)

```
INFO	HTTP Transaction

Metadata:
  Direction: egress-external
  Bytes Sent: 78
  Bytes Received: 941

Request:
  Method: GET
  URL: http://httpbin.org/get

Response:
  Status: 302
```

**Pros:** Human-readable, concise, shows final results **Cons:** No syscall details, can't debug per-FD issues

### BPF Trace (--bpf-trace="mod:socket,exe.contains:nginx")

```json
{"msg": "syscall/recvfrom", "pid": 139929, "fd": 3, "bytes": 81, "direction": 0, "ssl": false}
{"msg": "syscall/write", "pid": 139929, "fd": 11, "bytes": 1806, "direction": 1, "ssl": false}
{"msg": "syscall/read", "pid": 139929, "fd": 11, "bytes": 1210, "direction": 0, "ssl": false}
{"msg": "syscall/writev", "pid": 139929, "fd": 3, "bytes": 941, "direction": 1, "ssl": false}
```

**Pros:** Syscall-level detail, per-FD visibility, shows TLS state decisions **Cons:** Verbose, requires interpretation, not user-friendly

**When to use BPF trace:**

* Debugging complex proxy scenarios
* Investigating missing captures
* Understanding per-FD TLS state tracking
* Analyzing syscall-level data flow

**When to use regular logs:**

* Normal operation
* User-facing traffic analysis
* Compliance/audit logging

***

## Related Configuration

BPF trace works alongside standard Qtap configuration:

```yaml
# qtap.yaml
version: 2

services:
  event_stores:
    - type: stdout
  object_stores:
    - type: stdout

stacks:
  capture_all:
    plugins:
      - type: http_capture
        config:
          level: full           # Capture full HTTP data
          format: json

tap:
  direction: all                # Capture all directions
  ignore_loopback: false        # Include localhost traffic
  http:
    stack: capture_all
```

**Run with:**

```bash
docker run ... \
  --config=/app/config/qtap.yaml \
  --log-level=info \
  --bpf-trace="mod:socket,exe.contains:nginx"
```

***

## Known Limitations

1. **No SSL\_write/SSL\_read visibility** BPF trace shows syscalls (read/write) but not OpenSSL library calls (SSL\_read/SSL\_write).
2. **Per-process TLS state** The `"ssl": true/false` field may reflect process-level state rather than per-FD state in some qtap versions.
3. **Filter syntax constraints**
   * Must use `exe.contains` (not `bin.contains`)
   * Only substring matching (no regex)
4. **Log volume** Without filters, trace output can be **100x larger** than regular logs.

***

## Summary

| Feature         | Value                                                         |
| --------------- | ------------------------------------------------------------- |
| **Flag**        | `--bpf-trace="mod:socket,exe.contains:nginx"`                 |
| **Log Level**   | `info` or `debug`                                             |
| **Output**      | Syscall-level traces (read, write, writev, accept4, recvfrom) |
| **Filtering**   | By executable name substring                                  |
| **Use Cases**   | Debugging proxies, per-FD TLS issues, missing captures        |
| **Performance** | Low overhead with filters, high without                       |

{% hint style="info" %}
**Key Takeaway:** BPF trace is the most powerful debugging tool for understanding Qtap's eBPF-level behavior, especially for complex scenarios like mixed HTTP/HTTPS proxies.
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.qpoint.io/appendix/bpf-trace.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
