> For the complete documentation index, see [llms.txt](https://docs.qpoint.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.qpoint.io/guides/qtap-guides/observability-and-integration/self-hosted-grafana-observability-stack.md).

# Self-Hosted Grafana Observability Stack

This guide walks you through setting up a complete, self-hosted observability stack where Qtap events flow to Grafana via Loki and full HTTP payloads stay in your own S3-compatible object storage. Every byte of captured data — metadata and payloads — remains inside your infrastructure.

## Architecture

```
┌──────────────────────────────────────────────────────────────────┐
│                        YOUR INFRASTRUCTURE                       │
│                                                                  │
│   ┌───────┐                                                      │
│   │ Qtap  │──── Events (metadata) ───► OTel Collector (:4317)    │
│   │ Agent │                               │                      │
│   │       │                               ▼                      │
│   │       │                           Loki (:3100)               │
│   │       │                               │                      │
│   │       │                               ▼                      │
│   │       │                           Grafana (:3000)            │
│   │       │                               ▲                      │
│   │       │                               │ click artifact URL   │
│   │       │                               ▼                      │
│   │       │                       nginx proxy (:3904)            │
│   │       │                               │                      │
│   │       │                               ▼                      │
│   │       │── Objects (payloads) ──► Garage S3 (:3900)           │
│   └───────┘                                                      │
└──────────────────────────────────────────────────────────────────┘
```

| Service            | Role                                               | Port          |
| ------------------ | -------------------------------------------------- | ------------- |
| **Qtap**           | eBPF agent — captures HTTP traffic at kernel level | Host network  |
| **OTel Collector** | Receives OTLP logs from Qtap, forwards to Loki     | 4317 (gRPC)   |
| **Loki**           | Log aggregation and storage                        | 3100          |
| **Grafana**        | Query, explore, and visualize events               | 3000          |
| **Garage**         | S3-compatible object storage for HTTP payloads     | 3900 (S3 API) |
| **nginx**          | Proxy for anonymous read access to stored objects  | 3904          |

{% hint style="info" %}
This guide covers **events and object linking**. For sending events to any OpenTelemetry-compatible backend, see the [OpenTelemetry Integration](/guides/qtap-guides/observability-and-integration/sending-qtap-events-to-opentelemetry.md) guide.
{% endhint %}

## Prerequisites

* Docker and Docker Compose
* Linux kernel 5.10+ with eBPF support
* `aws` CLI (for S3 verification) — [install guide](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)

## Understanding the Two Data Paths

Qtap produces two distinct outputs. Understanding the split is key to this architecture:

|                 | Events                                                        | Objects                                                                                            |
| --------------- | ------------------------------------------------------------- | -------------------------------------------------------------------------------------------------- |
| **What**        | Lightweight metadata — method, URL, status, duration, process | HTTP transaction objects — metadata at `summary` level; headers and bodies at `full` level         |
| **Sensitivity** | Low — safe to send anywhere                                   | Varies — `summary` objects contain only metadata; `full` objects may contain API keys, tokens, PII |
| **Storage**     | Loki (via OTel Collector)                                     | Garage S3 (your infrastructure)                                                                    |
| **Volume**      | Every observed request                                        | Every captured request (content varies by capture level)                                           |

**The link between them:** Qtap's `access_url` template embeds a clickable URL into each `artifact_record` event. When you find an interesting event in Grafana, you click the URL to fetch the complete HTTP transaction from your S3 storage.

```
access_url: http://localhost:3904/qpoint/{{DIGEST}}
                                         ^^^^^^^^
                                         Replaced with SHA1 hash of the stored object
```

## Configuration Files

Create a project directory and add these files:

```bash
mkdir -p grafana-stack/grafana/provisioning/datasources
cd grafana-stack
```

### OTel Collector

{% code overflow="wrap" %}

```bash
cat > otel-collector.yaml << 'EOF'
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

processors:
  batch:
    timeout: 10s

exporters:
  otlphttp/loki:
    endpoint: http://localhost:3100/otlp
    tls:
      insecure: true

service:
  pipelines:
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlphttp/loki]
EOF
```

{% endcode %}

The collector listens for gRPC on port 4317 (where Qtap sends events) and forwards them to Loki's OTLP endpoint. Since the collector runs with `network_mode: host`, it reaches Loki at `localhost:3100`.

### Loki

{% code overflow="wrap" %}

```bash
cat > loki.yaml << 'EOF'
auth_enabled: false

server:
  http_listen_port: 3100

common:
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: "2024-01-01"
      store: tsdb
      object_store: filesystem
      schema: v13
      index:
        prefix: index_
        period: 24h

limits_config:
  retention_period: 168h
  allow_structured_metadata: true
  max_query_series: 100000

compactor:
  working_directory: /loki/compactor
  delete_request_store: filesystem
  compaction_interval: 10m
  retention_enabled: true
  retention_delete_delay: 2h
  retention_delete_worker_count: 150
EOF
```

{% endcode %}

Key settings: `allow_structured_metadata: true` lets Loki store Qtap's structured attributes (method, status, host, etc.) as queryable fields. Retention is set to 7 days (`168h`).

### Garage (S3-Compatible Object Storage)

{% code overflow="wrap" %}

```bash
cat > garage.toml << 'EOF'
metadata_dir = "/var/lib/garage/meta"
data_dir = "/var/lib/garage/data"
db_engine = "sqlite"
replication_factor = 1

rpc_bind_addr = "[::]:3901"
rpc_public_addr = "127.0.0.1:3901"
rpc_secret = "c052485a056fabf3c0832d98f63b14d58036f5189683fc39da199f43fde3f15e"

[s3_api]
s3_region = "us-east-1"
api_bind_addr = "[::]:3900"

[s3_web]
bind_addr = "[::]:3902"
root_domain = ".web.garage.localhost"

[admin]
api_bind_addr = "[::]:3903"
admin_token = "demo-admin-token-for-local-use-only"
EOF
```

{% endcode %}

Garage provides the S3 API on port 3900 (where Qtap writes payloads), a web endpoint on 3902 (for anonymous reads), and an admin API on 3903 (for bucket management).

### Nginx Proxy

{% code overflow="wrap" %}

```bash
cat > nginx.conf << 'EOF'
server {
    listen 3904;

    location /qpoint/ {
        rewrite ^/qpoint/(.*)$ /$1 break;
        proxy_pass http://garage:3902;
        proxy_set_header Host qpoint.web.garage.localhost;
        proxy_set_header X-Real-IP $remote_addr;
    }
}
EOF
```

{% endcode %}

The nginx proxy translates requests like `http://localhost:3904/qpoint/<DIGEST>` into Garage web requests with the correct virtual-host `Host` header. This avoids needing wildcard DNS for Garage's subdomain-based routing.

### Grafana Datasource

{% code overflow="wrap" %}

```bash
cat > grafana/provisioning/datasources/datasources.yaml << 'EOF'
apiVersion: 1

datasources:
  - name: Loki
    type: loki
    uid: loki
    access: proxy
    url: http://loki:3100
    isDefault: true
    editable: true
EOF
```

{% endcode %}

### Qtap

{% code overflow="wrap" %}

```bash
cat > qtap.yaml << 'EOF'
version: 2

services:
  event_stores:
    - type: otel
      endpoint: "localhost:4317"      # OTel Collector gRPC
      protocol: grpc
      service_name: "qtap"
      environment: "production"
      tls:
        enabled: false

  object_stores:
    - id: garage
      type: s3
      endpoint: localhost:3900        # Garage S3 API
      bucket: qpoint
      region: us-east-1
      access_url: http://localhost:3904/qpoint/{{DIGEST}}
      insecure: true
      access_key:
        type: env
        value: GARAGE_ACCESS_KEY
      secret_key:
        type: env
        value: GARAGE_SECRET_KEY

rulekit:
  macros:
    - name: is_error
      expr: http.res.status >= 400 && http.res.status < 600

stacks:
  default_stack:
    plugins:
      - type: http_capture
        config:
          level: summary             # (none|summary|headers|full)
          format: json               # (json|text)
          rules:
            - name: "Full capture on errors"
              expr: is_error()
              level: full            # Stores headers + bodies in S3

tap:
  direction: egress                  # (egress|egress-external|egress-internal|ingress|all)
  ignore_loopback: true              # (true|false)
  audit_include_dns: false           # (true|false)
  http:
    stack: default_stack
  filters:
    groups:
      - qpoint                       # Don't capture Qtap's own traffic
EOF
```

{% endcode %}

This configuration captures all egress HTTP traffic at `summary` level (metadata only — no headers or bodies), and automatically escalates to `full` capture for any 4xx or 5xx response. Full captures store complete request/response headers in Garage S3. Both levels emit `artifact_record` events with clickable URLs pointing to the stored objects.

## Docker Compose

{% code overflow="wrap" %}

```bash
cat > docker-compose.yaml << 'EOF'
services:
  loki:
    image: grafana/loki:latest
    container_name: loki
    restart: always
    ports:
      - "3100:3100"
    volumes:
      - ./loki.yaml:/etc/loki/local-config.yaml:ro
      - loki-data:/loki
    command: -config.file=/etc/loki/local-config.yaml

  otel-collector:
    image: otel/opentelemetry-collector:latest
    container_name: otel-collector
    restart: always
    network_mode: host
    volumes:
      - ./otel-collector.yaml:/etc/otel-collector-config.yaml:ro
    command: ["--config=/etc/otel-collector-config.yaml"]

  garage:
    image: dxflrs/garage:v2.1.0
    container_name: garage
    restart: always
    ports:
      - "3900:3900"
      - "3902:3902"
      - "3903:3903"
    volumes:
      - ./garage.toml:/etc/garage.toml:ro

  garage-web:
    image: nginx:alpine
    container_name: garage-web
    restart: always
    ports:
      - "3904:3904"
    volumes:
      - ./nginx.conf:/etc/nginx/conf.d/default.conf:ro
    depends_on:
      - garage

  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    restart: always
    ports:
      - "3000:3000"
    volumes:
      - ./grafana/provisioning:/etc/grafana/provisioning:ro
      - grafana-data:/var/lib/grafana
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
      - GF_AUTH_ANONYMOUS_ENABLED=true
      - GF_AUTH_ANONYMOUS_ORG_ROLE=Editor

volumes:
  loki-data:
  grafana-data:
EOF
```

{% endcode %}

{% hint style="info" %}
The OTel Collector uses `network_mode: host` so that Qtap (running on the host) can reach it at `localhost:4317`. Because it shares the host network stack, it also reaches Loki at `localhost:3100`.
{% endhint %}

## Running the Stack

### Step 1: Start the Services

```bash
docker compose up -d
```

Verify all containers are running:

```bash
docker compose ps
```

### Step 2: Initialize Garage

Wait for Garage to be ready, then configure the cluster layout, create a bucket, and set up access credentials:

{% code overflow="wrap" %}

```bash
# Wait for Garage to start
sleep 5

# Get the node ID
NODE_ID=$(docker exec garage /garage status 2>/dev/null | grep -oE '[a-f0-9]{16}' | head -1)
echo "Node ID: $NODE_ID"

# Configure cluster layout
docker exec garage /garage layout assign -z dc1 -c 1G "$NODE_ID"
docker exec garage /garage layout apply --version 1

# Create the bucket
docker exec garage /garage bucket create qpoint

# Import S3 credentials
docker exec garage /garage key import \
  --yes \
  -n qtap-key \
  "GK0123456789abcdef01234567" \
  "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef"

# Grant read/write access
docker exec garage /garage bucket allow \
  --read --write --owner qpoint --key qtap-key

# Enable anonymous web reads (for the nginx proxy)
docker exec garage /garage bucket website --allow qpoint
```

{% endcode %}

{% hint style="warning" %}
The credentials above are for local development only. For production, generate unique keys and manage them securely.
{% endhint %}

### Step 3: Start Qtap

{% code overflow="wrap" %}

```bash
docker run -d --name qtap \
  --user 0:0 --privileged \
  --cap-add CAP_BPF --cap-add CAP_SYS_ADMIN \
  --pid=host --network=host \
  -v /sys:/sys \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v "$(pwd)/qtap.yaml:/app/config/qtap.yaml" \
  -e TINI_SUBREAPER=1 \
  -e GARAGE_ACCESS_KEY=GK0123456789abcdef01234567 \
  -e GARAGE_SECRET_KEY=0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef \
  --ulimit=memlock=-1 \
  us-docker.pkg.dev/qpoint-edge/public/qtap:v0 \
  --log-level=info \
  --log-encoding=console \
  --config="/app/config/qtap.yaml"
```

{% endcode %}

### Step 4: Wait for Initialization

```bash
sleep 6
```

Qtap needs a few seconds to load eBPF programs and start capturing.

### Step 5: Generate Test Traffic

Generate both a successful request and an error to see both data paths:

```bash
# Successful request — captured at summary level (metadata only, no headers/bodies)
docker run --rm curlimages/curl -s https://httpbin.org/get > /dev/null

# Error request — captured at full level (headers + bodies stored in S3)
docker run --rm curlimages/curl -s https://httpbin.org/status/500 > /dev/null
```

### Step 6: Verify Data Is Flowing

**Check OTel Collector is receiving data:**

{% code overflow="wrap" %}

```bash
docker logs otel-collector 2>&1 | tail -5
```

{% endcode %}

You should see log export messages.

**Check Loki has events:**

{% code overflow="wrap" %}

```bash
curl -s "http://localhost:3100/loki/api/v1/query_range?query=%7Bservice_name%3D%22qtap%22%7D&limit=5" | python3 -m json.tool | head -20
```

{% endcode %}

**Check S3 has objects (from the error request):**

{% code overflow="wrap" %}

```bash
AWS_ACCESS_KEY_ID=GK0123456789abcdef01234567 \
AWS_SECRET_ACCESS_KEY=0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef \
aws --endpoint-url http://localhost:3900 \
  --region us-east-1 \
  s3 ls s3://qpoint/
```

{% endcode %}

## Exploring Data in Grafana

Open <http://localhost:3000> in your browser (default credentials: `admin` / `admin`).

### Viewing Events

1. Navigate to **Explore** (compass icon in the left sidebar)
2. Select **Loki** as the datasource
3. Enter a LogQL query:

```
{service_name="qtap"}
```

4. Click **Run query**

You should see Qtap events — both `connection` events (TCP connection metadata) and `artifact_record` events (HTTP transaction summaries, with full headers and bodies for requests matching capture rules).

### Useful LogQL Queries

| Query                                                           | Description                                  |
| --------------------------------------------------------------- | -------------------------------------------- |
| `{service_name="qtap"}`                                         | All Qtap events                              |
| `{service_name="qtap"} \| json \| event_type="artifact_record"` | Only artifact records (objects stored in S3) |
| `{service_name="qtap"} \| json \| response_status >= 400`       | Error responses                              |
| `{service_name="qtap"} \| json \| request_host="httpbin.org"`   | Traffic to a specific host                   |
| `{service_name="qtap"} \| json \| duration_ms > 1000`           | Slow requests (> 1s)                         |
| `{service_name="qtap"} \| json \| process_exe="/usr/bin/curl"`  | Requests from curl                           |
| `{service_name="qtap"} \| json \| direction="egress-external"`  | External egress traffic                      |

### Expanding Log Entries

Click on any log entry to expand it. You'll see structured attributes including:

* `request_method`, `request_host`, `request_scheme`
* `response_status`, `duration_ms`
* `process_exe`, `direction`
* For artifact records: `digest`, `url`, `type`

## Object Linking — From Events to Full Payloads

This is the key capability of this stack: linking lightweight events in Grafana to complete HTTP transactions stored in your own S3.

### How It Works

1. Qtap captures a request — all capture levels store an object in Garage S3, keyed by its SHA1 digest. In our config, errors get `full` capture (with headers), while other traffic gets `summary` (metadata only)
2. The HTTP transaction object is stored in Garage S3 as JSON
3. Qtap emits an `artifact_record` event to the OTel Collector, which includes a `url` field pointing to the stored object
4. The event flows through to Loki and appears in Grafana
5. You click the URL to view the complete HTTP transaction

### Walkthrough

**1. Find an error event in Grafana**

In Explore, query for artifact records:

```
{service_name="qtap"} | json | event_type="artifact_record"
```

**2. Expand the log entry**

Click on an artifact record event. Look for these fields:

```json
{
  "type": "http_transaction",
  "digest": "35a712233f2a70e4842d83eb017f952ae09bf74c",
  "url": "http://localhost:3904/qpoint/35a712233f2a70e4842d83eb017f952ae09bf74c",
  "summary": {
    "request_method": "GET",
    "request_host": "httpbin.org",
    "response_status": 500,
    "duration_ms": 580,
    "process_exe": "/usr/bin/curl",
    "direction": "egress-external"
  }
}
```

**3. Click the URL**

The `url` field is a direct link to the stored object. Click it (or open it in a new tab) to see the complete HTTP transaction:

```json
{
  "metadata": {
    "process_exe": "/usr/bin/curl",
    "container_name": "relaxed_zhukovsky",
    "container_image": "curlimages/curl",
    "bytes_sent": 72,
    "bytes_received": 127,
    "connection_id": "d654d887p3qnd27vfhag",
    "endpoint_id": "httpbin.org"
  },
  "request": {
    "method": "GET",
    "url": "https://httpbin.org/status/500",
    "scheme": "https",
    "authority": "httpbin.org",
    "protocol": "http2",
    "user_agent": "curl/8.18.0",
    "headers": {
      ":authority": "httpbin.org",
      ":method": "GET",
      ":path": "/status/500",
      "Accept": "*/*",
      "User-Agent": "curl/8.18.0"
    }
  },
  "response": {
    "status": 500,
    "content_type": "text/html; charset=utf-8",
    "headers": {
      "Content-Type": "text/html; charset=utf-8",
      "Server": "gunicorn/19.9.0"
    }
  },
  "duration_ms": 767,
  "direction": "egress-external"
}
```

This is the full HTTP transaction — headers and metadata — stored entirely in your infrastructure.

{% hint style="info" %}
All capture levels generate `artifact_record` events and store objects in S3. The difference is content: `summary` objects contain only metadata (method, URL, status, duration), while `headers` and `full` objects include the complete request/response headers and bodies. The object linking walkthrough above is most useful for `full` captures where you can inspect the actual HTTP payload.
{% endhint %}

### The `access_url` Template

The link between events and objects is configured in the Qtap `object_stores` section:

```yaml
access_url: http://localhost:3904/qpoint/{{DIGEST}}
```

`{{DIGEST}}` is replaced with the SHA1 hash of the stored object. The resulting URL is embedded in every `artifact_record` event.

For production, replace `localhost` with the hostname or IP address that Grafana users can reach:

```yaml
access_url: https://objects.internal.example.com/qpoint/{{DIGEST}}
```

## Cleanup

```bash
docker compose down -v
docker rm -f qtap
```

## Next Steps

* [Storage Configuration](/getting-started/qtap/configuration/storage-configuration.md) — S3, MinIO, AWS, GCS options
* [Traffic Processing with Plugins](/getting-started/qtap/configuration/traffic-processing-with-plugins.md) — Rulekit rules, capture levels
* [Prometheus + Grafana Monitoring](/guides/qtap-guides/observability-and-integration/monitoring-qtap-with-prometheus-and-grafana.md) — Add metrics dashboards alongside logs
* [OpenTelemetry Integration](/guides/qtap-guides/observability-and-integration/sending-qtap-events-to-opentelemetry.md) — Send events to any OTLP backend
* [Qpoint Data Schema Reference](/appendix/qpoint-data-schema-reference.md) — Full event schema documentation


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.qpoint.io/guides/qtap-guides/observability-and-integration/self-hosted-grafana-observability-stack.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
