Difficulty: ⭐⭐⭐ Advanced Time: 45 minutes
Scenario
"We need to export edge telemetry to Datadog for a post-incident review. Export only ERROR logs and failed traces from the past 2 hours for the payment-service."
Your Mission
- Query edge telemetry (HyperDX/ClickHouse)
- Filter data based on criteria (severity, service, time range)
- Export to a format suitable for Datadog
- Document the export process
Steps
1. Define Export Criteria
# Export specification
time_range: last 2 hours
services: [payment-service]
log_severity: [ERROR, WARN]
traces: [status_code = ERROR]
destination: datadog (simulated as JSON files)
2. Export Logs
# Export ERROR and WARN logs from payment-service
kubectl exec -it -n observability clickhouse-0 -- \
clickhouse-client --query \
"SELECT
toUnixTimestamp(timestamp) * 1000 AS timestamp_ms,
service_name,
severity_text,
body AS message,
trace_id,
span_id,
attributes
FROM default.otel_logs
WHERE service_name = 'payment-service'
AND severity_text IN ('ERROR', 'WARN')
AND timestamp > now() - INTERVAL 2 HOUR
ORDER BY timestamp DESC
FORMAT JSONEachRow" > payment-service-logs-export.json
# View exported logs
cat payment-service-logs-export.json | jq '.' | head -503. Export Failed Traces
# Export traces with status = ERROR
kubectl exec -it -n observability clickhouse-0 -- \
clickhouse-client --query \
"SELECT
trace_id,
span_id,
parent_span_id,
span_name,
service_name,
duration_ns,
status_code,
attributes
FROM default.otel_traces
WHERE service_name = 'payment-service'
AND status_code = 'ERROR'
AND timestamp > now() - INTERVAL 2 HOUR
ORDER BY timestamp DESC
FORMAT JSONEachRow" > payment-service-traces-export.json
# View exported traces
cat payment-service-traces-export.json | jq '.' | head -504. Export Metrics (Aggregated)
# Export error rate and latency metrics
kubectl exec -it -n observability clickhouse-0 -- \
clickhouse-client --query \
"SELECT
toStartOfMinute(timestamp) AS minute,
service_name,
countIf(status_code = 'ERROR') AS error_count,
count() AS total_count,
(error_count / total_count) * 100 AS error_rate_pct,
avg(duration_ns / 1000000) AS avg_latency_ms
FROM default.otel_traces
WHERE service_name = 'payment-service'
AND timestamp > now() - INTERVAL 2 HOUR
GROUP BY minute, service_name
ORDER BY minute ASC
FORMAT JSONEachRow" > payment-service-metrics-export.json
cat payment-service-metrics-export.json | jq '.' | head -205. Create Export Summary
Create export-summary.md:
# Telemetry Export Summary
**Export Date**: 2026-02-19
**Store ID**: demo-4523
**Time Range**: Last 2 hours
**Services**: payment-service
## Export Contents
### Logs
- **File**: payment-service-logs-export.json
- **Count**: [run: `wc -l payment-service-logs-export.json`]
- **Severity**: ERROR, WARN
- **Format**: JSON (Datadog-compatible)
### Traces
- **File**: payment-service-traces-export.json
- **Count**: [run: `wc -l payment-service-traces-export.json`]
- **Status**: ERROR only
- **Format**: JSON (OpenTelemetry format)
### Metrics
- **File**: payment-service-metrics-export.json
- **Aggregation**: Per minute
- **Metrics**: error_count, error_rate_pct, avg_latency_ms
## Findings
[Document your findings here after reviewing the exported data]
## Next Steps
1. Upload to Datadog (via API or UI)
2. Share links with incident response team
3. Use for post-incident review (PIR)
Questions to Answer
- How many ERROR logs were exported?
- How many failed traces?
- What is the average error rate per minute?
- What is the total size of exported data?
- How would you automate this export process?