Caddy 监控和日志管理指南
本文详细介绍 Caddy 的监控和日志管理,包括日志配置、监控指标、告警设置等内容。
日志配置
基础日志设置
{
log {
output file /var/log/caddy/access.log {
roll_size 100mb # 日志文件大小限制
roll_keep 10 # 保留文件数量
roll_keep_for 168h # 保留时间(7天)
}
format json # 日志格式
level INFO # 日志级别
}
}
自定义日志格式
{
log {
output file /var/log/caddy/access.log
format json {
time_format "2006-01-02 15:04:05"
time_local # 使用本地时间
message_key msg # 自定义消息键名
level_key level # 自定义级别键名
# 自定义字段
fields {
request>remote_ip remote_ip
request>method method
request>uri uri
request>proto proto
response>status status_code
response>size size
duration latency
upstream_latency upstream_latency
}
}
}
}
监控集成
Prometheus 集成
example.com {
# 启用 Prometheus 指标
metrics /metrics {
disable_openmetrics
}
# 限制访问
@metrics_auth {
remote_ip private_ranges
}
handle /metrics {
not @metrics_auth {
respond 403
}
}
}
Grafana 仪表板
# docker-compose.yml
version: '3'
services:
prometheus:
image: prom/prometheus
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"
grafana:
image: grafana/grafana
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
volumes:
- grafana_data:/var/lib/grafana
volumes:
grafana_data:
prometheus.yml 配置:
scrape_configs:
- job_name: 'caddy'
static_configs:
- targets: ['caddy:2019']
metrics_path: /metrics
日志分析
使用 GoAccess 分析
# 安装 GoAccess
apt install goaccess
# 实时分析日志
goaccess /var/log/caddy/access.log -c \
--log-format=COMBINED \
--real-time-html \
--output=/var/www/html/report.html
ELK 集成
# docker-compose.yml
version: '3'
services:
elasticsearch:
image: elasticsearch:7.9.3
environment:
- discovery.type=single-node
ports:
- "9200:9200"
logstash:
image: logstash:7.9.3
volumes:
- ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
depends_on:
- elasticsearch
kibana:
image: kibana:7.9.3
ports:
- "5601:5601"
depends_on:
- elasticsearch
logstash.conf 配置:
input {
file {
path => "/var/log/caddy/access.log"
codec => json
}
}
filter {
date {
match => [ "timestamp", "ISO8601" ]
}
}
output {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "caddy-logs-%{+YYYY.MM.dd}"
}
}
告警配置
使用 Alertmanager
# alertmanager.yml
route:
group_by: ['alertname']
group_wait: 30s
group_interval: 5m
repeat_interval: 1h
receiver: 'web.hook'
receivers:
- name: 'web.hook'
webhook_configs:
- url: 'http://127.0.0.1:5001/'
告警规则
# prometheus-rules.yml
groups:
- name: caddy
rules:
- alert: CaddyHighErrorRate
expr: rate(caddy_http_requests_total{status=~"5.."}[5m]) > 1
for: 5m
labels:
severity: warning
annotations:
summary: High error rate detected
description: "Error rate is {{ $value }} per second"
性能监控
系统指标
example.com {
# 系统资源监控
metrics {
disable_openmetrics
# 自定义标签
label instance {env.HOSTNAME}
label environment production
}
}
JVM 性能分析
# 使用 pprof 进行性能分析
go tool pprof -http=:8080 http://localhost:2019/debug/pprof/heap
# 生成火焰图
go tool pprof -http=:8080 -seconds=30 http://localhost:2019/debug/pprof/profile
日志轮转
logrotate 配置
# /etc/logrotate.d/caddy
/var/log/caddy/*.log {
daily
missingok
rotate 14
compress
delaycompress
notifempty
create 0640 caddy caddy
sharedscripts
postrotate
kill -USR1 $(cat /var/run/caddy/caddy.pid)
endscript
}
自动清理
#!/bin/bash
# /etc/cron.daily/clean-caddy-logs
# 删除 30 天前的日志
find /var/log/caddy -name "*.log.*" -mtime +30 -delete
# 压缩 7 天前的日志
find /var/log/caddy -name "*.log.*" -mtime +7 -exec gzip {} \;
监控最佳实践
1. 监控检查清单
- 请求响应时间
- 错误率
- SSL 证书过期时间
- 系统资源使用率
- 并发连接数
- 上游服务健康状态
2. 告警阈值设置
# prometheus-rules.yml
groups:
- name: caddy-alerts
rules:
- alert: HighLatency
expr: histogram_quantile(0.95, rate(caddy_http_request_duration_seconds_bucket[5m])) > 1
for: 5m
- alert: HighErrorRate
expr: rate(caddy_http_requests_total{status=~"5.."}[5m]) / rate(caddy_http_requests_total[5m]) > 0.05
for: 5m
- alert: CertificateExpiry
expr: caddy_certificates_expiry < 604800 # 7 days
for: 1h
3. 监控面板布局
{
"dashboard": {
"panels": [
{
"title": "Request Rate",
"type": "graph",
"query": "rate(caddy_http_requests_total[5m])"
},
{
"title": "Error Rate",
"type": "graph",
"query": "rate(caddy_http_requests_total{status=~\"5..\"}[5m])"
},
{
"title": "Response Time",
"type": "heatmap",
"query": "rate(caddy_http_request_duration_seconds_bucket[5m])"
}
]
}
}