Monitoring Tools
Monitoring tools are external systems that collect, analyze, visualize, and alert on data generated by NGINX, mainly from:
- Access logs
- Error logs
- stub_status / status endpoints
- System-level metrics (CPU, RAM, disk, network)
NGINX itself only produces data — monitoring tools consume and interpret it.
Why Monitoring Is Critical for NGINX
Without monitoring, you are blind to:
- Traffic spikes or DDoS attacks
- Backend failures
- Slow responses
- Resource exhaustion
- Configuration errors
Monitoring answers questions like:
- “Is my site up?”
- “Why is it slow?”
- “Is NGINX overloaded?”
- “Is an attack happening?”
Types of Monitoring in NGINX
Log-Based Monitoring
Uses: access.log, error.log
Tracks:
- Request volume
- Response codes (2xx, 4xx, 5xx)
- Latency
- Client IPs
- Errors
Metrics-Based Monitoring
Uses:
- stub_status
- Exporters
- OS metrics
Tracks:
- Active connections
- Requests per second
- Reading / Writing / Waiting
- Worker usage
Availability Monitoring
Checks:
- Is NGINX reachable?
- Is HTTPS working?
- Is response time acceptable?
Common NGINX Monitoring Tools (Most Used)
Prometheus + Grafana (Most Popular)
How It Works
NGINX → stub_status → Exporter → Prometheus → Grafana
What It Monitors
- Active connections
- Requests/sec
- Error rate
- Latency
- CPU & memory
Example Setup
location /nginx_status {
stub_status;
access_log off;
allow 127.0.0.1;
deny all;
}
Exporter scrapes http://localhost/nginx_status
Grafana dashboard
- Live traffic graphs
- Error spikes
- Connection states
ELK Stack (Elasticsearch, Logstash, Kibana)
How It Works
NGINX logs → Logstash → Elasticsearch → Kibana
What It Monitors
- Access logs
- Error logs
- Request patterns
- Attacks
- Slow endpoints
Example: Structured Logs
log_format json_logs escape=json
'{"time":"$time_iso8601",'
'"remote_ip":"$remote_addr",'
'"method":"$request_method",'
'"uri":"$request_uri",'
'"status":$status,'
'"response_time":$request_time}';
access_log /var/log/nginx/access.log json_logs;
Kibana can now:
- Filter errors
- Track response time
- Detect suspicious IPs
GoAccess (Real-Time Log Analyzer)
How It Works
NGINX access.log → GoAccess → HTML dashboard
Example
goaccess /var/log/nginx/access.log -o report.html --log-format=COMBINED
Shows:
- Requests per second
- Top URLs
- Slow requests
- HTTP status distribution
Datadog / New Relic (SaaS Monitoring)
How It Works
NGINX → Agent → Cloud Dashboard
What You Get: Metrics, Logs, Alerts, APM tracing
Example metrics:
- Error rate > 5%
- Response time > 500ms
- Active connections spike
Zabbix / Nagios (Traditional Monitoring)
Used in: Enterprise, Legacy infrastructure
Monitors: Service uptime, Port availability, Resource thresholds
Example check:
check_http -H example.com -S
What Metrics You Should Monitor (Must-Know)
-
Traffic Metrics: Requests per second, Active connections, Bandwidth usage
-
Error Metrics: 4xx errors, 5xx errors, Backend failures
-
Performance Metrics: Response time, Upstream latency, Slow requests
-
Resource Metrics: CPU, Memory, File descriptors, Worker connections
Alerting Examples (Very Important)
Example Alerts
| Alert | Meaning |
|---|---|
| 5xx > 2% | Backend issue |
| Active connections > 80% | Capacity risk |
| Response time > 1s | Performance issue |
| Error log spike | Config or app error |
Monitoring tools trigger: Email, Slack, SMS, PagerDuty
Security Monitoring with NGINX
Monitoring tools help detect:
- Brute force attacks
- DDoS patterns
- Abnormal request rates
- Unauthorized access attempts
Example:
- Many
401responses → brute force - Many
404from same IP → scanning
Best Practices for NGINX Monitoring
- Use metrics + logs together
- Secure status endpoints
- Disable access logs for
/nginx_status - Use structured logs (JSON)
- Set alerts before users complain
- Monitor trends, not just spikes
Example: Production Monitoring Stack
NGINX
├── access.log → ELK
├── error.log → ELK
├── stub_status → Prometheus
└── system metrics → Prometheus
↓
Grafana Dashboards
↓
Alerts