Tula Networks
Documentation
Toggle sidebar

Statistics & Monitoring

Real-time and historical monitoring of your load balancer

Statistics & Monitoring

Tula provides a built-in statistics dashboard that gives you real-time and historical visibility into your load balancer's performance. All metrics are collected automatically and require no additional software or external monitoring infrastructure to use.

Dashboard Overview

The statistics dashboard is accessible from the main navigation under Monitoring > Statistics. It presents key performance indicators at a glance, including total active connections, connections per second, bytes transferred in and out, and overall system health. The dashboard updates in real time, reflecting the current state of all configured VIPs and their associated backend servers.

Real-Time Metrics

Tula tracks a comprehensive set of real-time metrics across both Layer 4 (nftlb) and Layer 7 (HAProxy) load balancing services:

  • Connections per second -- The rate of new client connections arriving at each VIP.
  • Active connections -- The number of currently open connections being processed.
  • Bytes in/out -- The volume of data flowing through the load balancer in each direction, measured per VIP and per backend.
  • Response times -- Average and peak response times from backend servers, helping identify slow or degraded backends.
  • HTTP status codes -- For Layer 7 VIPs, a breakdown of response codes (2xx, 3xx, 4xx, 5xx) returned by backend servers.
  • Session rates -- The rate at which new sessions are established, sustained, and closed.

Per-VIP and Per-Backend Statistics

Statistics are available at multiple levels of granularity. At the VIP level, you can see aggregate traffic flowing through each virtual service. Drilling down into a specific VIP reveals per-backend statistics, showing how traffic is distributed among the backend servers and how each backend is performing individually. This makes it straightforward to identify uneven load distribution, failing backends, or capacity bottlenecks.

Historical Graphs

Tula uses RRDtool (Round-Robin Database) to store and graph historical performance data. RRDtool maintains a fixed-size database that automatically consolidates older data into lower-resolution averages, ensuring that long-term storage requirements remain constant regardless of how long the appliance has been running.

Historical graphs are available at four time intervals:

  • Hourly -- Detailed minute-by-minute data for the past 60 minutes. Useful for investigating recent incidents or verifying the impact of a configuration change.
  • Daily -- Five-minute resolution data covering the past 24 hours. Ideal for identifying daily traffic patterns and peak usage periods.
  • Weekly -- Thirty-minute resolution data spanning seven days. Helpful for spotting weekly trends and planning capacity.
  • Monthly -- Two-hour resolution data over 30 days. Provides a broad view of traffic growth and long-term performance trends.

Each graph displays metrics as line or area charts with clearly labeled axes, legends, and time scales. You can view graphs for connections, bandwidth, response times, and backend health across all time intervals.

Connection Tables

The real-time connection table displays every active connection currently being handled by the load balancer. Each entry shows the client source IP and port, the destination VIP, the assigned backend server, the connection duration, and the number of bytes transferred. This table is particularly useful for troubleshooting client connectivity issues, verifying session persistence behavior, and identifying long-lived or stuck connections.

Interpreting Key Metrics

When evaluating load balancer performance, focus on these indicators:

  • Rising connection counts with stable response times indicate healthy traffic growth within capacity.
  • Increasing response times with stable connection counts suggest backend servers are becoming overloaded or experiencing issues.
  • Uneven backend distribution may indicate a misconfigured balancing algorithm or health check problems causing traffic to concentrate on a subset of backends.
  • Elevated 5xx error rates at the Layer 7 level point to application-level failures on backend servers rather than load balancer issues.

Statistics data is retained locally on the appliance and persists across reboots. In an HA pair, each node maintains its own statistics reflecting the traffic it has handled while in the active role.