Direct Server Return Overview

Direct Server Return (DSR) is a load balancing topology in which inbound client requests pass through the load balancer, but outbound responses travel directly from the backend server to the client, bypassing the load balancer entirely. This asymmetric traffic flow fundamentally changes the performance characteristics of your infrastructure and is one of the most effective techniques for scaling high-throughput services.

How DSR Works

In a conventional load balancing setup, all traffic -- both requests and responses -- flows through the load balancer. This creates a bottleneck because response payloads are typically much larger than requests. DSR eliminates this bottleneck with the following flow:

The client sends a request to the Virtual IP (VIP) address.
The load balancer receives the request and selects a backend server based on the configured scheduling algorithm.
The load balancer forwards the packet to the chosen backend, preserving the original destination IP (the VIP).
The backend server processes the request and sends the response directly to the client using the VIP as its source address.
The client receives the response as though it came from the VIP, unaware that a different server generated it.

For this to work, each backend server must be configured to accept traffic destined for the VIP address (typically via a loopback interface) and must suppress ARP responses for that address to avoid conflicts with the load balancer.

Benefits

Dramatically reduced load balancer bandwidth. Because responses bypass the load balancer, it only handles the relatively small inbound request traffic. For workloads where responses are orders of magnitude larger than requests -- such as video streaming or file downloads -- this can reduce load balancer throughput requirements by 90% or more.

Lower latency. Responses take a direct path from the backend to the client, eliminating the extra hop through the load balancer on the return path.

Improved throughput. With the load balancer freed from processing response traffic, it can handle significantly more concurrent connections and higher request rates.

Cost efficiency. A single load balancer can front a much larger pool of backend servers than would be possible in a full-proxy configuration, reducing hardware and licensing costs.

Use Cases

DSR is ideal for workloads with asymmetric traffic patterns where response payloads substantially exceed request sizes:

Media streaming -- Video and audio delivery where small HTTP requests produce large media responses.
Large file downloads -- Software distribution, CDN origins, and object storage services.
High-throughput APIs -- Services that return large datasets in response to small query parameters.
Gaming servers -- Real-time game traffic with high packet rates and strict latency requirements.

Limitations

DSR is not suitable for every deployment. The following constraints apply:

No response modification. The load balancer cannot inspect, rewrite, or compress response traffic because it never sees it. HTTP header insertion and response-based health checks are not available.
No SSL offload on responses. While the load balancer can terminate SSL on the inbound side, it cannot encrypt or decrypt response traffic. End-to-end SSL between backends and clients is required if response encryption is needed.
Backend configuration required. Each backend server must be configured with the VIP on a loopback interface and must suppress ARP announcements for that address. This adds operational complexity.
Layer 4 only. DSR operates at the network layer (L4 via nftlb in Tula) and cannot provide Layer 7 features such as content-based routing, cookie persistence, or HTTP health checks.

Tula supports two DSR modes: Layer 2 DSR for servers on the same network segment, and Layer 3 DSR for servers across different subnets using IP-in-IP tunneling.