Performance Tuning & Traffic Shaping
Hive Router is built for performance right out of the box, but every production setup is different. Tuning the router's traffic management settings to match your specific workload and subgraph capabilities can unlock significantly better throughput and reliability.
This guide covers the traffic_shaping configuration in detail, explains the trade-offs of each
setting, and gives you a practical approach to benchmarking and optimizing your deployment.
For a quick reference of the configuration syntax, see the
traffic_shaping configuration reference.
Understanding Connection Limits
The most important setting for both performance and stability is max_connections_per_host. This
controls how many concurrent HTTP connections the router will open to each subgraph host (like
products.api.example.com).
- Default Value:
100
Finding the Sweet Spot
Getting this right is about balancing maximum throughput with protecting your subgraphs from overload.
Too low = bottleneck:
- Even if your subgraphs have plenty of capacity, a low connection limit will queue requests inside the router, adding latency
- Your subgraph services might sit idle while the router artificially throttles traffic
- You're leaving performance on the table
Too high = overload risk:
- During traffic spikes, the router might flood subgraphs with more connections than they can handle
- This can overwhelm connection pools, CPU, or memory on your subgraphs
- Can trigger cascading failures or "thundering herd" problems where sudden traffic surges crash downstream services
- More open connections may lead to ephemeral port exhaustion
How to Tune It
Start with the default and adjust based on your observations:
- Monitor subgraph performance under normal and peak load
- Watch for connection pool exhaustion in your subgraph logs
- Look for queuing in router metrics - if requests are waiting for connections, you might need to increase the limit
- Load test gradually - increase the limit incrementally and measure the impact
Managing Idle Connections
The pool_idle_timeout setting controls how long unused connections stay open in the router's
connection pool before being closed.
- Default Value:
50s
It takes a duration string (like 30s for 30 seconds, or 1m for 1 minute). This setting affects
how aggressively the router reuses existing connections versus closing them to free up resources.
The Connection Reuse Trade-off
Too short = latency overhead:
- Connections get closed quickly, so new requests have to establish fresh TCP/TLS connections
- Each new connection adds handshake latency (especially noticeable with TLS)
- Your router and subgraphs spend more CPU on connection setup
Too long = resource waste:
- Idle connections consume memory and file descriptors on both the router and subgraph servers
- Network devices (load balancers, firewalls) might have shorter timeouts and silently drop connections, leading to "zombie" connections that fail when used
Tuning Guidelines
- High-traffic APIs: Use longer timeouts (60-300 seconds) since connections are likely to be reused quickly
- Low-traffic APIs: Use shorter timeouts (10-30 seconds) to free up resources
- Check your infrastructure: Make sure this timeout is shorter than any load balancer or firewall timeouts in your stack
- Monitor connection errors: If you see connection failures, your timeout might be longer than network device timeouts
Request Deduplication
The router supports two complementary levels of in-flight request deduplication that can be enabled independently: inbound and outbound.
Inbound Deduplication
Inbound deduplication (traffic_shaping.router.dedupe) operates at the entry point of the
router. When multiple clients send identical GraphQL query operations simultaneously, the router
executes the operation only once and shares the result with all waiting clients — subgraphs receive
just a single request regardless of how many clients are waiting.
- Default:
false(opt-in)
traffic_shaping:
router:
dedupe:
enabled: trueDeduplication key
Two requests are considered identical when the following all match:
- HTTP method and path
- Normalized operation text (whitespace/comment differences are ignored)
- GraphQL variables
- GraphQL extensions
- Schema checksum (prevents sharing across schema reload transitions)
- Selected request headers (controlled by the
headerspolicy below)
Header policy
By default all headers are included in the fingerprint, so requests with different Authorization
or Cookie headers are not deduplicated with each other. You can narrow this down:
traffic_shaping:
router:
dedupe:
enabled: true
headers: all # default — include every headertraffic_shaping:
router:
dedupe:
enabled: true
headers: none # ignore all headers (requests from any user may be deduplicated)traffic_shaping:
router:
dedupe:
enabled: true
headers:
include: # include only these headers in the fingerprint
- authorization
- cookieWhen to enable it:
- Many clients frequently issue the same popular queries (dashboards, landing pages, product listings)
- You want to reduce overall query execution pressure on your subgraphs under concurrent load
When you might leave it disabled:
- All queries are highly personalised and rarely identical
- You're debugging and want every request to execute independently
Outbound Deduplication
Outbound deduplication (dedupe_enabled) deduplicates the requests the router makes
to individual subgraphs. When the router would send multiple identical requests to the same subgraph
simultaneously, it sends only one and fans the response back to all waiting parallel fetches.
- Default Value:
true
This is almost always beneficial to keep enabled. It dramatically reduces load on subgraphs when multiple clients request the same data at once (think of popular content or dashboard queries that many users run simultaneously).
When you might disable it:
- Your queries are always unique (heavily personalized)
- You're debugging and want to see every request
- You have very low traffic where deduplication doesn't help