...
All gateways were added to this sandbox and recompiled. We saw an increase in throughput as a result, along with an increase in dropped packets. This was thought to be due to insufficient TCP buffer sizes, hence Thomas, Jyothish (STFC,RAL,SC) made further changes increasing the TCP and ring buffers following the esnet https://fasterdata.es.net/host-tuning/linux/100g-tuning/ reccommendations. This was initially rolled out over a subset of gateways on 2/20 16:15 and that subset stopped dropping packets. this change was left in place overnight to ensure the stability of the gateways and then deployed over the rest of the gateways the next morning. All gateways then stopped dropping packets and maintained a near saturation throughput.
packet drops - ecn enabled, ecn disabled, tcp and ring buffer tuning
load balancing - 50/50 load and net selbyload vs standard round robin
load balancing - RR, ping times, number of connections