Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

ECN

before the data challenge, our ECN config ('net.ipv4.tcp_ecn') was set to 1 (always on and always request ECN)
we saw the throughput to SARA falling to very slow rates during the period their network link was loaded.

Alexander Rogovskiy then tried transfers from a test gateway by disabling ecn (setting to 0 - always off)
and found the speed increased from a few hundred kbps to 16MB/s.

Thomas, Jyothish (STFC,RAL,SC) made a sandbox with those changes and asked James Adams to review, who suggested using ecn=2 (off by default, enabled on request) which is the linux kernel default. This change was made to the sandbox.

All gateways were added to this sandbox and recompiled. We saw an increase in throughput as a result, along with an increase in dropped packets. This was thought to be due to insufficient TCP buffer sizes, hence Thomas, Jyothish (STFC,RAL,SC) made further changes increasing the TCP and ring buffers following the esnet https://fasterdata.es.net/host-tuning/linux/100g-tuning/ reccommendations. This was initially rolled out over a subset of gateways on 2/20 16:15 and that subset stopped dropping packets. this change was left in place overnight to ensure the stability of the gateways and then deployed over the rest of the gateways the next morning. All gateways then stopped dropping packets and maintained a near saturation throughput.

image-20240222-152438.png

packet drops - ecn enabled, ecn disabled, tcp and ring buffer tuning

image-20240222-152543.pngimage-20240222-152801.png

load balancing - 50/50 load and net selbyload vs standard round robin

image-20240222-161211.png

load balancing - RR, ping times, number of connections

  • No labels