Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

CHEP 2024:

Enhancing XRootD Load Balancing for High-Throughput transfers

To address the need for high transfer throughput for projects such as the LHC experiments, including the upcoming HL-LHC, it is important to make optimal and sustainable use of our available capacity. Load balancing algorithms play a crucial role in distributing incoming network traffic across multiple servers, ensuring optimal resource utilization, preventing server overload, and enhancing performance and reliability. At the Rutherford Appleton Laboratory (RAL), the UK's Tier-1 centre for the Worldwide LHC Computing Grid (WLCG), we started with a DNS round robin then moved to XRootD's cluster management service component, which has an active load balancing algorithm to distribute traffic across 26 servers, but encountered its limitations when the system as a whole is under heavy load. We describe our tuning of the configuration of the existing algorithm before proposing a new tuneable, dynamic load-balancer based on a weighted random selection algorithm.

Achieving 100Gb/s data rates with XRootD - Preparing for HL-HLC and SKA

To address the needs of forthcoming projects such as the Square Kilometre Array (SKA) and the HL-LHC, there is a critical demand for data transfer nodes (DTNs) capable of achieving 100Gb/s of data movement. This high throughput can be attained through combinations of increased concurrency of transfers and improvements in the speed of individual transfers. At the Rutherford Appleton Laboratory (RAL), the UK's Tier-1 centre for the Worldwide LHC Computing Grid, and initial site for the UK SKA Regional Centre (SRC), we have provisioned 100GbE XRootD servers in preparation for SKA operations. This presentation details the efforts undertaken to reach 100Gb/s data ingress and egress rates using the WebDAV protocol through XRootD endpoints, including the use of a novel XRootD plug-in designed to asses XRootD performance independently of physical storage backend. Results are presented for transfer tests against a CephFS storage backend under different configuration settings (e.g. via tunings to file layouts). We discuss the challenges encountered, bottlenecks identified, and insights gained, along with a description of the most effective solutions developed to date and areas of future activities.

Sustainability at the RAL Tier-1

CHEP 2023:

Jira Legacy
serverSystem JIRAJira
serverId929eceee-34b0-3928-beeb-a1a37de31a8b
keyXRD-51

DRAFT!

The Data access for the LHC experiments and increasing numbers of other HEP and astronomy communities is provided at UK Tier-1 facility at RAL provides data access to the LHC and HEP communities via through its ECHO storage.
The storage ECHO - currently in excess of 40PB of usable space - is deployed as a Ceph-backed erasure-coded object store, with frontend access to data is provided via XRootD - using the XrdCeph plugin - or gridFTP, via the libradosstriper library within of Ceph.

The storage must service the needs of: high-throughput compute, with staged and direct file access passing through an XCache on each workernode; data access to compute running at storageless sites, increasing utilising XCaches; and, managed inter-site data transfers using the recently adopted HTTP HTTPs protocol (using WebDav), including multihop data transfers to and from RAL’s recently newly commissioned CTA tape endpoint to external sites.

A review of the experiences of running an Object Store within these HEP data workflows, is presented, including the details of the improvements necessary improvements for the transition to WebDav from GridFTP for most inter-site data movements, and enhancements for direct-IO access.
For , where the development and optimisation of buffering and range coalescence strategies is explored.

In addition to serving the requirements of LHC Run-3, preparations for Run-4, and for large astronomy experiments is underway. One example is for ROOT-based data formats, the evolution from a TTree to RNTuple data structure provides opportunities an opportunity for storage providers to optimise and benchmark against this new format. A comparison of the current performance between data formats within ECHO is made presented and the details of potential improvements explored.

...