2022-09-21 Meeting notes

2022-09-21 Meeting notes


Sep 21, 2022


  • @James Walder

  • @Alastair Dewhurst

  • @Emmanuel Bejide

  • @Thomas Byrne

  • Manchester: Alessandra

  • Glasgow: Sam

  • Lancs: Steven


  • List of Epics

  • New tickets

  • Consider new functionality / items

  • Detailed discussion of important topics

  • Site report activity


 Discussion topics












Echo Gateways ~ 1/2 with 5.5.0 sandbox:

Awaiting approval of sandbox (Deployed).
WN XrootD containers then need to be addressed (centos7 / EL8?) (to discuss explicitly with @Thomas Birkett, and with automated builds )


‘unified’ config


Do we have a better name, than ‘unified’ ?
This will be configured to run on the webdav alias hosts.
root TPC transfers to be redirected to xrootd aliased hosts (or could just fail…)





gFal2 CLI slower than API

(API creates a context once, for each multi-processing thread).


CHEP abstracts


Anything planned for CHEP ? (17 Nov abstract deadline)

  • SE (Xrootd + Posix); and dev-lead ideas ?




Currently disabled; to review when Will returns (and a short post-mortem)



Possible New Time slot:

Time slot


Highly inconvenient

Time slot


Highly inconvenient

Monday 11-12



Wednesday 16-17



Thursday 13-14



Friday 14-15





Site reports

21 Sept 2022


Glasgow now on davs / xrootd 5.5.0. As mentioned to James, saw 20Gbit/s rates through the internal gateway before the xrootd service "livelocked" (up but apparently politely ignoring requests). Fixed by restart. We *do* see some packet discards on that link at the time we were at 20Gbit/s so it may be that the discards are associated with "ceph/rados ctx issues"->silent failures as noticed by Jyothish in general for xrdceph failures not being reported back up the xrootd chain.


Need to move to the redirector infrastructure now anyway [which I was holding off on whilst understanding the above] so that should also help reliability. 

Also doing some dev work on a fork of XrdCeph to add stream/on-the-fly checksums into the module [which hopefully would reduce the memory impact and io impact of checksums significantly].


(We also need to investigate the discards at a nic config level) 


Lancaster redirection balancing:

Bands indicate contribution of each server, stacked to 100% (left-hand scale).  Thick line (right-hand scale) is stddev of percentages, higher values indicating greater imbalance.  User+system time is as reported by xrootd, documented as generated from getrusage.  


 Action items

@James Walder prepare 5.5.1 RPMS

@James Walder Propose Thursday 13-14 for new meetings


  1. Thursday 13-14 for new meetings