We're planning to implement a streamed checksum implementation in XrdCeph (computing checksum at the server and writing it into metadata directly instead of reading back) and give it a few months of validation testing comparing it to the read back checksum.
Tests have already shown a good improvement in performance when done this way.
2 -xrdceph plans:
We're planning to move away from libradosstriper into using rados directly for future proofing and performance improvements as well as merge our fork into core xrootd.
3- backward compatibility with 'broken' clients this was following an incident where ATLAS were trying to use 5.6.0 clients with their older analysis software, which had a TLS bug causing transfer failures against endpoints supporting tokens over IPv4. We mitigated it with the NOTLSOK environment variable, but it'd be good to have a consensus on how to deal with clients with known issues.
The sandbox is deployed to the whole preprod farm. LHCb uploads look OK, Atlas and CMS have not tested the new setup yet.
Plan: file query system to summarize XRootD Logs
Plan to create a system to store info from across all gateways to search a filename and get creation time, last write time, last successful stat and deletion time in case of ‘lost’ files. Possible graduate sideproject.
UKSRC - Acting as source for SRCNet verification tests; not being stressed so far …
Teir-1 .
UKSRC Storage Architecture
Tokens Status
Operational
Technical
Accounting
on GGUS:
Site reports
Lancaster:
All on 5.7.3 for the last week, no issues.
Following on from last week, Steven has been upping our number of shoveller instances, we’ll see how that goes.
As mentioned in storage we’re shopping for some new gateways - getting quotes for single socket Dell boxes with quad-port 25Gb network cards (and looking at 100Gb options). Our plan is to have the external port on a single 25Gb and internal on a pair of bonded 25Gb NICs. It would be nice to save money and not have to cram these with RAM.
Also mentioned was Gerard's “xrootd restarter” service, which will shepherd regular rolling restarts of our xroot services (as a means to deal with xroot’s poor connection handling). The aim is to cleanly restart the services every couple of days. Gerard’s been working on making it as “unhacky” as he possibly can.
Side topic, in an email thread with Dan T he mentioned tls hardware offloading which piqued my interest. Anyone looked into this recently? AIUI it’s not a feature on all cards. I see there’s also tcp checksum offload features in cards too.
(have we discussed this here before? It rings a bell…)
Glasgow -
✅ Action items
How to replace the original functionality of fstream monitoring, now opensearch has replaced existing solutions.