2025-02-27 Meeting Notes
Date
Feb 27, 2025
Participants
@Thomas, Jyothish (STFC,RAL,SC)
@Ian Johnson
@Alexander Rogovskiy
Lancs: Steven, Gerard
Glasgow:
Apologies:
James Walder, Matt and Gerard as they’re still stuck in the machine room.
CC:
Goals
List of Epics
New tickets
Consider new functionality / items
Detailed discussion of important topics
Site report activity
Discussion topics
Current status of Echo Gateways / WNs testing
Recent sandbox’s for review / deployments:
Item | Presenter | Notes |
|
---|---|---|---|
Operational Issues |
| Worker Node writable XCache fixed and deployed in lhcb nodes Ceph upgrade ongoing - Quincy |
|
Compilation and rollout status of RAL XRootD versions | @Thomas, Jyothish (STFC,RAL,SC) | 5.7.3 released (awaiting other changes to gateways) |
|
XRootD collaboration Meeting |
| https://indico.cern.ch/event/1510817/ Requirements gathering session. Officially joining the collaboration requires pledging FTEs, but we can still submit patches and PRs without that. 6.0 features improve CI/security/stability https://github.com/orgs/xrootd/projects/1 for full list general HTTP/davs improvements requested by various communities, with possibility of davix client being merged into xrootd xrd.network and fallback manager use by the alice analysis facilities looked interesting, as well as their custom quota plugin. The latter can set a quota per individual user, but they’ve used it to set a quota on each server. general wish for improvements in logging and error messages. I had a chat with Guilherme after and proposed a mid/long term plan on making a GeoIP based redirection algorithm - this would benefit us in that it can allow us to have a CMSD cluster for the batch farm, as well as with CMS AAA.
|
|
cms-aaa naming convention | @Thomas, Jyothish (STFC,RAL,SC) | cms-aaa is the only remaining personality to use proxy/ceph as the xrootd service names Separate naming convention would be more appropriate, to have main/supporting (not so urgent). CC created, and sandbox is prepared and has been tested on a test host |
|
cms-aaa jemalloc use | @Thomas, Jyothish (STFC,RAL,SC) | testing on svc20, some memory leak still present |
|
Shoveler | @Katy Ellis | Shoveler installation and monitoring Had a discussion with Andy and Guilherme, as well as a new developer in the Team. There is not much apetite to support it going forwards. One option that was floated around is once the logging improvements are made, those could be passed into a standard log parser and pushed into elasticsearch directly. |
|
On the fly Checksums | @Ian Johnson
| Logging of streamed and readback checksums is progressing. Config options now allow new version of XrdCeph to:
Would like to run one gateway during testing period with the “do nothing” option as a reference. Action: check tomorrow for test readiness, test on friday afternoon and possible limited deployment during mini-DC.
|
|
Deletions | NTR, apart from revising SQL queries from previous deletion report scripts. |
| |
XRootD Writable Workernode Gateway Hackaton
| @Thomas, Jyothish (STFC,RAL,SC)
| rolled back due to memory issues related to buffer size. |
|
Plan: file query system to summarize XRootD Logs |
| Plan to create a system to store info from across all gateways to search a filename and get creation time, last write time, last successful stat and deletion time in case of ‘lost’ files. Possible graduate sideproject. Ian plans to extend the database schema from the deletion tests (capturing file write completions and deletions) into a more general event schema. |
|
100 GbE Gateway testing: | @James Walder @Thomas, Jyothish (STFC,RAL,SC) | UKSRC - Acting as source for SRCNet verification tests; not being stressed so far … Teir-1 .
|
|
UKSRC Storage Architecture |
| Tom B. Working on CephAdm setup for the cluster. JW attempting to reinstall the hosts. |
|
Tokens Status |
|
|
|
on GGUS:
Site reports
Lancaster:
So we found out the hard way that running Ceph with 25% of your servers on 1Gb NICs just doesn’t work for any load of any significance. Luckily the replacement 25Gb NICs have started arriving.
Glasgow -
Action items