Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Apologies:

CC:

...

Item

Presenter

Notes

Operational Issues
Gateways and WNs:
- Current status and upcoming changes

(Gateway Auth failures)

Thomas, Jyothish (STFC,RAL,SC) Relatively quite (CMS running >30k core), some network saturation (OPN).

Saturation on the external gateways caused by above average background load and multihop transfers to antares. To investigate VO workflow provenance.

subset of gateways on 5.7.1

Compilation and rollout status with XrdCeph and rocky 8: 5.7.x

Thomas, Jyothish (STFC,RAL,SC)

All (external gateways) on 5.7.0 now, being upgraded to 5.7.1

batch still on 5.5.4; to upgrade to 5.7.0 1 (with proxy cache patch) with Tom Birkett time.

5.7.1 will have FD and memory fixes.as well as improvements for the XCache

XrootD Workshop plan

Alastair Dewhurst

Katy Ellis

https://indico.cern.ch/event/1386888/overview

One month to go!

  • Xrootd mgt lunch (weds)

  • Alastair Dewhurst sorting logistics with Cosners, remaining registrations, GridPP / UK → Alastair Dewhurst to send email re. CVent (missing details).

  • Alastair Dewhurst Taxi arrangements

  • Andy on Sunday wants to discuss last minute planning.

  • Workshop helpers on the day.

    • FTS (RC / TN) session chairing

discussed at the storage meeting

Xrootd 6 is coming

xrdceph is getting merged back into core xrootd

Shoveler

Katy Ellis

Shoveler installation and monitoring

all the batch farm wn connected fine

shoveler is meant to run on the same host as the gw to avoid UDP packet loss

Deletion studies through RDR

Ian Johnson

Deletions

Jira Legacy
serverSystem Jira
serverId929eceee-34b0-3928-beeb-a1a37de31a8b
keyXRD-83

Looking at scaling of deletion rates with increasing numbers of clients. Starting with RADOS Striper direct, moving on to ROOT protocol for deletion (via xrdfs) and WebDAV.

Initial results:

image-20240905-121934.pngImage Removed

Prefetch studies and RAL deletions are within allowed times for ATLAS tests currently

WN changes

Alexander Rogovskiy

https://github.com/xrootd/xrootd/issues/2308 now in 5.7.1 (and patched 5.7.0).
To be deployed over first 2 weeks of October

Xrootd testing framework

XRootD Site Testing Framework

testing VM to be set up and be used for preprod testing

Rob C - working on kubernetes deployment component

Unit tests for xrootd (added since 5.7.0)

ctest -VV -C Release -DCDASH=1 -DCOVERAGE=1 -S test.cmake
CDASH sends the test to the xrootd cdash server https://my.cdash.org/index.php?project=XRootD
run as non root

https://github.com/stfc/xrootd-testing-framework/pulls

XrootD gateway specs

100Gb NICs
current usage
25Gb NIC
memory use of xrootd process ~25-35GB
context switch ~800kHz
CPU load ~10-20

SKA Gateway box

James Walder

/wiki/spaces/UK/pages/215941180

Future developments ideas planning work

Ian Johnson Thomas, Jyothish (STFC,RAL,SC)

https://stfc.atlassian.net/wiki/spaces/X/pages/459997229/Notes+from+planning+meeting+22-04-2024?atlOrigin=eyJpIjoiNDRmNDEwOWI3Y2NhNDg5MDg4ZmZiYTNhNTliOWUwNmUiLCJwIjoiYyJ9

Tokens Status

To split this into Operational aspects and any development / long-term planning aspects.

technical implementation to accept tokens is in place, issues seem to be due to VO use cases on scheduling and accounting

Checksums fixes

Alexander Rogovskiy Thomas, Jyothish (STFC,RAL,SC)

There is a github issue open to merge this upstream https://github.com/xrootd/xrootd/issues/2338

 

 

on GGUS:

Site reports

Lancaster: Updated to xrootd 5.7.1. Nothing melted. Steven notes our file handle usage since the update yesterday:using less file descriptors (~2/3 as before)

image-20240905-121619.png

Glasgow:

Ox - Stageout failures to Echo with timeouts https://bigpanda.cern.ch/job?pandaid=6338129522 with exact error
“Error description: pilot, 1151: File transfer timed out during stage-in: mc21_13p6TeV:EVNT.29070483._000105.pool.root.1 from RAL-LCG2-ECHO_DATADISK, copy command timed out: TimeoutException: Timeout reached, timeout=448 seconds')]:failed to transfer files using copytools=['rucio']”

image-20240919-124107.pngImage Added

image-20240919-124421.pngImage Added

✅ Action items

...