2024-10-03 Meeting Notes

 Date

Oct 3, 2024

 Participants

 

  • @Thomas, Jyothish (STFC,RAL,SC)

  • @Alexander Rogovskiy

  • @Thomas Byrne

  • @Alastair Dewhurst

  • Lancs: Matt

  • Glasgow:

Apologies:

  • @James Walder

  •  

  •  

CC:

 

 

 Goals

  • List of Epics

  • New tickets

  • Consider new functionality / items

  • Detailed discussion of important topics

  • Site report activity

 

 Discussion topics

Current status of Echo Gateways / WNs testing

Recent sandbox’s for review / deployments:

 

Item

Presenter

Notes

 

Item

Presenter

Notes

 

Operational Issues
Gateways and WNs:
- Current status and upcoming changes

(Gateway Auth failures)

@Thomas, Jyothish (STFC,RAL,SC)

 

Saturation on the external gateways on Friday 27/09

all common gateways on 5.7.1

'23 gws network issue might have been due to network advertisement priority to CERN - legacy network was higher in priority

 

cms-aaa naming convention

 

cms-aaa is the only remaining personality to use proxy/ceph as the xrootd service names

cms proxy cfg ->unified.cfg

ceph-> tcp.cfg


a separate naming convention would be more appropriate - main/supporting?

 

 

 

Compilation and rollout status with XrdCeph and rocky 8: 5.7.x

@Thomas, Jyothish (STFC,RAL,SC)

All (external gateways) upgraded to 5.7.1

batch still on 5.5.4; to upgrade to 5.7.1 (with proxy cache patch) with Tom Birkett time.

5.7.1 will have FD and memory fixes.as well as improvements for the XCache

 

XrootD Workshop plan

@Alastair Dewhurst

@Katy Ellis

https://indico.cern.ch/event/1386888/overview

discussed at the storage meeting

Xrootd 6 is coming

xrdceph is getting merged back into core xrootd - planned for Xrootd 6


 

HTCondor

 

Pelican - 'Wizard' like webUI for configuration

Tokens - non WLCG tokens use different token profiles/protocols

 

 

Shoveler

@Katy Ellis

all the batch farm wn connected fine

shoveler is meant to run on the same host as the gw to avoid UDP packet loss

 

Deletion studies through RDR

@Ian Johnson

 

 

 

Deletions

https://stfc.atlassian.net/browse/XRD-83

RAL deletions are within allowed times for ATLAS tests currently

 

 

 

WN changes

@Alexander Rogovskiy

https://github.com/xrootd/xrootd/issues/2308 now in 5.7.1 (and patched 5.7.0).
To be deployed over first 2 weeks of October

 

periodic hackaton?

 

group coding/testing for specific tasks

 

Xrootd testing framework

 

testing VM to be set up and be used for preprod testing

Rob C - working on kubernetes deployment component

Unit tests for xrootd (added since 5.7.0)

ctest -VV -C Release -DCDASH=1 -DCOVERAGE=1 -S test.cmake
CDASH sends the test to the xrootd cdash server https://my.cdash.org/index.php?project=XRootD
run as non root

https://github.com/stfc/xrootd-testing-framework/pulls

 

XrootD gateway specs

 

100Gb NICs
current usage
25Gb NIC
memory use of xrootd process ~25-35GB
context switch ~800kHz
CPU load ~10-20

Adding NICs to current gateways?
Tom B - bonded NICs were hard to set up and operate,
especially at scale

keeping networking simple preferred

on adding 100G NICs to current gws - switches might get overloaded

 

 

 

SKA Gateway box

@James Walder

https://stfc.atlassian.net/wiki/spaces/UK/pages/215941180

 

 

Future developments ideas planning work

@Ian Johnson @Thomas, Jyothish (STFC,RAL,SC)

 

Tokens Status

 

To split this into Operational aspects and any development / long-term planning aspects.

technical implementation to accept tokens is in place, issues seem to be due to VO use cases on scheduling and accounting

accounting - it’s in the pipeline but not coming soon on APEL

 

Checksums improvements

@Alexander Rogovskiy @Thomas, Jyothish (STFC,RAL,SC)

There is a github issue open to merge this upstream https://github.com/xrootd/xrootd/issues/2338

streaming checksums

 

 

 

 

on GGUS:

Site reports

Lancaster: Some issues with Reef, working on CHEP papers

 

 Action items

 

  •  

  •  

 

 Decisions