2024-10-31 Meeting Notes

 Date

Oct 31, 2024

 Participants

 

  •  

  • Lancs:

  • Glasgow:

Apologies:

  •  

  •  

  •  

CC:

 

 

 Goals

  • List of Epics

  • New tickets

  • Consider new functionality / items

  • Detailed discussion of important topics

  • Site report activity

 

 Discussion topics

Current status of Echo Gateways / WNs testing

Recent sandbox’s for review / deployments:

 

Item

Presenter

Notes

 

Item

Presenter

Notes

 

Operational Issues
Gateways and WNs:
- Current status and upcoming changes

(Gateway Auth failures)

@Thomas, Jyothish (STFC,RAL,SC)

 

CHEP week had a couple of callouts (self-resolving). Some restarts performed this week.

 

cms-aaa naming convention

 

cms-aaa is the only remaining personality to use proxy/ceph as the xrootd service names


Separate naming convention would be more appropriate, to have main/supporting

(not so urgent).

 

 

Compilation and rollout status with XrdCeph and rocky 8: 5.7.x

@Thomas, Jyothish (STFC,RAL,SC)

Done (related: discussion on the upstream merger of XrdCeph).

 

HTCondor

 

Pelican - 'Wizard' like webUI for configuration

Tokens - non WLCG tokens use different token profiles/protocols

 

 

Shoveler

@Katy Ellis

Shoveler installation and monitoring

 

 

Deletion studies through RDR

@Ian Johnson

 

 

 

Deletions

https://stfc.atlassian.net/browse/XRD-83

Can now use time bins to count total and/or slow deletions over an interval.

Need to modify log file extraction code to cover unlink after removing xattr (missed before, but low incidence)

Will apply slow deletion query to DC24 ‘busy times’ for specific or all VOs. Need to check queries and merge gateway log files.

 

WN changes

@Alexander Rogovskiy

Read requests fail for proxy+origin setup under heavy load · Issue #2308 · xrootd/xrootd now in 5.7.1 (and patched 5.7.0).
To be deployed over first 2 weeks of October

(To confirm that this was rolled out - and working ok)

 

periodic hackaton?

 

group coding/testing for specific tasks

 

Xrootd testing framework

 

XRootD Site Testing Framework

testing VM to be set up and be used for preprod testing

Rob C - working on kubernetes deployment component

Unit tests for xrootd (added since 5.7.0)

ctest -VV -C Release -DCDASH=1 -DCOVERAGE=1 -S test.cmake
CDASH sends the test to the xrootd cdash server https://my.cdash.org/index.php?project=XRootD
run as non root

Pull requests · stfc/xrootd-testing-framework

 

XrootD gateway specs

 

100Gb NICs
current usage
25Gb NIC
memory use of xrootd process ~25-35GB
context switch ~800kHz
CPU load ~10-20

Adding NICs to current gateways?
Tom B - bonded NICs were hard to set up and operate,
especially at scale

keeping networking simple preferred

on adding 100G NICs to current gws - switches might get overloaded

 

 

 

SKA Gateway box

@James Walder

https://stfc.atlassian.net/wiki/spaces/UK/pages/215941180

 

 

Future developments ideas planning work

@Ian Johnson @Thomas, Jyothish (STFC,RAL,SC)

Notes from planning meeting 22-04-2024

 

Tokens Status

 

To split this into Operational aspects and any development / long-term planning aspects.

technical implementation to accept tokens is in place, issues seem to be due to VO use cases on scheduling and accounting

accounting - it’s in the pipeline but not coming soon on APEL

 

Checksums improvements

@Alexander Rogovskiy @Thomas, Jyothish (STFC,RAL,SC)

There is a github issue open to merge this upstream Make stale checksum check optional for ceph storage endpoints · Issue #2338 · xrootd/xrootd

streaming checksums

 

 

 

 

on GGUS:

Site reports

Lancaster: Our CHEP talk was well received. Having a rough time with CEPH the last couple of weeks requiring forcing a switch between the MDS servers. When things settle down and we don’t have CHEP write ups to do we’re whimsically thinking to take a peek at the xrootd-s3 stuff.

 

 Action items

 

  •  

  •  

 

 Decisions