2024-04-11 Meeting Notes

 Date

Mar 7, 2024

 Participants

  • @Thomas, Jyothish (STFC,RAL,SC)

  • @Thomas Byrne

  • @Alastair Dewhurst

  • @Alexander Rogovskiy

  • Mariam Demir

  • Lancs: Gerard, Steven, Matt

  • Glasgow: Sam

  • @Ian Johnson

Apologies:

CC:

 

 

 Goals

  • List of Epics

  • New tickets

  • Consider new functionality / items

  • Detailed discussion of important topics

  • Site report activity

 

 Discussion topics

Current status of Echo Gateways / WNs testing

Recent sandbox’s for review / deployments:

 

Item

Presenter

Notes

 

Item

Presenter

Notes

 

Operational Issues
Gateways and WNs:
- Current status and upcoming changes

@Thomas, Jyothish (STFC,RAL,SC)

wk 19th March, moving gateways to the new network.

gw10,12,13,15 moved

cms-aaa had a brief issue with incorrect CA chains for x509

gw11,14,16,8 moving

 

XrootD Workshop plan

@Alastair Dewhurst

 

 

Further observations from Data Challenge

Future testing plans (RAL initiated / VO initiated) ?

All

 

 

Rocky 8 and 9 migration planning

 

gridftp to be decommissioned

 

Deletion studies through RDR

@Ian Johnson

 

Multi-thread deletions with known file sizes temporarily suspended for work with DC24 deletion times and LHCb dark file deletions. WIll resume, recording deletion times both from client side and server side for indication of “overhead” in server.

 

Deletions

https://stfc.atlassian.net/browse/XRD-83

load balancing algorithm seems to have improved this

Deletion times during DC24 - combining gateway logs for analysis of file size vs deletion time.

After DC24 finished- LHCb dark file deletions - recorded file size and deletion times for 158589 files, typical rate ~6Hz, via single-thread client.

Not started - hijack “rados bench” cleanup routine to work from a list of files, rather than its auto-generated sequence

 

Planning for ALICE CMSD redirection

@Thomas, Jyothish (STFC,RAL,SC)

INC-163994 - DNS ip additions

 

Checksums fixes

@Alexander Rogovskiy @Thomas, Jyothish (STFC,RAL,SC)

Noscript checksum by Jo-stfc · Pull Request #9 · stfc/xrootd

deployed, lhcb ticket closed

 

Prefetch studies and WN changes

@Alexander Rogovskiy

 

 

Tokens Status

@Thomas, Jyothish (STFC,RAL,SC) @Katy Ellis

Looking into tokens through redirection; (i.e. redacting the release of a token).
TPC still can see tokens in certain cases.

 

CMSD Load balancing

@Thomas Byrne @Thomas, Jyothish (STFC,RAL,SC)

PR:
revised load balancing algorithm - weighed random selection by Jo-stfc · Pull Request #8 · stfc/xrootd


 

 

SKA Gateway box

@James Walder

https://stfc.atlassian.net/wiki/spaces/UK/pages/215941180

4 Nodes awaiting installation:

2 for Exit pod (+ 1 existing)
1 for cloud
1 for Tier-1 usage

 

5.6.x root TPC issue

 

root:// TPC transfer fail with xrootd 5.6.x · Issue #2202 · xrootd/xrootd

issue found to be dcache side, mitigation possible in xrootd config to use the old md algorithm

 

 

on GGUS:

Site reports

Lancaster: Smooth running over Easter. Gearing up for an update to the “top of Pacific”. Steven noticed a correlation between drive errors and drive manufacturer, unsure if this is just one type of drive doing a better job of reporting.

(hard to see, but almost all the orange blobs are TOSHIBA drives, this data is pulled from SMART) -

Tom B - reported metrics might mean different things

image-20240411-120113.png

 

Glasgow: Sam build newer versions of xrootd on CC7 and 8. (anticipating 5.6.9), and awaiting any updates for LB.

 

 

 

 Action items

  • @James Walder to schedule a ‘hackathon’ within a F2F to have a session on architectural planning.

  • @James Walder to prepare an outline of the expected roadmap for XRootD developments in 2024.

  •  

 

 Decisions