2024-01-25 Meeting Notes

 Date

Jan 25, 2024

 Participants

  • @Thomas, Jyothish (STFC,RAL,SC)

  • @Alexander Rogovskiy

  • @Thomas Byrne

  • Lancs: Steven, Gerard, Matt

  • Glasgow: Sam

Apologies:

CC:

 

 

 Goals

  • List of Epics

  • New tickets

  • Consider new functionality / items

  • Detailed discussion of important topics

  • Site report activity

 

 Discussion topics

Current status of Echo Gateways / WNs testing

Recent sandbox’s for review / deployments:

 

Item

Presenter

Notes

 

Item

Presenter

Notes

 

Operational Issues

@Thomas, Jyothish (STFC,RAL,SC)

packet loss on perfsonar?

 

 

 

Gateways and WNs:
- Current status and upcoming changes

@Thomas, Jyothish (STFC,RAL,SC)

stable status currently

  • tokens have been deployed for cms/atlas (additional patch for restricting scope foo \ foobar rejection )

  • checksum library

  • prefetch off on WNs

To resist installing 5.6.4; before the break, one sets of sets (TPC transfers) was failing against another site. To repeat the tests and see

Rocky 8 for the Gateways (@Thomas, Jyothish (STFC,RAL,SC) working on a initial setup).

 

bugfix for calculating striper objects in direct reads

 

https://github.com/stfc/xrootd-ceph/pull/50

passed test on gw8 and code reviewed

 

ECHO File transfer / throughput studies

@Katy Ellis

Tests of per-file transfer writes into Echo.
A new Jira is set up to track these changes: https://stfc.atlassian.net/browse/XRD-80
Updates presented at Liaison meeting yesterday.
Preliminary results from iperf3 testing:

iperfcomp.png

tests ongoing on svc20 and gw8
Some results are summarized in the slides here:

 

Checksums fixes

@Alexander Rogovskiy

Status and plans for improving Checksumming work …

https://github.com/alex-rg/xrd_ckslib/tree/main
Ihttps://stfc.atlassian.net/browse/XRD-56

(Sandbox prepared and applied to GW8)

 

Prefetch studies and WN changes

@Alexander Rogovskiy

Sandbox ready and applied to 1 WN, pending envroinment variable for timeout increase

 

Deletion studies through RDR

@Ian Johnson

continuining with mixed results,

previous set was 500 files

5000 files could not get uploaded, wasn’t completed after 20+ hrs (seems to have been a bad time - last Tuesday)

100 X 1GB deletion in 5 s

check with Alessandra on rucio deletion concurrency (for DC24)

ceph is performing better at the moment

 

Tokens testing

@Thomas, Jyothish (STFC,RAL,SC) @Katy Ellis

https://stfc.atlassian.net/browse/XRD-63
https://stfc.atlassian.net/browse/XRD-78
https://github.com/xrootd/xrootd/pull/2152

https://github.com/xrootd/xrootd/pull/2151/files

 

Understanding CMSD Loadbalancing

@Thomas Byrne

explore different load balancing scheme (weighted placement)

testing in internal cluster? how to measure improvements? more instrumented current version to measure improvement

things look ~ok at the moment so lower in priority

 

SKA Gateway box

@James Walder

https://stfc.atlassian.net/wiki/spaces/UK/pages/215941180

 

Architectural review ‘hackathon’

All

Plan the process for the Architectural planning of XRootD across the External Gateways and WNs

 

2024 Planning

 

JW to prepare a summary of the plans for 2024

 

 

on GGUS:

Site reports

Lancaster - Unbalanced redirector issues from last week just disappeared… Currently disabled tokens for the reasons.

Glasgow - relatively stable, few network issues. OS/Ceph version update to do.

 

 

 

 Action items

  • @James Walder to schedule a ‘hackathon’ within a F2F to have a session on architectural planning.

  • @James Walder to prepare an outline of the expected roadmap for XRootD developments in 2024.

  •  

 

 Decisions