2024-04-25 Meeting Notes

 Date

Apr 25, 2024

 Participants

  • @Thomas, Jyothish (STFC,RAL,SC)

  • @Ian Johnson

  • @Alastair Dewhurst

  • @Mariam Demir

  • @James Walder

  • Lancs: Gerard, Steven, Matt

  • Glasgow: Sam

Apologies:

 

CC:

 

 

 Goals

  • List of Epics

  • New tickets

  • Consider new functionality / items

  • Detailed discussion of important topics

  • Site report activity

 

 Discussion topics

Current status of Echo Gateways / WNs testing

Recent sandbox’s for review / deployments:

 

Item

Presenter

Notes

 

Item

Presenter

Notes

 

Operational Issues
Gateways and WNs:
- Current status and upcoming changes

@Thomas, Jyothish (STFC,RAL,SC)

2024-04-23 LHCb WGprod Echo overload

more mem on the Xcache - talk with Tom Birkett

theoretical limits of the cluster: ~500BG/s , ~100k iops

actual iops are bloated by ceph (rocksdb, EC write/read)

SSD use for storage survey- check wear level/lifetime of SSD

dell-19s are SSDs used in batch farm

spinning disks have lower iops than ssd, Xcache buffering is vital for future

 

XrootD Workshop plan

@Alastair Dewhurst

TBA, registration payment page feedback welcome

 

Rocky 8 and 9 migration planning

 

gridftp to be decommissioned (CMS notified) , 4 gateways to be introduced in prod

 

Future developments ideas planning work

@Ian Johnson @Thomas, Jyothish (STFC,RAL,SC)

Notes from planning meeting 22-04-2024

 

Deletion studies through RDR

@Ian Johnson

 

Tidying awk/sqlite scripts to process logfile data, e.g. from DC24.

 

Deletions

https://stfc.atlassian.net/browse/XRD-83

Looking into “rados bencher” clean_up routine which fires off several async rm calls, comparing this with XrdCeph unlink which calls striper::remove - it waits for the async rm to complete, hence blocking may be the cause of RAL’s insufficient deletion rates

is the deletion fully parallel?

100TB castor migration dataset available for deletion

 

Planning for ALICE CMSD redirection

@Thomas, Jyothish (STFC,RAL,SC)

 

 

Checksums fixes

@Alexander Rogovskiy @Thomas, Jyothish (STFC,RAL,SC)

Done. Test in the batch farm?

 

Prefetch studies and WN changes

@Alexander Rogovskiy

 

 

Tokens Status

@Thomas, Jyothish (STFC,RAL,SC) @Katy Ellis

 

 

CMSD Load balancing

@Thomas Byrne @Thomas, Jyothish (STFC,RAL,SC)

PR:
revised load balancing algorithm - weighed random selection by Jo-stfc · Pull Request #8 · stfc/xrootd


 

 

SKA Gateway box

@James Walder

https://stfc.atlassian.net/wiki/spaces/UK/pages/215941180

4 Nodes awaiting installation:

2 for Exit pod (+ 1 existing)
1 for cloud
1 for Tier-1 usage

 

 

 

 

 

Xrootd testing framework

@Mariam Demir

 

 

 

on GGUS:

Site reports

Lancaster: CEPH - seems fairly happy at the moment, Gerard investigating some issues with scrubbing. XRootD - having a set of LSST functional tests fail with permission denied whilst making their directories. The xroot logs are worse then useless for debugging auth issues. There shouldn’t be any access issues, and access works for me and Tim. However it reminded me of issues Glasgow had with LHCB (mkdir over http getting permission denied). Any tips for teasing out more information on xroot auth decisions?

Jyothish - LHCB functional tests needed additional entries in the authdb to stat root folders. e.g. lhcb:user rl additional to lhcb:user/ a

Similarly is there any good way of logging deletions server side?

Logging needs improvement across xrootd - one event one line, machine parsable, reduce clutter, uniform format. possibly bring up in the workshop?

 

 

Glasgow: Sam build newer versions of xrootd on CC7 and 8. (anticipating 5.6.9), and awaiting any updates for LB.

GitHub - stfc/xrootd-ceph at variableobjectcleanup

[XrdCms] add a tunable weighed random load balancing algorithm by Jo-stfc · Pull Request #2246 · xrootd/xrootd

 

 

 Action items

  • @James Walder to schedule a ‘hackathon’ within a F2F to have a session on architectural planning.

  • @James Walder to prepare an outline of the expected roadmap for XRootD developments in 2024.

  •  

 

 Decisions