2024-11-07 Meeting Notes

 Date

Nov 7, 2024

 Participants

 

  • @James Walder

  • @Katy Ellis

  • @Thomas Byrne

  • @Thomas, Jyothish (STFC,RAL,SC)

  • Lancs: Matt, Stephen, Gerard

  • Glasgow: Sam

Apologies:

  •  

CC:

 

 

 Goals

  • List of Epics

  • New tickets

  • Consider new functionality / items

  • Detailed discussion of important topics

  • Site report activity

 

 Discussion topics

Current status of Echo Gateways / WNs testing

Recent sandbox’s for review / deployments:

 

Item

Presenter

Notes

 

Item

Presenter

Notes

 

Operational Issues
Gateways and WNs:
- Current status and upcoming changes

(Gateway Auth failures)

@Thomas, Jyothish (STFC,RAL,SC)

 

. quiet week

S3 rados gateway slowdowns during upgrades? (open search s3 monitoring)

 

image-20241107-130356.png

 

 

cms-aaa naming convention

 

cms-aaa is the only remaining personality to use proxy/ceph as the xrootd service names


Separate naming convention would be more appropriate, to have main/supporting

(not so urgent).

 

 

Compilation and rollout status with XrdCeph and rocky 8: 5.7.x

@Thomas, Jyothish (STFC,RAL,SC)

Upstream merging in process. Branch now exists.

Documention (particularly for the Buffered IO is needed).

 

Shoveler

@Katy Ellis

Shoveler installation and monitoring

WN Shoveler proving odd.
Discussion on data produced.
Is this in a position to replace the RAL fstream monitoring ?

 

Deletion studies through RDR

@Ian Johnson

 

 

 

Deletions

https://stfc.atlassian.net/browse/XRD-83

Can now use time bins to count total and/or slow deletions over an interval.

Need to modify log file extraction code to cover unlink after removing xattr (missed before, but low incidence)

Will apply slow deletion query to DC24 ‘busy times’ for specific or all VOs. Need to check queries and merge gateway log files.

 

periodic hackaton?

@Thomas, Jyothish (STFC,RAL,SC)

XRootD Writable Workernode  Gateway Hackaton (XWWGH)

Tues 12th Nov 1600

 

Xrootd testing framework

 

XRootD Site Testing Framework

 

 

100 GbE Gateway testing:
SKA / Tier-1

@James Walder

https://stfc.atlassian.net/wiki/spaces/UK/pages/215941180

Following up on server installs

 

UKSRC Storage Architecture

 

For v0.1 Requirements:

POSIX-like access be provided ‘next to’ the compute.
Via ‘some method', files / directories are mounted (RO) for applications (eg. Jupyterhub) to read from.

POSIX area ‘should be’ an RSE, to enable the transfers, and lifecylce management.
Bulk storage “May” exist, required a TPC into the ‘cache’ area.

Dissadvantages:
- Unnecessary data movement perhaps (and via TPC)
- Mounts and permissions

Other ideas:

Manilla style shares / volumes for each DID / container requested ?
- Rucio ‘download’ rather than TPC
- Lifecycle management on the Manilla share layer?

 

Tokens Status

 

  • Operational

  • Technical

  • Accounting

 

 

 

 

 

on GGUS:

Site reports

Lancaster: Still having a rough time with CEPH the MDS servers, having our third switchover in as many weeks. Gerard found some possible bugs, and the workaround is… what we’ve been doing…

 

 Action items

How to replace the original functionality of fstream monitoring, now opensearch has replaced existing solutions.

 

  •  

  •  

 

 Decisions