/
2025-01-09 Meeting Notes

2025-01-09 Meeting Notes

 Date

Jan 9, 2025

 Participants

 

  •  

  •  

  • Lancs:

  • Glasgow:

Apologies:

 

CC:

 

 

 Goals

  • List of Epics

  • New tickets

  • Consider new functionality / items

  • Detailed discussion of important topics

  • Site report activity

 

 Discussion topics

Current status of Echo Gateways / WNs testing

Recent sandbox’s for review / deployments:

 

Item

Presenter

Notes

 

Item

Presenter

Notes

 

Operational Issues
Gateways and WNs:
- Current status and upcoming changes

 

 

Upgrades of GWs complete

.

 

 

Checksums issue with an ATLAS file

 

[XrdCks] Checksum request during transfer locks partial file checksum into metadata for Ceph · Issue #2388 · xrootd/xrootd

GGUS /login

Checksum requested before whole file is updated. No ability to do stale checksum check in ceph, so original checksum ‘sticks’ to the file.

 

cms-aaa naming convention

 

cms-aaa is the only remaining personality to use proxy/ceph as the xrootd service names


Separate naming convention would be more appropriate, to have main/supporting

(not so urgent).

CC created, but due to be reviewed December

 

 

XRootD Managers De-VMWareification

@Thomas, Jyothish (STFC,RAL,SC)

Option 2 preferred for efficiency, but Option 1 decided on

Option 1 would be simpler to implement for a temporary fix, as the move would be reversed

antares tpc nodes to be moved to an echo leafsw, to confirm ipv4 real estate with James
lfsw30 (UPS room) decided on destination

 

Compilation and rollout status with XrdCeph and rocky 8: 5.7.x

@Thomas, Jyothish (STFC,RAL,SC)

5.7.2 published.
Investigating xrootd.redirect for write operations.

 

Shoveler

@Katy Ellis

Shoveler installation and monitoring

 

 

On the fly Checksums
https://stfc.atlassian.net/browse/XRD-98

@Ian Johnson

 

Simple PoC calculating Adler32 in the XrdCeph plugin mostly working. Neglible reduction in write rate compared to not calculating Adler32 on-the-fly.

 

 

Deletions

https://stfc.atlassian.net/browse/XRD-83

NTR

 

XRootD Writable Workernode  Gateway Hackaton

 

@Thomas, Jyothish (STFC,RAL,SC)

XRootD Writable Workernode  Gateway Hackaton (XWWGH)


Hackaton writeable workernode

 

 

Xrootd testing framework

 

XRootD Site Testing Framework

 

 

100 GbE Gateway testing:
SKA / Tier-1

@James Walder @Thomas, Jyothish (STFC,RAL,SC)

https://stfc.atlassian.net/wiki/spaces/UK/pages/215941180

 

image-20241212-125528.png

 

 

UKSRC Storage Architecture

 

 

 

Tokens Status

 

  • Operational

  • Technical

  • Accounting

 

 

 

 Tom - Provided updates from Cephlacon

on GGUS:

Site reports

 

Lancaster:

We continue with our handwavey observations about CEPH appearing to create work for itself by shuffling data around apparently unprovoked and XRootD connection handling seeming, for lack of more information, weird (so weird Gerard is working in an orchastrated restart script for the gateways). An example of the weird handling, after not being touched over Christmas spot the point where we restarted the xrootd services:

image-20250109-115856.png

We’re pencilling out the design for a fresh batch of gateways, we probably don’t have the infrastructure to go full 100Gb so we’re settling on a “3 NIC design”, where two 25Gb internal (i.e. cluster + cephfs traffic) NICs are either bonded, or a NIC is dedicated to the CephFS mount. Some (handwavey as always) concerns about inter- network card traffic.

 

Glasgow

 

 Action items

How to replace the original functionality of fstream monitoring, now opensearch has replaced existing solutions.

 

  •  

  •  

 

 Decisions