2024-11-21 Meeting Notes

 Date

Nov 21, 2024

 Participants

 

  • @Alexander Rogovskiy

  • @James Walder

  • @Ian Johnson

  • Lancs: Gerard, Matt, Steven

  • Glasgow: Sam

Apologies:

  • @Thomas, Jyothish (STFC,RAL,SC)

  •  

CC:

 

 

 Goals

  • List of Epics

  • New tickets

  • Consider new functionality / items

  • Detailed discussion of important topics

  • Site report activity

 

 Discussion topics

Current status of Echo Gateways / WNs testing

Recent sandbox’s for review / deployments:

 

Item

Presenter

Notes

 

Item

Presenter

Notes

 

Operational Issues
Gateways and WNs:
- Current status and upcoming changes

(Gateway Auth failures)

@Thomas, Jyothish (STFC,RAL,SC)

 

Upgrade of the GWs to happen on Weds / thurs next week

 

 

cms-aaa naming convention

 

cms-aaa is the only remaining personality to use proxy/ceph as the xrootd service names


Separate naming convention would be more appropriate, to have main/supporting

(not so urgent).

CC created, but due to be reviewed December

 

 

XRootD Managers De-VMWareification

@Thomas, Jyothish (STFC,RAL,SC)

Option 2 preferred for efficiency, but Option 1 decided on

Option 1 would be simpler to implement for a temporary fix, as the move would be reversed

antares tpc nodes to be moved to an echo leafsw, to confirm ipv4 real estate with James

 

Compilation and rollout status with XrdCeph and rocky 8: 5.7.x

@Thomas, Jyothish (STFC,RAL,SC)

Upstream merging in process. Branch now exists.

Documention (particularly for the Buffered IO is needed).

 

Shoveler

@Katy Ellis

Shoveler installation and monitoring

 

 

Deletion studies through RDR

@Ian Johnson

 

 

 

Deletions

https://stfc.atlassian.net/browse/XRD-83

Comparing deletion timings for 1000 and 4000 x2.5GB files, using 8 and 32 WebDAV clients respectively, running on a single VM node:

1000 files, 8 clients - 27 Hz - 69.4 GB/s

4000 files, 32 clients - 31 Hz - 71.1 GB/s

 

periodic hackaton?

@Thomas, Jyothish (STFC,RAL,SC)

XRootD Writable Workernode  Gateway Hackaton (XWWGH)

Tues 12th Nov 1600
Hackaton writeable workernode

 

Xrootd testing framework

 

XRootD Site Testing Framework

 

 

100 GbE Gateway testing:
SKA / Tier-1

@James Walder

https://stfc.atlassian.net/wiki/spaces/UK/pages/215941180

  • systemd-networkd used

Manual for initial setup
AQ for ip routes, etc. Some static variables for now. Maybe can improve with time

 

UKSRC Storage Architecture

 

 

 

Tokens Status

 

  • Operational

  • Technical

  • Accounting

 

 

 

 

 

on GGUS:

Site reports

Lancaster: Our scrubbing stats suddenly picked up last Friday, we’re not sure why but we’re not complaining.

 

image-20241121-130324.png

Also as seen during Wednesday’s storage meeting with experimenting with bindfs to do “cunning” things with user mapping, and rolled out a “ofs.crmode” setting to our xroot cluster as we’re consistently getting into LSST trouble with file/directory permission/ACLs.

 

Manchester: also noticed similar problems with mclock (3 objects/hour with mclock). changing settings didn’t seem to have an effect,
lots of scrubbing but not updating the scrub date (scrubbing a bit broken on reef)

Reporting bugs flagged upstream

Glasgow: targetting pacific for upgrade, waiting for Reef stability


RAL upgrade might go to Quincy

mid-upgrade (pacific mons + nautiluse osd) - osd maps are created very quickly, but can bloat mon stores

 

 

 Action items

How to replace the original functionality of fstream monitoring, now opensearch has replaced existing solutions.

 

  •  

  •  

 

 Decisions