2024-11-21 Meeting Notes

Date

Nov 21, 2024

Participants

@Alexander Rogovskiy
@James Walder
@Ian Johnson
Lancs: Gerard, Matt, Steven
Glasgow: Sam

Apologies:

@Thomas, Jyothish (STFC,RAL,SC)

CC:

Goals

List of Epics
New tickets
Consider new functionality / items
Detailed discussion of important topics
Site report activity

Discussion topics

Current status of Echo Gateways / WNs testing

Recent sandbox’s for review / deployments:

Item	Presenter	Notes

Item	Presenter	Notes
Operational Issues Gateways and WNs: - Current status and upcoming changes (Gateway Auth failures)	@Thomas, Jyothish (STFC,RAL,SC)	Upgrade of the GWs to happen on Weds / thurs next week
cms-aaa naming convention		cms-aaa is the only remaining personality to use proxy/ceph as the xrootd service names Separate naming convention would be more appropriate, to have main/supporting (not so urgent). CC created, but due to be reviewed December
XRootD Managers De-VMWareification	@Thomas, Jyothish (STFC,RAL,SC)	Option 2 preferred for efficiency, but Option 1 decided on Option 1 would be simpler to implement for a temporary fix, as the move would be reversed antares tpc nodes to be moved to an echo leafsw, to confirm ipv4 real estate with James
Compilation and rollout status with XrdCeph and rocky 8: 5.7.x	@Thomas, Jyothish (STFC,RAL,SC)	Upstream merging in process. Branch now exists. Documention (particularly for the Buffered IO is needed).
Shoveler	@Katy Ellis	Shoveler installation and monitoring
Deletion studies through RDR	@Ian Johnson
Deletions	https://stfc.atlassian.net/browse/XRD-83	Comparing deletion timings for 1000 and 4000 x2.5GB files, using 8 and 32 WebDAV clients respectively, running on a single VM node: 1000 files, 8 clients - 27 Hz - 69.4 GB/s 4000 files, 32 clients - 31 Hz - 71.1 GB/s
periodic hackaton?	@Thomas, Jyothish (STFC,RAL,SC)	XRootD Writable Workernode Gateway Hackaton (XWWGH) Tues 12th Nov 1600 Hackaton writeable workernode
Xrootd testing framework		XRootD Site Testing Framework
100 GbE Gateway testing: SKA / Tier-1	@James Walder	https://stfc.atlassian.net/wiki/spaces/UK/pages/215941180 systemd-networkd used Manual for initial setup AQ for ip routes, etc. Some static variables for now. Maybe can improve with time
UKSRC Storage Architecture
Tokens Status		Operational Technical Accounting

on GGUS:

Site reports

Lancaster: Our scrubbing stats suddenly picked up last Friday, we’re not sure why but we’re not complaining.

Also as seen during Wednesday’s storage meeting with experimenting with bindfs to do “cunning” things with user mapping, and rolled out a “ofs.crmode” setting to our xroot cluster as we’re consistently getting into LSST trouble with file/directory permission/ACLs.

Manchester: also noticed similar problems with mclock (3 objects/hour with mclock). changing settings didn’t seem to have an effect,
lots of scrubbing but not updating the scrub date (scrubbing a bit broken on reef)

Reporting bugs flagged upstream

Glasgow: targetting pacific for upgrade, waiting for Reef stability

RAL upgrade might go to Quincy

mid-upgrade (pacific mons + nautiluse osd) - osd maps are created very quickly, but can bloat mon stores

Action items

How to replace the original functionality of fstream monitoring, now opensearch has replaced existing solutions.