Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

\uD83D\uDDD3 Date

07

\uD83D\uDC65 Participants

...

Apologies:

CC:

\uD83E\uDD45 Goals

  • List of Epics

  • New tickets

  • Consider new functionality / items

  • Detailed discussion of important topics

  • Site report activity

...

Item

Presenter

Notes

Operational Issues
Gateways and WNs:
- Current status and upcoming changes

(Gateway Auth failures)

Thomas, Jyothish (STFC,RAL,SC)

accidentally broke preprod workernodes - docker had cached old layers for the new image

cms-aaa naming convention

cms-aaa is the only remaining personality to use proxy/ceph as the xrootd service names


Separate naming convention would be more appropriate, to have main/supporting

(not so urgent).

CC created, but due to be reviewed December

XRootD Managers De-VMWareification

Thomas, Jyothish (STFC,RAL,SC)

View file
nameRedirector de-VMWareification.pptx

Option 2 preferred for efficiency, but Option 1 decided on

Option 1 would be simpler to implement for a temporary fix, as the move would be reversed

antares tpc nodes to be moved to an echo leafsw, to confirm ipv4 real estate with James

Compilation and rollout status with XrdCeph and rocky 8: 5.7.x

Thomas, Jyothish (STFC,RAL,SC)

Upstream merging in process. Branch now exists.

Documention (particularly for the Buffered IO is needed).

Shoveler

Katy Ellis

Shoveler installation and monitoring

Deletion studies through RDR

Ian Johnson

Deletions

Jira Legacy
serverSystem Jira
serverId929eceee-34b0-3928-beeb-a1a37de31a8b
keyXRD-83

periodic hackaton?

Thomas, Jyothish (STFC,RAL,SC)

XRootD Writable Workernode  Gateway Hackaton (XWWGH)

Tues 12th Nov 1600
Hackaton writeable workernode

Xrootd testing framework

XRootD Site Testing Framework

100 GbE Gateway testing:
SKA / Tier-1

James Walder

/wiki/spaces/UK/pages/215941180

Following up on server installs
awaiting hostname, cabled but not installed

UKSRC Storage Architecture

For v0.1 Requirements:

POSIX-like access be provided ‘next to’ the compute.
Via ‘some method', files / directories are mounted (RO) for applications (eg. Jupyterhub) to read from.

POSIX area ‘should be’ an RSE, to enable the transfers, and lifecylce management.
Bulk storage “May” exist, required a TPC into the ‘cache’ area.

Dissadvantages:
- Unnecessary data movement perhaps (and via TPC)
- Mounts and permissions

Other ideas:

Manilla style shares / volumes for each DID / container requested ?
- Rucio ‘download’ rather than TPC
- Lifecycle management on the Manilla share layer?

Tokens Status

  • Operational

  • Technical

  • Accounting

...

on GGUS:

Site reports

Lancaster: Still having a rough time with CEPH the MDS servers, having our third switchover in as many weeks. Gerard found some possible bugs, and the workaround is… what we’ve been doing…Gerard updated to the latest reef (18.2.4), it wasn’t a completely smooth process but we got there in the end. Also as discussed in storage on Wednesday, turning off mclock made ceph behave a lot better.

Manchester also noticed similar problems with mclock (3 objects/hour with mclock). changing settings didn’t seem to have an effect,
lots of scrubbing but not updating the scrub date (scrubbing a bit broken on reef)

Reporting bugs flagged upstream

Glasgow: targetting pacific for upgrade, waiting for Reef stability

...

RAL upgrade might go to Quincy

mid-upgrade (pacific mons + nautiluse osd) - osd maps are created very quickly, but can bloat mon stores

✅ Action items

How to replace the original functionality of fstream monitoring, now opensearch has replaced existing solutions.

...