Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

\uD83D\uDDD3 Date

\uD83D\uDC65 Participants

\uD83E\uDD45 Goals

  • List of Epics

  • New tickets

  • Consider new functionality / items

  • Detailed discussion of important topics

  • Site report activity

\uD83D\uDDE3 Discussion topics

Current status of Echo Gateways / WNs testing

Recent sandbox’s for review / deployments:

Item

Presenter

Notes

Vector Read

https://stfc.atlassian.net/wiki/spaces/GRIDPP/pages/137265343/Non-striper+read+v+implementation+for+WN+s+xrootd+gateways

https://stfc.atlassian.net/wiki/spaces/X/pages/edit-v2/143786029

https://github.com/stfc/xrootd-ceph/pull/37/files

Code review largely complete; no additional meetings expected.
- Would like to resolve any residual comments, asap.

Planned actions:

  • Complete the Code review, to allow:

    • Merge PR into master branch (and merge into bufferedIO)

  • build rpms for 5.3.3 and 5.5.4 releases

    • Need to ensure 5.3.3. core XRootD keeps the DH key length patch

    • Request client side timeout ENV in docker job containers

  • (supply Glasgow with appropriate tag / commits in GitHub to build their RPMS)

  • Arrange testing on Echo prod, AAA and Alice gateways:

    • AAA testing with various buffer sizes would be interesting.

SEGV investigations with -S multi-stream flags

Jira Legacy
serverSystem JIRA
serverId929eceee-34b0-3928-beeb-a1a37de31a8b
keyXRD-53

The SEGV has not occured when using the Ceph plugin with XRootD v5.5.4.post257 (local compile). The reason for this is unknown; we have raised a query about this in https://github.com/xrootd/xrootd/issues/1821#issuecomment-1508126509

Continuing to search for the code change(s) now allowing multiple streams to work correctly. Issue in Jira updated with table of behaviour with different releases of XRootD server.

Fix for paged writes when misaligned to end of buffer

https://github.com/stfc/xrootd-ceph/pull/40

Sandbox on gw7; ready to be deployed? Aim for Tuesday rollout to the Gateways.

CMSD status

Jira Legacy
serverSystem JIRA
serverId929eceee-34b0-3928-beeb-a1a37de31a8b
keyXRD-41

CC document

https://stfc.atlassian.net/wiki/spaces/GRIDPP/pages/136446019/High-level+XrootD+redirection+for+Echo?focusedCommentId=136118425

  • Observed quick round-robin behaviour for ipv4; slow (according to TTL) for ipv6. James A. states this is fine; and different clients should get directed to different managers anyway.

  • Failover behaviour appears to work (when manually stopping / xrootd or cmsd ). Best command for spotting a broken xrootd service ?

  • AQ configuration exists, but should be refactored to add relevant “service” level functionality.

  • ATLAS FTs running against it, and run some simple ‘stress’ test transfers; looking ok so far.

Todo;

  • complete the AQ setup

  • Deploy to all Echo gateways (still in the largely ‘passive’ mode).

  • Define and agree an agenda / schedule, for moving VOs (and their various activities over).

    • Make final switch of webdav and echo xrootd aliases to the redirector address

    • This will need new certs for the two manager hosts

Transfers of 0-byte files

Jira Legacy
serverSystem JIRA
serverId929eceee-34b0-3928-beeb-a1a37de31a8b
keyXRD-62

Observed Dune transfer failures using 0-byte files

GGUS:

Deletion problem at RAL

Slow stat calls at RAL

Problem accessing some LHCb files at RAL

Site reports

✅ Action items

  • Create Jira for Checksumming updates for 3.7+ (especially for Rocky 9 releases).

  • James Walder to set up code review for next thursday To review and approve the PR for the vector read work

  • James Walder To identify and discuss with Dune representatives the 0-byte file failures, and whether this is an issue / understood

⤴ Decisions