...
Apologies:
Alison Packer , Alastair Dewhurst
\uD83E\uDD45 Goals
List of Epics
New tickets
Consider new functionality / items
Detailed discussion of important topics
Site report activity
...
Item | Presenter | Notes | |
---|
Impact of Vector Read update to Echo | Vector Read | | https://stfc.atlassian.net/wiki/spaces/GRIDPP/pages/137265343/Non-striper+read+v+implementation+for+WN+s+xrootd+gateways https://stfc.atlassian.net/wiki/spaces/X/pages/edit-v2/143786029 https://github.com/stfc/xrootd-ceph/pull/37/files 12–16 May 2023 Echo instability following readV rollout Timeline on Batch farm 5th May: ` wn-2020-xma - wn-2022-lenovo` will be set to drain. 9th May: 10-11th May: Let the updated workers run for a few days. 12th May: `wn-2017-dell (all 2017’s) - wn-2019-dell` will be set to drain. 15th May: Repeat above process for second half of workers. 16th May: Merge all required sandboxes into prod and manage farm back into `prod_batch` in AQ
| |
Next steps for WN deployment | | Possible options for short term WN status:Move to 5.5.4-2 (core) + 5.5.4-3 (xroot-ceph-buffered): Fixes the Xcache “Filename too long issue” (to be confirmed)! Provided buffering on ‘gateway’ for passed-through reads allows non-striper reads and readV requests (i.e. Alex updates) (also for passed through read(v) (b) and (c) are all configurable within the xrootd-xxx.cfg configuration files paged reads / (writes) would be enabled; probably only between Xcache and gateway (TBC) General fixes from 5.5.X series 5.5.4 currently being tested on lcg2268 (2017 dell, ml) (not exactly in this configuration however).
5.3.3-x (core) + 5.3.3-6 (xroot-ceph-buffered). Needs additional patch for “filename too long issue”; resulting in different (core) xrootd rpms for proxy and ceph (or a more detailed patch).
We ‘understand’ 5.3.3 as a working and stable release Most testing on WNs done under this configuration
(for the future) Make the proxy pass through all readV requests to the gateway …
(not for initial consideration) the proxy can be configured as disk-caching proxy (XCache) or to ‘forward / passthrough’ the requests to the gateway, without the need for draining the farm.
| |
EBUSY in readV requests | | Observation during Echo problem period -EBUSY requests from ceph, which are caught int the BufferedIO Read calls (5 attempts, then returns an -EIO error). We should ensure that readV requests also catch -EBUSY errors correctly, and not pass them back to core xrootd. James Walder to create jira. | |
Discussion on merging bufferedIO into master. Also to discuss pushing changes to “upstream” (xrootd/xroot-ceph” | | https://github.com/stfc/xrootd-ceph/pull/44 Needs testing for ‘correctness’ Also some discussion ongoing on xrootd “issues” on the xrootd-ceph sub-module: https://github.com/xrootd/xrootd/pull/2008 | |
SEGV investigations with -S multi-stream flags | | Jira Legacy |
---|
server | System JIRA |
---|
serverId | 929eceee-34b0-3928-beeb-a1a37de31a8b |
---|
key | XRD-53 |
---|
|
fix in ‘master’ of core xrootd; not yet added to a tagged xrootd release; to follow up with Ian Johnson | |
CMSD status | | Jira Legacy |
---|
server | System JIRA |
---|
serverId | 929eceee-34b0-3928-beeb-a1a37de31a8b |
---|
key | XRD-41 |
---|
|
CC document https://stfc.atlassian.net/wiki/spaces/GRIDPP/pages/136446019/High-level+XrootD+redirection+for+Echo?focusedCommentId=136118425 | |
Transfers of 0-byte files | | Jira Legacy |
---|
server | System JIRA |
---|
serverId | 929eceee-34b0-3928-beeb-a1a37de31a8b |
---|
key | XRD-62 |
---|
|
Observed Dune transfer failures using 0-byte files | |
| | | |
...
Create Jira for Checksumming updates for 3.7+ (especially for Rocky 9 releases).
James Walder To review and approve the PR for the vector read work
James Walder To identify and discuss with Dune representatives the 0-byte file failures, and whether this is an issue / understood
...
Begin testing process on WN test node and aim to push to farm in timely manner
Continue investigations to readV methods that will enable the XCache to be removed (and therefore allow writable WNs)
⤴ Decisions
- Gateway configuration following Option 1 is preferred: “Move to 5.5.4-2 (core) + 5.5.4-3 (xroot-ceph-buffered)”