Item | Presenter | Notes | |
---|
Impact of Vector Read update to Echo | Vector Read | | https://stfc.atlassian.net/wiki/spaces/GRIDPP/pages/137265343/Non-striper+read+v+implementation+for+WN+s+xrootd+gateways https://stfc.atlassian.net/wiki/spaces/X/pages/edit-v2/143786029 https://github.com/stfc/xrootd-ceph/pull/37/files 12–16 May 2023 Echo instability following readV rollout Timeline on Batch farm 5th May: ` wn-2020-xma - wn-2022-lenovo` will be set to drain. 9th May: 10-11th May: Let the updated workers run for a few days. 12th May: `wn-2017-dell (all 2017’s) - wn-2019-dell` will be set to drain. 15th May: Repeat above process for second half of workers. 16th May: Merge all required sandboxes into prod and manage farm back into `prod_batch` in AQ
| |
Next steps for WN deployment | | Possible options for short term WN status:Move to 5.5.4-2 (core) + 5.5.4-3 (xroot-ceph-buffered): Fixes the Xcache “Filename too long issue” (to be confirmed)! Provided buffering on ‘gateway’ for passed-through reads allows non-striper reads and readV requests (i.e. Alex updates) (also for passed through read(v) (b) and (c) are all configurable within the xrootd-xxx.cfg configuration files paged reads / (writes) would be enabled; probably only between Xcache and gateway (TBC) General fixes from 5.5.X series 5.5.4 currently being tested on lcg2268 (2017 dell, ml) (not exactly in this configuration however).
5.3.3-x (core) + 5.3.3-6 (xroot-ceph-buffered). Needs additional patch for “filename too long issue”; resulting in different (core) xrootd rpms for proxy and ceph (or a more detailed patch).
We ‘understand’ 5.3.3 as a working and stable release Most testing on WNs done under this configuration
(not for initial consideration) the proxy can be configured as disk-caching proxy (XCache) or to ‘forward / passthrough’ the requests to the gateway, without the need for draining the farm.
| |
EBUSY in readV requests | | Observation during Echo problem period -EBUSY requests from ceph, which are caught int the BufferedIO Read calls (5 attempts, then returns an -EIO error). We should ensure that readV requests also catch -EBUSY errors correctly, and not pass them back to core xrootd. James Walder to create jira. | |
Discussion on merging bufferedIO into master. Also to discuss pushing changes to “upstream” (xrootd/xroot-ceph” | | https://github.com/stfc/xrootd-ceph/pull/44 Needs testing for ‘correctness’ Also some discussion ongoing on xrootd “issues” on the xrootd-ceph sub-module: https://github.com/xrootd/xrootd/pull/2008 | |
SEGV investigations with -S multi-stream flags | | Jira Legacy |
---|
server | System JIRA |
---|
serverId | 929eceee-34b0-3928-beeb-a1a37de31a8b |
---|
key | XRD-53 |
---|
|
fix in ‘master’ of core xrootd; not yet added to a tagged xrootd release; to follow up with Ian Johnson | |
CMSD status | | Jira Legacy |
---|
server | System JIRA |
---|
serverId | 929eceee-34b0-3928-beeb-a1a37de31a8b |
---|
key | XRD-41 |
---|
|
CC document https://stfc.atlassian.net/wiki/spaces/GRIDPP/pages/136446019/High-level+XrootD+redirection+for+Echo?focusedCommentId=136118425 | |
Transfers of 0-byte files | | Jira Legacy |
---|
server | System JIRA |
---|
serverId | 929eceee-34b0-3928-beeb-a1a37de31a8b |
---|
key | XRD-62 |
---|
|
Observed Dune transfer failures using 0-byte files | |
| | | |