2022-10-13 Meeting
Date
Oct 13, 2022
Participants
@James Walder
Glasgow: Sam
Lancs: Gerard, Matt, Steven
Goals
List of Epics
New tickets
Consider new functionality / items
Detailed discussion of important topics
Site report activity
Discussion topics
Item | Presenter | Notes |
|
---|---|---|---|
5.5.X for WNs |
| Likely folded in with voms deployment on WNs (centos7) (see Oxford discussion below) |
|
‘unified’ config |
| Sandox on all webdav hosts; |
|
root-based TPC in unified config |
| Add a second xrootd-tpc instance for root-based TPC transfers (should it do more than this, e.g. 2x unified instance? But perhaps start simply first) To be tested … |
|
Oxford Xcache (old). Rucio update / change |
| Oxford Xcache put the cache back in the data-path. Update to 5.5.0 caused significant transfer failures. Downgrading to 5.4.2 mitigated problem. Small files tending to succeed, larger (e.g. > 1GiB) would fail. Failure mode is Timeout (due to an apparent stalled transfer). | New Xcache largely configured (Some questions remaining on whether the Bonded link is doing what it should be doing).
|
Stats, Simple stats test going to Gateways. Mean stat time [s] (per time bin) plotted. |
| Separating the metadata operations from the read operations should likely help.. Eg. with CMSd, redirect meta-ops to separate xrootd instance on a host.
|
|
Inputs for status monitoring of XRootD on gateways |
| Input from Dev. mtg to Echo Ops for ‘health-check’ metrics for XRootD status ? Number of connections, ceph read-time, transfer errors, stat times, … |
|
New network svcXX gateway status |
| Ipv6 address appears to now be on the host and in DNS. https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=474560 |
|
OX Cache permissions error
221013 12:07:27 661367 XrootdXeq: u73.7095:286@t2xroot01.physics.ox.ac.uk pub IPv4 login as atlascache
221013 12:07:27 661367 u73.7095:286@t2xroot01.physics.ox.ac.uk Xrootd_Protocol: 0100 req=open dlen=73
221013 12:07:27 661367 u73.7095:286@t2xroot01.physics.ox.ac.uk Xrootd_Protocol: 0100 open rt /atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1
221013 12:07:27 661367 u73.7095:286@t2xroot01.physics.ox.ac.uk ofs_open: 0-600 fn=/atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1
221013 12:07:27 661367 acc_Audit: u73.7095:286@t2xroot01.physics.ox.ac.uk grant gsi atlascache@t2xroot01.physics.ox.ac.uk read /atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1
221013 12:07:27 ceph_namelib : translated /atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1 to atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1
221013 12:04:43 649390 XrootdXeq: u33.67888:284@t2xcache01.physics.ox.ac.uk pub IPv4 login as atlascache
221013 12:04:43 649390 u33.67888:284@t2xcache01.physics.ox.ac.uk Xrootd_Protocol: 0100 req=open dlen=73
221013 12:04:43 649390 u33.67888:284@t2xcache01.physics.ox.ac.uk Xrootd_Protocol: 0100 open rt /atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1
221013 12:04:43 649390 u33.67888:284@t2xcache01.physics.ox.ac.uk ofs_open: 0-600 fn=/atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1
221013 12:04:43 649390 acc_Audit: u33.67888:284@t2xcache01.physics.ox.ac.uk deny gsi atlascache@t2xcache01.physics.ox.ac.uk read /atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1
221013 12:04:43 649390 ofs_open: u33.67888:284@t2xcache01.physics.ox.ac.uk Unable to open /atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1; permission denied
221013 12:04:43 649390 u33.67888:284@t2xcache01.physics.ox.ac.uk Xrootd_Response: 0100 sending err 3010: Unable to open /atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1; permission denied
221013 12:04:43 649390 u33.67888:284@t2xcache01.physics.ox.ac.uk ofs_close: use=0 fn=dummy
"/C=UK/O=eScience/OU=Oxford/L=OeSC/CN=t2xroot01.physics.ox.ac.uk" atlascache
"/C=UK/O=eScience/OU=Oxford/L=OeSC/CN=t2xcache01.physics.ox.ac.uk" atlascache
GGUS:
GGUS:159137 Study from Chris of stat times:
GGUS: 159146 “Enable HTTPs as write protocol at RAL”
Site reports
21 Sept 2022
Glasgow
Working on the in-flight Checksum code.
Lancaster:
Redirector setup still working well.
Action items