2022-10-13 Meeting

 Date

Oct 13, 2022

 Participants

  • @James Walder

  • Glasgow: Sam

  • Lancs: Gerard, Matt, Steven

 Goals

  • List of Epics

  • New tickets

  • Consider new functionality / items

  • Detailed discussion of important topics

  • Site report activity

 

 Discussion topics

https://stfc.atlassian.net/jira/software/c/projects/XRD/boards/26/roadmap

Item

Presenter

Notes

 

Item

Presenter

Notes

 

5.5.X for WNs

 

Likely folded in with voms deployment on WNs (centos7)

(see Oxford discussion below)

 

‘unified’ config

 

Sandox on all webdav hosts;

 

root-based TPC in unified config

 

Add a second xrootd-tpc instance for root-based TPC transfers

(should it do more than this, e.g. 2x unified instance? But perhaps start simply first)

To be tested …

 

Oxford Xcache (old). Rucio update / change

 

Oxford Xcache put the cache back in the data-path. Update to 5.5.0 caused significant transfer failures. Downgrading to 5.4.2 mitigated problem. Small files tending to succeed, larger (e.g. > 1GiB) would fail.

Failure mode is Timeout (due to an apparent stalled transfer).

New Xcache largely configured (Some questions remaining on whether the Bonded link is doing what it should be doing).
Currently getting a ‘permission denied’ in tests.

 

Stats,

Simple stats test going to Gateways.

Mean stat time [s] (per time bin) plotted.

 

Separating the metadata operations from the read operations should likely help..

Eg. with CMSd, redirect meta-ops to separate xrootd instance on a host.

 

 

Inputs for status monitoring of XRootD on gateways

 

Input from Dev. mtg to Echo Ops for ‘health-check’ metrics for XRootD status ?

Number of connections, ceph read-time, transfer errors, stat times, …

 

New network svcXX gateway status

 

Ipv6 address appears to now be on the host and in DNS.
Routing ipv6 seems to go to London before coming back to RAL.
External access is missing.
Can’t connect to ceph.

https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=474560

 

 

OX Cache permissions error

221013 12:07:27 661367 XrootdXeq: u73.7095:286@t2xroot01.physics.ox.ac.uk pub IPv4 login as atlascache
221013 12:07:27 661367 u73.7095:286@t2xroot01.physics.ox.ac.uk Xrootd_Protocol: 0100 req=open dlen=73
221013 12:07:27 661367 u73.7095:286@t2xroot01.physics.ox.ac.uk Xrootd_Protocol: 0100 open rt /atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1
221013 12:07:27 661367 u73.7095:286@t2xroot01.physics.ox.ac.uk ofs_open: 0-600 fn=/atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1
221013 12:07:27 661367 acc_Audit: u73.7095:286@t2xroot01.physics.ox.ac.uk grant gsi atlascache@t2xroot01.physics.ox.ac.uk read /atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1
221013 12:07:27 ceph_namelib : translated /atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1 to atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1

 

221013 12:04:43 649390 XrootdXeq: u33.67888:284@t2xcache01.physics.ox.ac.uk pub IPv4 login as  atlascache
221013 12:04:43 649390 u33.67888:284@t2xcache01.physics.ox.ac.uk Xrootd_Protocol: 0100 req=open dlen=73
221013 12:04:43 649390 u33.67888:284@t2xcache01.physics.ox.ac.uk Xrootd_Protocol: 0100 open rt /atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1
221013 12:04:43 649390 u33.67888:284@t2xcache01.physics.ox.ac.uk ofs_open: 0-600 fn=/atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1
221013 12:04:43 649390 acc_Audit: u33.67888:284@t2xcache01.physics.ox.ac.uk deny gsi  atlascache@t2xcache01.physics.ox.ac.uk read /atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1
221013 12:04:43 649390 ofs_open: u33.67888:284@t2xcache01.physics.ox.ac.uk Unable to open /atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1; permission denied
221013 12:04:43 649390 u33.67888:284@t2xcache01.physics.ox.ac.uk Xrootd_Response: 0100 sending err 3010: Unable to open /atlas:datadisk/rucio/data16_13TeV/27/4c/AOD.28100547._000208.pool.root.1; permission denied
221013 12:04:43 649390 u33.67888:284@t2xcache01.physics.ox.ac.uk ofs_close: use=0 fn=dummy

 

"/C=UK/O=eScience/OU=Oxford/L=OeSC/CN=t2xroot01.physics.ox.ac.uk" atlascache "/C=UK/O=eScience/OU=Oxford/L=OeSC/CN=t2xcache01.physics.ox.ac.uk" atlascache

GGUS:

GGUS:159137 Study from Chris of stat times:

GGUS: 159146 “Enable HTTPs as write protocol at RAL”

 

Site reports

21 Sept 2022

Glasgow

Working on the in-flight Checksum code.

 

Lancaster:

Redirector setup still working well.

 Action items

 

 

 Decisions