2023-11-16 Meeting Notes

 Date

Nov 16, 2023

 Participants

  • @Thomas, Jyothish (STFC,RAL,SC)

  • @Thomas Byrne

  • @Alexander Rogovskiy

  • @James Walder

  • @Ian Johnson

  • Lancs: @Matt Doidge, Gerard, Steven

  • Glasgow: Sam

Apologies:

 

 

 Goals

  • List of Epics

  • New tickets

  • Consider new functionality / items

  • Detailed discussion of important topics

  • Site report activity

 

 Discussion topics

Current status of Echo Gateways / WNs testing

Recent sandbox’s for review / deployments:

 

Item

Presenter

Notes

 

Item

Presenter

Notes

 

XrootD gateway architecture review (What should the XrootD access to Echo look like in a year’s time)

 

https://stfc.atlassian.net/wiki/spaces/GRIDPP/pages/255262851

Ideas on xrootd batch farm architecture

Current State
ECHO aliases

Key questions:
Should we segregate s3 and WN traffic from FTS?
having each service have its own redirector endpoint is good to maintain high availability and redundancy. additional hardware capacity can then be more easily added to each service if needed.
Shared servers: cmsd allows multiple clusters (redirector managers) to include the same server (gw in our case)
Multiple clusters? with or w/o shared servers? (no shared servers if possible)
Having multiple clusters is good (each service should have its own redirector managers for HA), shared servers will not be needed on a containerised setup, ease of management is preferred over slight resource optimization (with shared servers, underutilised servers could get traffic from other services which are under higher load to make more use of existing capacity)

What to aim for:

  • every service instance needs to be as resilient as possible.
    each DNS endpoint should have keepalived for redundancy and redirectors for high availability

  • manageability
    adding/removing gateways from a service should be simple and the overall setup should not be too complex to understand or manage

  • aim for simpler config
    keep the config understandable

  • flexibility on gw/use case to meet burst demand
    we should be able to add gws quickly to service endpoints as smoothly as possible to facilitate burst demands (e.g. alice) and to deploy additional capacity quickly

Containerizing everything (shared containers across all hardware) is the preferred desired end state.
This has the prerequisite of every service being behind an expandable high availability setup (xrootd cmsd managers)
[and an orchestrated setup to spin up more gws for load increase]

some system resource overhead should be reserved to keep the gateways running smoothly

 

WN gateways:
this should be kept going forwards as they mean we have an additional gateway’s worth of capacity for every workernode.
They currently only redirect traffic for reads over root (job operations using the xrootd.echo endpoint).
This is because of Xcache, which is read only.
Xcache is good at what it does and reduces the number of iops hitting ceph from reads. During the vector read deployment they were removed and resulted in enough IOps to slow down the echo storage cluster enough to fail

  • xcache can be removed if xrdceph buffers provide similar functionality (allows R/W over local gw)
    xrdeph buffers do not work on out of order reads or separate read requests (like the case with alice gateways)

  • some sort of xrootd manager tree setup might work for WN gw containers
    this could be similar to CMS AAA, with a hierarchy for access, but the first point of contact should be highly available

  • a single gw failing on a workernode should not cause all its jobs to fail. currently there is no failover built in for WN reads, so if the gateway is down all jobs on that WN will fail

  • a functional test equivalent healthcheck for WN gw will ensure the gateway is killed and restarted, and makes condor know if the gateway is still down. This would stop new jobs being sent to a WN with a broken gw but the jobs currently on it will still run.

  • The solution should strongly prefer a WN’s own gw. Ideally there should be some fallback mechanism where the transfer attempts to use its own gateway first and fails over to its neighbour WNs' gateway if unavailable.

  • cmsd is not smart enough to deal with r only and r/w servers as part of its cluster (this was attempted by Sam at Glasgow during early 5.x)

  • strong preference for having the same endpoint for reads and writes (removing xcache). This makes the configuration simpler and allows it to be managed by a cmsd redirector without issues.

 

  • A: evaluate whether xcache can be removed with xrdceph buffers enabled (measure IOps on single WN)

  • A: design a better solution for the gws on the WNs

  • A: create redirector managers for alice and s3

  • A: develop cmsd redirector capability to redirect onto own gateway preferably and have xcaches be included in the redirector in a mixed gw setup

 

XRootD Releases

 

5.6.3-1 is out

Glasgow Lancs has been using it (el7 and rocky8) (no cmfst post centos7)

Make solve problem with WN gateways that's been seen recently
(i.e. it fixes Deadlock in XCache's XrdCl instance · Issue #1979 · xrootd/xrootd )

Notes that cmsTFC needs to be compiled from source for EL8+, and that CMAKE errors on a particular ‘warning'.

 

Checksums fixes

 

Deployed to a single prod server; to deploy next week, and test the speed.
Concern that with the redirector may introduce some additional latency.

 

Prefetch studies and WN changes

Alex

planned for week of 20th to resume partial deployment over the farm

Increasing of timeouts to reduce failures of some async requests

 

Deletion studies through RDR

Ian

To follow up looking for differences in requests through XRootD and via rados commands directly (to spot where the ‘long tail’ may originate from).

For echo metrics, the kibana page is useful : https://kibana.gridpp.rl.ac.uk/goto/c962c51ce13283e9853334bbc79ca801https://kibana.gridpp.rl.ac.uk/goto/c962c51ce13283e9853334bbc79ca801

 

Gateways: observations

 

 

 

Tokens testing

 

To Liaise with the Token Trust Traceability Taskforce (aka. @Matt Doidge )

report by end of this month

CMS GGUS for enabling token auth
planned deployment on the week of the 20th
CC: https://stfc.atlassian.net/wiki/spaces/GRIDPP/pages/296517633

 

SKA Gateway box

 

https://stfc.atlassian.net/wiki/spaces/UK/pages/215941180

Deneb-dev routing still needed (on the Switch / router side).

Some tests with Ceph-dev and changing of the rados-striper

Difference between upload and download may be due to uploads from local disk, downloads to /dev/null. (to repeat with tmpfs).

 

WN Xcache issue

 

futex lock hard locking xcache proxy on WNs (possibly occurrence of Deadlock in XCache's XrdCl instance · Issue #1979 · xrootd/xrootd )

 

containerised gateways (kubernetes cluster)

 

working but still needs ironing a few bugs and scaling up

 

 

on GGUS:

Site reports

Lancaster - Revisiting Tokens config after a long hiatus - aiming to have atlas tokens working for DC24. Testing is proving problematic as Matt keeps mucking up the client side.

Glasgow -

 

 

 

 Action items

  •  

 

 Decisions