Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

Item

Presenter

Notes

Operational Issues
Gateways and WNs:
- Current status and upcoming changes

Worker Node writable XCache fixed and deployed in prelhcb nodes

Ceph upgrade ongoing - Quincy

Compilation and rollout status of RAL XRootD versions

Thomas, Jyothish (STFC,RAL,SC)

5.7.3 released (awaiting other changes to gateways)

XRootD collaboration Meeting

https://indico.cern.ch/event/1510817/

Requirements gathering session.

Officially joining the collaboration requires pledging FTEs, but we can still submit patches and PRs without that.

6.0 features


improve CI/security/stability
drop python2 support
timeout changes breaking ABI
improve error handling
review long term http client in xrdcl
c++20
cache fixes for reading replicas
process metalink
reflink file cloning (xrootd erasure coding)

https://github.com/orgs/xrootd/projects/1 for full list


general HTTP/davs improvements requested by various communities, with possibility of davix client being merged into xrootd

xrd.network and fallback manager use by the alice analysis facilities looked interesting, as well as their custom quota plugin. The latter can set a quota per individual user, but they’ve used it to set a quota on each server.

general wish for improvements in logging and error messages.

I had a chat with Guilherme after and proposed a mid/long term plan on making a GeoIP based redirection algorithm - this would benefit us in that it can allow us to have a CMSD cluster for the batch farm, as well as with CMS AAA.

cms-aaa naming convention

Thomas, Jyothish (STFC,RAL,SC)

cms-aaa is the only remaining personality to use proxy/ceph as the xrootd service names


Separate naming convention would be more appropriate, to have main/supporting

(not so urgent).

CC created, and sandbox is prepared and has been tested on a test host

cms-aaa jemalloc use

Thomas, Jyothish (STFC,RAL,SC)

testing on svc20, some memory leak still present

Shoveler

Katy Ellis

Shoveler installation and monitoring

Had a discussion with Andy and Guilherme, as well as a new developer in the Team. There is not much apetite to support it going forwards. One option that was floated around is once the logging improvements are made, those could be passed into a standard log parser and pushed into elasticsearch directly.

On the fly Checksums

Jira Legacy
serverSystem Jira
serverId929eceee-34b0-3928-beeb-a1a37de31a8b
keyXRD-98

Ian Johnson

Logging of streamed and readback checksums is progressing. Config options now allow new version of XrdCeph to:

  • Do nothing different to usual (default behaviour if no options selected),

  • Calculate the streamed checksum but do nothing with it (to allow measuring impact on CPU and memory utilisation),

  • Log the streamed and readback checksum (CSV for import into database),

  • Store the streamed checksum in an extended attribute (different attribute name to “XrdCks.adler32”?).

Would like to run one gateway during testing period with the “do nothing” option as a reference.


Action: check tomorrow for test readiness, test on friday afternoon and possible limited deployment during mini-DC.

Deletions

Jira Legacy
serverSystem Jira
serverId929eceee-34b0-3928-beeb-a1a37de31a8b
keyXRD-83

NTR, apart from revising SQL queries from previous deletion report scripts.

XRootD Writable Workernode  Gateway Hackaton

Thomas, Jyothish (STFC,RAL,SC)

rolled back due to memory issues related to buffer size.

Xrd-ceph version with write-only buffering is deployed on the LHCb-ony WN (lcg2345). LHCb jobs are again writing data from the preprod farm to ECHO after a short break.

7ef2abef258e2ddef6e0f038c588cc81.pngImage Removed


7ef2abef258e2ddef6e0f038c588cc81.pngImage Added

Plan: file query system to summarize XRootD Logs

Plan to create a system to store info from across all gateways to search a filename and get creation time, last write time, last successful stat and deletion time in case of ‘lost’ files. Possible graduate sideproject.

Ian plans to extend the database schema from the deletion tests (capturing file write completions and deletions) into a more general event schema.

100 GbE Gateway testing:
SKA / Tier-1

James Walder Thomas, Jyothish (STFC,RAL,SC)

UKSRC - Acting as source for SRCNet verification tests; not being stressed so far …

Teir-1 .

UKSRC Storage Architecture

Tom B. Working on CephAdm setup for the cluster. JW attempting to reinstall the hosts.

Tokens Status

  • Operational

  • Technical

  • Accounting

...