Skip to end of metadata
Go to start of metadata

You are viewing an old version of this content. View the current version.

Compare with Current View Version History

« Previous Version 4 Next »

\uD83D\uDDD3 Date

\uD83D\uDC65 Participants

Apologies:

James Walder, Maybe Matt if he’s still stuck in the machine room.

CC:

\uD83E\uDD45 Goals

  • List of Epics

  • New tickets

  • Consider new functionality / items

  • Detailed discussion of important topics

  • Site report activity

\uD83D\uDDE3 Discussion topics

Current status of Echo Gateways / WNs testing

Recent sandbox’s for review / deployments:

Item

Presenter

Notes

Operational Issues
Gateways and WNs:
- Current status and upcoming changes

Worker Node writable XCache fixed and deployed in pre

Ceph upgrade ongoing

Compilation and rollout status of RAL XRootD versions

Thomas, Jyothish (STFC,RAL,SC)

5.7.3 released (awaiting other changes to gateways)

XRootD collaboration Meeting

https://indico.cern.ch/event/1510817/

Requirements gathering session.

Officially joining the collaboration requires pledging FTEs, but we can still submit patches and PRs without that.

6.0 features


improve CI/security/stability
drop python2 support
timeout changes breaking ABI
improve error handling
review long term http client in xrdcl
c++20
cache fixes for reading replicas
process metalink
reflink file cloning (xrootd erasure coding)

https://github.com/orgs/xrootd/projects/1 for full list


general HTTP/davs improvements requested by various communities, with possibility of davix client being merged into xrootd

xrd.network and fallback manager use by the alice analysis facilities looked interesting, as well as their custom quota plugin. The latter can set a quota per individual user, but they’ve used it to set a quota on each server.

general wish for improvements in logging and error messages.

I had a chat with Guilherme after and proposed a mid/long term plan on making a GeoIP based redirection algorithm - this would benefit us in that it can allow us to have a CMSD cluster for the batch farm, as well as with CMS AAA.

cms-aaa naming convention

Thomas, Jyothish (STFC,RAL,SC)

cms-aaa is the only remaining personality to use proxy/ceph as the xrootd service names


Separate naming convention would be more appropriate, to have main/supporting

(not so urgent).

CC created, and sandbox is prepared and has been tested on a test host

cms-aaa jemalloc use

Thomas, Jyothish (STFC,RAL,SC)

testing on svc20, some memory leak still present

Shoveler

Katy Ellis

Shoveler installation and monitoring

Had a discussion with Andy and Guilherme, as well as a new developer in the Team. There is not much apetite to support it going forwards. One option that was floated around is once the logging improvements are made, those could be passed into a standard log parser and pushed into elasticsearch directly.

On the fly Checksums
XRD-98 - Getting issue details... STATUS

Ian Johnson

Deletions

XRD-83 - Getting issue details... STATUS

NTR

XRootD Writable Workernode  Gateway Hackaton

Thomas, Jyothish (STFC,RAL,SC)

rolled back due to memory issues related to buffer size.

Xrd-ceph version with write-only buffering is deployed on the LHCb-ony WN (lcg2345). LHCb jobs are again writing data from the preprod farm to ECHO after a short break.

7ef2abef258e2ddef6e0f038c588cc81.png

Plan: file query system to summarize XRootD Logs

Plan to create a system to store info from across all gateways to search a filename and get creation time, last write time, last successful stat and deletion time in case of ‘lost’ files. Possible graduate sideproject.

100 GbE Gateway testing:
SKA / Tier-1

James Walder Thomas, Jyothish (STFC,RAL,SC)

UKSRC - Acting as source for SRCNet verification tests; not being stressed so far …

Teir-1 .

UKSRC Storage Architecture

Tom B. Working on CephAdm setup for the cluster. JW attempting to reinstall the hosts.

Tokens Status

  • Operational

  • Technical

  • Accounting

 

on GGUS:

Site reports

Lancaster:

So we found out the hard way that running Ceph with 25% of your servers on 1Gb NICs just doesn’t work for any load of any significance. Luckily the replacement 25Gb NICs have started arriving.


Glasgow -

✅ Action items

⤴ Decisions

  • No labels