2022-11-3 Meeting Notes
Date
Nov 3, 2022
Participants
@James Walder
@Emmanuel Bejide
Lancs: Apologies (Gerard, Matt), Steven
Goals
List of Epics
New tickets
Consider new functionality / items
Detailed discussion of important topics
Site report activity
Discussion topics
Item | Presenter | Notes |
|
---|---|---|---|
5.5.1 released |
| Reports of significant issues (yet to check if these are on GitHub as issues) |
|
Xcache 5.5.X problems |
| https://github.com/xrootd/xrootd/issues/1808 No update so far |
|
Thoughts on combining the xrootd and webdav aliased hosts ? |
| Interest to try changing the DNS alias (next week?) to use common sets of host for xrootd and webdav? |
|
unified Sandox |
| Initial review with Tom; some cleanup and a couple of missed config settings (due to the addition of the tpc instance to be added). |
|
CMSD |
| Trying to find someone to talk to about RAL VMware |
|
Slow stats |
| Ongoing study; Separate out “unhappy” gateway scenario other items ? |
|
Vector Read requests on Echo Gateways |
| Discussing with Ian and Alex. Alex able to reproduce with Rob’s scripts; some testing ideas.
|
|
Slow deletes https://ggus.eu/index.php?mode=ticket_info&ticket_id=159395
|
| Here’s my timeline for the file: /lhcb/MC/2017/SIM/00170176/0001/00170176_00011436_1.sim (on the webdav aliased hosts)
Svc02: Initial Write 221101 03:56:38 File descriptor 133666 associated to file /lhcb:buffer/lhcb/MC/2017/SIM/00170176/0001/00170176_00011436_1.sim opened in write mode 221101 03:57:02 ceph_close: closed fd 133666 for file buffer/lhcb/MC/2017/SIM/00170176/0001/00170176_00011436_1.sim, read ops count 0, write ops count 30, async write ops 0/0, async pending write bytes 0, async read ops 0/0, bytes written/max offset 501311973/501311972, longest async write 0.000000, longest callback invocation 0.000000, last async op age 0.000000
svc01: Checksum
svc99: Unlink 221101 05:05:37 ceph_stat: /lhcb:buffer/lhcb/MC/2017/SIM/00170176/0001/00170176_00011436_1.sim 221101 05:05:37 ceph_posix_unlink : /lhcb:buffer/lhcb/MC/2017/SIM/00170176/0001/00170176_00011436_1.sim
| https://stfc.atlassian.net/browse/XRD-52
|
GGUS:
Site reports
Glasgow
Sam notes that “unhappy” OSDs might be the cause of stalled operations; restarting xrootd ‘fixes’ things
* How to make a dev OSD unhappy
* How to manage this in ceph.conf, or XrdCeph.
Lancaster:
ECDF:
Manchester:
Action items
JW to create ticket to Storage teams for VMware based CMSD testing
Consider combining xrootd and webdav aliased hosts once Sandbox is deployed to prod.
Preparation of abstract for CHEP for xrootd related would should indeed happen.
Location of Tom’s Xcache/vs memcache https://wiki.e-science.cclrc.ac.uk/web1/bin/view/EScienceInternal/XRootDVectorReadTestProgram
Plan how to used ceph dev to test scenarios of cases where OSD is problematic, but not yet marked as out of the cluster; vary the tuning parameters and characterise performance.