Page Comparison

Versions Compared

Key

This line was added.
This line was removed.
Formatting was changed.

...

new network traffic

storage monitoring

kibana dashboard for WN/tranche IOPS monitoring

Echo storage node IOPS (per generation)

XrootD production changes

External gateways

9/05/23 - pgwrite bugfix rollout on external gateways | deemed irrelevant to the incident

Batch farm:

9th May (9:30 am): Draining of first half of worker-nodes
11th May (9:30 am): Update drained worker-nodes

Bring back online updated tranches

12th May (16:00): Drain remaining half of worker-nodes
15th May (14:00): Update drained worker-nodes

Bring back online updated tranches
Health check entire farm

Plots and associated info

...

This has been found to be due to a missed line change in the dockerfile.

Hard limit for read IOPs before the crash in ceph monitoring seem to be 150k, with a desirable rate of <100k. current rate (without readV) is 30k