Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Useful pages for monitoring

Gateway status monitoring

XrootD report monitoring

Slow I/O monitoring

legacy network traffic

new network traffic

storage monitoring

Plots and associated info

Please at a screen shot (or more) of the plot, and the timestamp url from which it was obtained.
For URLs which are generally useful please also add to the section above, with some brief description.

Image Added

The problem was initially found by gateway functional tests failing Friday evening. Restarting the gateways didn’t fix the issue. Memory spikes correlate to increased connections in the xrootd report monitoring shown below

...

which suggested some slowdown or issue in ceph. This was further supplemented by the slow operation monitoring

...