Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Alert is received from opsgenie

...

check icinga test status and output https://icinga.scd.stfc.ac.uk/icingaweb2/search?q=webdav_service#!/icingaweb2/icingadb/service?name=ha-check_ceph_xrootd_webdav_service&host.name=echo-manager01.gridpp.rl.ac.uk

...

check load distribution on gateways https://vande.gridpp.rl.ac.uk/next/d/0AnwKrEVk/xrootd-manager-monitoring?orgId=1&refresh=1m&from=now-6h&to=now&var-hosts=echo-manager01.gridpp.rl.ac.uk&var-hosts=echo-manager02.gridpp.rl.ac.uk&var-Bin=1m&var-rp=1_day&var-prefix=mean_&var-time=1_week

...

if a single gateway shows high load, check its general health by searching the hostname in icinga

...

check crash dumps on hosts if they show disk near full, clear old dumps

...

See the Ceph documentation here:

/wiki/spaces/CD/pages/266600449