Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Current »

  1. Alert is received from opsgenie

  2. check icinga test status and output https://icinga.scd.stfc.ac.uk/icingaweb2/search?q=webdav_service#!/icingaweb2/icingadb/service?name=ha-check_ceph_xrootd_webdav_service&host.name=echo-manager01.gridpp.rl.ac.uk

  3. check load distribution on gateways https://vande.gridpp.rl.ac.uk/next/d/0AnwKrEVk/xrootd-manager-monitoring?orgId=1&refresh=1m&from=now-6h&to=now&var-hosts=echo-manager01.gridpp.rl.ac.uk&var-hosts=echo-manager02.gridpp.rl.ac.uk&var-Bin=1m&var-rp=1_day&var-prefix=mean_&var-time=1_week

  4. if a single gateway shows high load, check its general health by searching the hostname in icinga https://icinga.scd.stfc.ac.uk/icingaweb2/dashboard

  5. check crash dumps on hosts if they show disk near full, clear old dumps
    location: /var/spool/xrootd/unified

  6. restart gateway if the general load is not too high
    systemctl xrootd@{unified,tpc}.service

  • No labels