...
check the current transfer load on the gateways trough the grafana dashboard.
If the troughput average is more than 22Gb/s (>90% of maximum network capacity) do not proceed
For each host or batch of hosts that is currently in production use:
run the following command. hostname_prefix is the part before .gridpp.rl.ac.uk, for example ceph-svc01
Code Block bash blacklist.sh <hostname_prefix>
wait till the traffic drops (usually 5 15 min).
ssh into the host and run “reboot“
wait for the host to come back (10-20min)
check the systemd services xrootd@{unified,tpc} and cmsd@unified are running and active
run
Code Block bash unblacklist.sh <hostname_prefix>
...