XRootD Cluster Shuffle
Introduction
This Plan aims to move the XRootD Managers from the legacy network into the new tier-1 network. This is because supporting both networks blocks other changes and improvements, including the full capacity of the LHCOPN link and advertisement changes to enable the new gateways to be deployed (https://stfc.atlassian.net/browse/CEPH-303).
Step 1: Add a seperate cluster that manages the same service in aquilon but is not exposed to the alias
This results in additional managers for the gateway but has no impact otherwise, as the IPs exposed to the internet are still the previous managers
Step 2: Switch the DNS entries to the keepalived IPs of the new managers
After ensuring fallback works as intended, switch the IP addresses behind the aliases to the new managers' keepalived
Future Work
The old managers can be kept as assisting managers with no external exposure or shut off until the new virtualization service is up and running
Other Options considered
Creating a fresh cluster and moving each gateway to the new one
Very disruptive and results in a long time of reduced capacity
Adding the new managers to the existing cluster
Additional configuration needed and no way of knowing if it’s working until they’re public facing
individual hosts in a cluster cannot be added to separate sandboxes, so the keepalived IPs would have to be shared, further complicating that
Setting it up without aquilon
this will remain the status quo for at least a few months. security patches and updates will remain necessary.