Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

http://aquilon.gridpp.rl.ac.uk/sandboxes/diff.php?sandbox=jw-gateway-xrootd-cmsd

Fabric requirements

named:
echo-internal-manager01.gridpp.rl.ac.uk
echo internal-manager02.gridpp.rl.ac.uk

with associated x509 certificates with the following SANs:
*.echo.stfc.ac.uk,
xrootd.echo.stfc.ac.uk
webdav.echo.stfc.ac.uk
internal.echo.stfc.ac.uk

with external firewall holes for port 1094 (xrootd traffic)

they should should be able to contact echo gateways on port 1094,1095 and 1213

with the following specs
4 CPUs
8GB RAM
60GB disk

with IP addresses changed so that they are in the OPN subnet
Ideally they should be in the lower part of 130.246.176.0/24 https://netbox.esc.rl.ac.uk/ipam/prefixes/323/ip-addresses/  (James A's words.) (v4 and v6)

with AAAA DNS records added once set

Operational items

Know issues / limitations

...

add the given host on a single line (wildcards are in principle also ok).
This file is re-read on a per-minute basis, and requires no restart of services

if a host in the blacklist does not exist, the blacklist will fail to parse and will be ignored after a service restart

ensure the xrootd:xrootd ownership is set for it

Adding a new Server (Gateway host) to the cluster

When a new Gateway needs to be added to a cluster, the following steps (in addition to the usual set of checks for ensuring a fully functional gateway) are required.

  • Ensure the host has the correct personality (i.e. ceph-unified-gw-echo)

  • In Aquilon the manager hosts must be recompiled, in order to find the new host, and update the managers that the new host is available.

    • As we have a pair of managers, it is preferable to (using keepavlived) remove one manager, compile it, check it restarts services correctly, and (using keepalived) add it back.
      (this may require some quattor commands on the host to force the compilation to be deployed immediately).

    • Then, repeat this step for the second manager.

  • Finally, check the cms.blacklist blocklist files on each manager to ensure that the new Server (aka Gateway) is not explicitly excluded from the cluster here.

Development items

Services

...

Code Block
aq add_cluster --cluster xrootd_manager_echo --archetype ral-tier1-clusters --personality keepalived --down_hosts_threshold 1 --campus harwe
ll --sandbox orl67423/jw-gateway-xrootd-cmsd

aq cluster --cluster xrootd_manager_echo --hostname echo-manager01.gridpp.rl.ac.uk --personality ceph-xrootd-manager-echo-test
aq cluster --cluster xrootd_manager_echo --hostname echo-manager02.gridpp.rl.ac.uk --personality ceph-xrootd-manager-echo-test

aq compile --cluster xrootd_manager_echo
aq make  --hostname echo-manager02.gridpp.rl.ac.uk && aq make  --hostname echo-manager01.gridpp.rl.ac.uk

New cluster

Fabric


Could you please create 2 new rocky8 VMware hosts which should act similar roles as echo-manager01.gridpp.rl.ac.uk,

named:
echo-alice-manager01.gridpp.rl.ac.uk
echo-alice-manager02.gridpp.rl.ac.uk

with associated x509 certificates with the following SANs:
echo.stfc.ac.uk
alice.echo.stfc.ac.uk
*.echo.stfc.ac.uk 
*.s3.echo.stfc.ac.uk

with external firewall holes for port 1094 (xrootd traffic)

they should should be able to contact echo gateways on port 1094,1095 and 1213

with the following specs
4 CPUs
8GB RAM
60GB disk

with IP addresses changed so that they are in the OPN subnet
Ideally they should be in the lower part of 130.246.176.0/24 https://netbox.esc.rl.ac.uk/ipam/prefixes/323/ip-addresses/  (James A's words.) (v4 and v6)

with AAAA DNS records added once set,

along with a pair of floating IPs (like 130.246.176.2 and 130.246.176.3 and the associated v6 2001:630:58:1820::82f6:b002 and 2001:630:58:1820::82f6:b003) to be assigned to keepalived for load balancing

Aquilon

aq add service --service xrootd-clustered --instance xrootd-clustered-echo-internal
aq bind_server --service xrootd-clustered --instance xrootd-clustered-echo-internal --hostname echo-internal-manager01.gridpp.rl.ac.uk

aq map_service --service xrootd-clustered --instance xrootd-clustered-echo-internal --archetype ral-tier1 --personality ceph-gw-echo-internal --campus Harwell --justification tcm=000

copy /shared/service/xrootd-clustered/xrootd-clustered-echo into /shared/service/xrootd-clustered/xrootd-clustered-echo-internal and replace naming in configs appropriately

copy ral-tier1/features/keepalived/echo-managers to ral-tier1/features/keepalived/echo-managers-internal
in ral-tier1/features/keepalived/echo-managers-internal/config.pan, replace the ip addresses with the new floating ips and replace vrid[N] with a different number (not included in other keepalived configs)