Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Title

 Temporary alias for xrootd internal traffic

Submitted by

 Jyothish Thomas

Requested by

 Alastair Dewhurst

Summary

Problem

The new gateways do not have external ipv6 access due to pending network interventions that have delayed their deployments for multiple weeks. The current external gateways are under heavy load from the current traffic and cause functional test and job failures.
the new gateways are:
ceph-svc11.gridpp.rl.ac.uk
ceph-svc13.gridpp.rl.ac.uk
ceph-svc14.gridpp.rl.ac.uk
ceph-svc15.gridpp.rl.ac.uk
ceph-svc17.gridpp.rl.ac.uk
ceph-svc18.gridpp.rl.ac.uk

ceph-svc01.gridpp.rl.ac.uk
ceph-svc02.gridpp.rl.ac.uk

Proposed solution

Add an additional DNS round robin alias (internal.echo.stfc.ac.uk) that maps to the gateways pending deployment.
Add a routing rule on the batch farm to redirect xrootd.echo.stfc.ac.uk and webdav.echo.stfc.ac.uk traffic to it instead. (similar to current redirection to workernode gateway container)

This can be done by assigning each job container to a random gateway in the above list,

Direct transfers would take place without issues, and if the jobs perform tpc the traffic should go over ipv4 as the job containers are ipv4 only.

Urgency

 Urgent

Impact of successfully implementing the change

 Workernode load affecting external xrootd gateways will be diverted to a set of currently unused gateways, thereby reducing load related issues in production

Consultation

 

Type of Change

 

Link to Change Control master ticket (RT or JSM)

 

Jira Legacy
serverSystem JIRA
serverId929eceee-34b0-3928-beeb-a1a37de31a8b
keyXRD-74

...

Details of testing carried out

 After creating the alias and mapping the new gateways to it, functional tests will be run on the alias. VOs can also run their functional tests targeting the alias.

Further tests required prior to implementation

  restart

Deployed/tested at other WLCG/EGEE site?

 

Can be phased in stages?

 

Implementation plan

 

Post implementation testing

 

Reversion plan in case of problems

 delete the iptables rule

Has this been successfully reviewed with production team against new service ticklist.

(This should be done for significant changes to services too).

 

...