Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

green - can be done by users

orange - anyone in the group (e.g. ceph, batch farm, OC) can do it

red - needs specialized knowledge to resolve

  1. transfers to X are failing.

    1. Are they failing for everyone else too? i.e, do we fail against other endpoints as well or are all transfers to that endpoint failing?

    2. what’s the error?

    3. can you ping it/trace it from the gateways?

    4. how is the load looking over the gateways?

    5. how is the load looking on the managers?

    6. Does the server logs report any errors?

  2. high job failure rate.

    1. is it limited to specific VO?

    2. is it affecting specific gens/WNs?

    3. what’s the uptime on the gateway container? was it killed at some point? syslogs? server logs?