Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Alex R noticed that cached files from the xrootd-proxy are retrieved successfully, indicating the issue exists with the proxy recalling data from the gateway.

looks like the cache FD cannot be freed?

Nov 15 21:37:07 lcg2631.gridpp.rl.ac.uk docker[1542921]: CEPHSUM-2023-11-15 21:37:07,527-67018-DEBUG-Converted /lhcb:user/lhcb/user/k/kmattiol/GangaJob_350/InputFiles/diracInputFiles_350_dba170b4-0658-444a-bbeb-adb973370624.tgz to lhcb, user/lhcb/user/k/kmattiol/GangaJob_350/InputFiles/diracInputFiles_350_dba170b4-0658-444a-bbeb-adb973370624.tgz
Nov 15 21:37:07 lcg2631.gridpp.rl.ac.uk docker[1542921]: CEPHSUM-2023-11-15 21:37:07,706-67018-DEBUG-Striper: Object size:67108864, Total size:5227299, Num Stripes:1, Last Stripe size:5227299
Nov 15 21:37:07 lcg2631.gridpp.rl.ac.uk docker[1542921]: CEPHSUM-2023-11-15 21:37:07,706-67018-DEBUG-adler32, 1700083974, 1, 0, b'\x00', b'\x04', 4, b'\xf9\xe4\xe9d'
Nov 15 21:37:07 lcg2631.gridpp.rl.ac.uk docker[1542921]: CEPHSUM-2023-11-15 21:37:07,706-67018-DEBUG-Time info: 1700083974, 2023-11-15 21:32:54, 1, 0:00:01, 2023-11-15 21:32:55
Nov 15 21:37:07 lcg2631.gridpp.rl.ac.uk docker[1542921]: CEPHSUM-2023-11-15 21:37:07,706-67018-INFO-XrdCks.adler32: f9e4e964 ; 2023-11-15 21:32:54; 0:00:01; 4
Nov 15 21:37:07 lcg2631.gridpp.rl.ac.uk docker[1542921]: CEPHSUM-2023-11-15 21:37:07,706-67018-INFO-Path:user/lhcb/user/k/kmattiol/GangaJob_350/InputFiles/diracInputFiles_350_dba170b4-0658-444a-bbeb-adb973370624.tgz; From:metadata; Checksum:f9e4e964
Nov 15 21:37:07 lcg2631.gridpp.rl.ac.uk docker[1542921]: CEPHSUM-2023-11-15 21:37:07,728-67018-INFO-Result:Done, pool:lhcb, path:/lhcb:user/lhcb/user/k/kmattiol/GangaJob_350/InputFiles/diracInputFiles_350_dba170b4-0658-444a-bbeb-adb973370624.tgz, checksum:f9e4e964, time_s:0.20102, filesize_bytes:5227299, source:metadata, exit_code:0, srccks:N/A
-----READ FINISHED---------

Nov 16 09:59:34 lcg2631.gridpp.rl.ac.uk docker[1542921]: 231116 09:59:34 102659 acc_Audit: xrootd.102663:6647@localhost grant gsi 0495af69.0@localhost read /lhcb:user/lhcb/user/k/kmattiol/GangaJob_350/InputFiles/diracInputFiles_350_dba170b4-0658-444a-bbeb-adb973370624.tgz
Nov 16 09:59:34 lcg2631.gridpp.rl.ac.uk docker[1542921]: 231116 09:59:34 102659 oss_Open_ufs: Unable to reloc FD /xcache/lhcb:user/lhcb/user/k/kmattiol/GangaJob_350/InputFiles/diracInputFiles_350_dba170b4-0658-444a-bbeb-adb973370624.tgz.cinfo; invalid argument
Nov 16 09:59:34 lcg2631.gridpp.rl.ac.uk docker[1542921]: 231116 09:59:34 102659 oss_Open_ufs: Unable to reloc FD /xcache/lhcb:user/lhcb/user/k/kmattiol/GangaJob_350/InputFiles/diracInputFiles_350_dba170b4-0658-444a-bbeb-adb973370624.tgz; invalid argument
Nov 16 09:59:34 lcg2631.gridpp.rl.ac.uk docker[1542921]: 231116 09:59:34 102659 oss_Open_ufs: Unable to reloc FD /xcache/lhcb:user/lhcb/user/k/kmattiol/GangaJob_350/InputFiles/diracInputFiles_350_dba170b4-0658-444a-bbeb-adb973370624.tgz.cinfo; invalid argument

---after manually clearing file from cache and attempting to read----

Nov 16 10:12:13 lcg2631.gridpp.rl.ac.uk docker[1542921]: 231116 10:12:13 103058 XrootdAioTask: aioR overdue 8 inflight requests for xrootd.103062:6877@localhost /lhcb:user/lhcb/user/k/kmattiol/GangaJob_350/InputFiles/diracInputFiles_350_dba170b4-0658-444a-bbeb-adb973370624.tgz


Tcpdump shows that gateway for new connections closes the channes on its side for some reason immediately after connection establishment:

Code Block
10:31:09.812402 IP (tos 0x0, ttl 64, id 55828, offset 0, flags [DF], proto TCP (6), length 60)
    172.28.1.1.36580 > 172.28.1.2.nicelink: Flags [S], cksum 0x5a6a (incorrect -> 0x539f), seq 1788468839, win 64240, options [mss 1460,sackOK,TS val 2782928166 ecr 0,nop,wscale 8], length 0
10:31:09.812498 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    172.28.1.2.nicelink > 172.28.1.1.36580: Flags [S.], cksum 0x5a6a (incorrect -> 0x3470), seq 1456405645, ack 1788468840, win 65160, options [mss 1460,sackOK,TS val 4249275106 ecr 2782928166,nop,wscale 8], length 0
10:31:09.812515 IP (tos 0x0, ttl 64, id 55829, offset 0, flags [DF], proto TCP (6), length 52)
    172.28.1.1.36580 > 172.28.1.2.nicelink: Flags [.], cksum 0x5a62 (incorrect -> 0x60cb), seq 1, ack 1, win 251, options [nop,nop,TS val 2782928166 ecr 4249275106], length 0
10:31:39.842286 IP (tos 0x0, ttl 64, id 51510, offset 0, flags [DF], proto TCP (6), length 52)
    172.28.1.2.nicelink > 172.28.1.1.36580: Flags [F.], cksum 0x5a62 (incorrect -> 0xeb77), seq 1, ack 1, win 255, options [nop,nop,TS val 4249305136 ecr 2782928166], length 0
10:31:39.843151 IP (tos 0x0, ttl 64, id 55830, offset 0, flags [DF], proto TCP (6), length 52)
    172.28.1.1.36580 > 172.28.1.2.nicelink: Flags [.], cksum 0x5a62 (incorrect -> 0x762c), seq 1, ack 2, win 251, options [nop,nop,TS val 2782958197 ecr 4249305136], length 0