XRootD: Code review for ReadV implementation in XrdCeph (March 2023)

 Overview

Status

NOT STARTED

Submitter

@Alexander Rogovskiy

Reviewers

@James Walder @Alexander Rogovskiy @Thomas, Jyothish (STFC,RAL,SC) @Ian Johnson @Emmanuel Bejide @Thomas Byrne @Alison Packer @Alastair Dewhurst Steven Simpson (Lancaster), sam

Goals

Review the code for the ReadV changes and identify any further tests / updates that are required. Discuss the packaging, deployment and post-deployment testing; and included any discussion on risk factors

Change Control

https://stfc.atlassian.net/wiki/spaces/GRIDPP/pages/137265343/Non-striper+read+v+implementation+for+WN+s+xrootd+gateways

Jira tickets

 

 

On this page

 Requirements

Summary statement

Improve xrootd’s read and readv implementation for WN’s gateways to allow them effectively handle vector read operations

Problem statements

  • Vector read (aka readv) operations on ECHO storage are not very efficient. Oftentimes a single readv requests with cumulative size of several megabytes takes up to 1 minute or more. This causes a lot of problems, especially for LHCb VO.

Success metrics

  • Allow for readV access with failures at a level no worse than at other sites. Do not worsen Read performance.

Meeting Dates:

Summary of meeting times and main participants:

  • Meeting 1 - Tues 21 March 2023; Present:

 Review Changes:

 Open issues

Questions that arise during the review should be largely addressed in the appropriate github PR review process. Summaries of main issues arising from the review should be noted below

  •  

  • Current implementation does not support a sparse object where one or more “Chunks” is full of zeros and hence not written by the (pseudo)-striper:

    • It is agreed that this implementation should support the existing operations.

    • For this to happen, the psuedo-striper will need the total file size

      • \decision\ it is agreed that the file size will be determined, once on open i.e. at the same time that the block size is determined. The likely place to do this is in the CephFileRef

      • A new PR should be created: (by the Dev team) to add and sort this new variable

    • It is also agreed to create a specification of the accepted read and readV behaviour in different uses cases: e.g. EOF/ sparse reads, across boundaries of sparse reads, etc. and to be demonstrated to be working via the result of the functional tests

    • Checksumming code should be appropriately updated too

  • 20 April 2023

    • Implementation to fall back to striper read in case of sparse file implemented.

    • If file is corrupted can lead to corrupt data (0’s) being sent, as the striper would do

    • Decision made to keep this behaviour, but to log that this has occurred.

  • 27 April 2023

    • Identified a case if XRootD v3 client running. This was unable to use the core XRootD server side fix to request an increased timeout

    • Will not apply this part of the fix, and rather use the ‘client’ side environmental variable to request the timeout value. This should be able to be applied at the docker container level of each running job (and also propagated into Singularity).

 

 Test results

A summary of the test suites that have been run, and any issues that have arisen. New tests that are requested can be documented here. Can either be presented as a link to a test suite page or direct summary of each of the tests.

Scenario / test name

Results

What we know

What we need to learn

Scenario / test name

Results

What we know

What we need to learn

Functional tests

Pass

How to run the tests:
1. Set up TEST_FILE_URL environment variable
2. Set up FILE_BLOCK_SIZE environment variable
3. Run py.test

 

 

 

 

 

 

 Deployment plan

Attempt to capture the steps necessary to deploy the changes into production. Noting any particular cases for side-effects to be observed and where dependencies in / from other packages / services might be relevant.

 

Date

Plan type

Target segment

Notes

Action items

Date

Plan type

Target segment

Notes

Action items

1

TEST / FEATURE FLAG / OTHER

 

2

 

 

 

 

 

Definitions:

 

  • ceph object / chunks (e.g. the 64MiB object)

 

Notes:

rop->read(0, 10 &bl1, &retval1); retval is alwalys 0. ?

Error checking for catching issues with not recieving correct bytes.

 

 

bulkAioRead: high level analogue of striper (i.e. to deal with reads at the file level).

(chunks may not always be 64MiB, e.g. 4MiB for dumps) → is tested?

lifetime is for a given readV operation.

→ Add additional comments for each method in the the .hh file

 

:Change of coordinates:

0 length read request ? for last object