...
This means that, if we were to transfer the data to tape without using storageD, The IDS wouldn’t know about the newly uploaded files and wouldn’t be able to download them.
High-level requirements
A script, deployed on a server somewhere, will need to:
...
check ICAT for files added between a set of dates (Start data - > end date)
...
store the location of these files
...
download the files via s3 and create a checksum
...
create a drop file and publish to StorageD
...
once all files have been pushed to storageD, redownload and check checksum
...
A separate piece of software, in addition to the upload api, is needed to make the transfer.
The following 2 designs have been considered.
Architectural Design A
Here, the data will need to be downloaded to local storage so the StorageD client can pick it up. This has a high network overhead but has the advantage of being relatively easy to develop.
Architectural Design B
This design explores the possibility of the storageD server downloading the data directly. This would save a download step. However, it will mean a change to the storageD code which may be problematic due to the state of the code.
...
Solution
After a chat with Alan Kyffin, we decided to go with design A because it is simpler to develop and maintain.
the code isn’t in a git repo
isn't in great shape
is being looked after by another team
is currently being updated from python2 to 3
The code was put in a GitLab repo for this review.