Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Group

Data Services

Project name

Data Management for the Vera C. Rubin Observatory.

Project LM

Timothy Noble

Project code/Task

We will have new funding of 1.5FTE from Rubin that begins in April 2023.  These budget codes are still to be created.  Initially the graduate can book to STAK00024 02.03.

Other resources

Rubin Developers and project scientists from UK and worldwide.

Rucio Developers at CERN and Fermilab.

Those working on the GridPP Tier-1.

Project Summary

The Vera C. Rubin Observatory (formerly the Large Synoptic Survey Telescope (LSST)) is due to start taking data in July 2024.  This data will need to be exported from Chilie to a data processing facility in SLAC and then subsequently distributed to France and the UK as part of the Data Reprocessing Pipeline (DRP), which will process the data for further analysis by astronomers.   

 

It is currently in the process of deploying its data management infrastructure.  In the UK we have already deployed 9PB of disk storage and are preparing additional Tape storage at RAL and aim to process 25% of the data produced by the observatory.  It is important dataflow and analysis job workflows are established, tested, and verified beforehand.

 

This project is to work with the UK, US and French teams to coordinate, test and verify data distribution workflows required for a successful DRP. The data management system that is to be used in this project is Rucio, an open-source, python program that is used by ATLAS and CMS CERN experiments.

 

You will learn about and operate Rucio to support the data management of the project, which includes integrating storage endpoints, troubleshooting issues raised in the movement of data, and contributing to the overall operation of Rucio and curation of Rubin data.

 

You will also be liaising with the people around the UK supporting this project, to discuss the infrastructure that they provide for the project to ensure effective integration with Rucio.

 

Project Outputs

Project report / presentation / paper detailing the testing done and confirming that we will be able to meet the requirements for the Rubin DRP (or detailing what further work is required). 

Scripts / Code for managing Rubin workflows.

Documentation on how the agreed workflows should be run.

Skills and Expertise graduate will gain

·       Knowledge of the use and management of Rucio, as well as the software that integrates with Rucio (e.g. FTS)

·       Operational experience in managing time critical experiment workflows.

·       Knowledge of Rubin Workflows and data management processes

·       Experience working as part of a large international collaboration.

 

Exit plan when Graduate moves to different project

There is funding for the Rubin data manager to be a permanent role.  If there is no graduate that would like this permanently, we intend to recruit and would hope that a new person could be in place before this project finishes to handover.

Documentation, plans and details of any unfinished development are to be passed on to Tim Noble (LM) and other Rubin Staff to aid in their work.

...