User Stories

We note that Diamond users are already publishing data elsewhere.

By searching for “Diamond Light Source” in Zenodo we can see multiple datasets (281 results at the time of writing): https://zenodo.org/search?page=1&size=20&q=%22diamond%20light%20source%22&type=dataset

 

 

These user stories will help derive any potential requirements and Test Cases.

 

  1. User-defined DOIs

    1. mint whole Visits

    2. mint at a later date

  2. Data exclusion & DataPublications deletion

  3. Automatic minting of DOIs

  4. An external review process


1. Instantly create User-defined Data Publications

Actor

A Principal Investigator (PI)

Preconditions

  • They are able to log into DataGateway

  • They are able to see data from the experiments in which they are the PI.

Use Case Description

An existing ICAT user who is a Principal Investigator for a set of Visits, logs into DataGateway to create a data publication for the data that they are PI for.

They should be able to create DPs for datasets and datafiles.

The system should:

  • Auto populate the creator's field from the list of users associated with the overall Visit

    • the PI should be able to add/remove creators from this list, but not themselves.

  • When they click “mint” the system should publish the Data Publication, making the data publicly accessible and identified with a DOI. This will generate a landing page with metadata about the data and where to access it.

  • The UI should notify the user that they are responsible for the data they share and link to the DataPolicy. This should be confirmed by the user as they need to accept the data policy and be responsible for the data they are making open (no sensitive data)

  • Users should be able to add new creators and remove any existing ones

Technical Implications / Details

  • The Users will be responsible for providing the required metadata and so will need to provide the DataPublications' creator, title and description. Full list of metadata used in DOI here.

  • It is assumed that the DataGateway front end will assist the user with these mandatory inputs.

  • There is no upper limit imposed on the amount of datasets and datafiles a user could add to a user defined DP

  • Only data in which they are PI in for the overall investigation can be minted

  • There is to be NO review process for these DPs/DOIs

  • This functionality is to be released first

Questions

 

Expected Output

  • A data publication and associated DOI created from a user-defined set of metadata. The underlying data is now open and retrievable by anyone in the world

  • A DOI in a findable/searchable state with an associated landing page.


1.a: Create User-defined Data Publications for Whole Visits

Actor

A Principal Investigator (PI)

Preconditions

  • They are able to log into DataGateway

  • They are able to see data from the experiments in which they participate, either as PIs or other roles.

Use Case Description

An existing ICAT user who is a Pricipal Investigator for a set of Visits logs into DataGateway to create a data publication for the data that they are PI for.

The system should:

  • Auto popluate the creators field from the list of users associated with the Visit

    • the PI should be able to add/remove creators from this list but not themselves.

  • When they click “mint” the system should publish the Data Publication, making the data publicly accessible and identified with a DOI. This will generate a landing page with metadata about the data and where to access it.

  • The UI should notify the user that they are responsible for the data they share and link to the DataPolicy.

  • When a whole visit is selected, the title, abstract, and creators should be taken from the visit and can’t be changed.

Technical Implications

This isn’t possible right now (18/5/23) due to the size limit imposed on downloads by SCD. This means that investigations are not selectable in the data gateway front end.

Questions

 

Expected Output

  • A data publication and associated DOI created from a user-defined set of metadata. The underlying data is now open and retrievable by anyone in the world

  • A DOI in a findable/searchable state with an associated landing page.


1.b: Create a Datapublication at a later date

Actor

A Prinicipal Investigator (PI)

Preconditions

  • They are able to log into DataGateway

  • They are able to see data from the experiments in which they participate, either as PIs or other roles.

Use Case Description

The description encompasses the two use cases above but gives the user the option to create a DP/DOI at a date in future.

Technical Implications

 

Questions

 

Expected Output

 

 

2. Data exclusion & DataPublication deletion

Actor

An existing ICAT user with a valid account

Preconditions

  • They are able to log into DataGateway

  • They are able to see thier data in a DP.

Use Case Description

  1. Automatic minting: A PI wants remove a specific datasets from a data publication once a DOI has been created. Diamond are worried that confidential data may make it into DPs

  2. User-defined: If users want to delete DPs for any reason, this will be done through an email request.

Technical Implications

  1. Automatic minting: Once the automatic process is ran, a DP and DOI will be created but the data won’t be open until 3 years time, however if in the meantime, some confidential datasets are are added. A new Rule will need to be created to check a flag in the dataset, as so to never release confidential / sensitive data. In this case the DP and DOI will reference only the “open source” data. The Automatic process should not create DOIs and DP for confidential Visits in the first place so a flag will also need to be set against the Visit. The script will also take a configuration file list the requirements for data exclusion.

  2. User-defined: There will need to be a Delete State attached to the data publication enitiy in ICAT so we can track deleted ones. We need to do this as we have to maintain landing pages for DOIs, and we can’t delete specific DOIs, so a landing page will have to specify that the data has been removed.

    Overall, Users can’t change the data associated with a DP once its created. If it needs to change it will have to be deleted and a new one created.

Questions

 

Expected Output

  • That no other user can see the users datasets, and implied, datafiles



3. Automatic minting of DOIs

Actor

System (or cron job)

Preconditions

  • Any developed software can connect to the ICAT DB.

  • Correct permissions are set to allow the software to access the ICAT.

  • There exist sufficient data to fill out the required metadata fields in the Datapublication and DOI.

Use Case Description

At regular intervals, a batch system creates a set of DataPublications & associated DOIs for existing Investigations in ICAT.

Technical Implications / Details

This functionality will be used to do two things:

  1. Create DataPublications and DOIs for the backlog of data that should already be open but isn’t. All Visits with an investigation end date + n days, barring any exceptions provided by diamond (see here), will need to be processed.

  2. Be added to a cron job to regularly check for Visits added to ICAT to process. The investigation end date will be used. The script will need to check if any investigations have ended in the last 24 h and create DPs and DOI accordingly. Again, barring any exceptions provided by diamond (see here).

The script will need to take a config that lists all the datasets, datafiles or visits that need to be excluded from the minting process. Diamond are to provide this list.

This functionality is to be released after the user-defined finctionality.

Questions

 

Expected Output

A set of ICAT entities have been processed, and their related:

  • DataPublication has been created in ICAT

  • DOI has been created and is in a findable/searchable state

  • A landing page for each of the process ICAT entities is created and assigned to the relevant DataPublication and DOI.

 

4. An external review process

Actor

An Editor for the New Scientist Magazine (other experts in the field?)

Preconditions

They have access to the internet

Use Case Description

A scientist who has visited DLS wants their findings published in a fancy magazine. Along with the research, the Scientist wants to give temporary read access to the editor so they can review the draft findings, making sure they are accurate. Once the editor has reviewed the draft DOI, the Scientist can move the DOI from a draft state into findable, public one.

The scientist will be able to share this via a link which anyone can access (internal icat users and external).

Technical Implications

  • We would not facilitate the review process. How they communicate, provide sign off would not be included.

  • More complicated authorisation model,

Questions

  • what other existing systems do similar?, google docs, youtube.

  • If we created a sharable link to the DP Landing page, would this:

    • be a draft DOI/DP and not finadable to public?

    • be accessible to anyone with the link?

Expected Output

A publisher is able to review data produced by a visiting DLS scientist and has confidence that the content they provide is scientifically accurate.