data management @ ivs - ku leuven · • yes, you can do that as long as you are a member of...

14
Data Management @ IvS

Upload: others

Post on 06-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Management @ IvS - KU Leuven · • Yes, you can do that as long as you are a member of IvS-KULeuven organization in GitHub • What happens to my GitHub repos when I’m no

Data Management @ IvS

Page 2: Data Management @ IvS - KU Leuven · • Yes, you can do that as long as you are a member of IvS-KULeuven organization in GitHub • What happens to my GitHub repos when I’m no

Department of Physics and Astronomy

• How do I organize my data? • How do I store my data? • How do I document my data? • How do I backup my data? • How do I share my data?

!2

How do I … ?

Page 3: Data Management @ IvS - KU Leuven · • Yes, you can do that as long as you are a member of IvS-KULeuven organization in GitHub • What happens to my GitHub repos when I’m no

Department of Physics and Astronomy

• Identify your data • final, temporary, shared, … • models, observations, processed, statistics, … • data publication pyramid, data hierarchy • software used to produce, process, archive, ….

• Structure your data • folder names and file names • versions on files or directories • find a good balance in sub folders

• File Format

!3

Data Organization

Page 4: Data Management @ IvS - KU Leuven · • Yes, you can do that as long as you are a member of IvS-KULeuven organization in GitHub • What happens to my GitHub repos when I’m no

Department of Physics and Astronomy

• do not use your home directory • use /STER and request disk space @ system

• different /STER possible for sharing, projects, data usage, …

• do not use external disks for important research data • know how much data you have and need • think about a file structure • think about data ageing – what data can be deleted?

!4

Data Storage

Page 5: Data Management @ IvS - KU Leuven · • Yes, you can do that as long as you are a member of IvS-KULeuven organization in GitHub • What happens to my GitHub repos when I’m no

Department of Physics and Astronomy

• ASCII for small files only (configuration, meta data, …) • ASCII takes about 4x more space than binary for large data files

• FITS • HDF5 • Database • Use standards to make your data exchangeable

Document that data format + meta data

!5

Data Format

Page 6: Data Management @ IvS - KU Leuven · • Yes, you can do that as long as you are a member of IvS-KULeuven organization in GitHub • What happens to my GitHub repos when I’m no

Department of Physics and Astronomy

• double precision vs single precision • calculations vs storage

• (u)int64/32/16/8 • avoid redundancy

!6

Data Type

Page 7: Data Management @ IvS - KU Leuven · • Yes, you can do that as long as you are a member of IvS-KULeuven organization in GitHub • What happens to my GitHub repos when I’m no

Department of Physics and Astronomy

• README (txt, md, rst) • describe file structure including filename conventions • describe data format • describe meta data • describe usage, software to access • describe versions and related software changes • create a Data Management Plan

!7

Documentation

Page 8: Data Management @ IvS - KU Leuven · • Yes, you can do that as long as you are a member of IvS-KULeuven organization in GitHub • What happens to my GitHub repos when I’m no

Department of Physics and Astronomy

• Automatically by system • daily, incremental • home, STER (unless no-backup), NextCloud

• Not all data needs a backup • observational • easily re-generated • temporary ⟶ scratch disk

!8

Backups

Page 9: Data Management @ IvS - KU Leuven · • Yes, you can do that as long as you are a member of IvS-KULeuven organization in GitHub • What happens to my GitHub repos when I’m no

Department of Physics and Astronomy

• use Belnet FileSender for one time sharing • use NextCloud for indefinite sharing and request disk space @ system • contact system for sharing large amounts of data for a long time

!9

Data Sharing

Page 10: Data Management @ IvS - KU Leuven · • Yes, you can do that as long as you are a member of IvS-KULeuven organization in GitHub • What happens to my GitHub repos when I’m no

Department of Physics and Astronomy

• use git and GitHub to manage your software

!10

Software

Page 12: Data Management @ IvS - KU Leuven · • Yes, you can do that as long as you are a member of IvS-KULeuven organization in GitHub • What happens to my GitHub repos when I’m no

Department of Physics and Astronomy!12

Questions

Page 13: Data Management @ IvS - KU Leuven · • Yes, you can do that as long as you are a member of IvS-KULeuven organization in GitHub • What happens to my GitHub repos when I’m no

Department of Physics and Astronomy

• When do we need to provide the DMP? Is there a deadline? • Depends on the projects funding, usually 6 months

• Where do we have to send the DMP? • Depends on the projects funding, check the web sites in Resources

• Is it possible to have private repositories in GitHub? • Yes, you can do that as long as you are a member of IvS-KULeuven organization in GitHub

• What happens to my GitHub repos when I’m no longer @ IvS? • You either go for the paid option in GitHub or when you downgrade to a free account, your private

repo will not be accessible anymore. Take a backup before downgrading.

!13

Questions

Page 14: Data Management @ IvS - KU Leuven · • Yes, you can do that as long as you are a member of IvS-KULeuven organization in GitHub • What happens to my GitHub repos when I’m no

Department of Physics and Astronomy

• How can I check if and when the last backup was successful? • check with system if you are worried, we can send notifications if required

• How long is the data available on a scratch disk? • the scratch disks are usually about 200GB, clean up yourself after you saved your data to /STER. We

only clean up the /scratch area when there is a need to (and after checking with the owner of the data) or when the systems are re-installed (usually once per year).

!14

Questions cont.