wgs data management course try-out 2012-09-24, hugo besemer

16
WGS Data management course Try-out 2012-09-24, Hugo Besemer

Upload: thomasina-dalton

Post on 17-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: WGS Data management course Try-out 2012-09-24, Hugo Besemer

WGS Data management course

Try-out

2012-09-24, Hugo Besemer

Page 2: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Short time storage: file and path names

MS/Windows , Mac OS allow very long names but ...

Are your filenames descriptive?

Are your filenames unique?

8.3 convention (12345678.abc ) important e.g. when burning CD’s or DVD’s

Avoid spaces for files that may go on the web

Avoid punctuation () \ / : * ? " < >’As they may be reserved in operating system or programming languages

Page 3: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Short time storage: Descriptive file names

Descriptive filename Not unique Unique

in a folder structure (across folders) (across folders)

This will work for relatively small numbers of files. If

large numbers of files are produced automatically non-

descriptive filenames may be used. You need to know

something else (“DAMS “Digital assets management

system”) to keep track what is what

Page 4: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Short time storage: version control

Questions and Best practices

●Are you working alone or with others?

●Do you store files at different locations? (synchronisation)

●Keep track of ‘master files’ and ‘milestone files’ and store them in a single location (Dropbox?)

Identifying versions

●Use a naming convention that includes date or number (..._v1, ..._v2)

●Your software may be able to do (part of) the job

Page 5: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Backups

Stick to the agreed way of working within your group (if there are any)

In the next slides some points of view from the Wageningen UR IT department (FB-IT)

Page 6: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Backups: IT Data storage Continuity

Versus

• Data centre . Secure: (fire, power incidents, burglary).

• 2 data centres in case of disaster• The equipment is fail-safe • 500 TB reserved, 300 in use, 1 PB avail

Page 7: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Backups: ICT Data Products & Services

Service:

Application

Price per GB

Backup

ReliableAvailable

Speed

Minimalsupply

Bronze

Volatile or static data

€3,25

Week

Good

Good

50 GB

Silver

Databases or research data

€5,- without€7,- withbackup

Month

High

Fast

50 GB

Gold

Critical data

€15,-

Month+ History

max 1 year

VeryHigh

Fast

1 GB

Massive

Mass reproducable

data

€520 / TB

No

Good

Good

1 TB

Massive double

Same as massive, high

availability

€1000 / TB

No

High

Good

1 TB

Page 8: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Backups: Better alignment

(% is total percentage of score + 1 up or down)

Subject Importance Score FBIT

Ease of use 9 (85%) 8 (64%)

Backup/Restore 9 (64%) 7 (28%) very diverse

Share (intern) 9 (79%) 9 (71%)

Share (external) 6 (28%) very diverse 5 (21%) many n/a

Archive function 8 (50%) many n/a 5 (14%)

Findable 9 (79%) 7 (28%)

Price 9 (86%) 4 (28%) very diverse

Speed data transfer 9 (72%) 5 (21%) very diverse

Availability 9 (79%) 8 (64%)

Flexibility 8 (78%) 6 (28%)

Security 7 (50%) very diverse 8 (57%)

Page 9: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Backups: Data storage workshop conclusions

Enhancements Request:

1. Lower the price

2. Set up a Concern policy for Information security

3. Higher flexibility (request period, use period, costing, etc)

4. Accessibility for external people

5. Deliver a Product for Archiving

6. Higher throughput (data rate)

What is the next step?

● Building a roadmap for IT Storage and Products

Page 10: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Long term storage: Metadata

Content metadata

Context metadata

Metadata serves different purposes:

Metadata are structured data that provide a short summary about any

information resource, print or electronic, and facilitate the location,

identification, or discovery of that resource.

Subject terms, titles

creator, place , time, project

Location. Metadata can indicate where an information resource is located, either physically or virtually.Identification. Metadata can distinguish one information resource from another without describing the entire collection of information resources.Resource discovery. Metadata can link a user's queries about a particular subject with those information resources about the same subject.

Page 11: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Long term storage: metadata and datasets

Page 12: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Long term storage: metadata and datasets 2

Page 13: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Long term storage: metadata and datasets 3

DANS: Dutch national repository for datasets

Unique ID

Page 14: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Long term storage: metadata, datasets and

preservation

It’s as open as you want it to be

In a sustainable format, independent of (version of) software

With proper documentation for re-use

Page 15: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Long term storage: selection

Practical

Origin

Status

Subject content

Easy to reproduceCost of documentation / conversion acceptableFile size Reliable

AuthenticIs it stored elsewhere?

Required for verificationRequired for legal purposes

Re-usable

General interest

(WUR)mission

Page 16: WGS Data management course Try-out 2012-09-24, Hugo Besemer

What does all this mean

for your data

management plan?