approaches to archiving professional blogs hosted in the cloud

19
A centre of expertise in digital information management www.ukoln.ac.u k 1 UKOLN is supported by: Approaches to Archiving Professional Blogs Hosted in the Cloud iPRES 2010, Vienna, Austria Tuesday, September 21st 2010 Marieke Guy Research Officer, UKOLN www.bath.ac.u k This work is licensed under a Attribution-NonCommercial- ShareAlike 2.0 licence http://www.ukoln.ac.uk/web-focus/papers/pres- 2010/paper25/

Upload: marieke-guy

Post on 07-May-2015

3.503 views

Category:

Education


1 download

DESCRIPTION

'Approaches to Archiving Professional Blogs Hosted in the Cloud' presentation given by Marieke Guy, UKOLN on September 21, 2010 at the 7th International Conference on Preservation of Digital Objects (iPRES2010), Vienna, Austria. Available at http://www.ukoln.ac.uk/web-focus/papers/pres-2010/paper25/

TRANSCRIPT

Page 1: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

1

UKOLN is supported by:

Approaches to Archiving Professional BlogsHosted in the Cloud

iPRES 2010, Vienna, Austria

Tuesday, September 21st 2010

Marieke Guy

Research Officer, UKOLN

www.bath.ac.uk

This work is licensed under a Attribution-NonCommercial-ShareAlike 2.0 licence

http://www.ukoln.ac.uk/web-focus/papers/pres-2010/paper25/http://www.ukoln.ac.uk/web-focus/papers/pres-2010/paper25/

Page 2: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

2

Introduction to UKOLN• UKOLN is a centre of excellence in digital information

management, providing advice and services to the library, information and cultural heritage communities

• Library and cataloguing background• Located at the University of Bath, UK• Funded by JISC to advise UK HE and FE communities • Also project funding, including EU funding• Many areas of work including metadata, repositories,

dissemination activities, eScience, etc.• Digital preservation projects: DRIVER, CEDARS,

eBank, JISC Preservation of Web Resources, Beginners Guide, etc.

• Digital Curation Centre

Page 3: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

3

Why blogs? Why in the Cloud?

• Ease of creation, ease of use, ease of sharing• Increasingly used for reflecting, analyzing,

questioning, critiquing, recording, discussing, learning, etc.

• Very important for information professionals• Many dissemination benefits

• Lack of institutional blogging infrastructure• UKOLN supports innovation• Cloud is an agile, cost-effective, highly useable way

to deliver a service

• Now own institutional service and over 15 blogs

Page 4: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

4

The Professional’s blog

• Established 2006• 750+ posts• 240 users per day• Personal style• Institution vs

individual?

Page 5: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

5

The Project blog

• 2008 - 2010• 118 posts• 141 comments• 6 contributors• Professional style

Page 6: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

6

The Event blog

• June – August 2009• 68 posts• 3 contributors +

guests• Video, interviews,

photos, discussion• Informal/

professional style

Page 7: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

7

Why Preserve blogs?• Contain useful information• Information not available elsewhere• Look and feel relevant• Cultural significance• Reliance on 3rd party services• Blogs disappear (UK HE funding cuts…)

• ‘Archiving’ - ways in which blog content can be migrated to alternative environments in order to satisfy a number of business functions

• Focus on short-term continuity and management• Could comprise part of a preservation Strategy

Page 8: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

8

Different Approaches:• New Static Master Copy• Backup Copy• Migration to Another Platform• Physical Manifestation• Other technical approaches

• What are the issues with each of these?

http://www.flickr.com/photos/mnsc/433436548/

Page 9: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

9

New Static Master Copy• Migrate blog to static HTML• Point to new static resource• IWMW – WinHTTTrack static copyIssues: • No interactivity• Loss of technical architecture e.g. plugins• Loss of other elements e.g comments• Look and feel

Page 10: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

10

Backup Copy• Using XML, using HTML?• Where? • On the server? On a disc? On an external hard

drive?• On the same blog platform?• ArchivePress• On alternate blog platform?• JP XML version on Intranet• IWMW static version on IntranetIssues: • Access

Page 11: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

11

Migration to Another Platform

• Live blog to alternate platform• Could just be for data mining purposes – can’t do

on current environment• UKWF VOX platform, RSS feeds used, Yahoo

pipes• Export featureIssues: • Access• Loss of technical architecture e.g. plugins• Loss of other elements e.g comments• Look and feel

Page 12: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

12

Physical Manifestation• Create a hard copy print out e.g. self-publishing• Create PDF of site, RSS2PDF• UKWF Lulu self published book available• Purpose specificIssues• Obviously not interactive but record unlikely to

degrade like other options

Page 13: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

13

Technical Approaches• HTML Scraping

– HTTTrack – static Web site created• Third-party Web archiving

– UK Web Archive– Internet Archive– Not always complete capture but useful for look

and feel– URL submitted for case study blogs

Page 14: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

14

Freezing a blog• Assessment of status of blog• Audit - Get your house in order: links to embeds,

comments, spam, etc.• Preliminary posts• Statistics: dates, posts, comments, spam,

contributors, theme, plugins, software, licence etc. • Archive page/sidebar widget• Final post• Indication that blog is archived• Close comments• Archive blog

http://www.flickr.com/photos/plousia/93646438/

Page 15: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

15

The Archive Page

Page 16: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

16

General Issues• What constitutes a blog? – content, layout, plugins,

comments, tags, images, multimedia, etc.• Who owns a blog?• Identity, copyright, ownership and licences• Privacy• Permissions to access blogs belonging to

individuals• Understandability of pages if out of context• Blog policies• Availability

Page 17: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

17

Best Practice Checklist• Planning• Clarification of rights• Monitoring of technologies used• Auditing• Understanding of costs and benefits• Identification and implementation of archiving

strategy• Dissemination• Learning• Organisational Audit

Page 18: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

18

Lessons Learnt• Need for a risk assessment framework if using

third party services• Importance of planning and writing of blog policy

at start of blog lifecycle• Useful to consider a combination of approaches

rather than just one• Value of sharing best practice of blog archiving

Page 19: Approaches to Archiving Professional Blogs Hosted in the Cloud

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

19

Questions?• Twitter Id: mariekeguy• Email: [email protected]• Slides: http://www.slideshare.net/MariekeGuy

• All resource URLs tagged with ipres2010-blogs:http://delicious.com/mariekeguy/ipres2010-blogs