digital curation plan · digital curation is important; it establishes digital assets, maintains...
TRANSCRIPT
Digital Curation Plan ~LIS 889~ Lisa Roberts
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
2
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
Digital Curation Plan
Abstract:
In contemporary society, institutions are dealing with an unprecedented amount, and permanent
modification, of data demand by public and professional users. Information technology in
Institutions is in high demand to make sure essential information for users are accessible,
reliable, secure, quality driven and, trusted. Trusted sources are integrity driven, innovative
building and opportunity based. Great opportunities are born in contemporary culture, therefore
collaborative efforts to build relationships with other institutions is vital for the success of long
term preservation efforts of the digital born E-collections that will be presented.
The goal will be to share and sustain data across all variety of social cultures with economic
disparities and social class barriers. The goal of sharing information with the Smithsonian
Institution’s fits the working description and will serve as an educational component globally
accessible template. Efforts to collaborate a smaller collection with a larger establishment as the
Smithsonian, will bridge the knowledge gap for research and scholar resolutions. Deprived
institutions need advocacy base organizations that can articulate the need to bridge the
opportunity gap that continues to distress underserved communities. This will help alleviate the
strain on smaller educational establishments that may not have access to information that was not
readily available in larger collections. Smaller institutions lack adequate economic resources,
more resources yields better results. The Smithsonian Institution is the ideal solution for this
proposed blueprint.
3
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
Why digital curation is important:
Ever lost vital information using a portable device? Have any of your personal devices ever
crashed, became corrupt, lost or suddenly vanished? For the most part, a good number of people
have experienced at least one or all of these unforgiving scenarios. Data loss can happen at any
time and on any level, whether the device is a smart phone, USB, a tablet, in a cloud or a large
online data storage network. Digital curation is important; it establishes digital assets, maintains
preservation and adds value to digital repositories for present and future access.
Drawback:
“Because of these technical dependencies, digital objects are by nature very fragile, often more at
risk of data loss and even sudden death than information recorded on brittle paper or nitrate film
stock.” (Susan Schreibman, 2004)
The process of keeping digital born information accessible and reusable is one of the challenges
that experts face on a consistent basis. The risk of data loss is extremely high when more than
half of the world’s population uses smart phones and the internet. The sensitivity of constant
information changes, the migration of information remaining interchangeable to crosswalk with
other software and technological upgrades is essential. The purpose of formulating a digital
curation plan for digital born data is to safeguard the information in an authentic and complete
manner.
The expertise of computer scientist, engineers, librarians, information technology specialist,
archivist and scholars is critical when ensuring that data can be supported when software and
hardware are no longer supported in its original capacity. This expensive task requires relentless
4
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
and persistence training to have manageability and uninterrupted data flow over time, making
sure data will be accessible for future usage.
Benefits:
The benefit of a digital curation plan is to guide and maintain specific data that librarians,
scientists, scholars and historians have invested over a large amount of time to preserve. The
goal of this project is to create an on-line archive database for the digital born E-Science,
Humanities and social media collections that will be preserved. With supported software and
hardware from a capable digital repository such as the Smithsonian, the fear of data becoming
obsolete over time, is no longer a severe threat. The objective is to provide open access to digital
born data for scholars, researchers and its user. The digital curation plan for the 3 collections
presented will be a joint collaborative effort with the Smithsonian Institution. The plethora of
assets that already exist within the context of the organization will secure the smaller collections
for proper management.
The rapid growth of file formats and metadata schemas, the Smithsonian institution is a well-
established association that has excess resources that can help aid in the long term preservation
efforts. Please review the collections digital curation goals and assessments. Feedback is
welcomed.
5
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
E-Science Collection #1
The South Aegean Volcanic Arc
I. Introduction
Goals of Assessment: The objective is to create an active on-line archival database for its users.
The petrographic analysis database of pottery and other ceramic artifacts have been dormant to
users. The goal is to help preserve the extensive collection delicate ceramic thin sections that
exist worldwide. Build a relationship with a successful organization that will be able to sustain
long term preservation efforts for the collection.
Describe the Institution:
The Digital Curation Plan will focus on the accomplishments of the Smithsonian Institution’s
track record. “The Smithsonian is the world’s largest museum, education, and research complex.
(The Smithsonian is a museum and research complex, comprised of 19 museums and galleries
and the National Zoological Park. The total number of objects, works of art and specimens at the
Smithsonian is estimated at nearly 137 million. The collections range from insects and meteorites
to locomotives and spacecraft.” Sharing information with the Smithsonian fits the description
and will serve as an educational component globally. Efforts to collaborate a smaller collection
with a larger establishment as the Smithsonian will bridge the knowledge gap for research and
scholar resolution.
Describe the Collection:
The South Aegean Volcanic Arc website is a raw material database that was designed for
comparative Archeological and Geological Research. The compilation of scientific data holds
6
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
embedded databases that were strategically researched, collected and assembled. Essential data
provenance of artifactual data were gathered and consist of text, fieldwork studies, images,
numbers and animation. “The site is designed to be used intuitively by multidisciplinary
researchers. It is hoped that the community-based approach will draw the viewer into both
numerical and visual databases.” (Indiana University)
Describe how the collection fit within the scope of the institution: The collection will be
stored on the online Smithsonian Global Volcanism Program page with the original link that will
allow researchers and scholars to gain access to the information that will aid in the research
process. The SAVA data is already collected in a website that is referred to as a database.
https://web.archive.org/web/20131107195639/http://www.indiana.edu/~sava/
The foundation is established, the SAVA collection database will support the progress and
advancement needs for scholars, students, researchers and future professionals. Online sharing
will bridge the curation gap. http://volcano.si.edu/
Purpose:
The purpose of this collection is to identify the environments origin by collecting geological
data, sharing, preserving, analyzing and integrating for scholars, scientist and specialist.
Quantitative rock and clay-rich sediment data can be used for the provenancing of artifactual
material from the region of SAVA.
II. Collection Inventory
7
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
The collections inventory consists of numerical data, image data, and collaborative efforts from
specialist and maps. These files include folders that contain JPEG images, PowerPoint Slides,
Excel and Microsoft Word documents.
Inventory Files:
Numerical data, Image data, Map data, Data Interpretation, PowerPoint Slides, Supplementary
Material Inventory Files and collaborators.
1. Numerical Data
Geochemical database files- inventoried in PowerPoint (PPT) (Findings: Nature of Aeginetan
Ware Distribution: "Local Cultural Change Model")
Searchable database- Page could not be found
Aegina Island geologic history files- just data- an additional link will be added for to
research
Geologic fieldwork files- animated interactive satellite images are accessible with the
recommended flash drive download. Researchers are able to utilize interactive research
maps; they are available through the map application.
2. Image Data
Ceramic database files-some of the ceramic images from the ceramic artifact database have not
been intergraded. The data is charted and categorized by time period, source and reference
points. Along with the artifact image, each entry comprises mineralogical chemical and
8
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
archaeological data. “The mineral composites are data in the form of chronology, class, shape,
fabric, provenance and reference samples”. (taken from website)
EXAMPLE OF DATASET
Database
Data Type
Ceramic Athens
Ceramic Halieis
Ceramic Kolonna
Ceramic Tsoungiza
Petrographic Aegina Kolonna Kolonna
Backscatte
Images
Lerna Mercouri Asine
Maps
(General)
3. Map database files- Jpeg Format
Map database files- The map database files hold multiple samples that were collected from the
arctic lava flow. The data is intended for research and is accessible for research. Database
records hold itemized Kolonna, Halieis, Athens and Tsoungiza files.
9
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
Digital images for the seven sample sets from the distribution list of Aeginetan Ware are
related to their respective geochemical analyses. The artifact images, each entry comprises
mineralogical and chemical data, archaeological data for the samples, elemental mineral
composition samples and references to petrography.
Petrographic database files- “The Petrographic data collected gives a detailed descriptions
of the thin-section collections for the sites of Lerna, Kolonna, Halieis, Tsoungiza, Athens,
Asine and Asea. In addition, a thin-section collections for the rocks and clay-rich sediments
which make up the SAVA raw material reference data for both samples and their reference
materials are available.” (Indiana University)
4. Data Interpretation
AW provenance report- PDF
5. PowerPoint Slides
Lerna House of Tiles, Source Clay for Aeginetan Ware, Ceramic Technology for Aeginetan
Ware, Overview for Distribution Problems, Local Cultural Change Model and Environmental
Change Model.
6. Supplementary Material Inventory Files
The Inventory has four main file folders, each folder has Jpeg images of scientific data charts
inside the folders. There are six Microsoft word documents that are describe with Tables and six
Microsoft Excel spreadsheet docs. Some of the data is not accessible. Adding the original link as
a tool, the data can be shared with the public, this will allow for data sharing. a) Tables
10
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
(Kolonna, Halieis, Athens, Tsoungiza are listed in JPEG format that are itemized. b) Individual
Na-K plots c) Geochemical Data Files. d) Protocols for data manipulation (Laboratory notes and
or notebooks document specific data/data files. (The data describes all columns, units,
abbreviations and missing value identifiers. examples)
7. Collaborators
Dr. J. BROPHY, Dr. C. SHRINER, Dr. G. CHRISTIDIS, Dr. H. MURRAY, Dr. C. LI, Dr. A.
SCHIMMELMANN and Dr. E. RIPLEY.
Determine the file type:
The SAVA online database files are inventoried and cataloged into file folder directories that
contain scientific data, with data being archived in JPEG images/format, Microsoft word docs and
Microsoft excel spreadsheet docs.
Describe the significant properties of the data:
The significant properties of the collection is the historical context within the data. The properties
of the artifact images comprise of mineralogical, chemical data and archaeological data. The
collection provides sample sets of “presumed Aeginetan ware excavated throughout the Aegean
and mainland Greece.” (Indiana University)
III. Curation Plan
11
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
What data will be archived? All of the data will be archived, it will fit the scope of the already
established volcanic website, and the link will be an additional source for reference.
Rational and criteria for the decisions- Choosing the Smithsonian Museums to focus the digital
plan seemed logical, it caters to all disciplines. The Smithsonian not only caters to visiting patrons,
but the institution has a strong online virtual presence that provides straightforward accessibility to
their multitude of collections. The website interface provides unlimited navigational opportunity
for its global users. Choosing the Smithsonian for the digital curation plan will add value to the
collection as well as provide vital information for researchers, educational purposes and
knowledge seekers. The Smithsonian is a trusted organization that prides itself on quality curation
and preservation efforts throughout the variations of its collections lifecycles.
What preservation strategy will be employed?
How will the data be archived? The data will be archived using hardware/software dependency:
Description and organization
IV. Metadata
What metadata already exist:
Contextual- The collections pre-existing metadata are descriptively and structural described. It
textually and technical summarize information about the data that exist in the collection. The
collection is categorized in an online database by contextual data, numerical data, image data and
topographical data. “Artifact images are documented with mineralogical, chemical and
archaeological data.” The existing data are categorized into databases/ datasets that are described
12
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
by provenance, fabric, location, type, shape, references, and elements. Some of the information is
not analyzed. These products provide access to a variety of environmental data through online
map applications that allow users to select and view data based on geographic parameters.
What standards will be used?
The International Organization for Standardization (ISO) 19115 (Geospatial metadata) standard
metadata will be used, it is the most common and widely shared. The xml format will be used, this
file is more complete. The two formats will provide compatibility. ISO is an independent
nongovernmental membership organization that develops voluntary standards, published the open
archival information system. The metadata standards data is text based and documented.
How will the metadata be created and/or enhanced?
The metadata will be created by using online xml editor software. The software includes the
allowable elements, prompt for them and validate the results, meaning the descriptive information
about the added datasets can aid in greater use, discoveries and navigation for metadata sharing.
V. Discovery
How will patrons access this material?
With advance technology, patrons will have unlimited access to the Smithsonian online digital
collection located on the National Museum of Natural History Global Volcanism Program
website. The Smithsonian’s digital collections for global volcanism include reports, bulletins,
databases, galleries, and research activities. With the E-Science collection of the SAVA database,
the compilation would fit into the scope of the institutions guidelines. The GVP is housed in the
13
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
department of Mineral Sciences, National Museum of Natural History. http://volcano.si.edu/
Updated technologies afford patrons the opportunity to access material through tools. The
Smithsonian uses new digital technologies for maximizing preservation efforts in a accelerated
environment.
VI. Staffing- The National Museum of Natural History Global Volcanism Program
Curatorial: 1 preservation specialist, Technical: 5 librarian technicians and 1 staff member for IT
support, Management: The head of natural and physical sciences department, 1 assist dept head
Clerical: 2 assistants.
14
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
Digital Humanities Collection #2
The Owens Family
A researcher has developed a collection of digitized original documents about the Owens family.
This prominent, historical family settled the New Harmony community, a utopian common in the
early 1800s in southern Indiana. This community was known as the Athens on the Wabash for
its support of the arts and sciences and was a major contributor to 19th century knowledge and
culture. As his research is complete on this collection, papers and books written and published,
this researcher wishes to contribute all of his research materials to an archive for long term
curation. The data is divided into two directories.
Describe the Collection
I. Introduction
Goals of the assessment- The goal is to create an on-line database for long term
preservation. The challenge will be maintaining the original data when the infrastructures in
which they were created will revolutionize. Data loss can be an issue; the risk can be less
risky by providing backup that will preserve with the changing times.
Describe your institution- The Digital Curation Plan will focus on the success of the
Smithsonian track record. “The Smithsonian is the world’s largest museum, education, and
research complex. (The Smithsonian is a museum and research complex, comprised of 19
museums and galleries and the National Zoological Park. The total number of objects, works
of art and specimens at the Smithsonian is estimated at nearly 137 million. The collections
range from insects and meteorites to locomotives and spacecraft.” The goal of sharing
15
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
information with the Smithsonian Institution’s fits the description and will serve as an
educational component globally. Efforts to collaborate a smaller collection with a larger
organization as the Smithsonian will bridge the knowledge gap for research and scholar
resolution.
Describe the collection- The collection hold files from the Owens Family, they were settlers
in the early 19th
century, records date back as early as 1825. They settled in the community
of New Harmony, located in Southern Indiana. The Owens strived to create a utopian society
by establishing the best professionals. The Owens were major contributors of cultural, arts
and the sciences in this community. The collection holds a variety of business and personal
papers, ledgers, correspondence, journals, diaries and other records belonging to the Owens
family. The collected works are categorized into 2 directories.
Describe how the collections fit within the scope of the institution- “The Smithsonian's
collections represent our nation's rich heritage, the Smithsonian institution’s strive to
improve the experience and strengthen its mission of having access to information. “The
Smithsonian digital curation management and research team actively take steps to frequently
maintain integrity, data assurance and high quality software. (just to name a few) “Some data
cannot be replaced if destroyed or lost,” digital curation and data preservation are ongoing
processes that require best practices for institutions and researchers to manage and retain
their research data. The Natural Museum of American History Library is the ideal place for
long term preservation efforts of the Owens Family Digital Collection.
II. Collection Inventory –categorized into 2 directions.
16
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
o Inventory files- Directory one holds email, New Harmony and New Harmony people
files. They are preserved in Gif, Jpeg, and PNG file formats.
Chart EXAMPLE
Data Data Type Format/Duration/Si
ze
Planned
Access
Numerical Digital image JPEG Open access
via CBS
website
(bristol.ac.uk
) & JORUM
Harmony Society Manuscript
15 page
docx
Correspondence_Li
st
Xlsx, excel
History of New
Harmony
10 pages
History_of_anarchis
m
10 pages docx
In_Log_Cabin digital Image jpg
Inventory txt
New harmony .html
nh_people .html
NH_1834_Map Map Jpg
Notes.Branigan .docx
Old George Walker Digital Image Jpeg
Original House Digital Image Jpeg
Rapp Owan
Granary
Digital Image Jpeg
o Directory two is stored in a zip file; the file has 2 downloadable folders; MACOSX and
Owens_Correspondence. The MACOSX file is not accessible without updated software.
Even so, the updated software for the MACOSX records, (Adobe) and its 184 files
context are not retrievable. The Owens_Correspondence directory holds 74 folders,
17
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
metadata exist for each record. For example, one folder holds a journal with 184 pages,
these pages are stores in jpeg files formats, each page represents a single Jpeg file.
Directory two is a zip file that hold 74 Owen correspondence file folders, each folder
contain individual Jpeg images of original scanned documents. Each folder contains Jpeg
files that range from 1 document in a folder up to 180 jpeg records in each folder.
o Determine the file types- The files contain digitized still images. The images were from
original tangible documents that are preserved in JPEG format.
o Describe the significant properties of the data-The significant properties of the data
collection holds historical context within the data. The files contain textual images that
are digitalized from tangible/physical to virtual images in Jpeg. (dbl check repeat)
III. Curation Plan
o What data will be archived- Most of the records
o What data will not be archived- the updated software for the MACOSX records,
(Adobe) and its 184 files context are not retrievable. These records will not be archived,
non accessible.
o Rational and criteria for the decisions- The records are not accessible
o What preservation strategy will be employed? How will the data be archived?
o Description and organization
IV. Metadata
18
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
What metadata already exist? The collection holds digitally scanned papers and published
books that were hand written. The data is divided into two directories. Directory one
metadata- New Harmony: This directory of data holds emails, and New Harmony files. Files
holding correspondence, D.S. Store, Harmony society articles, history and manuscripts.
What standards will be used?
Text Encoding Initiative will be used for the collection. TEI XML captures data and edits
data. The format and standard for collecting develops and maintain a standard for
representation of text in digital form. A set of guidelines are put in place specifying the
encoding methods, for machine readable texts. This metadata standard aids in online research
used for this collection.
How will the metadata be created and/or enhanced?
The metadata will be create and enhanced using the Oxygen software. The software provides
feedback with code editing.
V. Discovery
How will patrons access this material?
With advance technology, patrons will have unlimited access to the Smithsonian’s online digital
collection located with links on the National Museum of American History Library website. The
Smithsonian’s digital collections include over 27,000 digitized books and manuscripts. With the
Owens family collection, the humanities compilation would fit into the scope of the institution
guidelines. http://library.si.edu/libraries/national-museum-american-history-library
19
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
http://library.si.edu/collections
Technologies afford patrons the opportunity to access material through tools
VI. Staffing- Natural Museum of American History Library
Curatorial: 1 collections librarian,
Technical: 2 librarian technicians and 1 staff member for IT support
Management: Head of department, Sr. librarian
Clerical: 1 librarian technician
VII. Technologies Requirements
What is required to support your curation plan?
Having a successful curation plan requires updated and advanced tools that will enable users to
have access physically, virtually and globally. The Smithsonian online research tools include
One Search, Library Catalog (SIRIS) E- Journals and Databases as well as the Smithsonian
research online tools.
What is required to support your curation plan
Advance hardware/software reliance is a constant concern. The best way to synchronize constant
data preservation is to have a strategic plan (disaster recovery) in place. For example, the
equipment, software operating systems, media drives and so forth requires an immense amount
20
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
of investing. Technology frequents change, backing up data is necessary in digital
preservation/curation. Digital born metadata is more at risk of being permanently damaged when
technology becomes obsolete. Some metadata schemes are not interchangeable. When mapping
between Metadata elements, some elements are equivalent and compatible with existing resource
descriptions and othrs are not. Interoperability/cross walking can be converted, mapped and
enhanced in some of the elements. Migration—to copy data, or convert data, from one
technology to another, whether hardware or software, preserving the essential characteristics of
the data. These are some of the requirements needed to support the curation plan.
21
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
Social Media Collection #3
The social media digital preservation of Michael Richard "Mike" Pence is historically important.
Pence was born June 7, 1959, he is an American politician, lawyer, and the 48th and current Vice
President of the United States. Social media is a modern movement of self documenting ones
events, political and religious views.
I. Introduction
Long term attributes for digital curation is to make sure the digital content of social media is
preserved by implementing the data life cycle components. The documentation of movements,
events, political and religious views on social media has become the popular trends of the modern
world. Being able to access and reuse data over time is the challenge. The historical events that
are documented on a social media’s platform are central when referencing is required. The
challenge is keeping up with the changing times of modern technologies.
Goals of the assessment- The goals are to ensure long term access to social media data archives
by preserving an organization important data content. Organizations preserve data according to
what the business deems important and appropriate for curation. The social media platform has
regulations in place that seek to govern authentication of privacy rights and regularly laws that
protect the user.
Describe your institution- The Digital Curation Plan will focus on the success of the Smithsonian
Institution’s. “The Smithsonian is the world’s largest museum, education, and research complex.
22
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
(The Smithsonian is a museum and research complex, comprised of 19 museums and galleries and
the National Zoological Park. The total number of objects, works of art and specimens at the
Smithsonian is estimated at nearly 137 million. The collections range from insects and meteorites
to locomotives and spacecraft.” The goal of sharing information with the Smithsonian
Institution’s fits the description and will serve as an educational component globally. Efforts to
collaborate a smaller collection with a larger establishment as the Smithsonian will bridge the
knowledge gap for research and scholar resolution.
Describe the collection- The collection is comprised of various social networking platforms that
Michael Pence subscribes. The most popular being; Facebook, Instagram, Twitter and YouTube.
The collection documents the use of social media and provides information on interaction with
others. The social media database focuses on the current trends depending on which site is being
accessed. Users are able to engage with their audience, seek employment, stay on top of current
events and use it as a political platform.
Describe how the collections fit within the scope of the institution- The social media collection
fits into the scope of the institution database versatility. Michael Pence social media pages
captured political context during the 2016 presidential elections. Pence’s social media accounts
captured real time events and conversations that will one day be used as reference points for future
scholars. The Smithsonian Institution has already begun archiving the social media collections
within the institutions facility. The process of archiving social media websites are still in the
beginning practice, preservation is processed by screen capturing.
23
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
II. Collection Inventory
Inventory files- Facebook data is categorized by followers, likes, photo albums, interests and
activities. Twitter data- categorized by followers, tweets, following, liked photos and videos.
Instagram data- categorized by about, followers, post and following. YouTube data files- search
results, data, and hot topics. Finally Flickr data is categorized by- Photo, favorites, following and
followers.
Determine the file types- File types will be all captured with screenshots
Describe the significant properties of the data- Data Type- Social media and social networking
service
III. Curation Plan
What data will be archived as screenshots- The content of social media is large; this makes the
information complex to curate. Curating all social media information would be great; however, the
size of the continuous data makes it burdensome. When reviewing the social media accounts for
Michael Pence, similar data was collected from each social media page. Instead of collecting the
entire datasets from social media, data with heavier activity will be archived. Social media data
that will be archived: Facebook, Twitter, YouTube
What data will not be archived- YouTube and Flickr- Michael Pence does not have control over
photo and video sharing on these platforms. All the providers listed with the exception of one,
does not have readily available platforms to capture Instagram’s metadata. However, it is stated
24
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
that the Hanzo Archives is very general in the description of capturing websites and particular
social media platforms. The information is very limited.
Rational and criteria for the decisions- preserving and maintaining specific data records make it
easier to maintain control of the data being archived. Social media metadata preservation can be
overwhelming. Flickr is widely used by photo researchers and bloggers, this data has no formal
connection with the Vice President himself. YouTube allows users to upload, view, rate, share,
add to favorites report and comment on the uploaded data. Data that has no physical connection
with Vice President himself will not be used.
What preservation strategy will be employed? When it comes to preservation strategies for
social media, the “best solution would be manual processes.” (Fasching, Kaliner, & Karel, 2012)
How will the data be archived- Solutions for data are still in their primary stages. Data capturing
is a popular way to preserve social media content for archives. Aleph Archives, X1 Social
Discover, Hanzo Archives, Iterasi Archives, and Reed Archives are current solutions that are
available for archiving.
Description and organization- X1 Social discovery is more idea for capturing and maintain the
content in its original format. Aleph Archives is more idea for digital curating social media. Aleph
provides web archiving services using web crawlers to capture the website. Iterasi is a subscription
based provider that targets corporate, legal and governmental industries. Reed Archives captures
25
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
social media as needed, “users can export archives as PDFs or create bulk exports to entire
websites and social media accounts.” (National Archives and Records Adminstration, 2013)
IV. Metadata
What metadata already exists- The popular social media websites with preexisting metadata:
Facebook, Twitter, YouTube and Instagram already exists for this collection.
How will the metadata be created and/or enhanced- the metadata will be created/enhanced
through screen capturing, cloud based storage, and or using the “X1 social discovery provider.
Data is collected and indexed from social media streams, linked content and websites.” (National
Archives and Records Adminstration, 2013)
V. Discovery
How will patrons access this material- Patrons will access the Smithsonian online materials
through the Archives database.
VI. Staffing- The maintaining of digital archives that are kept up-to-date throughout the course of
research. 2 curators, 2 technical librarians 1 staff member IT support, 1 head of dept (management)
and 1 clerical
VII. Technologies required
What is required to support your curation plan- Using Archive-It, a tool by Internet Archive; it
is used to capture social media accounts prior to being outdated. Archive-it uses a crawler, a
26
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
program that browses the internet and duplicates a specific moment of a particular website it is
crawling on. The Wayback tool allows accessibility for future usage.
27
Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk
Works Cited
Fasching, D., Kaliner, S., & Karel, T. (2012, July). Social Media Data Preservation, Tools and Best
Practices. LJN's Law Journal, Legal Tech Newsletter, 29(3).
Indiana University. (n.d.). SAVA South Aegean Volcanic Arc and Aeginetan Ware Database Project.
Retrieved from
http://web.archive.org/web/20140604205028/http://www.indiana.edu/~sava/gallery.html
Indiana University. (n.d.). Retrieved from
https://web.archive.org/web/20140604205018/http://www.indiana.edu/~sava/database.htm
National Archives and Records Adminstration. (2013). White Paper on Best Practices for the Capture of
Social Media Records. Washington DC: National Archives.
Steven, M. J. (2016). Metadata for Digital Resources. New York, NY, USA: Neal-Schuman.
Susan Schreibman, R. S. (2004). A Companion to Digital Humanities, ed. Oxford: Blackwell. Retrieved
from http://www.digitalhumanities.org/companion/