research data repositories - cni: coalition for networked ......3) ins5tu5onal or consor5al (either...

45
Research Data Repositories Developing and Implemen5ng Infrastructures for Ins5tu5onal and Consor5al Environments Ray Uzwyshyn, Ph.D. MBA MLIS Director, Collec5ons and Digital Services, Texas State University Libraries

Upload: others

Post on 10-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • ResearchDataRepositoriesDevelopingandImplemen5ngInfrastructuresforIns5tu5onalandConsor5alEnvironments

    RayUzwyshyn,Ph.D.MBAMLISDirector,Collec5onsandDigitalServices,

    TexasStateUniversityLibraries

  • OnlineDataResearchRepositoriesWhatareThey?

    •  WaytoManageaResearcher’sData/Metadata

    •  PermalinkingStrategyforDataCita5on

    •  WaytoManageFederalGrantCompliance

    •  Middle-TermDataArchivingandSharingStrategy

  • TheResearchDataRepositoryLifecycle

    BecomingpartofScience,SocialScienceandHumani5esResearchProcess

    Promotes:accuracy,efficiency,sharing

  • WhyareDataManagementRepositoriesNecessary?

    MostmajorFederalgrantagenciesrequiredataaccessasmandatorypartofthegrantproposal/oversiteprocess.(NIH,NSF,NEH,USDA)

    WordleoftheFinalNIHStatementonSharingResearchData,Mandatory2003

  • WhatmakesDataManagementRepositoriesuseful?

    •  Makesavailablefaculty,departmentalandins5tu5onalresearch•  Allowspublica5onofnega5vedata(lessensresearchreplica5on)

    WordleoftheNa-onalScienceFounda-on’sAwardandAdministra5onGuide.ChapterVI.D.4,Mandatory2011

  • TypesofResearchDataRepositories

    1)Projectspecificlargesinglefaculty/teamprojects

    2)Disciplinespecifici.e.PurdueNanohub/Nanotechnology

    3)Ins5tu5onalorConsor5al(eitherins5tu5onwideorconsor5alrepositories)

  • All-PurposeandSpecializedDataRepositoryPla^orms

    Fearon,D&Sallans,A.C.(January2014)Ins5tu5onalResearchDataManagement:Policies,Planning,ServicesandSurveys.Coali5onforNetworkedInforma5on.hbps://www.youtube.com/watch?v=rvbrW7S2fes(54ARLLibrariescurrentlyofferdatamanagementservices_)

  • ResearchDataRepositorySoBwareCharacterisDcs

    •  Hostedoronaserver•  Sohwarecontainsmanagementandcollabora5veop5ons

    •  Opensourceorproprietarysohware•  WideVarietyofDataTypes

    (ExceltoSPSStovariousdisciplinaryspecificformats)

  • Evaluation Criteria •  System Performance/ Robustness •  Usability •  an active open source community

    Gather Finalists: Harvard’s Dataverse, Purdue’s Hubzero

    Figshare

    Make Final Choice: Harvard’s Dataverse

    PartI:PlanningYourRepository

    DataRepositoryWorkingGroupReport(August28,2015)

    EnvironmentalScanofNeedsforYourInsDtuDonorConsorDum

  • DataverseHarvard’sOpenSourceResearchDataSolu5on

    Datasharing,datacita5on,datapublishingandversioningmanagement

    SocialSciencesBeginnings(IQSS)DataScience(site)hbp://thedata.orgDataverseOpenSourceDownload(Github),SohwareBackground

  • DataverseArchitecture(Consor5al)

    ResearchStudyData

    OriginalDataSetFilesMetadataParatextualMaterials(Methodology,FieldNotes,Mul5media,Graphs,Programsetc.)

    TexasStateUniversityDataverse

    TexasDigitalLibraryDataverse

    UniversityofHouston,UT

    AusDnDataverses,etc.

    Centers

  • DataCita5onandMetadata

  • DataverseMetadataExample(FromtheSimpletoComplex)

    SchemasSupported:GeoSpa5al,LifeSciences,AstronomyandPhysics,GeoreferencedData

  • TheManyPlanningAspectsofDataResearchRepositories

    PlanningPrinciples

    WideFlexibilityonIns5tu5onalLevels.

    GuidingConsor5alTemplateswhichcanbecustomizedonins5tu5onallevels

  • Part II: Developing Your Data Repository TDL Dataverse State Working Group

    (August 2015 – December 2016)

    Charge: Develop, Pilot and launch a consortial repository for research data archiving and management.

    Sub-Committees

    Working Group

    members

    Texas Universities

    MainWorkingGroup(14)(4Subcommibees)•  PolicyandGovernance•  WorkflowsandOutreach•  Budget/BusinessModel•  Technology

    StateDataRepositorySymposiumGroup(Baylor)

    FinalReportOctober,2016

  • http://data.tdl.org

    Interface Design & Usability

  • TexasDataRepository

    Member University Libraries (service & outreach)

    Researchers (deposit, search, publish)

    1) Mixed 2) Mediated 3) Unmediated (Direct)

    Service Models

  • TexasStateAcademicResearch

    ResearchData

    TSDataverse(RegulartoMediumSizeDataSets)

    CustomDataStorage

    (BigData,TB+,TR)

    Text

    D-SpacePublica5onsRepository

    TexasStateRepositoriesArchitecture

  • OneSizeDoesNotFitAll

    TypesofDataProjects(Sizes)1)NormalRangeProjectsFiles/DataFitonServer,maybeuploaded,Dataverse,Hubzero)

    2)LargeProjects(DatamayrequirespecializeduniversityITSupport,i.e.terabyte/petabytedrives,Pointersetc.)

    3)HugeProjects(Projectsrequireconsor5alpossibili5es,na5onalmodels,TexasAdvancedComputerCenterTAAC,DEEPN,Duracloud,AWS,CustomSolu5ons)

  • FacultyDataManagementPlanDocumenta5on/PolicyTool

    OverviewVideo

    CustomizablePlanOutlineToolResourceLinksSupportsAllMajorFunders

    hbps://dmptool.org/CaliforniaDigitalLibrary

    Connec5onswithOfficeofSponsoredResearchandOtherRelevantUniversityOfficesLibrary/DataverseTemplates

  • Part III: Human Resource Infrastructures (Working Teams)

    Full or Part Time

    Data Repository Liaison Publication Repository Liaison Metadata Liaison Subject Liaisons (Outreach) Committee for Workflows & Policies

    Current Hires

    Digital Collections Librarian (Texas State Data Repository Dataverse/Publications Repository: D-Space)

    Data Visualization and Analytics Librarian (Tableau, Bayesia)

    Future Hires

    Machine Learning/Neural Networks/AI Librarian (working with the data)

  • Marketing and Other Possibilities

    FuturePossibili5es:VIREO,DATAREPOSITORYCONNECTIONS

    ElectronicThesisandDissertaDons(ETD)Repository(D-Space)

    WorkingwiththeData–SupportMechanismsDataLiteracy(Workshops/Educa5on)DataVisualiza5on,DataAnaly5csMachineLearning/NeuralNetworks/AI

  • ResearchDataRepositoryAdop5onLifecycle

    (2018)

  • FurtherLinks/References•  ARLNSFDataSharingPolicyandResourceLinks,

    hbp://www.arl.org/focus-areas/e-research/data-access-management-and-sharing•  ARL(WhiteHouseDirec5vesandFundedResearchData)

    hbp://www.arl.org/focus-areas/public-access-policies#.VoaV0I-cFzo•  Borgman,C.2015.BigData,LiFleData,NoData.ScholarshipintheNetworkedAge.MITPress•  Baker,Monya.1500Scien5stsLihtheLidonReproducibility.

    www.nature.com/news/1-500-scien5sts-lih-the-lid-on-reproducibility-1.19970•  Harris,Richard.(April2017).RigorMor-sHowSloppyScienceCreatesWorthlessCures•  CaliforniaDigitalLibraryDMTTool:hbps://dmptool.org/•  Chronopolis:hbp://www.digitalpreserva5on.gov/partners/chronopolis.html•  DataReproducibilityCrisis.Nature.

    hbp://www.nature.com/news/1-500-scien5sts-lih-the-lid-on-reproducibility-1.19970•  Dataverse.hbp://thedata.org/•  Dataverse(DataScienceSite).hbp://datascience.iq.harvard.edu/dataverse•  DataInforma5onLiteracyGuide.hbp://www.datainfolit.org/dilguide/•  DataInforma5onLiteracyCompetencies(Purdue).hbp://blogs.lib.purdue.edu/dil/the-twelve-dil-competencies/•  DPN(DigitalPreserva5onNetwork)hbp://www.dpn.org/•  Duracloud:hbp://www.duracloud.org/•  Force11.DataCita5onPrinciples.hbps://www.force11.org/group/joint-declara5on-data-cita5on-principles-final•  Purr.(PurdueIns5tu5onalDataRepository).hbps://purr.purdue.edu/•  Hubzero.hbps://hubzero.org/

  • FurtherLinks/References•  Figshare.hbp://figshare.com/•  ICPSRDataManagement&Cura5on.hbp://www.icpsr.umich.edu/icpsrweb/content/datamanagement/•  ResearchDataManagement.Principles,Prac5ces,andProspects(November2013).CouncilonLibraryand

    Informa-onResources.hbp://www.clir.org/pubs/reports/pub160•  Cox,A.andPinfield,S.ResearchDataManagementandLibraries.JournalofLibrarianshipandInforma5on

    Science.June2013.•  Fearon,D&Sallans,A.C.(January2014).Ins5tu5onalResearchDataManagement:Policies,Planning,Services

    andSurveys.Coali5onforNetworkedInforma5on.hbps://www.youtube.com/watch?v=rvbrW7S2fes(videopresenta5on)

    •  DataManagementforLibraries:(LITAGuide)hbp://www.alastore.ala.org/detail.aspx?ID=10737•  NMCHorizonReport:2014LibraryEdi-on.hbp://cdn.nmc.org/media/2014-nmc-horizon-report-library-EN.pdf•  “ResearchDataManagement”.pp.6-7andpp24–45.•  Holden,J.MemorandumforHeadsofExecu5veDepartmentsandAgencies:IncreasingAccesstotheResultsof

    FederallyFundedResearch(2013).hbp://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf

    •  Green,A.Macdonald,SandRice,R.Policy-makingforResearchDatainRepositories:AGuide.DISC-UK.hbp://www.disc-uk.org/docs/guide.pdf

    •  ResearchDataManagementintheArtsandHumani5es(2013).UniversityofOxford.hbp://www.dcc.ac.uk/events/research-data-management-forum-rdmf/rdmf10-research-data-management-arts-and-humani5es(ConferencePresenta5ons)

    •  TexasDataRepository.TDRFinalReport(October,2016),Selec5onProcess,Aug.2015,PeaceWilliamsonetal.UTArlington,DataCompetencies.TDLTexasDataRepositoryPresenta5on.Video.,KristyPark,San5Thompsonetal(October,2016)

    •  Uzwyshyn,R.2016.ResearchDataRepositories:TheWhat,When,WhyandHowofDataResearchRepositoriesComputersinLibraries.

  • Comments/Ques5ons

    ContactInforma5on:

    RayUzwyshyn,Ph.D.MBAMLISDirector,Collec5onsandDigitalServicesTexasStateUniversityLibrariesruzwyshyn@txstate.edu(512)245-5687

  • AcademicResearchLibrariesEnvironmentalScan

    OnlineDataResearchRepositories(CNI)

    Fearon,D&Sallans,A.C.(January2014)Ins5tu5onalResearchDataManagement:Policies,Planning,ServicesandSurveys.Coali5onforNetworkedInforma5on.hbps://www.youtube.com/watch?v=rvbrW7S2fes(54ARLLibrariescurrentlyofferdatamanagementservices_)

  • DataverseNetworkArchitecture

    WhytheDataverseNetwork?(silentvideooverview)

    OpenJournalSystemsDataverseIntegra5on

    ResearchStudyDataDataSetFilesMetadata(DataDescribingthedata)ParatextualResearchMaterial(Methodology,FieldNotesetc.)GraphDataFiles

  • PURRandHubzero:Purdue’sDataManagementSystem

    •  Purr:PurdueUniversityResearchRepository(video)

    •  PurrSite(ProprietarytoUniversity)

    •  PurrBackground

    1.)CreateDataManagementPlans2)CollaboratewithotherResearchers3)PublishDataSets(PurduecanpublishaDOI:DigitalObjectIden5fierforDataSets)UsefulForCita5on4)ArchiveDataSets

    Boilerplatetextfordatamanagementproposalsavailable

    PurrispartofHubzeropla^ormforscien5ficcollabora5on(OriginallyNanohub)

  • Hubzero:OpenSourcePla^ormforScien5ficCollabora5on

    •  hbps://hubzero.org/•  GexngStarted,DownloadableandHostedOp5ons•  HubzeroVideo,Hubzero2

    ResearchCollabora5onandDataManagementSolu5on

    ResearchDataTypesSpreadsheetsInstrumentorSensorReadingsSohwareSourceCodeSurveysInterviewTranscriptsImagesandAudiovisualFiles

  • Figshare/Cloudbased/Proprietary

    Repositorywhereusersmaketheirresearchavailableincitable,shareableanddiscoverablemanner

    Figures,datasets,media,papers,posterspresenta5onsandfilesetscanbedisseminatedInawaythatthecurrentscholarlypublishingModeldoesnotallow

    OpenSourcePla^ormforSharingResearch

    Figshare(video)

    FigshareforIns5tu5ons(Video)

  • FigshareFeatures(CloudBased/Proprietary)

  • hbps://www.force11.org/group/joint-declara5on-data-cita5on-principles-final

    DataCita5onPrinciples

  • TexasDataRepositoryTexasDigitalLibraryIni5a5ve,2014-2016

    TDLConsor5umof22universi5esacrossTexasleveragingtechnologicalcoopera5onamongacademiclibraries

  • InsDtuDonalRepository(MIT,D-Space)

    hbps://digital.library.txstate.edu/

    Facultypublica5ons,whitepapers,preprints,theses,disserta5ons,workingprojects,reports,greyliterature

    LargerIdea,GrantCompliance,EnablingFacultyResearchOnline,RaisingResearchVisibility,

  • PilotStudyResponsesPerceivedBenefitsofDataRepository

    •  Fulfillfederalmandatesforsharingpublica5onsandresearchdata

    •  Makeresearchdatamorewidelyavailable•  Sta5s5csondownloadsandcita5onsofmydata•  MakemydataciteablethroughtheassignmentofaDOI(digitalobjectiden5fier)

    •  Savingvariousversionsofthedataset(datalifecycle)•  Collec5ngallmydatainoneplace

  • Collaboration Across Institutions

    Jones et al. (2008). Science 322: 1259-1262.

  • DataSharing

    Currently,80%ofresearchersdonotsharetheirdata

    Andreoli-Versbach,P.,Mueller-Langer,F.(November2014).Openaccesstodata:Anidealprofessedbutnotprac5ced.ResearchPolicy.,hbp://dx.doi.org/10.1016/j.respol.2014.04.008

  • hbp://www.nature.com/news/1-500-scien5sts-lih-the-lid-on-reproducibility-1.19970Harris,Richard.(April2017).RigorMor-sHowSloppyScienceCreatesWorthlessCures

    ResearchDataReproducibilityCrisis(Nature.2016)

  • Hubzero/PurrCustomiza5on