report on scholarly communication initiatives @ microsoft

19
Lee Dirks Director, Scholarly Communications Technical Computing MSR External Research Microsoft Corporation

Upload: webb

Post on 01-Feb-2016

19 views

Category:

Documents


0 download

DESCRIPTION

Report on Scholarly Communication Initiatives @ Microsoft. Lee Dirks Director, Scholarly Communications Technical Computing MSR External Research Microsoft Corporation. Agenda. Context Our Mission & Mandate Engagement Model / Methodology Some Project Examples Future Directions - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Report on  Scholarly Communication Initiatives @ Microsoft

Lee DirksDirector, Scholarly Communications

Technical Computing MSR External ResearchMicrosoft Corporation

Page 2: Report on  Scholarly Communication Initiatives @ Microsoft

• Context• Our Mission & Mandate• Engagement Model / Methodology• Some Project Examples• Future Directions• Your Questions & Feedback

Page 3: Report on  Scholarly Communication Initiatives @ Microsoft
Page 4: Report on  Scholarly Communication Initiatives @ Microsoft

• Advancement of Science• Global Collaboration

• Technology Excellence• Interoperability

Putting computing into science…Applying Microsoft products and research technologies to advance the scientific research and engineering innovation process

Putting science into computing…Investing in potentially breakthrough computer science research to address the Multicore challenges facing the IT industry

Page 5: Report on  Scholarly Communication Initiatives @ Microsoft

CollaborationSharePoint

LiveMeeting

DiscoverabilityLive Search Academic & Books

Libra 2.0SharePoint

OpenXMLXPSSQL ServerRights ManagementData Protection Manager

Tablet PC/UMPCOffice 2007:•Word•PowerPoint•Excel

Word 2007 + PowerPoint 2007SharePoint

WPF & Silverlight“Sea Dragon” / “PhotoSynth”

Excel 2007Windows Compute Cluster Server

“Astoria” / “Pop Fly”

The Scholarly Communication Lifecycle

Page 6: Report on  Scholarly Communication Initiatives @ Microsoft

• Science + computation are not the entire equation• Authoring, Analysis, Publishing, Discoverability, and Data

Storage/Preservation are key components to scientists’ everyday work…and Microsoft’s core businesses

• The scholarly community has made it clear to us:• Microsoft must improve its offerings throughout the

scholarly communication lifecycle• MSR/TCI is uniquely positioned to drive this initiative

within Microsoft • Our approach: Conduct prototyping projects and

proofs-of-concept to evolve Microsoft’s scholarly communication offerings

Page 7: Report on  Scholarly Communication Initiatives @ Microsoft

• Academics / Scholars (higher education setting)

• Researchers / Scientists • Libraries / Archives

• Academic, Research and National institutions

• Scholarly Publishers & Societies• Both Open Access and For-Profit enterprises

• Governments / Related Organizations• EU, NIH/NLM, NSF, NASA, etc.• JISC (UK), OCLC, CNI, DLF, NISO, etc.

Page 8: Report on  Scholarly Communication Initiatives @ Microsoft
Page 9: Report on  Scholarly Communication Initiatives @ Microsoft

• Optimize for data-driven research & science (open data/access) – To both data (scientific) and to information (scholarly publications)– Reproducible research + computational science– Properly document / annotate scholarly output

• Interoperability is paramount – Actively lobby and drive for consensus around technical standards and standardized protocols proactively

adopted by the community; enable broad community engagement• Customers have told Microsoft that the interoperability (and intellectual property) are OUR responsibility

• Data preservation (and provenance) should be baseline– Documentation of the data’s provenance– Reliable and secure long-term storage – at a massive scale– Preservation needs to be like “accessibility” features – i.e., assumed as required

• Social networking & semantic knowledge discovery – Harnessing collective intelligence must be a consideration – since accessing research is a core step in the

life-cycle. Enable knowledge discovery – Optimize for Web 2.0 scenarios and allow end-users/experts to find things easier

• Metadata conventions / taxonomies / ontologies– This is a crucial strength for libraries – and a critical component in enabling Web 2.0

Page 10: Report on  Scholarly Communication Initiatives @ Microsoft

• Work with researchers around the world– Facilitate/advise on the application of technology– Link MSR researchers with (non-CS) researchers

• Work with product groups– Provide feedback on the use of MS technologies– Identify research-driven requirements for products

• Terms & Conditions– Microsoft typically shares IP (via BSD-type license) or makes source code available on

http://www.codeplex.com – Microsoft will not develop on a Linux platform

• Project Execution Models– Internal Development (FTE)– External Development (Vendor)– External Development (Institutional)– Mixed Model

Page 11: Report on  Scholarly Communication Initiatives @ Microsoft

• Current or Completed Projectso Cornell – arXiv.org + Word 2007 (and repository interoperability) o MIT / Broad Institute – Authoring (Word 2007) + data for research reproducibility o MSR – CMT++ interoperability with data + metadata transfer/exchange (conference

management tool enhancements) o UC San Diego / PLoS – Semantic mark-up of scholarly articles (+ submission)o LiveLabs – eJournal publishing online service (community publishing tool)o Johns Hopkins University – Digital Archive for Astronomy/Astrophysics data (storage,

preservation and access) o Planets Project / EU (with MSR – Cambridge) around OpenXML and file format preservation

and interoperabilityo eChemistry Project (Cornell, Penn State, Indiana, Cambridge, Southampton) – ORE

exemplar: access to compound chemical info objects (cross-repository access to open chemistry data)

o Indiana University – Toolbox for Social Networking (SRT)o British Library – Researcher Information Centre (RIC) online workflow tool for scientists and

researcherso University of Southampton (UK) – Port ePrints Repository Software for installation on the

Windows platformo University of Manchester / “MyExperiment” Project – social networking for scientists o ORE Acceleration Project (OAI – Object Reuse & Exchange)o UK National Archives – Virtual PC / Emulation of legacy systems to facilitate preservationo National Library of Medicine / NCBI – “PubMed Int’l” UK version of PubMed + NLM DTD o Creative Commons Add-in for Office 2007 – evolving the Word 2003 effort

• Pipelineo Chem4Word with Office & Cambridge University – Create add-in to Word 2007 to facilitate drawing of

chemical compounds and equations o DRIVER 2 (EU) – Infrastructure integration of across a network of European research repositories

Page 12: Report on  Scholarly Communication Initiatives @ Microsoft

• Integrate data and images from GenePattern workflows into research papers. Allow for research reproducibility by combining data with the text.

• Highlights OpenXML and Office 2007 technologies as well as breaking new research ground with the integration of data & workflows with research papers.

– MIT Broad Institute • (http://www.broad.mit.edu/)

– Contracted Work• Infusion for development work via SOW• Broad for GenePattern Development for integration

04/22/23 12

Page 13: Report on  Scholarly Communication Initiatives @ Microsoft

• NLM’s PubMedCentral repository contains full-text of research papers resulting from work funded by NIH

• Working with NCBI using Word 2007 to author the NLM-DTD tag set

• TCI assisted in deployment of PMC International in the UK, Japan, Italy, China and South Africa

Page 14: Report on  Scholarly Communication Initiatives @ Microsoft

• eJournal Project– Extending existing MSR ‘CMT’ Conference Management

Tool to offer eJournal service– Developing a toolset for ‘self-publishing’ of workshop and

conference proceedings and small journals

• Research Repositories– Adapting ‘arXiv’ repository to accommodate Word 2007

and interoperable web services interfaces– Developed an open source (BSD) Windows version of

‘EPrints’ software at Southampton

Page 15: Report on  Scholarly Communication Initiatives @ Microsoft

• Identify information sources, tools and services to support research in STM

• Explore the application of new services– Collaborative filtering of

literature, continual queries and more…

• Intuitive to use and navigate, user configurable

Page 16: Report on  Scholarly Communication Initiatives @ Microsoft
Page 17: Report on  Scholarly Communication Initiatives @ Microsoft

• Working with the Astronomy community to to build the IVO– Goal is for all astronomy data and

literature online and cross indexed– Tools to analyze it

• OpenSkyQuery Federation of ~20 observatories

– Works and is used every day– Spatial extensions in SQL 2005– Good example of Data Grid– Good example of Web Services

• TCI is facilitating a library projectto link astronomy publications to the data

Page 18: Report on  Scholarly Communication Initiatives @ Microsoft

• “Global Research Library 2020” with University of Washington (Oct07)

• Planning to participate in application(s) to the NSF “DataNet” solicitation (as an unfunded partner)

• Sponsoring BioMed Central’s 2008 Research Awards (Mar08)

• Aug07 Issue of CT Watch Quarterly (v. 3, no. 3)– “The Coming Revolution in Scholarly Communications & Cyberinfrastructure”– http://www.ctwatch.org/quarterly/articles/2007/08/

• New Scholarly Publishing website at:– http://www.microsoft.com/mscorp/tc/scholarly-publishing.mspx

Page 19: Report on  Scholarly Communication Initiatives @ Microsoft

Lee [email protected]

http://www.microsoft.com/science