next generation research and the university of...
TRANSCRIPT
![Page 1: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/1.jpg)
Next Generation Research and the University of California:
Planning for the Future of UC’s
Cyberinfrastructure
A report on the UC VCR-CIO 2015 Summit December, 2015
![Page 2: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/2.jpg)
UC VCR-CIO 2015 Summit 2
UC VCR-CIO Cyberinfrastructure Vision Steering Committee
Michael Pazzani, Vice Chancellor of Research and Economic Development UC Riverside Sandra Brown, Vice Chancellor of Research UC San Diego Tom Andriola, CIO UC Office of the President Larry Conrad, CIO UC Berkeley Jim Davis, Faculty Representative and Conference Panelist UCLA
Willeke Wendrich, Faculty Representative and Conference Panelist UCLA Terry Gaasterland, Faculty Representative and Conference Panelist UC San Diego MacKenzie Smith, University Librarian UC Davis
Support provided by Charles Rowley, Assoc. VC and CIO UC Riverside
The UC VCR-CIO Cyberinfrastructure Writing Group
Jim Davis, UCLA (Chair) David Greenbaum, UC Berkeley David Minor, UC San Diego Arash Naeim, UCLA
Valerie Polichar, UC San Diego Charles Rowley, UC Riverside Yvonne Tevis, UC Office of the President
Conference Website: http://cnc.ucr.edu/uccybersummit/
![Page 3: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/3.jpg)
UC VCR-CIO 2015 Summit 3
Next Generation Research and the University of California: Planning for the Future of UC’s Cyberinfrastructure
Table of Contents
Introduction
Compelling Case for a UC Next Generation Cyberinfrastructure
UC Researchers – Advancing Cyberinfrastructure as a Systems of Systems
Definitions
o Central/shared services
o Federated services
o Intercampus services
o Cyber facilities
o Cyber collaboration infrastructure
o Platforms
o Sociotechnical infrastructure
Positioning UC – Cyberinfrastructure Needs
Recommended Actions
o Create a UC Cyberinfrastructure Alliance tasked to define, build, stage and
orchestrate federated and centralized operations and policy
o Develop systemwide and campus “Cyberinfrastructure Mediator” support
o Develop an effective systemwide “marketplace” for research cyberinfrastructure
o Make research data an institutional asset
o Develop cyberinfrastructure “connective tissue” and associated tools to join
services, create cyber platforms, and enable federated services
o Develop approaches to scale discipline-similar requirements across campuses
o Position health, patient and clinical data for research access, patient care, and
other strategic uses
o Build on UC’s expertise via a development structure for UC researchers and
support staff
Immediate Next Steps and Moving Forward
Summary and Conclusion
Appendices
![Page 4: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/4.jpg)
UC VCR-CIO 2015 Summit 4
Next Generation Research and the University of California: Planning for the Future of UC’s Cyberinfrastructure
Introduction The modern research landscape continues to evolve dramatically: Science and digital
scholarship are becoming data driven (e.g. precision medicine, prevision agriculture,
climate modeling, etc.), research now occurs in increasingly collaborative environments,
time to action and outcomes is a key driver, researchers must be both domain and data
experts, and data as a language enabling research and scholarship is the new normal.
Moreover, as researchers address more complex problems, an individual researcher in
isolation no longer develops a hypothesis, conducts experiments, collects data and analyzes
the data and the hypotheses. Rather, teams of researchers with different expertise and
perhaps in different locations are needed these grand, complex issue.
This new environment is driving change and presents new challenges and opportunities,
from ethics to data access to human analytical capacity. Clearly, this evolving environment
requires the University of California (UC) to consider and plan for its collective future, and
a thoughtful research cyberinfrastructure strategy is required to ensure UC addresses
these challenges and that every opportunity is leveraged.
The costs of not addressing this collective UC need are significant. The grand, complex
challenges facing humankind can only be resolved with robust, coordinated, and
collaboratively utilized cyberinfrastructures and related services and support. If UC does
not act, the University will have reduced capacity to address these challenges and realize
increasingly competitive funding opportunities that are trending toward resolving these
issues. Further, resting on UC’s collective laurels is not an option: lack of action risks
compromise to UC’s world-class reputation.
The 2015 VCR and CIO Cyberinfrastructure Summit was a call to action. Based on
conference themes and observations, this cyberinfrastructure vision document offers
prioritized recommendations and action plans to move the UC research enterprise to the
next level in its ability to innovate, collaborate, attract funding, and blaze new trails. This
plan has been reviewed and vetted by UC’s VCRs, CIOs, Librarians, and the over twenty UC
faculty members who served as conference panelists.
The roadmap described herein will enable UC to optimally support the future success of its
research enterprises. Data-driven science, digital scholarship, and the associated (and
enabling) cyberinfrastructures this vision document discusses are core to UC campuses, its
laboratories, and to the University of California’s collective ability to address the grand
challenges facing California, the nation, and the entire world.
Compelling Case for a University of California Next Generation Cyberinfrastructure
![Page 5: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/5.jpg)
UC VCR-CIO 2015 Summit 5
A decade ago at the 2005 UC VCR-CIO Summit, the emphasis was on the cyber facilities
needed to provide capacity and capability for high-performance computation-based
research. By the 2011 Summit, the tenor of the discussion had shifted from cyber facilities
to a direct focus on the researcher-defined, front-end research capabilities that comprise
cyber collaboration infrastructure. The 2015 Summit revealed a much more extensive
cross-disciplinary research interest, an increased diversity of targeted uses, and an
expectation of precision in findings, predictions and insights. All disciplinary areas now
depend on data and analytics in some way. The 2015 Summit featured widely cross-
disciplinary breakout sessions, and all disciplines noted the importance of infrastructure
and expertise to support research data management, preservation and analytics. Facilities
such as compute, storage and transit were presumed to be essential, but are not always
present at the necessary levels. The term “informatics” was employed frequently. The
expected precision of solutions and team-based informatics amplified the dependence on
agile and flexible research tools that facilitate shared, team-based research. This in turn
generated further need for more purpose-built integrated, end-to-end collaboration
infrastructure capabilities, which are referred to here as platforms. The institutional role,
and the need for platforms that no single researcher or research group can individually
provide was underscored, along with the role of people and the importance of
sociotechnical infrastructure.
All in all, the 2015 Summit provided a compelling case for action:
The grand challenge problems facing researchers today will require collaboration —
across disciplines, campuses, and capabilities — on an unprecedented scale to solve. UC
has the world-class faculty needed to address these challenges, but it must develop the
“connective tissue” to bring researchers, data, tools and capabilities together across
traditional boundaries, and must treat research data as a strategic asset.
To continue to attract top-notch faculty, students and staff, UC must continue to provide
top-notch facilities, including cyberinfrastructure. Technology facilities that were novel
and groundbreaking ten years ago are no longer sufficient to support modern research
methodologies or attract researchers in critical fields.
The ability to attract substantial grant funding will increasingly depend on the
capabilities and facilities available to the researcher, and to the partnerships
researchers can forge across disciplines to solve complex problems.
The increasingly data-driven research landscape means that data itself has become a
critical and valuable institutional asset. Effectively managed, curated and shared data
has both reputational and financial benefits that are increasing in importance over time.
In order to prepare our students for success, whether in further studies or in the
increasingly high-tech commercial world, UC must provide them with instructional and
research opportunities that employ cutting-edge technology, and give them access to
![Page 6: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/6.jpg)
UC VCR-CIO 2015 Summit 6
tools, data, support and resources to maximize its use.
The NSF’s long-term vision for cyberinfrastructure stresses that the complexity of
research analytics is increasing, and that solutions will demand approaches to big data,
strategies to deal with data from new technologies, interoperable capabilities and
resources, partnerships, and rational data access, analytics and archiving strategies.
These trends are reflected by other agencies and initiatives as well.
Facilities infrastructure is still important, but now it must be joined by development of
tools, middleware/connective tissue, platforms, and sociotechnical infrastructure (in
particular support, training, and facilitation) necessary to enable true collaborative
research. A new emphasis on data means partnering between appropriate technical and
Library entities to achieve goals important both to individual researchers and to the
institution as a whole. Creating a central entity to identify, develop and integrate tools and
best practices, to facilitate the sharing of our system's best solutions, and to prepare our IT
staff to serve a new generation of researchers is critical to realizing efficiencies, attracting
and nurturing novel research, top faculty, and research dollars, and maintaining/bettering
our reputation as an institution.
UC campuses are individually recognized as world-class research universities. Each campus supports a wide range of research and each campus claims particular areas of research leadership. When UC’s research areas, grants, patents, scholarship recognitions, etc., are considered as a whole, the University is unrivaled as an institution. Indeed, the University of California already has resources that are unrivaled by any other university system. These include:
The San Diego Supercomputer Center (SDSC) at UCSD was established as one of the nation’s first supercomputer centers by the National Science Foundation (NSF) The Center opened its doors on November 14, 1985 and has recently launched Comet, SDSC’s newest HPC resource, a petascale supercomputer.
The California Research and
Education Network (CalREN) is a multitiered, advanced network managed by the Corporation for Education Network Initiatives in California providing connectivity to UC campuses at speeds up to 100Gbs.
Next-generation Research and Supporting Cyberinfrastructure: a geosciences researcher at one UC campus sees a program solicitation that would benefit from collaboration across multiple fields. She heads to a faculty profiling system, where she determines a list of potential collaborators and contacts them to hone a proposal team. Her campus digital technology resource advisor works with colleagues at the partner campuses to develop a compelling facilities description to accompany the proposal. The multi-campus, cross-disciplinary approach nets a large, multi-year grant. Once the proposal is awarded, the team selects appropriate data management, collaboration, analysis, visualization and workflow tools from a central marketplace to make collaboration seamless for their particular domain. They use the library to obtain geospatial data that they use to hone their data collection. Team members use local and remote instruments to collect experimental data, and easily analyze or visualize each other’s data without regard for the data’s origination or compute location. In addition to the published papers resulting from the work, the resultant data sets are curated and made available to future researchers in digital libraries. There the data are cited frequently, and are reused to both replicate the experimental work of the project and to build on that foundational work to take science forward.
![Page 7: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/7.jpg)
UC VCR-CIO 2015 Summit 7
Lawrence Berkeley Labs is the home National Energy Research Scientific Computing Center, or NERSC, one of the world’s leading supercomputing centers for open science which serves nearly 6,000 researchers in the U.S. and abroad.
As part of UC Biomedical Research Acceleration, Integration, and Development" (UC
BRAID), the UC ReX Data Explorer is a secure online system that enables cross-institution queries of clinical aggregate data from 14+ million de-identified records.
Nevertheless, for the most part, UC research and cyberinfrastructure capabilities are
operationally separated by campus, with little inter-campus visibility, access or interaction.
Both in research and in cyberinfrastructure, UC is perceived as ten individual campuses,
not as a system. Indeed, UC has a history of competing as individual campuses rather than
aggregating strengths as a system or cluster of campuses when responding to state and
national initiatives and funding opportunities.
The NSF recently noted, “Although team science promises to address increasingly complex
scientific questions, conducting research collaboratively can introduce challenges that slow or
prevent projects from achieving their scientific goals.” In order for UC to be successful, we
must embrace this trend and build the experience with the tools to achieve frictionless access
to services and support.
UC Researchers – Advancing Cyberinfrastructure as a Systems of Systems
The cyberinfrastructure of the future will allow the researcher to leverage far more
integrated tools and services than is easily available today, with enormous potential impact
on the ability to win grants, achieve impact, and advance the reputation of the institution.
Importantly, cyberinfrastructure as a “systems of systems” makes on-demand, composable
cyberinfrastructure solutions a reality.
Researchers at the recent Pacific Research Platform Workshop at UCSD (October, 2015)
presented many specific and compelling science drivers for a more mature research
platform environment — and not just that represented by the PRP’s new high-speed
network to join campus science DMZs. Examples of science that could benefit from a more
comprehensive approach include the work of Sergio Baranzini, professor of neurology at
UCSF. He describes a set of research challenges that include democratization of data
collection equipment, distributed data generation, and individualized analysis. Frank
McKenna’s research at the Pacific Earthquake Engineering Research Center at UC Berkeley
involves the collection of relatively small data over long periods of time, and collaborations
with other university researchers, government, and industry. He notes that in an ideal
world, “it would appear to me like local files and applications were on my desktop. I just
define the workflow and the system figures out where to run it.” These descriptions sound
very much like that of the platforms described below that should be built to accommodate
such research. Daniel Cayan, a researcher at the University of California, San Diego Scripps
![Page 8: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/8.jpg)
UC VCR-CIO 2015 Summit 8
Institution of Oceanography, describes needs for a “better catalogue describing available
data, better access tools for a range of users (small to large), remote processing and
analysis tools, and more accessible expert network knowledge” — approaches to address
most of these needs are described below.
Definitions
Because jargon is rampant and terminology is used differently in different contexts, the
following guide is provided to define the cyberinfrastructure terms contained in this
document:
Central/shared services are those managed centrally for the good of all campuses. Such
services are provisioned centrally and access is extended to campuses. An example of a
central/shared service is the California Digital Library shared subscriptions.
Federated services are distinguished from centrally shared services with respect to
approach, resources and operations. A federated service is a value-driven coordination
of services drawing upon the strengths and diversity of the distributed approaches, and
typically involves a coordinating front end plus middleware to provide access to
distributed back-end systems. An example of a federated service is the UC ReX secure
patient data search service, which allows secure searching for patient cohorts that are
assembled from multiple campuses’ independent patient data stores.
Intercampus services are services developed and managed locally by one or a set of
campuses, which are made available to other campuses within the UC system. An
example of an intercampus service is the SDSC Colocation facility.
Cyber facilities include the physical compute, storage, data center and network facilities
and the operational standards, software and code that comprise the computational,
storage and network system layers of cyberinfrastructure. Facilities also include
sophisticated routers, servers, fiber, cabling, data centers, power and cooling, etc.
Cyber collaboration infrastructure describes the tools, applications and processes that
are layered on the cyber facilities:
a. collaboration tools for multiple research groups to work together with analytics,
modeling, simulation and visualization capabilities
b. software-based processes for data management, data modeling, curation,
preservation, and aggregation for accessing, reusing and building broadly used
research data assets, as well as protecting and securing them
c. cyber environments for readily promoting, accessing, using and collaboratively
building software applications, i.e., research software stores
![Page 9: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/9.jpg)
UC VCR-CIO 2015 Summit 9
d. networked tools and search mechanisms for discovering and accessing expertise,
both formally and informally and in directed team-based projects, to spark
innovation, discovery and trial
e. network-based channels for conducting team-based R & D securely, tech transfer
that manages IP, processes that manage regulated data, etc. not only within higher
education, but also with commercial and industry partners, recognizing that data
are valuable intellectual property and technology transfer assets
Platforms combine cyber facilities (now considered basic needs) and cyber collaboration
infrastructure (new, enabling tools and processes) with “connective tissue” (e.g.
middleware, front ends, networks) to create integrated cyberinfrastructure capabilities
and services that, in aggregate, offer new functions, often taking into account the full
research data life cycle or the end-to-end process of collaboration. An institutional
research cyberinfrastructure platform might, for example, integrate network,
computation, data, workflow and security facilities and services to facilitate the ability
of researchers at different locations and institutions to progressively analyze data sets.
Mobility services might be added to facilitate distributed human-centered data input.
Different database structures might be integrated to facilitate different data analysis
and integration needs. A HIPAA-compliant platform might make it possible to do health
sciences research involving patient data. Discipline-specific platforms could be built
separately or over general-purpose platforms. Platforms are typically federated
environments (e.g. the upcoming Pacific Research Platform network) joining the
strengths of multiple campuses.
Sociotechnical infrastructure – this term, in increasing use in higher education, refers to
the technical expertise, guidance, workflow, procedures, interfaces and other human-
technology interventions (such as the Cyberinfrastructure Mediator service described
later in this document) that facilitate the use of cybertechnologies by humans in the
research environment. The importance of this type of service was stressed at the
conference and must be developed in concert with the facilities and infrastructure that
accompany it.
Positioning UC – Cyberinfrastructure Needs
The 2015 Summit generated a spectrum of topics worthy of consideration. However, seven
of these received particularly strong, cross-disciplinary attention, as measured by how
often they surfaced in the disciplinary sessions and summit panel sessions. These seven
resonant priorities form the basis for the recommended actions:
Cyberinfrastructure “concierge” service (here called Cyberinfrastructure Mediator)
Collaboration tools, portals, and services
![Page 10: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/10.jpg)
UC VCR-CIO 2015 Summit 10
Storage vision and ecosystem
Data management, curation, metadata / interoperability
Data access – UC and beyond
Skills development, training, “boot camps”
Policies and ethical considerations
These can be further organized into actionable themes: (1) make cyber collaboration tools
accessible, (2) expand and scale skills and access to expertise, (3) support data as research
assets to be managed, curated, and preserved; and (4) bring it all together into a platform
“ecosystem” of federated services, systems, and support. The following notes provide
detailed thoughts and notes concerning these themes and UC’s emerging
cyberinfrastructure needs:
Make cyber collaboration tools accessible
° Enabling a broader base of researchers. Easier-to-use, self-guided and more highly
abstracted transformative tools and services that embody informatics expertise will
enable a broader base of researchers to conduct novel research without having to
develop or invest in the same expertise. In addition, new models for research
informatics support will support researchers who may be in silos or who lack resources
to establish independent infrastructure and support systems. Such models may also
realize cost savings. Emerging technologies and access to standardized approaches to
data management will be accessible to all faculty, including those in fields where such
capabilities have traditionally been underdeveloped. Finally, widely available training
for students, research staff and faculty in applying new technologies to research will
help develop cyberinfrastructure skills into standard research techniques.
Expand and scale skills and access to expertise
Cross-disciplinary collaboration. Collaboration and partnerships across departments,
schools and fields of study will increase our ability to solve complex research problems.
Innovative approaches for generating, collecting, and analyzing data to bridge disciplinary
languages, dictionaries, and areas of interest will provide vast opportunities for cross-
disciplinary researchers to share ideas, data, tools, and algorithms and to approach
research and global problems with a shared context. However, such collaborative
approaches and data driven inquiry require new skills and approaches to support success
within a shared, interdisciplinary context. UC therefore needs to invest in the development
and growth of both its researchers and information technology staff across the system.
Support data as research assets to be managed, curated, and preserved
![Page 11: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/11.jpg)
UC VCR-CIO 2015 Summit 11
° Data ownership and big data. Big data has three attributes: volume (scale), variety (its
many forms, e.g., structured/unstructured, text, multimedia), and velocity
(dynamism/real-time qualities). The ability to more readily collect, access and analyze
data beyond the walls of the institution, and to store and analyze large amounts of
disparate data (or big data) generated both locally and distally, will increase
opportunities for new kinds of research, analysis and decision-making. Real-time
dynamic data and analysis will transform traditional research approaches and
methodologies by accelerating the generate-analyze-apply-learn cycle. Systems will use
networked, information-based technologies to integrate intelligence in real time across
entire enterprises and will use data-driven modeling, simulations and Key Performance
Indicators to communicate optimal actions and results in real time. There are
significant policy, regulatory, security, privacy, and ethical issues to be managed.
° Multi-use data. The line between operational, business and research data is blurring.
Data is quickly becoming dual-purpose or multi-use as organizations integrate potential
research data collection more seamlessly into business workflow and operations. Policy
and governance will be critical to efficiently and effectively manage data in
organizations with potentially multi-purpose data innovative approaches to human
subjects protection and compliance issues. Business operations will have to consider
how to support business and research simultaneously.
° Data visualization. Of increasing importance for managing large data sets, data
visualization involves the graphic display of data too complex for manual processing or
assessment; the resultant imagery is typically the end result of an algorithmic process
or generated from large-scale data sets. It encompasses a broad range of analytic tools
and techniques that include statistical visualization, GIS, and 3D modeling, all of which
share the common goal of organizing data into a coherent visual display that can be
readily interpreted and understood.
Platform “ecosystem” of Federated Services, Systems, and Support
o A federated but connected and interoperable infrastructure of platforms. UC campuses
and medical centers can and should build tools, services, and infrastructures to address
compelling local needs and support research innovation via agile and nimble service
provisioning. However, a federated approach will allow UC to discover ways to share,
leverage, and connect campus and medical center capacity (including data and data
services) to enhance UC’s collective research enterprise as a system of systems.
Such an approach is key to helping the campuses enhance capacity and capability
individually and across the system. Federated infrastructures will extend the tools and
capabilities that form an institutional “nervous system” (distributed resources,
capabilities, expertise, policy and ethics) through which data can be moved and
methodologies accessed. Organized for campus leverage, these federated platforms will
![Page 12: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/12.jpg)
UC VCR-CIO 2015 Summit 12
cultivate individual researcher capability. Mobile information and communication
technologies will play a major role. Policy will be an important driver, and initiatives
must reflect the ethical values that the UC wishes to project.
An example of such an approach is UC ReX, which begins with the premise that all six
UC medical centers have independent and effective clinical operations. Through a
federated approach, it is now possible to share patient cohort data so that each medical
center can use all of UC’s data to perform research and optimize clinical strengths (e.g. a
clinic that specializes in Alzheimer's has additional data for therapy optimization and
precision).
Recommended Actions
UC should begin by focusing efforts on the first two actions below. It is not necessary, nor is it
advisable owing to human resource limitations, to initiate all of these action items at once;
however, UC should complete at least the first two within twelve months in order to realize
measurable benefits quickly and to provide momentum for completing the effort.
Action 1: Create a UC Cyberinfrastructure Alliance tasked to define, build, stage and
orchestrate federated and centralized operations and policy.
The Cyberinfrastructure Alliance should be established and staffed as the initial federation
operating entity. As a start-up itself, the Alliance will be responsible for prioritizing
federated capabilities, commissioning working groups and supporting and orchestrating
the activities of each working group. This Alliance would start small, with the
recommended Actions indicated in this document; if proven effective, it could grow into a
larger and more permanent body. (Please see the note comparing the Cyberinfrastructure
Alliance organizational structure to CENIC on page 19 of this document.)
The Cyberinfrastructure Alliance will include a Federated Governance Board (FGB) as well
as project management and other support positions, since it will need to coordinate and
manage resources from the beginning. As capabilities become operational and others enter
the development process, the Alliance will become an operating entity. It is recommended
that the Federated Governance Board (FGB) comprise two VCRs, two CFOs, two CIOs, two
librarians and several key faculty members from multiple campuses. The
Cyberinfrastructure Alliance will interact with campuses through existing senate and
administrative structures, as well as create events, such as workshops, to define, shape and
build operational direction and interest and to build the infrastructure needed to facilitate
capability.
In time, the Cyberinfrastructure Alliance will address all the projects, initiatives, and
themes that surfaced during the Cyberinfrastructure Conference. The following notes
![Page 13: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/13.jpg)
UC VCR-CIO 2015 Summit 13
describe how the Cyberinfrastructure Alliance will position UC for the successful delivery
of these projects and initiatives:
Policies, Practices, Procedures, Organizations – Enhance, Modify, and Create Policies,
Practices, Procedures, and Organizations that Enable Federated and Intercampus Services
Enabling the UC Research Enterprise.
At present, UC is not organized operationally or financially to facilitate federation or
intercampus services. Some services exist despite this gap: for example, SDSC’s
provision of intercampus colocation services, or the federated UC ReX system for
securely searching patient data. In general, however, policies, practices, and incentives
often encourage the creation rather than the dissolution of silos. Although a
“federation” is challenging to the currently fully decentralized business and financial
structures of the UC system, it is highly valuable and should include the ability to
interoperate with services, facilities and support from across the United States and
beyond, as well as within UC. While difficult, UC should tackle and promote the
development and use of federated or intercampus-accessible services. The following
actions and organizing principles are essential to developing and promoting these
services:
o Marketplace of Services and Support. Establish as an organizing principle for a
systemwide “Research Cyberinfrastructure Marketplace” consisting of available
central/shared, federated and intracampus-offered services, platforms, technical
expertise, and accessible, reusable research data [see Action 3 for initial development
steps] with an emphasis on federating offerings for the benefit of all campuses.
Precisely because of the broad nature of individual campus research strengths, UC is
well positioned to build and demonstrate the power of federation. UC federated
services would allow individual campuses to retain their interests and strengths,
and to build on them and draw on crossover strengths where there are multi-
campus benefits. Federation should be used to create interoperability opportunities
that take advantage of the system, infrastructure and expertise at each campus for
the purposes of accelerating, enhancing and promoting the development of each
campus’s unique research strengths.
There are several national Marketplace models that could be considered or mixed
(e.g. Smart Manufacturing through the Smart Manufacturing Leadership Coalition,
Industrial Commons through the Digital Manufacturing Design Innovation Institute,
Hubzero at Purdue, and Community Apps Sharing Architecture through the
Instructional Management Systems Coalition. Such an environment would make
available software, data, and service resources and allow users and developers to
review, access or contribute to the store to address a wide variety of research needs.
Tools, standards and best practices might also be made available in the Marketplace,
![Page 14: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/14.jpg)
UC VCR-CIO 2015 Summit 14
as well as brokering of cloud services. Such a Marketplace would need to be
continuously updated, and a process of inventory and discovery put into place to
work alongside it.
o Infrastructures, Tools, Services to Enable Federation. To make this work, determine
the appropriate infrastructure (such as network connectivity and systems
interoperability), transparency, and incentives necessary to facilitate federated
resource sharing between campuses [see Actions 4, 5, 6, 7 and 8 for initial
development steps]. Federated resources must not be determined solely in a top-
down, system-level manner, but must be allowed to emerge from individual or
collaborative campus efforts and identified and selected for federation. Bottom-up
structures are often more agile, approach new technologies sooner and address a
broader range of disciplinary and cross-disciplinary research activities. Top-down
transparency, organization and facilitation can be combined with campus-level
development, expertise, and emerging skills to maximize impact.
o Cyberinfrastructure Alliance Guiding and Organizing Principles. The following
guiding / organizing principles will allow UC to break down policy barriers to
collaboration with specific timelines and the following deliverables:
o Inventory of services, systems, and support. Strategies are needed to communicate
the existence of shared services and to facilitate inter- campus use of such
devices, systems, tools, and services.
o Institutional support for sharing services across the UC system. The barriers to
entry for intercampus sharing and for utilizing common tools across campuses
must be eliminated or greatly reduced. These barriers include financial, cultural,
incentive, policy, and organizational constraints.
o Federated services strategy. Services should be selected for federation where
such action would lead to significant improvement in the technical support and
trusted partnerships that UC researchers most need, in a reasonable period of
time and in a cost-effective manner. Importantly, not all campuses must utilize a
particular service, nor it is necessary for all shared services to be provided by
UCOP or a particular campus or center. Rather, UC’s strategy should recognize
that intercampus collaborations of two or several campuses or research centers
might generate significant efficiencies and benefits. (This does not preclude such
services being identified as shared service opportunities at a later time.)
o Common approach to data access, security, etc. UC does not have a common
(campus, discipline, health sciences) approach to data access, security,
availability, etc. UC should develop and support a suite of transparent policies,
procedures, and incentives that are easy to understand / utilize and that
promote the wide availability of data and resources within UC. Issues that must
![Page 15: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/15.jpg)
UC VCR-CIO 2015 Summit 15
be addressed include compliance (e.g., HIPAA), security, bio-ethical topics, and
clinician / researcher relationships.
o Ethical considerations. As access to data increases, UC must ensure appropriate
policies and standards for privacy, confidentiality, data ownership, public /
private partnerships, etc., are considered and adopted.
o External (non-UC) data. UC must investigate policies and practices relating to
data security, access, privacy, etc., that will facilitate the acquisition of data from
organizations, firms, and other groups outside UC.
Successful Delivery of Federated Services and Support – Service Delivery Approaches
Designed for Success.
Federation must be viewed as an operation in its own right that facilitates and sustains value-driven federation-oriented policy, infrastructure activities and interoperability collaborations, which together produce measurably increased campus and collective research capability and capacity. In sharp contrast to centralization, federation involves sustaining an evolutionary development lifecycle that will generally consist of the following steps:
1. Identification of a high-potential federated capability
2. Inventory and visible exposure of campus capabilities, e.g., websites and workshops
3. Detailed review of federated potential, consideration of approaches and funding, policy and capacity needs/barriers
4. Highly visible pilot orchestrated with a small subset of campuses to champion, demonstrate and shape an approach
5. Resolution of funding, policy, infrastructure or capability barriers
6. Scaling from the successful pilot, moving to operational requirements and scaling to critical mass interest
7. Adjusting and sunsetting a capability when requirements, technologies and value changes.
To execute on this development pattern, a working group for each potential federated
capability needs to be identified. Each working group must be supported with
increasing involvement and project management. This will ensure the demonstration of
value and review on the merits of capability, and will avoid the loss of capabilities
because of lack of support, resources or commitment at any one step. Federated
capabilities that survive the pilot process need to be able move into a managed
operational start-up and scale-up mode with identification of appropriate federated
value, investment in resources, and resolution of policy barriers. The VCR-CIO Summit
identified a first slate of candidate federation capabilities. The descriptions for each of
![Page 16: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/16.jpg)
UC VCR-CIO 2015 Summit 16
the following recommended actions provide proposed agendas for the associated
working groups.
Per the project delivery strategy noted above, for each of the following actions, these basic
steps are envisioned:
1. Create a cross-campus working group to guide development
2. Survey existing offerings, tools & services; contribute to the central inventory and
identify federation possibilities for the Marketplace
3. Produce an online “best practices” guidebook/manual for campuses
4. Initiate a process to actively monitor/maintain the landscape over time
Action 2: Develop systemwide and campus “Cyberinfrastructure Mediator” support.
During the UC Cyberinfrastructure Conference, participants uniformly recommended
creating a “concierge” service, a capacity for digital technology resource guidance that
brings federated expertise and capabilities together to deliver appropriate
cyberinfrastructure services to meet individual research needs. This important capacity
has been named UC’s Cyberinfrastructure Mediator partnering and support function, and it
aims to reduce the time faculty spend bringing cloud, national, UC wide and local campus
cyberinfrastructure capabilities together to address research goals and objectives. UC’s
Cyberinfrastructure Mediators will sponsor and create “ask an expert” services and provide
“how to do things or get things done” guidance; it is also envisioned that these support staff
will partner very closely with UC faculty and provide synergistic input relating to how
technology and technical services can be leveraged to address research challenges and
promote collaboration and next-generation science.
Action 3: Develop an effective systemwide “marketplace” for research
cyberinfrastructure.
The University of California has a vast array of cyberinfrastructure tools, services, support
and related data that facilitate and enable its research enterprise. However, in general,
these various cyberinfrastructures and associated data are not readily available to
researchers who do not “own” or have not directly participated in the provisioning of a
particular tool or service. Indeed, such siloed environments exist at the campus and
medical center levels.
As a result, leveraging UC’s collective cyberinfrastructure is relatively difficult and can be
quite costly given that each new partnership requires discovery of a particular service,
developing an understanding of how it might be federated or shared, discussion of fiscal /
financial issues, and addressing data interoperability and challenges. UC should therefore
create of a “Research Cyberinfrastructure Marketplace” with very low barriers to entry and
![Page 17: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/17.jpg)
UC VCR-CIO 2015 Summit 17
commensurately low “transaction processing” costs for sharing, utilizing, and federating
services and support.
The “Research Cyberinfrastructure Marketplace” will be built on a clear understanding of
available central/shared, federated and intracampus-offered services, platforms, technical
expertise, and accessible and reusable research data. UC’s Cyberinfrastructure Mediators
will routinely refer researchers to UC’s Research Cyberinfrastructure Marketplace as an
option for obtaining services and support, and the marketplace itself will expand over time
as UC’s Cyberinfrastructure Alliance identifies and acts on opportunities for expanding
federated service offerings. Importantly, other suggested action in the document will lower
“barriers to entry” and “transaction costs” associated with utilizing the Research
Cyberinfrastructure Marketplace (see, for example, Action 4 relating to research data as an
institutional asset).
Associated Action - Build a shared software store.
One tangible component of UC’s cyberinfrastructure marketplace will be a software
brokerage infrastructure and appropriate policy for sharing/promoting/buying cloud
software applications across the UC system. Similarly, the UC federation should be set up to
facilitate a technology channel for data and software with respect to internal and external
partnerships. Collectively, UC research is a major producer of software, and this asset can
be leveraged within the system to enhance research achievements for all.
Action 4: Support research data as an institutional asset.
UC must acknowledge the role of research data as valuable University intellectual property,
and to develop and implement a set of guidelines for its management. Further, UC must
develop new — and integrate existing — tools and services based on these guidelines,
bringing together local campus data management initiatives and system-level tools where
appropriate. The libraries’ critical role in building research data into a University research
asset emerged strongly in the Summit — issues relating to data management (short and
long term), data quality, curation, retention practices, and metadata structures that enable
interoperability, etc., are foundational to optimizing UC’s effectiveness and cementing UC’s
reputation as a leader. UC must leverage expertise within its libraries and partner with
technology organizations to address this important need.
Action 5: Develop cyberinfrastructure systems “connective tissue” and associated
tools to join services, create cyber platforms, and enable federated services.
It is essential to develop platform tools that bring researchers and their work into a more
visible, discoverable state to facilitate shared expertise and to increase the potential of
collaborations. For example, how does one researcher find another researcher doing
![Page 18: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/18.jpg)
UC VCR-CIO 2015 Summit 18
something similar with cyberinfrastructure, especially across disciplines? Collaboration
tools, federated data and database interfaces, interconnected networks and more are
required to empower UC as a system. UC needs to agree on standards and build the
necessary “connective tissue”: campus network interconnects, middleware, scheduler
technologies, cloud service management technologies etc. to make it possible for federated
services, databases, facilities and tools to interoperate. This will make it possible to take
advantage of cross-system and commercial cloud technologies to assemble services for
particular research needs, and may also realize efficiencies.
Action 6: Develop approaches to scale discipline-similar requirements across
campuses.
Not all research areas have large, concentrated discipline-specific data needs that are
accommodated by formal structures such as centers. There is a huge diversity of research
and scholarship programs working with smaller but equally valuable data assets that lack
the ability to scale, share, and leverage data resources across campuses, and when
appropriate, the entire UC system. UC should leverage institutional and cross-institutional
discipline-specific data resources to allow smaller data assets to take advantage of shared
resources. Importantly, this initiative will also bring together UC faculty and practitioners
from across campuses who are thought leaders in their fields; such collaborations will
enable faculty to exchange innovations and novel approaches as well as discuss and resolve
field-specific data challenges.
Action 7: Position health, patient and clinical data for research access, patient care,
and other strategic uses.
The five UC medical centers and many health science programs and their attendant health,
patient and clinical data are unparalleled data assets for research. The UC ReX (discussed
above) and Big Cogito pilot are examples. Key challenges will be standardization of
terminology across UC, and the development of appropriate policies and data governance
that allow the UC to simultaneously work as one collaborative system in certain situations
while promoting healthy competitive innovation and excellence as individual campuses. UC
must define a HIPAA-safe approach and infrastructure to advance research collaboration;
identify data workflows, interfaces, and data standards to allow for precision medicine
within the electronic medical records; provide ready access to de-identified clinical data to
faculty outside of the school of medicine or outside of health sciences. In the process, UC
must examine challenges around specific types of data, highlight data visualization needs,
and engage patients and the community.
Action 8: Build on UC’s expertise via a development structure for UC researchers and support staff.
![Page 19: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/19.jpg)
UC VCR-CIO 2015 Summit 19
Collaboration and partnerships across departments, schools, fields of study, campuses, and
medical centers and will increase UC’s capacity to solve complex research problems.
However, these collaborations and partnerships (that are often built around big data and
associated informatics / analysis) require new skills and approaches relating to
collaboration, data capture and analysis, systems and tools, and algorithms required to
approach research and global problems within a shared, interdisciplinary context.
UC therefore needs to invest in the development and growth of both its researchers and
information technology staff across the system.
The notion of cyberinfrastructure support staff includes the full range of domain experts
who choose non-faculty career paths supporting researchers, as well as technology experts
who are responsible for keeping research operations running. Professional development
will include the soft (interpersonal) and hard (technical) skills needed so that research
technology professionals can move comfortably from helping to address local problems to
participating in cross-campus and multi-campus collaborations. An ultimate goal of this
process should include the establishment of a UC community of well-connected research
technology consultants and cyberinfrastructure engineers who can serve as an adjunct
“community of experts” to the Cyberinfrastructure Mediators described above.
Importantly, UC must provide similar developmental opportunities for its faculty who now
require non-disciplinary expertise (data capture, data management, informatics / analysis
capabilities, etc.) and skills relating to collaborating with non-traditional colleagues and
partners (e.g. bridging disciplinary languages, dictionaries, areas of interest, etc.).
Immediate Next Steps and Moving Forward
As noted earlier in this document, UC is currently not organized to successfully provide or
facilitate federation or intercampus services. Indeed, current incentives and organizational
structures, in many cases, encourage the provisioning of tools, infrastructures, and services
in that are inherently not shareable and/or interoperable.
As a result, UC should create an organizational approach and formal structure to facilitate
and enable the federated, collaborative vision outlined in this document. This approach
will not only provide immediate benefit to UC’s research enterprise, but will also provide a
structure to prioritize, implement, and manage initiatives over the next several years and
beyond. It is therefore proposed that UC act on and complete the following two actions
items within the next twelve months:
UC should create the UC Cyberinfrastructure Alliance to provide oversight, guidance, and
structure for acting on the various recommendations resulting from the
Cyberinfrastructure Conference.
![Page 20: Next Generation Research and the University of Californiacnc.ucr.edu/uccybersummit/images/ucvcrciosummitreportpositionpape… · Next Generation Research and the University of California:](https://reader031.vdocuments.us/reader031/viewer/2022030423/5aab9f077f8b9a8f498c27b7/html5/thumbnails/20.jpg)
UC VCR-CIO 2015 Summit 20
This body will create the framework necessary to support federated and intercampus
services. The UC Cyberinfrastructure Alliance will also prioritize initiatives and
coordinate overall project planning/management.
UC should develop a systemwide and campus Cyberinfrastructure Mediator style service
and begin development of a systemwide “marketplace” for research cyberinfrastructure.
This effort will produce a formal group that provides immediate service and benefit to
the UC research community, including cybersecurity recommendations, while the UC
Cyberinfrastructure Alliance is formed.
Summary and Conclusion
The University of California’s most successful and important technical collaboration is
CENIC (Corporation for Education Network Initiatives in California). CENIC is a federated
service insofar as UC campuses are able to build, operate, and optimize local networks to
meet campus needs, but the “connective tissue” that integrates these networks is provided
by CENIC. California’s federated educational network is a best-of-breed solution that is
critical to UC’s teaching, research, and public service missions.
Similar to the CENIC model for providing federated services, the proposed UC
Cyberinfrastructure Alliance and the recommended service implementation plans will
provide (and/or facilitate) the tools, platforms, data management practices, and other
services that will provide UC researchers and scholars frictionless access to the marketplace
of UC cyberinfrastructures and support. Importantly, this approach will not compromise
individual campus’ or researchers’ tools, systems, and initiatives, but will connect them in
novel and synergistic ways.
This effort to implement federated services will better position UC to support faculty like
Berkeley’s Frank McKenna (mentioned earlier in this report). Cyberinfrastructure related
tools and services will ideally “appear to me like local files and applications were on my
desktop. I just define the workflow and the system figures out where to run it.”
UC’s Vice Chancellors of Research and CIOs are ready to produce a plan for creating and
operationalizing the Cyberinfrastructure Alliance for three years. This operating plan will
include the creation of the campus Cyberinfrastructure Mediator service as well as a suite of
metrics and reports defining and measuring impacts and success. If these next steps are
approved, this plan will be created by February 2016.