presentation afternoon sessionleverage actively developed open source software and libraries...

42
1/25/2018 1 Lucy Rose Department of Forest Resources

Upload: others

Post on 21-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

1

Lucy Rose

Department of Forest Resources

Page 2: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

2

Opportunities and Challenges for High Temporal Resolution Hydrologic Monitoring in 

Northern Minnesota

Lucy Rose and Diana KarwanDepartment of Forest Resources, University of Minnesota

Legislative‐Citizen Commissionon Minnesota Resources

Study Site : West Swan River

Watersheds topographically defined, don’t always line up in this area

Total Watershed Area: 85 mi2

Contributing Area between “Above” and “Below”: ~2 mi2

Marcell

West Swan

Hibbing, MN

Page 3: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

3

West Swan River

Primary interests:

What are the changes in magnitude and timing of:

• Stream discharge• Dissolved organic carbon (DOC)• Total suspended sediment (TSS)• Turbidity

Upstream site

Downstream site

MonitoringEquipment‐Opportunities

Campbell OBS‐3Turbidity Probe

Decagon CTD‐10Conductivity, Temperature, Depth Probe

Stroud Water Research Centerenvirodiy.org

Page 4: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

4

Stroud Water Research Centerenvirodiy.org

MonitoringEquipment‐Opportunities

Programmable ISCO AutomatedWater Sampler

Sontek IQ‐Plus High‐Frequency Discharge Profile Sensor

MonitoringEquipment‐Challenges…

Page 5: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

5

Hydrologic variability during the spring snowmelt in West Swan River

March 15 – April 6, 2017

Downstream study site

Page 6: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

6

Opportunities and Challenges

• Expanding capabilities for high temporal resolution measurement of many water quality characteristics

• Supportive community of open source, DIY sensor and dataloggerenthusiasts, willing to share knowledge (envirodiy.org)

• Still working out the “bugs” in many of these DIY systems

• Even commercial sensors and samplers can require regular attention, depending on the monitoring site 

Thank you!

Page 7: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

7

AmitPradhanhanga

Center for Changing Landscapes

Comprehensive social science data collection

and analysis Amit K Pradhananga

Mae A Davenport

01/19/2017

Page 8: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

8

WHAT drives conservation behavior?

Data collection

Page 9: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

9

Sociodemographicand property characteristics

• Age, gender, education, income

• Farming experience

• Property size• Tenure

• Practice adoption

• Civic engagement

• Support for conservation initiatives

• Awareness• Attitude• Values• Norms• Efficacy

Conservation behavior

Perceptions

Page 10: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

10

Completed projects Ongoing projects

Wild Rice Watershed District

Capitol Region Watershed District

Mississippi Watershed Management 

Organization

Vermillion River Watershed

Ramsey‐Washington Metro Watershed District

Sand Creek Watershed

Cannon River Watershed

Middle Minnesota Watershed

Mississippi River‐ La Crescent Watershed

Mississippi River‐ Reno Watershed

Lower Minnesota Watershed

Middle Snake Tamarac Rivers 

Watershed District

Develop comprehensive, social science-based framework to track drivers of conservation behavior

Page 11: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

11

Social data mapping

http://gis.joewheaton.org/topics/data

Opportunities/challenges

Page 12: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

12

Expertise in geospatial analysis and social data mapping

Partnerships and collaborations

Page 13: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

13

LeifOlmanson

Department of Forest Resources

Remote Sensing of Lake water QualityOpportunities and Challenges

UNIVERSITY OF MINNESOTA

Leif Olmanson

Page 14: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

14

HIGHLIGHTS7 statewide water clarity assessments since 1975 of >10,000 Minnesota lakes 

Analysis of spatial and temporal trends and causative factors

Lake Browser: An on‐line  resource for ~9,000 unique monthly visitors

Currently being updated to include 2010 and 2015 to maintain 5 year interval

Prior Accomplishments:

27

Remote sensing of lake water clarity in Minnesota

To improve water quality and fisheries management we need more comprehensive 

water quality datawater.rs.umn.edu

New satellite technology enables measurements of the three factors controlling water clarity – algae, suspended solids, and dissolved organic color – allowing us to assess their individual effects on water quality

More often

Better sensors

Finer resolution

28

2016 CDOM Map

Page 15: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

15

29

Water Clarity Model Applied 

GloballyCloud Based Image Processing

A planetary‐scale platform for Earth science data & 

analysis

Using to explore Image processing methods validated with in situ data

30

Garbage in, Garbage Out.Your analysis is only as good as your data!

Page 16: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

16

Path 299/17/14

Path 289/12/15& 8/22/13

Challenge

Atmospheric correction for 

consistent results using automated 

methods

Path 288/30/16

Path 2610/3/16

Secchi disk transparency data Within 1 day

Landsat OLI test image field data 

Landsat OLI Remote Sensing Reflectance (Rrs) Lake Spectra

Page 17: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

17

August 30, 2016Landsat 8 OLI image RGB (Blue, Thermal, Thermal) 

masked using EROS CFMask

Cloud, Shadow and Haze Masking 

Works well for clouds and most shadows but not 

for haze

Challenge

Lakes in areas with haze will be mischaracterized

34

Opportunity: Near Real‐Time Water Quality Monitoring 

Improved data

More often (~weekly)

FreeEROS Data Center

Normalize images

Remove land, clouds, shadows, haze…

Water clarity

CDOM

Suspended Solids

ChlorophyllAutomated satellite imagery pipeline

Recently launched satellites:Landsat 8Sentinel‐2 Sentinel‐3

Maps, data, statisticalsummaries, time‐trend plots and animations

Minnesota Supercomputing Institute (MSI)

UMN high performancecomputing systems

Prepare images using new automated methods.

Apply water quality models Provide customized 

information to agencies, researchers, and citizens. 

Enhanced Lake Browser

New Lake and Fisheries management and Research Opportunities

Page 18: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

18

XunTang

Spatial Computing Research Group

www.spatial.cs.umn.edu

Courses

36

• CSCI 5715 Spatial Computing

(Fall 17)

• CSCI 8715 Spatial Data Science Research (Spring18)

Related Grants• Active

• USDA: Increasing Low‐Input Turfgrass Adoption though Breeding

• NSF: Collaborative Research: Mining Climate and Ecosystem Data Driven Approach

• Finished

• NSF: IGERT: Non‐equilibrium Dynamics Across Space and Time: A Common       Approach for Engineers, Earth Scientists, and Ecologists

• NSF: CRI:IAD Infrastructure for Research in Spatio‐Temporal and Context‐Aware Systems and Applications

• USDOD: Modeling and Mining Spatio‐Temporal Co‐occurrence Patterns

• USDOD: Cascade Models for Multi‐Scale Spatio‐Temporal Pattern Discovery

Page 19: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

19

Research: Spatial data mining

37

• The process of discovering interesting, useful, non-trivial patterns from large spatial datasets

• Example patterns• Hotspots, Spatial clusters• Spatial outlier, discontinuities• Co-locations, co-occurrences• Location prediction models

• Highly Inter-disciplinary• Students always collaborate with scientist from Environmental Science,

Public Health, Transportation.

GEO: Forensics: When and where do contaminants enter Shingle Creek?CISE/IIS: Scalable detection of spatio‐temporal hot‐spots & co‐occurrences

38

Ex. Oil Spill

Flow anomaly

After consecutive heavy rain events

(HydroLab sensor)

Details: J. M. Kang, S. Shekhar, C. Wennen, and P. Novak, Discovering Flow Anomalies: A SWEET Approach, IEEE Intl. Conf. on Data Mining, 2008.

Ack: NSF IGERT, CISE/IIS/III, USDOD.

Dissolved Oxygen

Rainfall

Page 20: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

20

Goals:

• Design compelling visions

• Identify gaps

• Develop a research agenda

55 Participants (Data-driven FEW & Data Sciences)

Global Temperature

Global Population

StateNexus Dashboard

Locations

Potentially Transformative Research Agenda: • National FEW Nexus Observatory & Dashboard for chokepoint monitoring, alerts, warnings (See Figure above)• Novel Physics-aware Data Science for mining nexus patterns in multi-scale spatio-temporal-network data despite non-stationarity, auto-correlation, uncertainty, etc.• Scalable tools for consensus Geo-design via participative planning with nexus observations and policy projections• An INFEWS data science community to address crucial gaps, and shape next-generation Data Science

Next: (a) Workshop report in Jan. 2016. (b) Symposium at NCSE National Conf. on Science, Policy & Env. (2pm-330pm, Th. 1/21/16, Crystal City, Washington D.C.)

NSF INFEWS Data Science (DS) Workshop (@ USDA NIFA, Oct. 5th‐6th, 2015; Shekhar, Mulla, & Schmoldt; www.spatial.cs.umn.edu/few)

Finding 1: Data & Data Science are crucial!• Understand problems, connections, impacts• Monitor FEW resources, and trends to detect risks• Support decision and policy making• Communicate with public and stakeholders

Finding 2: However, there are show-stopper gaps.1. Data Gaps: No global water & energy census, Heterogeneous data formats & collection protocols2. Data Science (DS) Gaps: Current DS methods are inadequate for spatio-temporal-network FEW data. Strong assumptions in DS need examination for better coupling with mechanistic models (e.g., Physics)

Aral Sea Shrinkage (1978-2014)Due to Cotton Farms

Alerts

Global Population

Food Energy Water DataSc.

14 10 11 20

Gov. Aca. Industry

26 24 5

Sea-Surface Temperature Anomaly

Trends

Page 21: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

21

Small Group Discussion

Regarding the interface of water and data at the University of Minnesota, identify our institutional strengths, 

opportunities, weaknesses, and threats.

Page 22: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

22

DIGITAL WATER SURVEY SUMMARYJeffrey M. Peterson

January 19, 2018

THE SURVEY, BY THE NUMBERS

• Online survey with 13 questions

• Sent to 61 selected faculty and staff

• Complete data from N = 42 respondents

• Response rate = 69%

Page 23: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

23

Disciplines of respondents

Biological science

Aquatic science

Agricultural science

EngineeringEarth science

Computational or data science

EconomicsSocial science

Other

Number of responden

ts

Positions of respondents

Tenured or tenure‐track 

faculty

Non‐tenure track faculty

Research associate or postdoctoral researcher

Extension Educator

Other

Number of responden

ts

Page 24: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

24

Data types used

Geospatial dataTime series data Cross‐sectional data, not 

georeferenced

Field measurements

Laboratory measurements

Socioeconomic survey data

Qualitative data

Number of responden

ts

How much of your research depends on multidisciplinary collaborations?

None None but would consider 

None but am planning

All or most

Number of responden

ts

Page 25: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

25

Importance of constraints in multidisciplinary research 

Lack of common vocabulary to communicate ideas

Lack of a common framework to combine different types of data

Lack of tools/software to combine and analyze multidisciplinary data

Lack of ways to easily share data and analysis tools

Importance of constraints in data handling: Hardware and software 

Availability of high performance data storage

Availability of high capacity data storage

Availability of many processors for computationally intensive work

Availability of tools/software for visualization

Availability of tools/software for data management and interoperability

Page 26: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

26

Importance of constraints in data handling: People

Access to people capable of implementing algorithms and data 

analysis workflows

Access to people who are skilled at gathering, organizing, and 

curating data

Access to people who are skilled at analyzing and visualizing data

Removing constraints would….

Improve my ability to compete for extramural funding

Increase my efficiency in completing research projects

Improve my ability to recruit students

Increase the likelihood that I will stay at the University of 

Minnesota

Page 27: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

27

I or my group members would benefit from training on…

Machine learning

Database management

Data visualization

Trend analysis

Anomaly detection

Pattern recognition

Geospatial analysis

Importance to future research: Hardware and software

High performance computing resources

High capacity data storage

Mapping software or tools to analyze geospatial data

Custom software

Page 28: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

28

Importance to future research: Capabilities

A means to combine diverse datasets

A means to protect the privacy of some or all data

A way to share data with collaborators outside of UMN

Data use agreements to protect Intellectual Property (IP) data

A means to store, organize and access data

SOME KEY RESULTS

• Data interoperability is a pathway to multidisciplinary research

• Hardware is important but not currently a constraint

• Current constraints revolve around human resources, software/tools, and training

• Removing constraints is expected to have large benefits

Page 29: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

29

Jim Wilgenbusch 

(Phil Pardey & Kevin Silverstein)

University of Minnesota Minnesota Supercomputing Institute

College of Food Agricultural and Natural Resource Sciences

January 19, 2018Water Resources Assembly and Research Symposium

University of Minnesota

G.E.M.S :  Enabling Agricultural Innovation TM

Credit:Marcel Ritter, Jian Tao, Haihong Zhao, Louisiana State University Center for Computation and Technology

Visualizing Big Data: oil flow through water 

Page 30: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

30

EG SM

Data Interoperability

Genomics Environment Management Socio‐Economics

(and Scalability)

TimeTime SpaceSpace

Siloed DataInstitutions, individuals, and, most importantly, by subject‐matter disciplines

Page 31: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

31

“Broken” DataA lot of “data” is incomplete, some is messy and even incoherent 

Due Diligence

Page 32: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

32

G.E.M.S

GEMShareTM

Data sharing Metadata management

GEMSToolsTM

Data cleaning Data analytics

Governance

Partners

HPCTechnology

Human Capital

TM

IAA

Membership Governance Data use agreements/Data privacy Federated resources

Membership Governance Data use agreements/Data privacy Federated resources

Other Groups

Membership Governance  Data use 

agreements/Data privacy Federated resources

Relationship to Partnerships

Other Groups

(Digital Water 

Initiative)

G.E.M.S Genomes To Fields (G2F)

IAA

TM

Page 33: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

33

Leverage actively developed open source software and libraries

Contribute back to open source development Build new communities of developers and users when none 

exist Prepare to throw stuff away

Basic Development Principles

Postgres ‐ MIT Jupyter ‐ BSD 3.0 Django ‐ BSD 3.0 pyCSW ‐ MIT 

Globus ‐ Apache 2.0Apache Spark ‐ Apache 2.0 Geotrellis ‐ Apache 2.0 

Docker ‐ Apache 2.0 PostGIS extensions ‐ GPL 2.0 

Puppet ‐ Apache 2.0 Conda ‐ BSD 

R ‐ GPL Scala ‐ BSD CentOS 

Open Source Tools Supporting G.E.M.S

G.E.M.STM – Core Features

GEMShareTM

A research‐enabling, federated data storage and sharing platform

• Security: Appropriate levels of security (data encryption at rest; authentication with home institution’s credentials; and secure infrastructure)

• Access Control: Data owners control access to their data, in recognition of the proprietary nature of much of the data 

• Access Levels: Different levels of access [single organization; set of organizations; and publicly available (open) data]

• Discovery: Discoverability of data through metadata alone• Transfer: Secure data transfer over both high speed networks between reliable endpoints and high latency networks to less reliable endpoints

Page 34: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

34

G.E.M.STM – Specialized Features

GEMToolsTM

An ever‐expanding data documentation, cleaning, harmonizing and analysis toolkit

• provide access to best in class hardware and software libraries

• accommodate different programming languages

• offer a range of analysis styles (novice to sophisticated)

GEMSTools – Analysis Interface (Expert)

Page 35: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

35

Mousing over a location displays selected aggregate stats for that location.

Filters:

Output:

Data setCIMMYT maizeCIMMYT wheatG2F maize

LocationSeriesTrialMgmt ConditionPhenotypeSocioeconomic

Aggregate statsGlobal

By Country

By Series

By Trial

By Investigator

By Seed Variety

By Seed Source

By Location ID

GermplasmGenotype MatrixPhenotypeSocioeconomic

x

x

GEMSTools – Analysis Interface (Point & Click)

Data interoperability issues and GEMToolsTM solutions

Page 36: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

36

• Nomenclature inconsistencies

• Measurement unit differences

• Erroneous and missing entries

• Outlier / physically impossible data values

• Domain‐specific problems• Pedigree syntax

• Genotype / Pedigree inconsistencies

• Spatial concordance of census and mapped data

• Spatiao‐temporal boundary standardization

Typical Data Impurities

Nomenclature inconsistencies

Total Phosphorus207 lb/A46lb per acre46 pound/A22 lbs68 kg/ha54lbs/acre55.5 lbs P per Acre80lb/acreNone40 poundsnone applied192 lbs;17‐Apr‐14...

GEMS Tools—DataCleaner

Planter TypeAir planterFluted coneAlmaco TP2airair planterFluted ConeFluted coneAlmaco fluted cone planterjab planterjabjab (hand) planter...

Previous cropSoybeansoybeanssoybeancornsoy beansCorn...

Page 37: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

37

Dynamic Metadata Mashup Model—DM3 

EML (Ecological Metadata Language): experiment, investigator, institution, organism specimens and taxonomy. 

OBI: sequencing, library preparation, and sequence processing 

ENVO & XEO/XEML: environmental features and habitats

Planteome.org (TO & CO): plant phenotypic traits (TO) across many individual crop ontologies (CO)

PATO: plant phenotypic qualities

AGRO agronomic practices and techniques

OGC standard ISO19115‐2, FGDC and Dublin Core: geospatial 

Broad Vocabularies

AGROVOC (FAO): including food, nutrition, agriculture, fisheries, forestry, environment etc. Translated in 27 languages.

ICASA (AgMIP)

E

G

S

M

GEMSTools — DataCleaner

Before After

Correcting errors

Imputing missing lat/long values

Geocoding Inference Engine

Page 38: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

38

GEMSToolsTM—Machine Aided Data Cleaning

Modular code to address each cleaning issue

• Work on specific problem (e.g., maize field trial data)

• Write code to automate much of the cleaning

• Apply to new crops or new datasets

o G2F vs CIMMYT vs PepsiCo (nomenclature cleaning)

o Maize, wheat, soybean, apples (pedigree cleaning)

o Surface water mesurements

Rule‐based techniques, Natural Language Processing, and some Deep Learning methods

Converge toward real‐time feedback on cleaning

Thanks

G.E.M.S URL – Under Construction! 

Page 39: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

39

Office of the Vice President for Research

Advanced Systems

Operations

- Common Services

- HPC Systems

- Storage Systems

- Hosted Services

Scientific Computing Solutions

-- Optimization

-- Benchmarking

-- HPC Research

Workflow & pipeline Development

Application Development

Solutions

- Custom App Dev

- System Programming

Research Informatics Solutions

-Informatics education

-Informatics research

-Informatics services

-Life Science Computing

User Gateway Group

-- User Support Lead

-- User Training

-- On Boarding

-Communications

-- Outreach

Minnesota Supercomputing

Institute

UMInformatics

InstituteU-Spatial

Research Computing

MSI Computing and Data Storage Assets

Batch High Performance Computing• Two Supercomputers• 25,000 CPU Cores• 230,400 GPU CUDA Cores• 100 TB Memory • Infiniband Network

Big Data Storage & Analysis• 6 PB Primary High Performance • 3 PB Second Tier • 30 PB Archive Tape Library • 1.2 PB Hadoop/Spark Cluster

Interactive & Cloud Computing• Citrix VDI for Windows • DCS Nice for Linux Desktops• OpenStack for Secure Cloud• 100 Gbps Campus Research Network• Regional & National Optical Networks 

Web Portals & Databases• Galaxy for Multi‐omics • Jupyter Hub• Custom Interfaces & Applications 

Page 40: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

40

Data Storage

Core Services Node Analysis Node

Database

REST API

Container Server

Web Apps

Globus Auth & Transfer

Globus C

lient

Jupyte

r/ Sp

ark (User1)

Jupyte

r/ Sp

ark (UserX)

Apache

Container Server

G.E.M.S – Platform Architecture & Components

Jupyte

rHub

Jupyte

r/ Sp

ark (User2)

Admin Workstation

Admin Workstation

Admin Workstations

UsersUsersUsers

Bastion Host

Globus Auth& Transfer

G.E.M.S – Security Infrastructure 

SSL Encrypted

Secure Web Browser 

Path (notebook)

Two factor authenticationfor admin access

SSHEncrypted

Logging

Monitoring

Encrypted Data Storage

User Isolated container 

Environments

Page 41: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

41

Admin Workstation

Admin Workstation

Admin Workstations

UsersUsersUsers

Bastion Host

Globus Auth& Transfer

G.E.M.S – Scale out of Compute and Data

Logging

Monitoring

G.E.M.S – Use Cases and Business Models

Presentation

Application

Data

Middleware

APIs

Integration

Hardware

Facilities

Connectivity

Case 1

Use everything managed by MSI or by Federated Partner 

Business Model: SaaS

Infrastructure

Platform

Apps

Case 2

Use G.E.M.S platform, apps, & data, and

another infrastructure

Business Model: Open Core

ACME Corp IT, AWS, etc.

Platform

Apps

Data Data

Case 3

Use only G.E.M.S data and  other apps, platform, and infrastructure

Business Model: DaaS

Application

Presentation

Middleware

APIs

Integration

Hardware

Facilities

Connectivity

My Apps, My Laptop, ACME Corp IT Server, 

AWS, Supercomputing Institute at Univ., 

etc. 

Data Data

Page 42: Presentation Afternoon SessionLeverage actively developed open source software and libraries Contribute back to open source development Build new communities of developers and users

1/25/2018

42

Small Group Discussion

Identify resources and investments needed for a Digital Water Initiative  at the University of Minnesota to 

best support collaboration.