code camp 2015 visual programming mm
Post on 08-Apr-2017
335 Views
Preview:
TRANSCRIPT
Visual Programming
Environments for
Science and BusinessMITCH MILLER
SCIENTIFIC THINKING
CODE CAMP 2015
SEPTEMBER 19, 2015
Disclaimer
This talk represents my opinion and personal experience using 2 fine
software systems developed by third parties
The software systems shown are very complex and have hundreds of components. I have only worked with a small number.
Every task shown today can be accomplished in multiple ways. I’m
only showing of those ways.
Overview
Introduction: first demo
What is a ‘visual programming environment’
The two systems we’ll look at today
What are these systems capable of?
Second set demos (in-depth)
Demo 1: set-up
Task: produce report of all compounds registered during January
Visual Programming: informal
definition
Drag functional components onto canvas to create program
Configure most components by setting parameters
Connect components to route data from one to another
Run and observe data traveling down the lines
Component types
File I/O
Read/write text files
Read/write MS Office documents
XML
JSON
Database access
Connect
Query
Update
Component types (continued)
Web service consumption
Domain-specific processing
Chemical structure I/O
Chemical structure processing and analysis
Sequence processing
Extensibility
Add your own libraries for more sophisticated processing
Component types (continued)
Visualization
Graphing
Statistical calculations
Scripting
Tip: aim for brief scripts
Data transformation
If/else processing
Filtering
Column selection
And many more…
KNIME
Originally a production of the University of Konstanz, Germany 2004
Currently produced by KNIME.com AG, a company in Zurich,
Switzerland
KNIME stands for KoNstanz Information MinEr
Pronounced “Nighm”
A general purpose data analytics platform
Free version available for download
For-sale version available with added extensions
KNIME (continued)
Java based
Written in Java
Scripted, extensible in Java
URL: https://www.knime.org/
Pipeline Pilot
Developed and sold by BIOVIA, San Diego, CA
Originally developed by Scitegic, San Diego in 1999
Designed for scientists to “rapidly create, test and publish scientific
services that automate the process of accessing, analyzing and
reporting scientific data”
(http://accelrys.com/products/collaborative-science/biovia-
pipeline-pilot/)
Client-server system
Commercial product
Extensible using .NET and Java
Scripted using an original language, ‘PilotScript’
KNIME Terminology
Components are called “Nodes”
Programs are “Workflows”
Reusable sets of Nodes are “Metanodes”
Groups of related Nodes are “Extensions”
Pipeline Pilot Terminology
Components are called “Components”
Programs are “Protocols”
Reusable sets of Components are “Subprotocols”
Groups of related Components are “Packages”
Different protocols can be combined
One protocol provides initial UI –including a Web form
A second protocol handles form data processing (‘work protocol’)
Different systems shown today
serve different populations
KNIME can be used ad hoc on the desktop of a power user. It is also
used by companies in a variety of industries
Pipeline Pilot is geared towards scientists and is part of an enterprise system and requires a server installation
Programs can be deployed outside
the development client
Give users a URL to access your program
Users of BIOVIA Electronic Lab Notebook and other software can access
Pipeline Pilot protocols outside the Pipeline Pilot UI
Users access a Web application that shows them the data they’re
looking for in a purpose-built user interface
The application does not look like the system with which it was built
For-sale version of KNIME Server provides similar functionality
Server Features
User access configuration
Shared data sources
Automatic jobs
Etc.
Second demo
Exploration of data set using KNIME and Pipeline Pilot
Data set comes from National Cancer Institute (NCI)’s Developmental
Therapeutics Program (DTP)
Results of laboratory tests for activity against 60 types of human cancer
cell lines
Data freely available:
https://dtp.cancer.gov/discovery_development/nci-60/default.htm
Additional demos
Pipeline Pilot Web Port sample
Suggestions for getting started
Download the KNIME software(knime.org)
Install on your computer
Look at the sample workflows
Start simple; build up
Types of applications
Reporting
Data set comparisons
ETL
Data Analysis
References
Scholarly article on KNIME and Pipeline Pilot
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3414708/
www.knime.org
https://www.youtube.com/user/KNIMETV
http://accelrys.com/products/collaborative-science/biovia-
pipeline-pilot/
https://dtp.cancer.gov/
Who is your speaker?
Mitch Miller, Ph.D. in Chemistry and 20+ years of IT experience
Independent consultant: Scientific Thinking, LLC
mitch.miller@thinkscience.us
Some recent projects
Ongoing custodian of one chemical database implementation for ChemIDplus project within the National Library of Medicine
Upgraded 10-year-old Java Servlet lab workflow application to latest version of JDK, Internet Explorer 11 and implemented enhancements
Windows service to handle communication between 2 legacy applications
Import wizard for chemical array designer
Merged a set of chemical databases and harmonized data
top related