2005 04 05 sri eln architecture

32
ELN Architecture Simon Coles President & CTO, Amphora Research Systems

Upload: simon-coles

Post on 20-Aug-2015

321 views

Category:

Technology


3 download

TRANSCRIPT

ELN Architecture

Simon ColesPresident & CTO, Amphora Research Systems

http://www.amphora-research.com/

So...

• You’re on holiday one day• Doing your normal thing• And then you get the call... • they want an ELN!

2

http://www.amphora-research.com/3

http://www.amphora-research.com/

ELN architecture

• Hopefully • I am not going to self-destruct • Your project won’t be as exciting

• Your task is to• Deliver a state-of-the-art ELN system• In tight timescales• With limited budget• In the real world• That the users like• And will serve you for many years

4

http://www.amphora-research.com/

Introduction

• About me• Started working with ELNs in ‘96• President & Co-founder of Amphora• IT background

• First ELN was enterprise-scale ELN for Kodak• Worldwide, 1,000’s of users, diverse user base• Completely Electronic Records (no paper)

• After a long & windy road• New products, lots more deployments, many industries• Certain amount of realism about ELN implementation• Provide Patent Evidence Creation & Preservation

Systems• Work with a wide variety of “ELN” systems etc.• Now based in the US & UK

5

http://www.amphora-research.com/

This presentation

• You can download a copy of this presentation from our web site

6

http://www.amphora-research.com/

Why does architecture matter?

• A good architecture can help• Integrate “Best of breed” tools with existing investments• Allow you to split the project into manageable pieces• Ensure you don’t get “captured” by the vendor• Help your system withstand the ravages of time• Keep your TCO down

• A bad architecture will hurt• Reliability, Scalability problems• Reduce your options going forward• Force you into “Big bang” project

• Some random thoughts on architecture

7

http://www.amphora-research.com/8

• Major issues• Diversity & Flexibility• Project size/Justification/ROI• Creating & Preserving Evidence for Patents• Need for long term access to ELN contents• Scalability• Web-based systems• How your network can help you

• Trends• Integration methods• Open Source• In the lab• Ones to watch

ELN architecture

http://www.amphora-research.com/9

• “Science” covers a wide variety of activity• Each of these is served by its own industry• Improvements in each area needs to happen at its

own pace• Things change

• Different techniques• New data types• Another R&D centre• New devices for use in the lab

• The very essence of “Research” is to change the way you work

• How do we design an ELN which can accommodate these changes?

Diversity & Flexibility

http://www.amphora-research.com/10

• Build on other projects & integrate• if it can be done within another project, then do so• Keeps your life simpler and more focused, clear aims• Those other projects can proceed according to the

rhythm and needs of the specific area • Where possible employ loose coupling between

systems• Message passing reduces implementation complexity• SOAP/OLE/XML etc.

Dealing with change

http://www.amphora-research.com/11

Loosely-Coupled Systems Keep You

Sane

http://www.amphora-research.com/12

• Two approaches• Either attempt to justify the whole ELN in one go

(“Big bang”)• Or Phased

• Divide the project into phases• Each involves a smaller investment (risk)• With a corresponding payoff

• Move forward at a pace that’s comfortable for the business

Project size/Justification/ROI

http://www.amphora-research.com/13

• Historically this was very difficult to do with ELNs• Record keeping• Integration with other systems

• Needs to be designed into the project (& product) from the start• Patent evidence creation/preservation system• Generic science-neutral platform (can often be your

existing IT infrastructure)• Integrate/collaborate with discipline-specific software

• When you can do it, makes a huge difference• Can start at a departmental level if needed• Asking the business to take a small risk each time

Phased ELNs

http://www.amphora-research.com/

Creating & Preserving Evidence for Patents

• Specialized area with very specific (and unique) considerations

• Best done separately from science-specific ELN tools• Hard to reconcile requirements of science and records

in one system• You’ll often have a number of science-focused systems,

yet want only one Patent evidence system• Run by a small group of people who know they’ll end

up in court• Reduce risks & discovery costs

• You can have an “Electronic” notebook for the scientist and still create a paper record

14

http://www.amphora-research.com/

Paper or Electronic?

• The choice often comes down to• Comfort• Practicality• Cost

15

10 100 500 1000

Sys

tem

Co

st

PaperElectronic

http://www.amphora-research.com/

Long term access to ELN content

• Partly this is records management issue• But there’s a heavy technical component

• What format you store your data in• How you store your data• Metadata

• You need to make Open Data formats part of your purchasing requirements

16

http://www.amphora-research.com/17

• Publicly documented• Legally unencumbered

• No patents, copyright concerns etc.• Any patents or copyright must be in the public domain

• Ideally, self documenting (XML is a good start)• Degrade gracefully

• If you can’t the data, at least you can see a picture• Based on more open, primitive formats where

possible• At least two implementations of readers, one of

which is Open Source• Widely used (W3C or IETF standards are good

signs)

“Good” (open) file formats

http://www.amphora-research.com/18

• Good• For text: Plain ASCII, Unicode, HTML, possibly RTF• For graphics: PNG, SVG• For structured data: XML• To preserve appearance: PDF

• Worry about• Storing files in databases

• The database file format is probably undocumented• Store objects on the file system and use the

database to point to them• Anything that is proprietary - there’s no excuse for it,

and it dramatically increases your risk• Binary files generally• Mixing content in files (e.g. embedding XML in PDF)• Proprietary digital signatures

Data formats for the long term

http://www.amphora-research.com/19

IP concerns & data formats

• Companies have always used Proprietary Data Formats as a competitive weapon

• Companies are waking up to the use of IP tools (licenses, patents, copyrights) to reinforce their control over data formats

• Just because a format is published doesn’t mean it is open• The Microsoft Office XML formats are a particularly

bad example• Right now it looks positively radioactive• They’re being very careful what they say which

indicates to me they’re planning something• http://www.groklaw.net/article.php?

story=20050330133833843• (see section: 4. Dissecting Microsoft’s “Patent License”)

http://www.amphora-research.com/20

• There are so many to choose from!• Two key ways of generating “Standards”

• De Facto - dominant supplier/format• De Jure - committee based

• Who gets to “bless” a standard?• What makes a “good standard”

• De Jure process has difficulty keeping up with the real world

• De Facto process has risk of lock-in• Pragmatic approach

• Expect your suppliers to use open file formats• If there is an acceptable standard, use it• Make sure you are using the right kind of format for

each purpose

Standards

http://www.amphora-research.com/

Records considerations

• Not all the “Stuff” that’s generated during the research process is the same• Some of if needs to be kept for a long time• Some is only useful for the moment• Some will be benefit anyone• Some is only really useful for the person who created it

(using specialized tools)• Some material is suitable for long term

preservation, some isn’t• You can go crazy getting into this in too much

detail• But you also need to make sure your tools and

processes do allow you to manage the data/records you’re creating

21

http://www.amphora-research.com/22

• Geographical space• In wide area networks, latency becomes the most

noticeable issue• Over multiple timezones, acceptable “Maintenance

Windows” disappear• More data

• Number of data items• Size of individual data items

• Number of users• Larger populations generally mean more disparate

requirements• How many people will get upset if the system goes

down

Scalability

http://www.amphora-research.com/

Latency

• The science-specific “Deep” systems• Often highly interactive

• Lots of round trips to the server for data etc.• This is what makes them cool

• You can’t beat the speed of light (and network hardware add significant latency)

• Therefore need to have a server close to the end user• Federation will give you a single overview

• “Broad” systems have different usage characteristics• Very much like a normal web site, latency is much less

of a problem• Very easy to have one system for worldwide use, even

for large companies• Building large systems quite easy

23

http://www.amphora-research.com/

Web-based systems

• “Web based” has become a bit of a marketing tool• Generally thin clients offer a lower TCO• And hence IT like them

• In practice, most science-supporting ELN front ends will be delivered as a “thick” client• There’s a reason it’s called a browser• Wrapping an OLE object in IE is still “thick”

• However, “Ajax” systems like GMail and Google Maps show just what you can do with a web-based system

• Web based systems should expose a sensbiel URL interface

24

http://www.amphora-research.com/

How your network can help you

• There’s a whole load of useful network services and Interfaces that large companies have

• Useful ones• Single Sign On• LDAP• Printer/Fileserver etc.• Security/Status monitoring etc.

• Beware of Central Digital Signature Infrastructure• Mixing vulnerabilities - leaves you open to accidents• Often not designed for long term use

25

http://www.amphora-research.com/26

• Major issues• Diversity & Flexibility• Project size/Justification/ROI• Creating & Preserving Evidence for Patents• Need for long term access to ELN contents• Scale• Web-based systems

• Trends• Integration methods• Open Source• In the lab• Ones to watch

ELN architecture

http://www.amphora-research.com/

Integration methods

• RPC-like mechanisms• Service Oriented Architecture• SOAP• REST

• Text file passing (files, email, etc.)• URL launching

• Often overlooked, but very powerful

• What’s important• Loose-coupling• Open, lightweight systems• Consistent, stable keys• Stable URL (& domain) space

27

http://www.amphora-research.com/28

• Definitely one to watch• Not the “Free” lunch you might think, but a

pragmatic business too• Examples

• Linux• Postgres• JBoss, Tomcat etc.• Ghostscript

• Open Source is part of everyone’s infrastructure• Make sure you can run your systems on a variety of

platforms

Open Source

http://www.amphora-research.com/29

• Good for records• Gives you top-to-bottom control

• Good for TCO• We’re finding the Open Source infrastructure easier to

setup and reliable than proprietary alternatives• Enables a better solution

• Transparent systems mean you can do things the original designers didn't think of

• This is especially important for ELNs

Why?

http://www.amphora-research.com/30

• This is just our experience offering people alternatives for the server portion

• 2000 - “What's Open Source? What’s Linux?”• 2001 - No way!• 2002 - some pilots underway, some acceptance• 2003 - majority of installations are Open Source

infrastructure• 2005 - we’re wondering where Windows is• We’re not abandoning proprietary infrastructure

• But it is clear that Open Source is getting serious consideration

• Seeing a migration away from proprietary infrastructure to Open Source

Data point

http://www.amphora-research.com/

In the lab

• ELN use in the lab is a hard problem• Tablets, Laptops, Palmtops etc. doesn’t seem to be

working• What does seem to work

• Small form-factor PCs on the bench• Remote Desktop & Citrix

31

http://www.amphora-research.com/32

• Technology• XML generally• Web Services• Bluetooth and WiFi• RSS• OpenOffice• Jabber (as computer messaging and IM framework)

• Trends• File format nasties• DMCA and other copyright legislation

Ones to watch