knis open architecture v0.1

76
Please provide comments on this draft to [email protected] SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License . 1 SUSA Architecture Team Working Draft Version 0.1 19 October 2010 Open Architecture Project: A Key National Indicator System for the United States Managed by The State of the USA -- Working Draft -- Version 0.1 Please provide comments on this draft to [email protected] KNIS Draft Architecture by the State of the USA is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License . Supported by The John D. And Catherine T. MacArthur Foundation

Upload: stateoftheusa

Post on 08-Apr-2015

698 views

Category:

Documents


0 download

DESCRIPTION

This document is a working draft (version 0.1) of an enterprise architecture for a Key National Indicator System for the United States. It is being published solely by the State of the USA in concert with its technical advisors for open comment. It is specifically intended for technical audiences – in all sectors and at all levels of our society.

TRANSCRIPT

Page 1: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

1

SUSA Architecture Team – Working Draft – Version 0.1 – 19 October 2010

Open Architecture Project:

A Key National Indicator System for the United States

Managed by

The State of the USA -- Working Draft -- Version 0.1

Please provide comments on this draft to [email protected]

KNIS Draft Architecture by the State of the USA is licensed under a

Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

Supported by

The John D. And Catherine T. MacArthur Foundation

Page 2: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

2

Page Intentionally Left Blank

Page 3: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

3

October 19, 2010

To Whom It May Concern,

This document is a working draft (version 0.1) of an enterprise architecture for a Key National Indicator System for the United States. It is being published solely by the State of the USA in concert with its technical advisors for open comment. It is specifically intended for technical audiences – in all sectors and at all levels of our society.

Its purpose is not to finalize a design but to start a specific dialogue over the next year that will underpin important technical decision-making. (Please see www.stateoftheusa.org for more information on the Key National Indicator System and the State of the USA.) Hence, it is not a consensus document. There is ongoing debate and discussion amongst our team and advisors on dozens of issues. However, it is time to open up the process and expand involvement in the project with the publication of this initial version.

The architecture outlines key principles but does not suggest product selection. It generates working hypotheses but does not define operational specifications. It has a five year planning horizon but is only a first step toward designing an official Key National Indicator System implementation. It represents the hard work of a dozen individuals but anticipates engaging hundreds from around the country in a dialogue based on this document.

We are actively seeking critiques, ideas and suggestions about purpose, structure, content and process. Out of this dialogue will come the requirements, design and specifications for how best to start and then evolve an architecture for a Key National Indicator System.

The evolution of democratic society has always been about striving to achieve increasing specificity about progress and higher degrees of transparency. These mutually reinforce one another to accelerate learning and improve accountability for the use of scarce resources our nation grows. The State of the USA is grateful to the John D. and Catherine T. MacArthur Foundation for their visionary support of this activity.

We have concluded that design challenge for a Key National Indicator System can only be accomplished with a combination of an open and inclusive approach – guided by individuals who have histories and track records of large-scale, complex enterprise and systems development. For this reason, the State of the USA Open Architecture Project has been especially fortunate to be guided in this early stage of our process by a

Page 4: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

4

distinguished group of technical advisors, which you will find listed at the conclusion of this letter.

Why is openness so important? Our intent is to make the asset that is built for the nation accessible to the widest range of possible users, from individuals to institutions and from the public to application developers. This is best achieved through a transparent and collaborative process that involves representatives from diverse user and stakeholder communities. Hence, this architectural document anticipates discussions about open standards, open communities and open source software that would all be a vital underpinning of a KNIS.

If you have comments, please contact us at [email protected]. During the fall of 2010, the State of the USA will host two national webinars, one in November and one in December. Final dates will be posted on the SUSA website at the same time this document is published. To register for either one of these webinars, please send an email with the subject title ―ARCHITECTURE WEBINAR‖ to [email protected]. Each of these discussions will be a chance to have more dynamic interactions with SUSA technical advisors on topics raised in this document. It is also our intention to expand this group of advisors over the coming year to increase its scope, depth and diversity.

Please join all of us on this journey to create a Key National Indicator System for the United States. It is one that cannot help but advance our capability to answer vital questions about how to define, measure and communicate about progress – or the lack thereof – in entirely new ways.

Most Sincerely,

Christopher Hoenig

President and CEO

Page 5: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

5

State of the USA Technical Advisors

Bill Allman, Vice President, E-Media for Bonniercorp.com and noted enterprise

information dissemination and visualization thought leader.

Peter Blair, Executive Director of the Division of Engineering and Physical Sciences

at the National Research Council.

George Brucia, Experienced enterprise technologist with a long history of fielding

well-engineered, user-focused systems of high quality, at large scale at both public

and private enterprises.

Hank Conrad, Managing Partner of CounterPoint Corporation and expert in IT

business alignment, systems integration, program management, outsource

management, process improvement, relationship management, change

management, and new technology introduction.

David Epstein, Chief Operating Officer, MAK. A senior technology executive with

leadership experience in the IBM, U.S. military, global research, and national health-

related information technology organizations on bio-surveillance, adverse drug and

quality of care events, intelligent building and city infrastructure, advanced water

management, and market analysis.

Larry Filetti, Managing Partner, 716 Group, Inc and proven enterprise technology

and strategy executive known for enterprise architecture, IT transformation,

technology introduction and delivering systems of high usability, including business

intelligence and enterprise IT to organizations like Argonne National Labs, First

national Bank of Chicago and McDonald's.

Jamie Gaughran-Perez, Partner at Threespot. Creator of user-focused web-

delivered content and systems including delivery of highly scalable solutions to

clients such as Brookings, the NFL, national TV programs and the U.S. Congress.

Scott Gilkeson, Chief Data Officer, State of the USA, and Website Development

Team.

Bob Gourley, Chief Technology Officer, Crucial Point LLC and editor,

CTOvision.com. Project Lead, SUSA KNIS Architecture.

Marvin (Marv) F. Langston, Principal, Langston Associates, and former Deputy

Chief Information Officer for the Department of Defense.

Howard Parnell, Vice President, Content and Creative at the State of the USA

Page 6: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

6

Ron Ponder, EVP and CIO, Wellpoint. Globally known business and IT executive,

former CIO at Federal Express, Sprint and AT&T, expert in large scale operational

implementation and world class business performance.

Ben Shneiderman, Member of the National Academy of Engineering. Professor of

Computer Science and founding Director of the Human-Computer Interaction

Laboratory at the University of Maryland and globally known expert in creativity and

cognition, information visualization, and information technology.

Bill Vass, Globally known IT executive, former CIO of the Office of Secretary of

Defense, former CIO of Sun Microsystems, former President of Sun Federal.

Known for designing and building highly scalable, fast, interoperable user-focused

systems.

Page 7: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

7

Table of Contents

State of the USA Technical Advisors ............................................................................... 5

Introduction and Background ......................................................................................... 10 About This Document .............................................................................................................. 12 Sections of the Draft KNIS Architecture ................................................................................. 13

Foundations of the KNIS Architecture ........................................................................... 15 The KNIS Mission: ................................................................................................................... 15 Audience for this Architecture Document .............................................................................. 16 Architecture Defined ................................................................................................................ 16 Terminology ............................................................................................................................. 17 The KNIS Value Proposition .................................................................................................... 18 The KNIS User Communities and Their Needs ...................................................................... 18 KNIS High Level Requirements ............................................................................................... 19 Design Principles ..................................................................................................................... 22

Conceptual Architecture .................................................................................................. 25

Logical Architecture ......................................................................................................... 27 High-level Guidelines ............................................................................................................... 27

Reuse and Purchase Before Developing ................................................................................ 28 Open Systems and Open Standards ...................................................................................... 28 Vendor Specific Extensions .................................................................................................... 28 Separation of Concerns.......................................................................................................... 28 Decomposition ....................................................................................................................... 28 Systemic Qualities .................................................................................................................. 28 Business Continuity ................................................................................................................ 28 Architecting for Security ......................................................................................................... 29 Architectural Patterns ............................................................................................................. 29 Architecting for Usability ......................................................................................................... 29

Enterprise Tier.......................................................................................................................... 29 KNIS core processes ............................................................................................................. 29 KNIS Architectural Governance .............................................................................................. 30 Architecture, Design Guidance, Implementation Directives .................................................... 32 Contributing Back to the Open Source Community ................................................................ 32

Client Tier ................................................................................................................................. 33 Thin Client Rule ..................................................................................................................... 33 Client Mobility Rule ................................................................................................................ 33 Disconnected Client Rule ....................................................................................................... 33 Client Applet Rule .................................................................................................................. 34 Client Usability Rule ............................................................................................................... 34

Presentation Tier ...................................................................................................................... 34 Localization (L10N) Rule ........................................................................................................ 35 Internationalization (I18N) Rule .............................................................................................. 35 Accessibility Rule ................................................................................................................... 35 End-User Preference Configuration Rule ............................................................................... 36 End-User Role Identification Rule .......................................................................................... 36 Field Validation Rule .............................................................................................................. 36 Presentation Tier Standards Rule .......................................................................................... 36 Active Content Rule ............................................................................................................... 37

Page 8: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

8

Business Processing Tier ....................................................................................................... 37 Services Overview ................................................................................................................. 37 Shared Services ..................................................................................................................... 39

Data Resource Tier .................................................................................................................. 41 The KNIS website will require at least the following data resources: ...................................... 41 Estimates of Data Size and Scope ......................................................................................... 42 Data Access ........................................................................................................................... 42 Data Persistence .................................................................................................................... 43 Data Labels ............................................................................................................................ 44 Data in the cloud .................................................................................................................... 45 Data Registry ......................................................................................................................... 46 Content Management Systems .............................................................................................. 46 Data Provenance ................................................................................................................... 47 Data Value Add ...................................................................................................................... 47

Integration Tier ......................................................................................................................... 47 Data Schema, Format and Semantics .................................................................................... 48 Batch Data Transfers ............................................................................................................. 48 Syndication of Value Added Content ...................................................................................... 48 Syndication and Social Media ................................................................................................ 49 Discovery of data sets and other content ............................................................................... 49 Interaction Models .................................................................................................................. 49 Direct and Indirect Integration ................................................................................................ 50 Third-Party Application Integration ......................................................................................... 51

Technical Architecture ..................................................................................................... 53 High-level Guidelines ............................................................................................................... 54

Operating System Guidance .................................................................................................. 55 Designing for Flexibility in use of new Cloud Capabilities ....................................................... 55

Technology Architecture of Client Tier .................................................................................. 56 Browser .................................................................................................................................. 56 Consumer device apps .......................................................................................................... 56

Technology Architecture of Presentation Tier ....................................................................... 56 HTML and CSS ...................................................................................................................... 56 XML and XSLT ....................................................................................................................... 57 Presenting data via widget: .................................................................................................... 57 Application Frontends ............................................................................................................ 57 User and Usability Testing ..................................................................................................... 57

Technology Architecture of the Business Processing Tier .................................................. 57 Web Services ......................................................................................................................... 57 SOAP and REST:................................................................................................................... 58 Assertion of authorization in a Web Services environment ..................................................... 59 Web Services Continued ........................................................................................................ 59 Application Business Web Services ....................................................................................... 60 Web Service Registry ............................................................................................................. 60 Data Registry Choices ........................................................................................................... 61

Technology Architecture of the Data Resources Tier ........................................................... 61 Relational Database Management Systems (RDBMS) ........................................................... 62 Directory Servers ................................................................................................................... 62 Object-Oriented Databases (OODB) ...................................................................................... 62 XML Database ....................................................................................................................... 63 File Systems .......................................................................................................................... 63

Page 9: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

9

RDBMS or Directory Servers.................................................................................................. 63 Java Database Connectivity (JDBC) ...................................................................................... 64 Data Storage .......................................................................................................................... 64 Data Source Types ................................................................................................................ 65 Content Management System ................................................................................................ 66 Content Abstraction Layer ...................................................................................................... 66

Technology Architecture of the Integration Tier .................................................................... 66 Open System Web Server ...................................................................................................... 66 Open System Application Server ............................................................................................ 67 Open System Portal Server .................................................................................................... 67 Relational Database Server ................................................................................................... 67 Monitoring Products ............................................................................................................... 67 Network Attached Storage (NAS) ........................................................................................... 67 Storage Area Network (SAN) ................................................................................................. 68 Virtual Private Network (VPN) ................................................................................................ 68 Monolithic Applications and Legacy Applications ................................................................... 68 Syndication of Value Added Content ...................................................................................... 68 Integration Testing ................................................................................................................. 68 Content Delivery Services ...................................................................................................... 69

Technology Trends to Watch .......................................................................................... 70

The Current SUSA Beta Architecture ............................................................................. 71

Glossary ............................................................................................................................ 72

Architecture Resources ................................................................................................... 74

Table of Standards ........................................................................................................... 75

About This Architecture................................................................................................... 76

Page 10: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

10

Introduction and Background

This document is a working draft (version 0.1) of an enterprise architecture for a Key National Indicator System for the United States. It is being published solely by the State of the USA in concert with its technical advisors for open comment. It is specifically intended for technical audiences – in all sectors and at all levels of our society. Its purpose is not to finalize a design but to start a specific dialogue over the next year that will underpin important technical decision-making. Hence, it is also not a consensus document. There is ongoing debate and discussion amongst our team and advisors on dozens of issues. However, it is time to open up the process and expand involvement in the project with the publication of this version 0.1.

In preparations for continued support to a Key National Indicator System (KNIS) for the United States, the State of the USA has drafted this architectural vision, principles, concept and plans relevant to the implementation of a KNIS. The purpose for this document is to leverage shared assets and accelerate learning among the many participants in a national indicator system to maximize its full potential for service to the American people. The version outlines key principles but does not suggest product selection. It generates working hypotheses but does not define operational specifications. It has a five year planning horizon but is only a first step toward designing an official Key National Indicator System implementation. It represents the hard work of a dozen individuals but anticipates engaging hundreds from around the country in a dialogue based on this document.

We are actively seeking critiques, ideas and suggestions about purpose, structure, content and process. Out of this dialogue will come the requirements, design and specifications for how best to start and then evolve an architecture for a Key National Indicator System.

The evolution of democratic society has always been about striving to achieve increasing specificity about progress and higher degrees of transparency. These mutually reinforce one another to accelerate learning and improve accountability for the use of scarce resources our nation grows. The State of the USA is grateful to the John D. and Catherine T. MacArthur Foundation for their visionary support of this activity.

The mission of the State of the USA is to help the American people assess the progress of the nation for themselves, using the nation’s best quality measures and data. Its vision is to make these available to the public on the web as a free service in such an easily usable form that they become a shared frame of reference for civic debate on whether we are, in fact, making progress on the major issues we face.

As a non-profit, non-partisan institution, it conducts work in a collaborative and transparent fashion, involving a diverse range of individuals and institutions. The State of the USA’s founding in 2007 consciously built on 20 years of work by millions of Americans that had

Page 11: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

11

already created a patchwork of key indicator systems at the neighborhood, city, county, regional and state levels. (For more information, including a history of the effort, please see www.stateoftheusa.org.)

Starting in 2010, the growing movement by Americans to assess progress with key indicators officially reached the national level. After many years of development and bipartisan support, a Key National Indicator System for the United States has been mandated by law (P.L 111-148, sec. 5605). A bipartisan Commission on Key National Indicators is being constituted by Congressional leadership of both parties. That Commission will then negotiate an agreement with the National Academy of Sciences to implement a web-based KNIS in partnership with a non-profit institute, the State of the USA.

Although these relationships are still being formalized, preparation has begun in earnest, which is the reason for SUSA’s publication of this document. As a public/private partnership, resources and talent from both the public and private sector must be involved early in the process of preparation. This document has not been reviewed or approved by the National Academy of Sciences, the National Academy of Engineering, the Institute of Medicine or the National Research Council.

A Key National Indicator System can help millions of Americans become better informed about the progress of the United States on a wide range of issues, from education to innovation, from the environment to the economy, and from families and children to health. The question this document begins to address is how best to design this system for the country.

For clarity, this is an architecture for a ―national‖ system, not a ―governmental‖ system. It must take account of and complement efforts in government. But a national system in our society must involve the government, business, media, non-profit and academic sectors. It must involve government at the federal, state and local levels as well as international organizations that collect and publish data for purposes of comparing the U.S. to other countries.

In addition to such a broad scope, the design task is made doubly challenging by a technology environment with a dizzying rate of evolution and innovation. The design must optimize performance and openness, innovation and continuity for the nation, as well as balancing hundreds of other potential tradeoffs. It must support continuing, high quality production while keeping pace with the external technical environment.

We have concluded that this design challenge can only be accomplished with a combination of an open and inclusive approach – guided by individuals who have histories and track records of large-scale, complex enterprise and systems development. For this reason, the State of the USA Open Architecture Project has been especially fortunate to be guided in this early stage of our process by a distinguished group of technical advisors.

Page 12: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

12

Why is openness so important? Our intent is to make the asset that is built for the nation accessible to the widest range of possible users, from individuals to institutions and from the public to application developers. This is best achieved through a transparent and collaborative process that involves representatives from diverse user and stakeholder communities. Hence, this architectural document anticipates discussions about open standards, open communities and open source software that would all be a vital underpinning of a KNIS.

If you have comments, please send them to [email protected]. The goal of this project is to continually expand participation in order to increase its scope, depth and diversity. At the conclusion of this version, the State of the USA technical advisors identified several areas that are high priority for inclusion in our work program for the next iteration of this document – version 0.2. Hence, these are of special interest for those providing feedback. Those areas are:

-- Refinements in audience priority and segmentation

-- Increased detail in user/stakeholder requirements definition

-- Increased detail of enterprise process design

-- Specificity of user experience and information architecture

-- Specificity of data services, curation and integration

-- Elaborate roadmap for implementation over time

-- Expanded references to key documents

-- Expanded references to comparable sites/installations

-- Increased treatment of security and privacy considerations

About This Document

This version 0.1 of the KNIS architecture was prepared by an interdisciplinary team of issue experts, technologists, program managers and enterprise architects, supported with initial input from KNIS stakeholders. In its current state, it sets aspirational goals for openness and is intended to provoke thought and debate about how best to design and implement national and community indicator systems. The State of the USA would like to see this framework develop into a system that others could adopt and adapt, thus creating synergy and shared value. The aim during the coming year is to evolve a collaborative design which can be borrowed and built upon by the many organizations required to support a KNIS. The architecture should eventually also provide a frame of reference for third-party developers seeking to create independent end-user capabilities that may be enabled by a KNIS.

Page 13: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

13

Caveats and Context:

- This is a living document in continuous development while input is being solicited from a broad community. Since it is an initial working draft, in the future please consult with SUSA to ensure you are working off of the latest version.

- This document is based on an evolving set of user requirements, of which only a brief summary is provided here. SUSA is currently running both a public site www.stateoftheusa.org and a more advanced private beta implementation to gain input on requirements definition, evolve design principles, test alternative components and assess performance. If you are interested in becoming a beta user, please sign up on the public site.

- This is a high-level design document and intentionally does not provide the specificity required to implement a KNIS. Architectures with progressively more detailed specifications will flow from this one.

- This document is written with a five year forward planning horizon in mind. It anticipates multiple specific implementations within that five year horizon as well as continuing alteration and adaptation of the architecture based on evolving requirements, technologies and key external factors in the market space.

- This document provides best practices, lessons learned and architectural decisions we believe are right for enterprise capabilities of this nature, but customization is also required prior to establishing any implementation.

- Designs and implementations that that flow from this architecture are intended to support the Key National Indicator System articulated in P.L. 111-481 sec. 5605. However, the KNIS is still in the process of formalizing governance and management processes. Hence, this draft has not been reviewed by or approved by any institution established by or named in that law.

Sections of the Draft KNIS Architecture

Key sections of this draft are:

A Conceptual Architecture View: This section identifies the main components of the architecture and provides important context.

A Logical Architecture View: The logical architecture section explains component functions and their interrelationships in greater detail, which begins to provide directional guidance for implementation. This guidance is reflected and embodied in the KNIS technical architecture, but it is also repeatable in other technical architectures.

A Technical Architecture View: The purpose of a technical architecture is to map defined components from the logical architecture to specific implementation technologies. These technologies are generally layered and support standard interfaces that allow them to be used in a ―plug and play‖ manner.

1 For more on P.L 111-148 see PDF at: http://www.stateoftheusa.org/assets/Key%20National%20Indicators%20Act%20of%202008.pdf

Page 14: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

14

In short, the conceptual architecture is initial context, the logical architecture provides guidance and rules regarding logical component segmentation, and the technical architecture provides more specific standards and technologies.

A graphical overview of the draft KNIS architecture is provided in Figure 1 below. This graphic will be returned to again in the conceptual architecture. It will be modified and extended in the logical architecture section. And it will be expanded on again in detail within the technical architecture section.

Figure 1: Graphical Overview of the Draft KNIS Architecture

Page 15: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

15

Foundations of the KNIS Architecture

This section of the draft KNIS architecture provides an overview of the background, requirements and guidance necessary to ensure the components of the KNIS are aligned with the mission and accomplish the objectives outlined in P.L. 111-48.

The KNIS Mission:

A Key National Indicator System can help Americans better assess the position and progress of the nation for themselves, freely and easily, with the best quality measures and data on the most important issues facing the country.

The United States faces many systemic issues, with few systemic ways to measure and manage the progress of our society. The architecture presented here is designed to address this situation by enabling a source of high quality measures, free and easily usable for millions, of measures on the nation's major issues. The impact can be seen in better framed problems, increased understanding of what we know and of what works, more informed choices, and improved resource allocations.

Unique KNIS attributes:

Breadth – A KNIS addresses all topics relevant to a society (e.g., economy, innovation, families, youth and children, education, environment, health) with a dynamic topic structure that can be extended to meet consumer demand or evolving concepts.

Focus – A KNIS presents carefully selected issue frames, indicators and datasets from the highest quality sources for each topic—on the order of tens of measures, rather than hundreds or thousands. The selected indicator set will evolve, and individual indicators may appear under multiple topics. The issue frames, indicators and data sets will be selected in an editorial process designed by the National Academy of Sciences, the National Academy of Engineering, the Institute of Medicine, the National Research Council and the State of the USA.

Consistency – All of the indicators presented by the KNIS will have similar functionality and expression. Once site visitors understand a simple and intuitive way to interact with one measure, they will know how to interact with any of them and explore interrelationships. The underlying data will be available for downloading, either through the user interface or via a standard internet protocol.

Multi-dimensionality – Each indicator will be available with as much detail as possible within quality constraints along four major dimensions: time, geography, demographics (or other appropriate aspect) and conceptual decomposition. At times, this will mean using data from various different sources, which may not be strictly comparable, for the same measure. For example, obesity is measured clinically by a small survey which can only provide data at the national level. In order to show obesity at the state level, a less reliable but much larger survey must be used. Data quality is a component of this in all dimensions.

Page 16: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

16

Audience for this Architecture Document

The KNIS architecture was written for designers and developers of enterprise, information and technology systems. These are the people who will create and deliver a capability. It was also written for those who will oversee the production of the capability, including stakeholders and mission partners.

Architecture Defined

"Architecture serves as the blueprint for both the system and the project developing it, defining the work assignments that must be carried out by design and implementation teams. The architecture is the primary carrier of system qualities, such as performance, modifiability, and security, none of which can be achieved without a unifying architectural vision. Architecture is an artifact for early analysis to make sure that the design approach will yield an acceptable system. Architecture holds the key to post-deployment system understanding, maintenance, and mining efforts. In short, architecture is the conceptual glue that holds every phase of the project together for all its many stakeholders." – From the Carnegie Mellon Software Engineering Institute, http://www.sei.cmu.edu/architecture

While there is no single, widely adopted definition of architecture, the many definitions available have a great deal in common, and SUSA's approach to architecture is consistent with that of SEI, above. Figure 2 provides a visual guide to what is included and not included in this architecture.

Figure 2: Scope of the KNIS Architecture

Page 17: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

17

Terminology

A few terms are used extensively throughout this document. To provide a working understanding of these terms, they are defined briefly below. For more detailed definitions, refer to the Glossary

Principles, Strategies and Recommendations

These terms are often confused. As they apply to enterprise architectures, their definitions are as follows:

- A principle is a high-level rule that is well established and not likely to change over time.

- A vision is a desired future state.

- A strategy is a means for achieving a vision, often a medium-level rule or guideline that applies to a specific area of the architecture. Strategies can evolve over time as the technological landscape changes.

- A recommendation is a specific ―best practice‖ that has proved to be effective and desirable based upon past experience.

Component, Service

A component or service is a unit of software with a single purpose, which has an interface, and interacts with other components and services. Services are distinguished from components in that they tend to run in their own process, whereas components may simply be software libraries. For a more detailed definition of service, refer to the W3C standard.

Interface

As defined by CMU/SEI-2002-TN-015, an interface is ―a boundary across which two independent entities [components] meet and interact or communicate with each other.‖

The specification of interfaces at the architectural level is extremely important to ensuring that components can be built independently yet work correctly together.

Dependency

Component A has a dependency upon another component, B, if the correct functioning of A depends on the existence and correct functioning of B. Additionally, any change to B may have an effect on component A.

Must, Should and May

In this document, wherever possible, the following terms are to be interpreted as described in the requirements language standards found in the RFC 2119 standard (the standard is paraphrased here for the reader's convenience):

- When the word must, required or shall is used, the statement is an absolute requirement of the specification.

- When the phrase must not or shall not is used, the statement is an absolute prohibition of the specification.

Page 18: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

18

- When the word should or recommended is used, there may exist valid reasons to ignore the statement under certain circumstances, but the full implications of doing so must be understood and carefully weighed.

- When the phrase should not or not recommended is used, there may exist valid reasons the behavior is acceptable or even useful under certain circumstances, but the full implications should be understood and the case carefully weighed before implementing the action.

- When the word may or optional is used, the statement is not compulsory.

The KNIS Value Proposition

The KNIS is creating a single source, provided free and easily usable for millions, of the highest quality measures and data on the nation's major issues.

Current and future versions of the KNIS site and IT capability are designed to be easy to use and to provide tools that will enable KNIS staff and Americans to discover, understand and share information across the Web through distributed publishing and social networking.

In so doing, the KNIS seeks to unite nonprofits, the media, government decision makers, business leaders, scientists, educators and citizens around a single goal: to deepen and broaden our factual knowledge and understanding of the country's most pressing issues.

Relying on expertise and quality assurance from the National Academy of Sciences, the National Academy of Engineering, the Institute of Medicine, the National Research Council, the statistical community, the scientific community and individual, nationally recognized subject-matter experts from all sectors, a KNIS will assemble the highest quality quantitative measures and related data and develop Web presentations designed to make it easy for interested citizens to assess whether progress is being made, where it is being made, by whom and compared to what.

KNIS value proposition emphasizes:

- Highest quality, most current data on the issues that matter most

- "Key" measures incorporate ease of understanding, grasping what matters most

- Reliable, free and accessible data and contextual content

- Engaging and educational data and context

- User interaction, commenting and discussion opportunities

- Publishing on Web time, in constant motion with frequent updates

The KNIS User Communities and Their Needs

Delivering a valuable service to users is at the heart of a KNIS. Users can be segmented into four broad categories: Individual Users, Institutions, Partners and Stakeholders and Developers:

Individual Users: This segment includes all users with an interest in quality measures and data on the state of their nation. Additionally, a KNIS is a means to introduce new topics into consideration for discussion. The public includes many sub-segments of the U.S.

Page 19: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

19

public: policy makers, business, students, researchers, professionals, legislators, educators, journalists and any other community requiring access to quality data. This user segment can be served as individuals but increasingly individuals are leveraging social media and collaborative capabilities to work on large issues and the KNIS intends to support and leverage social media in providing its service to the public.

Institutions and Partners: A KNIS will work with many other institutions in the key national indicator ecosystem, many of whom are data providers, many of whom are data users, and a large portion of which are both. They will include both public and private institutions. The KNIS will benefit institutions (including governments, business and non-profits) by enhancing their access to actionable information to enable better strategies and resource allocation choices on investments in complex issues. The KNIS will provide media partners with new information and tools that improve productivity, depth of coverage and accuracy. The KNIS will provide business partners better insight into broad societal patterns and trends for planning, investment and product/service creation. Education partners will be provided with information that enables improved quality of curricula, increased numeracy, better understanding of public issues, and increased levels of meaningful civic engagement.

Stakeholders: These include the American people, the U.S. Federal government, state and local governments, the business community, civil society, KNIS leadership, KNIS partners, statistical data providers, concerned foundations, academicians, and a wide variety of other members of the KNIS family of stakeholders. Some of these stakeholders will leverage the KNIS infrastructure to be a reliable source of timely and accurate data and repeatable models. Others will use the architecture to disseminate information.

Developers: The KNIS is building wherever possible on open platforms using open standards designed to empower developers. This community needs information on KNIS

measures and data and how to find them, as well as guidance on information quality. They are also appreciative of ways to share lessons and code.

KNIS High Level Requirements

The KNIS strategy for building a broad user base and an active audience for its content is to employ state of the art social networking and syndication techniques to promote a dynamic, engaging web site. This means that KNIS content will be made available through multiple channels, in addition to a standard web site. Additional channels include widgets that can be embedded in other web pages, direct programmatic access to indicator data and metadata, online webinars and YouTube videos, and more.

Design of the KNIS capability will create a clear and engaging environment for the audience to explore, learn, and take action. The interface should be clean and intuitive, and should facilitate the easy location of indicators, data, and editorial content. The design should also facilitate the incorporation of syndication, including widgets within other web sites/services to facilitate syndication/distribution. A range of visual and interactive methods should be used to identify, clarify, and compare issues. Deeper exploration of issues, measures and data should be a consistent possibility.

Page 20: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

20

The KNIS is working to engender trust and credibility, and the aesthetic must be intelligent, authoritative, and professional. However, the experience should not be overly complex or technical. The cumulative goal of design and content organization should be to offer the audience a clearer understanding of one or more national issues, and to provide a set of flexible and branded tools with which to take further action.

The KNIS will provide an experience that allows users to search and manipulate data in the context of a very flexible super-issue, issue and sub-issue construct to gain a better sense of systemic issues. The KNIS will also strive to achieve credible simplicity through carefully selected data and measures with easy links to more complex or sophisticated sites or data sources.

The following are key categories of high-level user requirements which are currently being defined in our ongoing processes, which include a combination of user testing and feedback, use case scenario building, and specification to mainstream technical standards:

- Performance and Reliability: The site must be able to meet mainstream market standards for uptime and responsiveness.

- Availability: All KNIS functionality will be highly available and reliable and an appropriate security design will ensure this. Additionally, in the event of a catastrophe, all data will be backed up and ready to respond and recover.

- Scalability: The site must be able to grow to consumer-scale and withstand not only high user demands but diverse data types and be able to do so with resiliency of a mission critical system.

- Multi-platform and multi-vendor: The KNIS must be capable of supporting multiple technology and vendor products.

- Portability: The KNIS must emphasize capabilities for sharing, syndication and social interaction.

- Selectivity: Users should be able to understand the rationale for selection of issue frames and limited numbers of key measures and data sets to enhance their capacity for assessing the nation as a whole.

- Credibility: Choices of issues, measures and data sets and the presentation of information must meet the highest standards of professionalism.

- Accessibility: The site should support a multilayered design balancing credible simplicity and complexity, for all types of audiences, along with freely available content and no advertising.

- Quality: Dimensions of user-centric measure and dataset quality should be incorporated in metadata and essential dimensions exposed so that users can assess relative information quality depending on their intended use.

- Utility: Content should be presented in a fashion that makes meaningful facts easy to discover and then presents measures and data in a way that is practical and useful.

Page 21: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

21

- Multi-Dimensional: The KNIS should give the user the capacity to look at any measure over time, at different levels of geography, by demographic group or at various levels of information density/abstraction.

- Content in Context: The KNIS should offer not only the essential statistics but also contextual information that enhances understanding and engagement without interpretation. Issues, measures and data will always be presented in relationship to one another.

- Multi-faceted: The KNIS should offer users with different cognitive and learning styles a variety of ways to engage in the information, using different sensory modes.

- Multi-media: The KNIS should support the capacity to visualize and interact with measures and data through a full range of techniques.

- Navigation and Orientation: The site should support navigation in a variety of intuitive ways and allow users to maintain consistent sense of orientation.

- Interactivity and Continuity: The KNIS should make it possible for users to interact with and explore data and information in a variety of ways and then to save their work and build understanding cumulatively over time.

- Balanced: The KNIS must present what is known and what is not known, what is available and what is not available, where information exists as well as where gaps in coverage need to be filled.

- Involvement and Diversity: The KNIS should give the American people many ways to be involved in issue framing and indicator selection, constantly balanced against expert input.

- Independence: People, processes, vendors, content and products must be selected on needs and demonstrated capabilities and not influenced by inappropriate bias.

- Persistence: All pages, once published, must have a persistent URI so that links established from other web sites remain viable.

- Openness, Transparency and Extensibility: Decisions must be based on open and transparent processes, sources, open standards, open principles, and, to the greatest extent possible, open source software.

- Synchronization: Updates should draw from a large ecosystem of hundreds of data providers while being continually updated with acceptable and reliable latency times.

- Security: The KNIS must provide security in the experience, including confidentiality of user information, assured availability of all services, and assured integrity of all data.

- Flexibility and Adaptability: The KNIS must be able to respond to user and stakeholder input as well as market evolution with frequent integrated updates.

- Confidentiality: The core reason for existence of the KNIS is to get the right information to the right users, and transparency in doing that is always the default answer. However, at times the KNIS will be entrusted with information that must

Page 22: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

22

remain confidential. This includes any information associated with users, such as logins, search strings or publicly identifiable information.

- Integrity: The data used by and held by the KNIS and the information mixed, mashed and modified by users must be secure from tampering, including secure from tampering by sophisticated adversaries.

Design Principles

The following are KNIS architecture design principles. These principles guide the KNIS

architecture decision-making process, including actions of the governance team, the design team, and SUSA staff:

1. The entire effort is focused on the users and their experience. 2. Web services will be used to the greatest extent possible. 3. Open source approaches are preferred. 4. We standardize on open standards. 5. KNIS designs will be OS independent. 6. KNIS designs will be client and browser independent. 7. The architecture will be broadly understandable and broadly communicated. 8. The design will empower communities. 9. We design for scalability. 10. We design for interoperability. 11. We design for flexibility, extensibility and an ability to evolve. 12. We design for universal accessibility and usability

1. The entire effort is focused on the user and their experience. The priority driving principle of the Key National Indicator System is that humans must be empowered for greater understanding and better decision-making. This activity is all about people who will be using the system and the design team will build architectures that place the American people’s experience in the primary position it deserves. We also recognize that success here will require far more than design. It will also require continuous process of usability testing and community engagement. 2. Web services will be used to the greatest extent possible. Web services enable system-to-system communication, creating a means for reliable exchange of information and autonomous synchronization. The standards and specifications associated with web services have been proven to provide scalable, reusable, interoperable capabilities and will be used in KNIS designs. Implications: The architecture will come with the many benefits of web services, but care must be taken to ensure reliability meets expectations. With web services, reliability must be engineered in. 3. Will design with the open source community in mind. KNIS framework designs are not being built to favor any single software package or suite of tools. But as a key architecture principle, KNIS will design with the open source community in mind. The KNIS framework should be implementable for a low cost and deliver high availability with solid security. Commercially supported open source is supportive of these requirements. If,

Page 23: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

23

during design work, the team identifies requirements that are best met by proprietary approaches or software, these components will be well-articulated and steps will be taken to ensure that transition away from proprietary solutions is possible in the future. Implications: This early attention to open source solutions will help build community free from undue influence. However, care must be taken to ensure all decisions, open source or proprietary, are documented well. 4. We standardize on open standards. It is the intent of the design team to follow best practices as articulated by the open standards group. These may include groups such as The Open Group, the Organization for the Advancement of Structured Information Standards (OASIS) and the Object Management Group (OMG). Using the standards and implementation guidance of widely known and highly respected organizations will help ensure standards are implemented in repeatable ways. We intend on following the SoaML (Service oriented Architecture Modeling Language) as a way of clearly articulating implementations of standards. Implications: use of best practices provides lessons learned from many other environments. The design team will be well-versed in these best practices and will only deviate when there is good reason. Here too, a key implication is that design choices must be well documented. We also expect to use open standards, where possible, for data security (especially data integrity).

5. KNIS designs will be OS independent. The KNIS framework is being built in a way that can be implemented by a wide range of organizations. Although we intend on engineering for secure scalability with reliable systems (and open source operating systems will be the first choice) we will take every step to be as implementable as possible in any operating system to ensure the greatest possible adoptability. Implication: Engineering for OS independence requires attention to detail and experience with a wide range of OSs. 6. KNIS designs will be client and browser independent. The KNIS designs will support users on any client, including traditional PCs, smartphones, cell phones and tablets. End users will access most information from the framework via browsers, and we intend on supporting all major browsers. Implication: This goal can be hard to achieve but we view it as very important to attempt, since there is no lock-in by any one OS or other software platform vendor for client devices. Citizens should be able to interact with the KNIS architecture from any device. 7. The architecture will be broadly understandable and broadly communicated. The open, collaborative vision of the KNIS requires an architecture that is available to all those who will participate in governing, building, interacting with or overseeing it. The design team will, to the greatest extent possible, avoid producing an architecture which can only be understood by the design team. Implication: This architecture must be written and re-written, with an ever-increasing circle of diverse technical input, until it has been demonstrated to be widely understandable and useful. 8. The design will empower communities. We expect and will engineer for a high degree of community involvement in the resulting system. The design being produced will be implemented by the KNIS to provide a web presence for a community (i.e., virtual, geographic, demographic). But we also expect this to be empowering in a different way. It

Page 24: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

24

is also to be designed for the use of an empowerment of any other community which can benefit from it, including international communities. We intend on the design supporting communities but recognize that success here will require far more than design. It will also require a continuous process of usability testing and community engagement. 9. We design for scalability. With a target audience of hundreds of millions, designs must support very large ―consumer-scale‖ use. Virtualization will be the scalability path. Implication: If any concept or decision is introduced that has yet to be proven as scalable it will be well researched and scalability risks mitigated and stress-tested. 10. We design for interoperability. A KNIS will require data, results, graphics and spreadsheets being exported from systems for further use, including embedding in sites and consumption by other automated tools. This interoperability will be designed in from day one. Implications: The design will need to be tested for interoperability characteristics. 11. We design for flexibility, extensibility and an ability to evolve. Designs must be established so that they will allow any single component to be removed and replaced. This is important for the design's ability to evolve over time and is also an important enabler to allowing variation between sites in ways that does not impede interoperability. To the greatest extent possible, the architecture will not require specific software packages. Implications: Enhancements in functionality and the introduction of new technologies are expected. When they are made, the design will enable smooth evolution, in small increments that will minimally impact other systems.

12. We Design For universal accessibility and usability. Designs will adhere to the highest standards of providing access to users with visual, auditory, and motor disabilities as specified in Section 508 of the Rehabilitation Act. In addition, our design will strive to serve the needs of users with cognitive limitations and low literacy or numeracy skills, while keeping in mind the needs of young/old and novice/expert users.

These principles are a key means of evaluating the architecture and will be a continual compliance check to ensure the architecture below sets the KNIS on the right path.

Page 25: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

25

Conceptual Architecture

The Conceptual Architecture provides important context relevant to understanding the entire architecture and which is useful in explaining the architectural intent.

The KNIS conceptual architecture can be expressed as illustrated in Figure 3.

The KNIS architecture provides data and services to end users via their client devices and also enables syndication of data to other systems. The conceptual architecture includes:

Enterprise Tier Components – This is the realm of overall governance (e.g. strategy, fiduciary, policy, requirements and leadership as well as technology and data governance).

Client tier components – Software that must reside on the client hardware (e.g., desktop PC, PDA, cell phone, etc.). In general, these are restricted to browser software.

Presentation tier components – Software responsible for rendering information that will be conveyed to the end-user (including system administrators). All of the screens specified

Figure 3: KNIS Conceptual Architecture

Page 26: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

26

by end-users during use-case modeling reside in this tier. Often, these screens will be encapsulated within the user interface.

Business processing tier components – Software responsible for performing business-specific functions. All of the business operations specified by end-users during use-case modeling will reside in this tier. These include components that implement business transactions, validate user requests, apply and enforce business rules, as well as components that ―wrap‖ legacy systems and databases (although the legacy systems and databases themselves do not reside in this tier). Although there are security components throughout the architecture, some critically important business tier contributions to security include user authorization, account management, and mechanisms to ensure only authorized users can change data.

Data resource tier components – Data management components (e.g., databases, directory servers, etc.) responsible for managing the persistent state of business data. All of the business objects specified by end-users during domain-object modeling will be managed by the components residing in this tier. The data resource tier also contains the legacy systems that are ―wrapped‖ by business processing tier components. Rules and capabilities for ontologies and taxonomies are articulated in this layer.

Integration tier components – General-purpose or business-specific components used to tie together business processing tier components with data resource tier components or resources that are external to the application. The components in this tier generally are not visible to end-users. These ―messaging‖ components usually take the form of queued messaging servers, publish/subscribe event servers, or a combination of the two. The Integration tier is the interface to syndication services provided from the KNIS and the interface to data providers. The integration tier and its syndication services provide developers with connectors which can facilitate enhanced direct connection to social media outlets (for example, Facebook, LinkedIn, Twitter).

Page 27: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

27

Logical Architecture The purpose of a logical architecture is to specify the work of logical components in more detail, based on the desired functionality of the enterprise. Each component should do one thing and do it well – accounting for interactions with other components. The logical view describes the problem from an abstract, platform and technology-independent perspective. It describes the software elements that meet the system's functional requirements. It describes the design of individual services, their interfaces, and their operations. This section provides architectural specifications and guidelines flowing from the conceptual architecture. It lists specific guidelines for each tier. The relevant systemic qualities and future directions related to logical architectures are also discussed. An overview of the Logical Architecture is presented in Figure 4.

High-level Guidelines

The following are KNIS guidelines for the logical architecture:

Figure 4: Logical Architecture

Page 28: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

28

Reuse and Purchase Before Developing

The development team should seek to reuse existing infrastructure and, when none exists to meet business requirements, make informed buy-versus-build decisions before proceeding with new development projects.

Open Systems and Open Standards

In any system purchase or development project, open systems and open standards should be preferred above proprietary technologies, after considering comparative functionality, total cost of ownership and track records for adoption and innovation.

Vendor Specific Extensions

When vendor capabilities are used they are to be maintained in as open a configuration as possible. All vendors, even those which base their capabilities on open source, provide means to extend their capability. In most cases, extending capabilities like this introduces future interoperability issues and can end up having the same negative interoperability and vendor lock-in issues as purchase of closed source proprietary capabilities does. If vendor specific extensions are used, they should be encapsulated and well commented/documented so they can be removed or replaced with the least impact on the surrounding systems.

Separation of Concerns

The logical architecture provides focus on the separation of concerns within the application: each tier deals with a specific logical area of the application (presentation, processing, data management, etc.) and each component within a tier should focus on one and only one concern.

Decomposition

Because decomposition isolates specific responsibilities to individual components, so they may be addressed independently, it is important to ensure that the required functionality can be delivered by components working in collaboration. To provide the proper context for the development and use of each of component, functional responsibilities for each component must be assigned and documented.

Systemic Qualities

Systemic requirements (such as performance and uptime) and functional requirements are of critical importance and are articulated where possible in this logical architecture. To properly address systemic qualities, sets of collaborating components must be considered together. Performance, for example, should be addressed in terms of the patterns of interaction the design calls for, not just from the perspective of the individual parts.

Business Continuity

Recoverability, redundancy, and maintainability should be addressed during application design, based on criticality and impact to the mission, in order to determine the required level of continuity.

Page 29: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

29

Architecting for Security

There are many non-architectural aspects of security and the designers want to take this opportunity to alert all readers to that fact. Policy, Process, Training and Governance are all critically important to security. However, there are also important technology considerations for both the logical and technical architecture.

Considerations, using the previously articulated constructs of Confidentiality, Integrity and Availability include:

Confidentiality: When users provide personal data (for configuration or savings of settings) that information must be protected.

Integrity: Data must be provided in its pure, unchanged form, with no threat of modification by adversary or accident.

Availability: KNIS services, including syndicated services, must be provided at a high availability and the system must be designed to withstand both natural disaster and computer attack.

Architectural Patterns

Existing architectural patterns, reference architectures, business services, etc., should be leveraged wherever possible. This includes emerging reference architectures and patterns for use of emerging cloud capabilities. Where possible, architecture patterns are used in the technical architecture below.

Architecting for Usability

When design trades are considered, priority weighting for decisions must be on usability factors, since this is the overriding "meta requirement" of the KNIS architecture.

Enterprise Tier

The enterprise tier is the domain of business processes, governance and key standards and is intentionally articulated here in the logical architecture section since it is a driver of all other components of this tier (the KNIS is a mission-driven capability).

KNIS core processes

The KNIS institution will support four key processes:

- Content Management: Including selection of issues, sourcing of data sets, presentation, publication, design, evolution and adaptation of information over time, as well as data quality.

- Product Management: Focused on product/service design, development and maintenance to performance specifications.

- Strategic Development: Includes communications, fundraising, marketing, public relations, partnership and stakeholder management.

- Institutional Management: Includes all aspects of relationships with public and private stakeholders, the National Academy of Sciences and the State of the USA

Page 30: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

30

leadership (e.g., governance, planning, finance and accounting, legal, information systems, human resources, audit and oversight).

All four of these core KNIS processes have complex interactions between them. But they are bounded in the context of larger societal processes of debate, learning and change. It is vital to understand those boundaries. A KNIS will focus on the presentation of measures and data, in the context of issue frames and with a high enough utility that they can be used for analysis by users. However, the boundaries of the KNIS enterprise processes do not extend into education, choice, change or dialogue. A simplified model for societal processes is diagrammed below in Figure 5:

KNIS Architectural Governance

Architectural governance is the means to ensure that processes and technologies support excellence in the pursuit of the KNIS mission. This section articulates an initial approach to KNIS governance. KNIS architectural governance processes are subordinate to the overall KNIS governance processes. A summary follows:

Advisory Processes The KNIS maintains an architectural advisory board consisting of experienced enterprise thought leaders from a wide background selected based on their years of dedication to community and years of demonstrated success in a variety of fields. Membership of the KNIS technical advisory board is by invitation only with the Presidents of the National Academy of Sciences, the National Academy of Engineering, the Institute of Medicine, the National Research Council and the President and CEO of the State of the USA responsible for inviting membership. Decision Mechanisms

Figure 5: KNIS Processes

Page 31: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

31

KNIS leadership must find balance between broad coordination with a large base of stakeholders and agile action designed to support our mission. This balance will be found by use of two levels of collaborative groups: an executive architectural board and subordinate working groups. The KNIS Executive Architecture Council is used by the SUSA CEO and senior staff to ensure appropriate vetting of ideas and decisions with a broad range of internal and external stakeholders. This council is an enterprise systems and technology decision-making body. KNIS Architecture Executive Council membership consists of representatives from:

- The Executive Officer and General Counsel of the NAS, NAE, IOM and NRC or their designated representatives

- The SUSA Executive Staff (Chief Data Officer, Chief of Content, Chief Technology Officer, Chief Scientist, Chief Statistician, EVP, CFO and CEO)

- The KNIS Technology Advisory Board

The KNIS Architectural Council is chaired by the SUSA CEO or, in the CEO’s absence, by the CTO. This council has approval authority over the KNIS architecture and its principles and will help adjudicate issues brought to the council by working groups (further described below). This council is about ensuring the right architecture decisions and will keep user issues at the forefront of design decisions. KNIS Architecture Working Groups are an additional decision mechanism which will also be used to enable the best coordinated advice from technologists. Working groups are to be chartered as required and will be empowered with terms of reference approved by the SUSA Architecture Executive Council. Working groups may be chartered to work on specific issues, however, at least two are envisioned to be of extended duration: The KNIS Architecture design working group and the KNIS Data Working Group. Additionally, although this is the governance process for technology issues, there are key processes underway in the higher level KNIS governance structure for other critical topics including issue frames and measurements. SUSA leadership has noted that the complex interrelationships between issue frames, measurements and data issues can lead to ambiguity by those working the issue and will work to ensure leadership of these working groups are in constant communication to ensure this ambiguity is reduced. The KNIS Architecture Design Working Group will be assigned responsibility for maintaining each chapter of the KNIS enterprise architecture and will work to keep the architecture aligned with the KNIS vision, coordinated with the community and relevant for designers. The KNIS Data Working Group will work technical data issues among public and private

Page 32: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

32

data providers, data consumers and designers. This group will also maintain and update the data section of the SUSA enterprise architecture.

Each working group charter will spell out their role in capability certification, which is a mechanism the CEO will be able to use to grant approval for capability roll-out. SUSA intends on using ITIL v3 as a framework for operations and maintenance and artifacts required by ITIL will be provided by working groups prior to capabilities being certified as ready to run. ISO standards for business process certification will also be used where appropriate. Oversight/Execution/Feedback Decisions regarding architecture of the KNIS will be promulgated by the CEO and compliance ensured by effective communication to all involved and monitoring to ensure execution. It is the intent of SUSA to ensure dialog and evaluation of multiple courses of action in architectural decisions, and mechanisms will be put in place to surface competing ideas.

Issue Areas, Culture, Standards and Data Mechanisms

The KNIS architecture governance structure and process must cover a broad range of issue areas covering many stakeholders, information consumers and data providers. The scope of this effort means open collaboration and coordination and open processes are key. The governance team will ensure transparency at all levels to assist in broad input and involvement by all able to contribute.

Architecture, Design Guidance, Implementation Directives

The KNIS architecture governance process relies on broad understanding, continuous feedback and dialog. Therefore a key operating concept of KNIS architecture governance is to provide all architectural artifacts in openly sharable formats for all stakeholders, from users to data providers to developers to architects, to review and provide input on. Additionally, a KNIS should provide virtualized instances of the KNIS capability for developer use. The provisioning of these virtualized instances will be provided upon approval of the KNIS CTO as resources allow.

Contributing Back to the Open Source Community

The KNIS architecture is designed with the meta-requirement of usability, and it is user focused. This has driven an approach that is open in many ways, including a strong bias towards open source software. Wherever possible it is our intention of sharing back with the Open Source community. One early way to do that will be in providing use cases showing ways a KNIS is implementing open source. When possible we will also share back suggestions for code improvements and contribute in other ways. KNIS interactions with the open source community will be under the umbrella of the KNIS architectural governance structure and coordinated by the KNIS CTO.

Page 33: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

33

Client Tier

Client tier components – Software that must reside on the client hardware (e.g., desktop PC, PDA, cell phone, etc.). In general, these are restricted to browser software and do not include any of the three concepts modeled by the end-user.

The client tier consists of any client device or system that manages display and local interaction processing. This includes the browser. KNIS IT planners default to the following guidelines for the architecture in client-focused decisions:

Thin Client Rule

Whenever possible, end-user interaction with business applications and services should be mediated by a browser. Mediation may include the browser's use of commonly-available plug-in modules. The KNIS recognizes that tracking release details of browsers is also important and that a wide range of users indicates a wide range of browsers and variants will be in the ecosystem. KNIS implementations will document release levels that are written to. Decisions which concern censurability may only be approved by the KNIS architecture governance board.

Justification: By standardizing access, application and service offerings are very loosely coupled to the client tier. It will almost never be the case that deploying a new service or application will force deployment of new components to the client tier.

Impact: Client tier deployment costs are minimized. Access to business applications and services can be monitored more easily.

Client Mobility Rule

Whenever possible, an end-user's physical location should not affect access to KNIS, applications and services (except in cases where location gives users additional control over their experience, and then that should be by user choice).

Justification: The KNIS exists to serve citizens, wherever they are.

Impact: Costs related to special client tier configuration will be minimized. End-user satisfaction, performance, and effectiveness will be maximized.

Disconnected Client Rule

KNIS visualizations and applications will sometimes run on clients that cannot be connected continuously to underlying business applications and services. In these circumstances, the client tier may include components which would otherwise be placed in other tiers in order to make the client useful while disconnected (for example, a database engine). The preferred way of providing for offline use is to provide data through syndication which can be downloaded for viewing in any common viewing engine already resident in mobile clients (browsers) or easily consumable by common platform applications (for example, iPhone or Android apps).

Justification: Many use-cases support end-user productivity while disconnected.

Impact: The value of enabling end-users to remain productive should not be underestimated; however, this value should be balanced against the cost of deploying, supporting, and securing additional client tier software components. When such a client is disconnected but remains in use, the end-user's session state as reflected on the client

Page 34: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

34

may diverge from the ―stale‖ version reflected in the lower tiers; the application is responsible for ensuring synchronization when a connection is reestablished. Data associated with the security of the application, that must be stored locally, must be encrypted using a user-specified password that is not stored on the client machine.

Client Applet Rule

Whenever possible, business applications and services should avoid the use of client applets. Applets may be appropriate when the code cannot be persistent locally.

Justification: Applets have a number of drawbacks that should be considered:

- Every time an applet is started, the entire applet needs to be provisioned.

- Applets are often large and, therefore, time-consuming to provision.

- Applets are very sensitive to runtime environment; they are easier to fail.

Client Usability Rule

Because the end-user experience is the focus of KNIS requirements and critical to the perceived quality of applications and services, end-users must be provided with:

- Reasonable response time, under expected operating conditions, when interacting with underlying components; this includes a consideration of low bandwidth and high latency environments

- A comprehensible experience (sometimes called ―walk up and use‖)

- A consistent experience across client platforms

- Familiar graphical aids

- Appropriate mechanisms for customization (e.g., allowing end-users to specify right- or left-hand use, preferred fonts, font sizes, etc.)

Justification: The meta-requirement of the KNIS architecture is a quality end-user experience.

Impact: Long wait times, excessive number of keystrokes to complete tasks, and excessive confusion would result in poor user experience and run counter to all KNIS is building towards.

Presentation Tier

Presentation tier components – Software responsible for rendering information that will be conveyed to the end-user (including system administrators). All of the screens specified by end-users during use-case modeling reside in this tier. These screens may be encapsulated within the user interface on a portal.

The presentation tier is responsible for formatting all information displayed to end-users, capturing end-user input, and performing simple field-level validations. The format of displayed information may take many forms, but in the KNIS context will generally be via a browser. The presentation tier is often architected using a model-view-controller pattern.

For security reasons, these components will, in general, also be responsible for checking request parameters initiated or specified by end-users. These ―sanity checks‖ help reduce the possibility of buffer overflow attack. Other data validation (e.g. date, phone number,

Page 35: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

35

account number formats) may be performed by the presentation tier, but business-specific validation is usually performed by the business processing tier.

Localization (L10N) Rule

L10N is ―the process of providing language- or culture-specific information for computer systems.‖ When a business application or service will be used in multiple locales, its functional requirements must state explicitly the aspects of L10N that will be important, which aspects of L10N must be implemented and which aspects will not be localized.

Justification: We are designing for all our citizens as well as the international community and must ensure accessibility by the widest possible cultural base. Designers must ensure that all prospective end-users will be able to experience the application or service in a manner that will facilitate both learning the application and using it in a highly productive way.

Impact: The result is reduced training time, improved end-user productivity, and less likelihood of end-user data entry error.

Internationalization (I18N) Rule

I18N is ―the process of generalizing computer systems so that they can handle a variety of linguistic and cultural conventions.‖ All applications must facilitate I18N. When a business application or service will be used in multiple linguistic and cultural locations, its functional requirements must state explicitly the aspects of I18N that will be important, which aspects must be implemented and which aspects will not be internationalized.

Justification: We are designing for all our citizens as well as the international community and must ensure accessibility by the widest possible cultural base. Designers must ensure that all prospective end-users will be able to experience the application or service in a manner that will facilitate both learning the application and using it in a highly productive way.

Impact: The result is reduced training time, improved end-user productivity, and less likelihood of end-user data entry error.

Accessibility Rule

KNIS applications will be used by end-users with disabilities. Some accessibility aspects may be handled in a uniform manner through application integration with a portal that provides some accessibility features across all the application portlets or widgets it manages. Accessibility includes sensitivity to color issues not only for people who have color-deficient vision, but for display on various display devices, including projectors and printers.

Justification: Ethical and legal considerations clearly indicate the importance of ensuring that end-users with disabilities will be able to operate all aspects of the KNIS application suite.

Impact: The result is reduced training time, improved end-user productivity, and less likelihood of end-user data entry error.

Page 36: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

36

End-User Preference Configuration Rule

When a business application or service might have its usability enhanced by allowing end-users to set their own preferences, its functional requirements must state explicitly which types of user preferences will be permitted and the manner in which they may be implemented. Whenever possible, an end-user's preferences should be in effect regardless of the user's location or client device.

Justification: End-users often expect to be able to make superficial modifications that make using the application more pleasant or a more productive experience.

Impact: The user's ability to ―customize‖ an application or service results in improved user productivity. For example, if a user prefers to view his calendar by month instead of by week, allowing this customization permits the user to complete tasks more rapidly.

Standardizing where and how user preferences will be stored is a topic under study.

End-User Role Identification Rule

When a business application or service constrains the end-user roles that will be authorized, its functional requirements must state explicitly which roles will be authorized. Differentiation of functionality by end-user role must also be stated explicitly.

Justification: This serves to clarify how use-cases that vary by role are differentiated by the presentation layer (i.e., content, data capture, and field validation).

Impact: Implementation at the presentation layer will be clarified. This rule also impacts the business processing tier and the data resources tier.

Field Validation Rule

When a field in a form is of an easily validated type, the applications or service's functional requirements must state explicitly the validation rule to be applied. Field validation in the presentation tier does not affect the requirement that all fields must be validated in the business tier.

Justification: Generally, fewer computing resources are required to force the correction of (relatively trivial) input errors before they are passed to the business logic tier. Conversely, validation of fields based on business rules is not trivial, is subject to change, and should be performed in the business tier. It is possible that a nominally valid input from the presentation tier (e.g., a valid date) will fail a business tier validation (e.g., the date was a blackout date).

Impact: Field validation helps ensure productivity is optimized. For example, if a user enters a bad date as part of a form and posts the form all the way to the business logic tier before the error is detected, the user might have to wait considerably longer before being notified of mistake. Performance considerations need to be balanced against security concerns. Note also that a nominally valid input might later be rejected by the business tier.

Presentation Tier Standards Rule

All information produced by the presentation tier must be formatted using widely accepted standards (such as those expressed in the technical architecture below).

Page 37: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

37

Justification: Adherence to standards ensures that information will be useable on the widest possible range of client platforms. It also decreases the likelihood that custom client components will need to be provided as part of an application solution.

Impact: This is critical to implementing thin client applications. Failure to adhere to standards yields unpredictable results.

Active Content Rule

Examples of active content include active or looped GIF sequences, graphically active applets, Flash movies, and varying color. Browser-based active content should be avoided unless required for conveying KNIS information to users. If active content must be employed, the end-user must be able to turn it off.

Justification: Visually active content may become annoying to the user. Additionally, applications that employ active content might consume excessive computer resources.

Impact: On some client platforms the aggregation of active content significantly impacts computer resources.

Consideration: In cases where developers need help in determining which situations are acceptable for active content, focus on user experience and consult with SUSA Chief Data Officer and/or Chief Content Officer for guidance.

Business Processing Tier

Business processing tier components – Software responsible for performing business-specific functions. All of the business operations specified by end-users during use-case modeling will reside in this tier. These include components that implement business transactions, validate user requests, apply and enforce business rules, as well as components that ―wrap‖ legacy systems and databases (although the legacy systems and databases themselves do not reside in this tier).

This section provides guidelines for architecting the software components responsible for performing business-specific functions. This includes implementing business transactions, as well as applying and enforcing business rules.

Services Overview

Often, the business-specific functionality for an application is encapsulated within a service. This means that all the code and resources that are necessary to implement the business functionality are grouped together as a package and run in one or more stand-alone processes that are accessible via the network.

This model allows the business functionality to be implemented and deployed once, and then used by multiple application components. It also allows the service implementation to scale independently of the other application components.

Other parts of the application that need to access the encapsulated business functionality can do so by calling a published service application programming interface (API) that interacts with the remote service. The API is implemented by proxy code that runs in the client process and communicates with the service via standard networking protocols like HTTP. Such proxies should be provided by the Web service for the convenience of clients,

Page 38: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

38

but because some clients may not be capable of using such a proxy, all APIs should be well documented.

As illustrated in Figure 6, a service will be treated as a separately deployable package, almost like a mini-application. Each must be able to be monitored, configured, and secured separately, independent of the other application components that depend on it. This model allows services to be upgraded independently if the service APIs that it implements are not changed by the upgrade.

A well designed service does one thing and does it really well. Combining unrelated business functionality into a single service is confusing to users and makes it hard to evolve related services in a consistent manner. In contrast, having a service implement just one business operation is wasteful due to the added overhead associated with the deployment and management of a service.

A good measure of how well a service is designed is the number of interfaces it supports in its API. As illustrated in Figure 7, a well designed service should have only one or two business-related interfaces, a monitoring interface, and a configuration interface for remote administration.

Figure 6: Services

Calling Process

Service API

ServiceProxy

ApplicationCode

Service Process

ServiceImplementation

Service Interface

network call

Page 39: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

39

The business interface is called via proxy code that is integrated with the part of the application used by the end-user. The monitoring and configuration interfaces would be called via proxy code that is integrated with the part of the application (or a general monitoring and administration program) that is used by the system administrators. If the application is running in a portal, both interfaces may be made available by the portal at different times to different users based on their current roles.

Shared Services

Some services provide business functionality that is general enough to be useful across multiple applications. These are called shared services. See Figure 8.

Applications that leverage shared services have the potential to realize several business benefits:

- Reduced time to market; integrating existing services, rather than developing redundant code, speeds application development and enables upgrades to better deliver consistent results

- Reduced total cost of ownership; costs for implementing and maintaining the service are assumed by the service provider rather than the service consumer; as new features are added to these services, consumers realize increased functionality at little cost

The benefits of shared services come at a cost. Shared services have more dependencies on them and require more care when upgrading implementations or changing interfaces.

Figure 7: Multi-Interface Support

Calling Process

Business API

ServiceProxy

ApplicationCode

Service Process

ServiceImplementation

Monitoring Interface Configuration InterfaceBusiness Interface

ne

two

rk c

all

Calling Process

AdministrationCode

Configuration API

ServiceProxy

Monitoring API

ne

two

rk c

all

ne

two

rk c

all

Page 40: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

40

Such services must ensure that upgrades are backward compatible and produce a minimum effect on dependent applications.

Because a service-based architecture results in ―finer grained‖ deployment packages (and more of them) special care must be taken with respect to version management. The versions of each deployed service, along with the versions of the interfaces it supports and the client components that depend on it, must be tracked carefully. Upgrades must deliver consistent results.

- An old (obsolete) interface cannot be "end of life'd" (EOL'd) until all the clients that depend on it have been moved to a later version of the interface.

- A policy for each component must be published stating how long each version of the service interface will be supported and when it is planned to be EOL'd. An example of a reasonable policy would be to support the current version plus one previous version of each interface.

One way to deal with this complexity is to have a service implementation support multiple versions of the same interface at the same time. This approach allows dependent applications to migrate to the newer interface versions when ready, while dependent applications that are not yet ready can continue to use old versions of the interface.

A policy should be made that each version of an interface will be supported for at least one year (but not much longer) after a newer version of the interface has been deployed. This provides a realistic upgrade path for components that continue to use the previous version.

Figure 8: Shared Services

Calling Process

Business API

ServiceProxy

Application XCode

Service Process

ServiceImplementation

Business Interfacen

etw

ork

ca

ll

Calling Process

Business API

ServiceProxy

Application YCode

ne

two

rk c

all

Page 41: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

41

Data Resource Tier

Data resource tier components – Data management components (e.g., databases, directory servers, etc.) responsible for managing the persistent state of business data. All of the business objects specified by end-users during domain-object modeling will be managed by the components residing in this tier. The data resource tier also contains the legacy systems that are ―wrapped‖ by business processing tier components.

The data resource tier provides management of data that an application acts upon. Any data that an application needs to fulfill its purpose or behavior are part of this tier.

Primarily, the data resource tier provides access to and persistence of data. This section focuses on selection of the most appropriate data resources.

The KNIS need for response times and need for adding value to content will require designers give appropriate consideration of the balance between internal data managed in the enterprise, internal data managed in the cloud, and data maintained at provider sites and called when required. These design trades will be assessed at time of design with final design choices approved by the CTO.

The KNIS website will require at least the following data resources:

Indicator data (and metadata) – These data originate with other sources, mostly government statistical agencies, but will probably be stored in KNIS databases, at least until adequately responsive web services to them are available. These data are at the heart of the site content, and will be available to users through data visualizations, online tables, downloads in various formats, and other formats as determined by site designed. They will also be available remotely (off the KNIS site) in visual 'widgets' or via a web service for public data access. Any use of the data must be accompanied by metadata to at least identify the source and provide labels and important notes.

Indicator data will in general change slowly, with updates typically on a monthly or annual basis. Data will support disaggregation to the extent possible along four dimensions: time, geography, demographics or other characteristics, and conceptual components. These dimensions will vary from data set to data set, but the KNIS will strive for consistency. Timeliness is important; data should be available within the KNIS system within hours of when they become available from the data provider. Data will be provided in a variety of formats, from online web service delivery, to downloadable spreadsheets, to screen-scraping from web pages or PDF files.

The metadata associated with indicator data describe the provider, the survey or system from which the data originate, the description of the measure used, and any notes necessary to use the data. Notes may be at the data-set level, the data point level, or at some point in between, such as at the collection of disaggregated data (e.g. race/ethnicity or year). In some cases, statistical quality measures (standard error, confidence interval, etc.) will be available and must be stored and delivered with the data.

Textual and media content – Textual and media content will be managed by a staff of writers, artists, web producers, and other (largely non-technical) staff. Although such content will almost always be mediated by a content management system (see CMS at various places in this architecture), the content generated will be stored in and served from a data store. The content management system will also typically store user comments,

Page 42: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

42

although implementation decisions and development phasing will drive that aspect of the architecture.

Site metrics, log and user access data – Data about user interaction with the site and the site elements are essential for KNIS business practices. They will inform site development and content priorities, provide a metric for site success, and contribute to capacity planning and hardware support decisions. Data must include interactions both on the KNIS site and with KNIS content that has been distributed to other sites, and KNIS activity on Twitter, Facebook and other social interaction platforms.

User-specific data – It is important that users be able to interact with KNIS content and indicator data and to be able to share their discoveries and creations with others. To provide a personalized experience, the KNIS will allow users to create an account and through that account be able to customize their settings, and have access to saved mashups, bookmarks, or other items enabled by the site functionality. There will be opportunities for users to register to receive RSS feeds or email alerts, to link to or from social networking sites like Facebook, or to send references to content to others. User authentication and customization may also apply to users accessing the site via other means than the web site, such as web services.

Estimates of Data Size and Scope

Although the KNIS is working with a wide range of data providers numbering in the hundreds and we are seeking an end state goal of a system able to scale to millions of users the core data in our repository will be text and is likely to be modest in size. For initial planning purposes we believe KNIS-retained data stores will be on the order of under 400 Gigabytes. We are basing this on very rough assumptions and this planning figure will be frequently revisited as we continue to design and scale up from our current working system.2

Data Access

One of the most important decisions an application architect must make is to choose the mechanism by which application data will be accessed. This decision is based on application type and data type.

- A file access is sufficient for static local data and application-specific data such as configuration and locale information.

- A distributed transaction environment is required for accessing data from multiple sources.

- An application should cache reference data that it needs, or access such data on an as-needed basis. Caching also depends on the overall application environment and behavior (for example, caching reference data makes more sense in a low bandwidth environment than it does in a high bandwidth environment).

2. Other large scalable systems serving users with dynamic issue relevant data include Wikipedia. The core Wikipedia database consists of 163Gig of text. For more info relevant for comparisons see http://en.wikipedia.org/wiki/Wikipedia:FAQ/Technical#How_big_is_the_database.3F

Page 43: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

43

Data Persistence

Applications require various types of data to be persistent. Examples include transactional data, reference data, logging data, and configuration data. Data persistence means different data life spans depending on the type of data, type of application, and type of usage. Solutions for data persistence must take these factors into consideration. Solutions such as caching, batching of data, query optimizations must be employed to address the specific application environment.

- Data persistence should be tracked in terms of "logical units of work," to ensure the state of all related data in a transaction set or query set are consistently maintained.

- Log data should employ persistence solutions that provide features for data management, such as purging, archiving, and searching. In particular, a record of user actions on the site must be available to provide customized recommendations and dynamic feedback about general trends and preferences.

- Transaction data should leverage persistence solutions that support transactions, including the ability to update multiple records or perform multiple queries within a single transaction.

- Reference data should use data registries (enterprise directory service, company registry, agreements registry, etc.), if available. Reference data should not be modified by an application; as such, persistence of reference data is not the development team's responsibility.

The selection of a data resource is much more complex than just choosing the appropriate technology (although selection of the appropriate technology should play a part). Many factors other than technology play a pivotal role in the selection process. An understanding of boundary systems is also helpful in choosing the most appropriate technology.

Application architects must:

- Identify all data sources and target systems. This enables the architect to account for all data exchanges.

- Identify all reference data, including data feeds used. All domain (business system) registries, where they are available, must be used for reference data.

- Identify systems of record (SORs) with which the application will interface. Whether internal or external, all SOR data must be accessed via standard and open interfaces.

- Account for application data volumes (transaction rates), replication requirements, metadata management, data archiving, and data purging policies that align with data access patterns. This will drive architectural decisions for the application. Note: Purge and archive retention policies will also be driven by business requirements and regulations and some might be mandated by content providers. Documentation of decisions and implementations here must be done with a discipline enabling broad review of decisions.

- Account for data stewardship, determining who owns and maintains the data. Appropriate access control must be provided for data owners and administrators.

Page 44: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

44

Answers to these questions and concerns provide a good foundation to work with the most appropriate data resource options. Many of these systems have their own interfaces that enable other applications to integrate with them. The choice of data persistence engine depends primarily on the type of data to be managed.

Other factors play an important role in the selection process for data management, such as performance, transactional reliability, productivity in application development, mechanisms for integration with existing resources, compliance with standards and portability.

Data Labels

Data will be contained within an unlabeled table or file architectures to allow for maximum flexibility in data storage, retrieval and display. Figure 9 provides a graphical representation.

The labels for the data will be stored separately and dynamically associated with data prior to user display based on unique factors such as:

- Data source

- User Role (who is querying the data)

- Query path (intent of query or mash-up - e.g. didn’t define a role, but stated interest in a particular topic – health, immigration, etc)

- Related Data Sets (e.g. being queried at the same time or as part of a mash-up)

- Meta Data (source or user defined)

- Language

- Other variables TBD

Page 45: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

45

Data in the cloud

The KNIS architecture will support the potential for some Data to be stored in the cloud and/or locally. To allow for scale, the KNIS may utilize advanced cloud platforms for data storage using a blended approach.

At a minimum, some data types will be stored locally. The term local data will be used to describe data that will be stored locally on KNIS servers (even if those servers are cloud server instances).

The term cloud data will be used to describe data that could reside on cloud services (e.g. Amazon, RackSpace, etc).

LOCAL DATA ALLOWED CLOUD DATA

User profiles

Saved user queries

Saved user mash-ups

Original source elements

Data content type (labels)

Content labels

Figure 9: Dynamic Display of Data

Page 46: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

46

Social content of queries (e.g. discussions, external links, incoming links)

Social content of mash-ups

Geospatial context

Related datasets table

User tags

Data Source Lists

User role labels

Query path labels

Context labels

Data Curation

Upon import of source data, the KNIS architecture will support the creation of multiple dynamic data labels for each data element to support finding data in multiple ways, but each data element will have consistent names on the key label. The system and associated processes must allow for both manual and automatic labeling. In addition, the curation/import process must support the mapping of source data labels to existing KNIS labels and normalization of benign data labels such as date formats.

Data Source Lists

The KNIS architecture must support the storing and processing of source data identification tables to include:

Data source feed location

Data refresh rate

Data rights (store locally, syndicate)

Data label history (complete history of all labels applied to any given data element)

Data Registry

A data registry is ―a system of record that provides unique identifiers and required key descriptors for discrete business objects.‖ We extend this definition to require data registries also to provide a published service API that is application domain specific. As with other business services the API encapsulates business rules for data validation and insulates application components from the data management system that implements the API (such as a database). In this case, the components are those that use the business processing tier data registry.

Content Management Systems

A content management system (CMS) is a specialized application for creating, editing, storing, accessing, and distributing structured content (and user comments). Content is, in essence, any type of digital information – it can be text, images, graphics, video, sound, etc.

CMSs are distinct from document management systems (DMSs), where the managed entity is a complete document with specific name, size, and content – documents are

Page 47: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

47

comprised of content. A DMS is concerned with a document in its entirety and is less (or not at all) interested in what the document contains.

A typical CMS has at least these modules: authoring, meta tagging, workflow (publishing), and rendering or presentation. Additionally, it may also provide design templates and access control mechanisms. Web content management systems aim to reduce the overhead and bottlenecks of Web production by enabling "anywhere, anytime" Web authoring and promoting content reuse by separating content from its presentation requirements.

To use a CMS service, the application must generate data in an appropriate format, suitable for consumption and publishing by a CMS system.

Data Provenance

Data used by the KNIS and provided to end-users and/or syndicated must be trusted, indexed, searchable and enable rich means for users to assess the validity of conclusions. Key to enhancing these factors are Data Provenance considerations. KNIS systems should track where data came from, how it was modified, what value was added to it and who has accessed it. This same need for data provenance extends to meta data, which may include notes about data, confidence, confidence intervals, standard errors, etc, as well as text and dates and notes regarding the meaning of conclusions.

Data Value Add

BI typically has some intentional de-normalization to improve query performance, typically organized in "star" or "snowflake" designs. Normalized data structures are usually used by OLTP systems. A key factor regarding the types of data the KNIS is dealing with is that it is not all normalizable in the BI sense, because in most cases the KNIS is not dealing with the microdata, and in some mission domains (especially Health) collection of data is done with different means depending on locality which complicates normalization. In many (most?) cases we are dealing with summary data that can't be normalized for different populations (for example, when one survey is civilian non-institutionalized over 18 and another is all ages). This is an important consideration for data and an impact on the ability to normalize, cleanse and add value. However, adding value is in the KNIS mission scope and as value is added provenance will be kept. The KNIS value is in bringing data from disparate sources together in one place, with a common, well understood format, and as much consistency in disaggregation as possible given the constraints above.

Integration Tier

Integration tier components – General-purpose or business-specific components used to tie together business processing tier components with data resource tier components or resources that are external to the application. The components in this tier generally are not visible to end-users. These ―messaging‖ components usually take the form of queued messaging servers, publish/subscribe event servers, or a combination of the two.

This section provides guidelines for linking business processing tier components with resource tier components, or with services and resources external to the application. Code that is primarily intended for enabling integration with other applications or services, and that does not perform business processing, should be isolated as integration tier components. This insulates the business tier from changes in integration technology.

Page 48: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

48

Data Schema, Format and Semantics

The KNIS consumes data from a wide range of providers and at times stores, transforms, adds new context and re-provides that data back. The KNIS data schema enhances design quality by ensuring data flow, storage and retrieval are optimized.

When one application is integrated with another application or service, a common data format and semantics should be defined. Data format should not be confused with data encoding, as the latter is dealt with in the technical architecture and relates to technology (e.g. XML, Java Object Model, etc.). In logical architecture the focus is on identifying the data elements involved in the integration, defining their semantics, and grouping the data elements into business objects. This task is performed by business analysts or functional architects who are domain experts.

Rather than defining business objects from scratch, the development team must review KNIS and industry standards and use these as a foundation. Several such standards are available with varying levels of maturity and acceptance. Relevant standards include, but are not limited to:

- RosettaNet (www.RosettaNet.org)- for larger scale b2b system frameworks

- Universal Business Language (http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=ubl)

- SDMX (www.sdmx.org ) an evolving standard for international statistical data sponsored by the OECD, IMF, World Bank, UN and other organizations.

Batch Data Transfers

In some situations the data exchanged between applications are generated as part of scheduled jobs, in which the data are produced in batches. In such situations, applications must use batch data transfer mechanisms such as file transfers. File standards for exchange are captured in the technology architecture (and are XML based).

Data ingest and other potentially long-running batch transfer/update routines hold the potential of becoming very long running jobs and must be designed to be executable in reasonable times through techniques such as partitioning the input for shorter runs (care and thought must be given in the design so jobs to not run too long).

Syndication of Value Added Content

The term syndication here is used to mean the ability for external users and consumers to take automated feeds of valued added content from the KNIS system.

Syndication interactions are unregulated. This does not mean no rules apply, but that the interactions are isolated. The interactions are not included as part of a strict and complex procedure. Unregulated syndicated content can be used by authorized clients who apply mandatory rules and formats. The model is basically to act on demand.

The KNIS will study existing data exchange standards to determine whether any are appropriate for use. If no existing standards meet KNIS needs, then KNIS will publish an open standard for data exchange and work to help other organizations implement and adopt it.

Page 49: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

49

Syndication and Social Media

Syndication will enable use of value added KNIS content by developers fielding capabilities on a wide range of other platforms. One deserving of special note is social media. Developer guidelines will be provided that facilitate use of KNIS value added data by developers focusing on key social media sites like Facebook, LinkedIn and Twitter. And representations of data produced will be socially sharable by embedding in other sites (for example, and individual's blog or Facebook page or LinkedIn page).

Discovery of data sets and other content

KNIS architectures produce information meant to be discovered, shared and used. Information that is sharable will be exposed to search engines and optimized for search engine discovery via the open sitemap protocol.

Interaction Models

The type of interaction an application uses to interface with another application or service must be determined based on the business processing needs.

Synchronous Interaction

In synchronous interaction an application ―makes a call‖ to another application or service and receives a response in that same call. The calling application blocks until it receives a response; therefore, time-outs should be used when the API supports time-outs.

This type of interaction requires the responding application or service to be available; otherwise the call will fail. Hence synchronous interaction increases the coupling between the calling application and the responding application or service. Synchronous interaction must only be used where a real-time response is required.

Asynchronously Interaction

Applications should interact asynchronously with other applications and services, as well as with other components within the same application, under the following circumstances:

- The communication is only one-way (i.e., the application sends information but does not expect a response)

- The application makes a request but does not require a response in real time (the response is sent as a separate one-way message)

- There is a need to provide massive scaling capability and improve performance; the application design may use multiple instances of a service, which are used to balance the load (the round robin scenario depicted below is one such strategy for load balancing)

In asynchronous interaction, the calling application makes a request for information but does not block until a response is obtained; it proceeds with its processing and the information it requested is obtained later. This usually requires the calling application to register a listener or event handler that the responding application or service may use for sending the requested information. In most situations asynchronous interaction is accomplished via middleware that decouples the sender from the receiver and guarantees delivery of data.

Page 50: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

50

Asynchronous interaction may be of the different styles listed in Table 1. The appropriate interaction style must be determined based on the application's needs.

Interaction Applicability

Point-to-Point Single consumer for any given message.

In the round-robin pattern, one message may go to one recipient; the next may go to another, and so on.

Publish-Subscribe

Multiple message producers and multiple message consumers.

Any given message needs to be delivered to multiple consumers.

Request-Reply Response is needed asynchronously.

Reply needs to be associated with request.

Table 1: Asynchronous Interaction Styles

Direct and Indirect Integration

When integrating one application with another application or service the architect may choose direct access (see Figure 6) or indirect access that uses middleware to facilitate interaction (see Figure 7). Most synchronous access is direct due to the need for real-time

response.

Middleware, such as queues, should also be used for synchronous communication when multiple instances of an application or service may respond to a request and load balancing or high availability are necessary. When supported by the API, time-outs must be employed to avoid calls from blocking forever when there is no response.

Indirect integration for synchronous interaction is also applicable when direct access is blocked by a firewall. In such situations the synchronous access is facilitated by an intermediary, such as a synchronous messaging server.

If the application requires guaranteed delivery of information or must broadcast or multicast information (i.e., send information to multiple recipients at the same time) it should use indirect integration facilitated by middleware such as a message broker. Most asynchronous interaction should take place indirectly via middleware to leverage the

Figure 6: Direct Synchronous Interaction

Page 51: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

51

guaranteed delivery (sometimes referred to as ―fire and forget‖) and decoupled nature of such interaction.

Third-Party Application Integration

Applications often require integration with third-party applications or components. These components must be evaluated with the KNIS architecture in mind so integration will be straightforward.

Often, however, third-party applications or components may not be compliant with the guidelines set forth in this document. When this is the case, there are some important considerations to be made. These are described in the following sections.

APIs

If the third-party application supports a published API that can be called by the integrating application, and the third-party application will reside in the same sub network as the integrating application or uses an acceptable communication protocol (such as HTTP), then the integrating application should call the third-party application's API directly.

If the third-party application API is in a different programming language than the integrating application, then an adapter must be implemented if response time is critical, or an indirect integration mechanism may be used if response time is not critical.

Wrapping an existing API behind a new API does not solve the integration problem, as a new wrapper would be necessary when the third-party application changes. A justification is required before wrapping a third-party API.

Monolithic Applications

If the third-party application needs to be accessed by other applications but does not support a programmatic interface (API) then the entire third-party application should be wrapped behind a service interface with its own defined API. The wrapper service must then go directly to the application database, if possible, or interact with the monolithic third-party application via screen scraping.

Figure 7: Indirect Synchronous Integration

Page 52: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

52

Screen scraping is when the wrapper service behaves like a display device (such as a browser) and converts the screen input and output into API requests and responses. This approach allows the monolithic third-party application to be treated as a separate business processing tier component or service.

Applications that access the monolithic application should not access the third-party application database directly, even when this option is available. This prevents these applications from breaking when the third-party vendor changes its database schema.

Application Consolidation

When third-party platform components need to be upgraded, reconfigured, or restarted for one application, all applications that share the third-party platform instance will be affected. Therefore, when consolidating instances of a third-party platform, special care must be taken to ensure that all applications sharing the consolidated platform also share similar service level agreements (SLAs).

Page 53: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

53

Technical Architecture The purpose of a technical architecture is to map defined enterprise components from the logical architecture to specific implementation technologies. These technologies are generally layered and support standard interfaces that allow them to be used in a ―plug and play‖ manner. This section provides guidance for technology choices. It complies with the rules established in the logical architecture. It is not yet an all-inclusive enterprise technical architecture. More comprehensive versions will flow from this one. Although many issues are surfaced and guidance is provided, specific application solutions will require more detailed implementation guidance. A graphical depiction of the KNIS technical architecture is provided in Figure 8.

This architecture emphasizes technology independence. The concept of technology independence means that we will design the logic of our system without introducing technology-specific details. Then we will make an explicit mapping step to translate the technology independent logic to a technology specific implementation.

Figure 8: Technical Architecture

Page 54: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

54

The mapping process involves taking each component defined during the logical architecture phase and selecting, from the technologies listed in the technical architecture section, an implementation technology to either provide complete functionality itself or host the component once it has been developed. This section provides general guidelines for developing an effective technical architecture. Also included are explicit guidelines for using the specified technology within each layer.

The guidelines below are meant to ensure we address the number one "meta-requirement" of the KNIS architecture, which is usability. Guidance is also provided meant to optimize operational and support considerations. Both these factors are key reasons for our focus on open source software, open architectures and open API access. Both are also the reason we place a large focus on the requirements for the implementation team to document, with discipline, all aspects of the technology decisions made.

High-level Guidelines

The following high-level guidelines capture the ―big rules‖ that should be kept in mind for the technical architecture for the KNIS.

- To minimize training and support costs, minimize the variety of technologies in a particular category.

- Implementations are expected to be operated and maintained by ITIL compliant processes (ITIL: Information Technology Infrastructure Library), and documentation sufficient for use by operations and maintenance staff working under ITIL are required. This includes information for help desk support. Use cases for problem management, change management, capacity management will be documented and provided with every delivery of system capabilities. The SUSA CTO is the holder of these documents.

- Regarding REST versus SOAP, the KNIS architecture must have an ability to consume both since both exist in the ecosystem. More information on those two approaches is provided below.

- To achieve vendor independence and improve interoperability between components, use industry standards when available.

- To minimize the code that must be written and supported, leverage the capabilities of the application platform and IT infrastructure services when possible.

- Before resorting to custom development projects, first identify whether purchased applications and software can satisfy requirements, then consider reusing existing services.

- Avoid customizing purchased software (through code changes) that will require maintenance or invalidate support contracts. Customization using mechanisms provided by the application, such as Oracle flex-fields, is acceptable.

- Carefully consider whether there are single points of failure within designs and eliminate them when possible and highlight the rest.

- When adopting a new technology, always try to make sure that:

- Viable alternatives are not already in use in other applications or services

Page 55: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

55

- The technology in question has not already been considered and discarded by other development teams

- The technology has no hidden costs or dependencies

- The technology does not conflict with other technologies in terms of its use of resources

- The technology can be monitored for access, performance, and failure

- The technology adheres to industry standards and does not commit the application to a sole vendor (this is frequently harder to do than it seems, most technologies support industry standards but the way the technology is used by developers (for example, extended features) commit and application to a sole vendor, and that is what we want to avoid).

Operating System Guidance

Either Solaris or Linux operating system is the preferred OS for all application components running in the business processing tier or data resource tier. Either Solaris or Linux is recommended for application components running in the presentation tier. Any third-party application with components that require another operating system must have an exception granted during the review process.

From an application architecture perspective, the end-user desktop is out of KNIS control. Applications should, therefore, be accessible by end-users running Solaris, Linux, or Windows or Mac OS. This is most easily accomplished by making the application browser-based and having it support the most popular browsers.

Designing for Flexibility in use of new Cloud Capabilities

Extensive lessons from the developer community are all pointing to the need to design for the cloud. The good news is the same important constructs for enterprise architectures apply to cloud architectures. It remains important to split application functions and couple loosely, for example. There are some differences, however. The following are key:

Network Communication: Designs must use network-based communication interfaces and not interprocess communication or file-based communication paradigms. This allows scale in the cloud since each piece of the application can be separated into distinct systems.

Design for the Cluster: Rather than scale a single system up to serve all users, the system can be split into multiple smaller clusters, each serving a fraction of the application load. This is often called "sharding" and many web services can be split up along one dimension, often users or accounts. Requests can then be directed to the appropriate cluster based on some request attribute.

Ensure asynchronous interfaces: To tolerate failure, applications must operate as part of a group but not be too tightly coupled to their peers. Each app piece should be able to continue to execute despite the loss of other functions. Asynchronous interfaces are an ideal mechanism to help application components tolerate failures or momentary unavailability of other components.

Page 56: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

56

Design for data persistence: This is always important, but care must be taken in cloud environments to ensure data are available, including data for recovery from outages.

Engineering for monitoring in the cloud: The methods and approaches for monitoring processes in the cloud are still a very dynamic/emerging market. Implementation decisions must be made that will enable the appropriate monitoring of processes and alerting on required performance parameters.

Technology Architecture of Client Tier

Browser

The KNIS's strategy is for the browser to be the standard thin-client platform for delivering Web content to and obtaining input from end-users on large-screen devices. Our strategy is to ensure compliance with as many browsers as possible, but to provide for focus we intend on focusing on HTML 5 compliant browsers first. We code to the W3C Web Standards.

Consumer device apps

Apps on iPhone, iPad, Android and Windows Mobile devices will be a good adjunct to access to the KNIS through mobile browsers. This architecture enables development of these apps by ensuring feeds and presentations are readily consumable.

Technology Architecture of Presentation Tier

This section provides guidelines for employing the set of standard APIs and protocols that are used to insulate the application components from dependencies on specific implementation technologies. The majority of these APIs and protocols are found in multiple environments including J2EE, LAMP, SAMP and others.

These APIs and protocols are implemented by the implementation technologies in the upper platform layer. Many design patterns are available which describe how to use and combine these APIs in proven ways.

HTML and CSS

The hypertext markup language3 (HTML) is still the most popular way to create static content in Web-based applications. Browsers use HTML to render application content for the end-user. HTML should therefore only be used within presentation tier components.

Cascading Style Sheets (CSS) work with HTML to separate design elements and layout from content. Best practice involves using a combination of HTML and CSS.

HTML has a number of variants and extensions. It is outside the scope of this document to enumerate them, but the development team should remember that different browsers may display the same HTML and CSS code differently. This fact should be considered when deploying an application, but it is especially important for externally facing applications.

NOTE: The capabilities of HTML 5 and the ability of fielded browsers to support HTML 5 are important to track as this is a quickly changing component of the technological architecture.

3 For more information on HTML, see http://www.w3.org/MarkUp/.

Page 57: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

57

XML and XSLT

One of HTML's constraints is that it limits the choice of client tier devices. A better alternative is to capture application content in extensible markup language (XML), then apply extensible style sheet language transformations (XSLT) to produce device-specific – and perhaps even locale-specific – content. Thus the same XML data can be used for a wide variety of client tier devices (PDAs, cell phones, PCs, etc.).

Presenting data via widget:

Although concepts of portals are still in use and may be implemented as a presentation means in many situations, increasingly users expect data to be provided in units embeddable in any page, site or even desktop. The increasingly common term in computer design is "Widget." Designing with smart separation of HTML, CSS and use of XML provides a head start in designing for presentation via widgets. Other standards such as JSR168 and the standard WSRP interface are also key.

Application Frontends

Application frontend components are responsible for rendering the larger pieces of application content to the user. The frontend components should be implemented using widely available technologies and giving consideration to performance and accessibility.

Application frontends should call upon web services and application backend business logic components to implement business rules, perform database transactions and initiate interfaces with other back-end services. Application front ends should not display login pages to end users, collect credentials or authenticate the user. Authentication of users should be externalized to the security service via use of an agent installed on the application front end (using standard encryption/authentication standards such as WS or SAML). Applications should also externalize access control for resources which can be handled by the URL policy agent (URLs).

Each frontend component should accept HTTP(S) requests and return content in a format that is acceptable to the end user's device's browser (e.g. HTML).

User and Usability Testing

Although testing of functionality must occur at every tier, Usability Testing occurs at the client end of capability. Its purpose is to ensure initial requirements are being met and to solicit new requirements. Usability testing focuses on measuring the capacity of the solution to meet its intended purpose. It gives users direct input into the system. Since the KNIS's user base will be broad, care must be taken to sample the entire spectrum of users from many issue areas, to ensure no one single group is driving requirements for all issues areas.

Technology Architecture of the Business Processing Tier

Web Services

Within heterogeneous deployment environments, especially environments that span corporate boundaries, there is a need for services that can be accessed using standard Web communication protocols such as HTTP, independent of programming language. Shared services that provide an XML-based interface over HTTP are commonly referred to

Page 58: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

58

as Web services. Web services also tend to imply a runtime mechanism for dynamically looking up the URL of a Web service prior to using it. This is generally accomplished using a Web service registry, which maintains interface definitions and URLs for a set of Web services.

A more formal definition of a Web service, provided by the World Wide Web Consortium (W3C), is as follows:

A Web service is a software system identified by a URI [uniform resource identifier], whose public interfaces and bindings are defined and described using XML. Its definition can be discovered by other software systems. These systems may then interact with the Web service in a manner prescribed by its definition using XML based messages conveyed by Internet protocols.

SOAP and REST:

The SOAP4 protocol is the recommended approach for implementing most services that provide a document-oriented interface. A wide experience base and familiarity by developers means SOAP will likely be in the ecosystem for a long time. SOAP is simple to generate. It is also known as a more reliable protocol for large data and is regarded as the better choice for higher availability systems. However, REST is easy to consume and work with. For many web services, REST5 will be the easiest choice and the capability we desire. REST is now a protocol of choice for simple interfaces. REST has a low barrier to entry and enables a simple XML over HTTP approach. REST is strongly supported by a growing community but it is still a new way of doing things and in many cases there are not standards for how it should be implemented.

As an example, take identity propagation in web services composition (Client->Service->Service->Service), For SOAP, there are accepted means of implementing WS-Security, giving SOAP a well-vetted messaging solution for propagating identity, whereas most REST solutions typically either require developers to create their own means for this or use proprietary solutions. REST can do identity propagation, but it is by complex, unique methods that make it much harder to use than REST fans like to do.

Guidance for deciding when to use SOAP/WSDL or REST for services provided by the KNIS:

- If it is a service internal to the KNIS, the simple to generate and consume SOAP is probably the answer. SOAP has a strong developer community and several features supportive of enhanced security, availability and reliability.

- If it is a service from the KNIS to the community, publishing SOAP and WSDL is perfectly acceptable. However, REST may also be provided. REST is lightweight, readable by humans, easy to build and fast to field.

The most important lesson gained from years of interacting with web services developers: in either SOAP or REST, everything must be documented or use/reuse will be too hard.

Not every application or service should be Web service-based. Defining XML schemas for

4 For details on the SOAP protocol, see http://www.w3.org/TR/SOAP/.

5 See: http://www.oreillynet.com/pub/wlg/3005

Page 59: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

59

the message types, and generating the SOAP proxies and adapter code, adds complexity to the application. Unless an application or service requires interoperability with non-Java or external applications or services, the Web service approach to implementing a logical component may be overkill.

Assertion of authorization in a Web Services environment

OpenID is a leading candidate for web services identity assertion and delivers great promise in large scale web services environments as a key component of the authorization solution.

Web Services Continued

The KNIS is focused on developing a service-oriented architecture of loosely-coupled Web services in the basic request-response style, leveraging SOAP and WSDL or REST standards for technical interoperability and reduced time to market. The KNIS initiative has supported governance and infrastructure build-out needs.

The KNIS must interact with a wide range of data and service providers, as well as consumers. It should work in ways that make it easy for data providers and consumers to interact with us, so we will enable consumption of data from any source. However, we believe that developing shared guidelines with data providers – such as those below -- can help data providers and consumers in the ecosystem provide their information in ways that make consumption of their data easier and its utility greater.

The KNIS provides and consumes web services. Web services are accessed through interfaces. Those interfaces describe how capabilities are presented and the rules and protocols for using them. Key points:

- KNIS web services conform to the WS-I Basic Profile v1.1 and Security Profile v1.0.

- All KNIS partners and data providers are requested to provide the KNIS with WSDL and XML samples. This will enable us to document and share with others the detailed definitions of the content of services and data provided by others. Consult with the KNIS architecture team for examples. This will also enable establishment of an initial Service Inventory: A service inventory is a "responsibility map" that captures who will be providing the service.

- WSDL is used by data and service providers to express the communication protocols, message formats, including serialization techniques, and service locations.

- Service invocation policies such as security requirements, required SOAP headers etc are also provided by formal definition in the WSDL.

- The KNIS offers data definitions and schemas that can be reused and encourages collaboration to capture the best means of ensuring these schemas support the mission and needs of all stakeholders. We provide these definitions and schemas to enhance interoperability and to help the community avoid the problems which arise from development of independently generated, not-well-understood WSDLs.

Service specifications hold all the information a user or consumer of the service would want to know before deciding if they are interested in using the service, as well as exactly how to

Page 60: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

60

use it if they are. It also specifies everything a service provider needs to know to implement the service. The service specification includes:

- Service name

- Provided and required interfaces

- Rules for how the functions are to be used and in what order

- Constraints that reflect what successful use of the service accomplishes

- Qualities that service consumers should expect such as cost, availability, performance etc.

- Policies for using the service.

The interface contract specifies the data to be exchanged within the context of a business interaction and a set of criteria that determine initial and ongoing success. The contract does not specify how either the service consumer or the service provider will be written. So, the service consumer and provider can be written in any language and they can be deployed on any platform. And, a consumer or provider can be a single monolithic program, or it can be a cluster of programs. Best practices for SOA have functional and non-functional aspects of a service implemented separately. This separation of concerns facilitates initial development, ongoing maintenance and reuse for both functional and non-functional code. A further separation can be applied in implementing each non-functional aspect – e.g., logging, security, and versioning.

Application Business Web Services

The application and web services implement the business processing functionality and enforce business rules. They respond to requests from the application specific portlets or application front ends to perform specific business functions. They should export a service-specific, public application programming interface (API) that can be called by portlets and application front ends.

Web Service Registry

The web service registry is responsible for storing information about web services, such as descriptions and interface information. The web service registry is used by clients of web services to dynamically discover, locate, determine the interface mechanism for and send requests to web services which are registered.

The KNIS Service Registry provides a means for services to operate as a collective, since consumers must have a means to discover services. Service registries provide a means to find other services and to use them in a loose coupling way.

The KNIS registry will be a standard UDDI server (Universal Description, Discovery and Integration). This provides a dynamic choice of service based on the functionality required by the consumer. Its role is similar to that of the Yellow Pages. The key use of the KNIS UDDI is to store WSDL files which are used by a service consumer at design time.

However, we also make this server available for runtime lookup of service endpoints based on service name and policies. Typical examples of such policies can be quality of service requirements, security requirements, preferred communication protocols, service version and the like.

Page 61: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

61

Service descriptions and interface definitions for services are maintained in the Web Service Registry. Before developing new Web services, the development team must check this registry to see if an existing service will satisfy the application's business needs.

An exemplar UDDI is an open source Java implementation of UDDI.

Key features that make this the exemplar:

- Platform Independent

- Supports for JDK 1.3.1 and later

- UDDI version 2.0 compliant implementation

- Use with any relational database that supports ANSI standard SQL (MySQL, DB2, Sybase, JDataStore, HSQLDB, etc.)

- Deployable on any application server that supports Servlet 2.3 specification (Jakarta Tomcat, JOnAS, WebSphere, WebLogic, Borland Enterprise Server, JRun, etc.)

- jUDDI ws.apache.org/juddi/ registry supports a clustered deployment configuration.

- Easy integration with existing authentication systems

Data Registry Choices

A data registry should be the SOR for a given domain's data (e.g., product, customer, or order information) and should track those data using globally unique identifiers. To optimize performance, data registries may sometimes store data from other systems of record, but these duplicate data are then treated as read-only.

The published service API for the data registry should also be domain-specific. Each parameter that is passed and returned is relevant to the business domain, in this case customer information.

The data registry should also maintain additional information about each domain entity (e.g., the customer) including the user ID of the person who created the entity, the date the entity was created, the user ID of the person who last updated the entity, and the date the entity was last updated. This approach permits better data auditing and allows for the archiving of old data that have been kept a long time but not been updated.

Technology Architecture of the Data Resources Tier

Several different technologies for data persistence are available, each with its own strengths. This section briefly describes these technologies and the uses for which they are best suited.

Data Background: physical versus domain data

Physical data: This is the data that is actually stored on disk. The details are how it is stored are described in a database schema. The schema is optimized for the performance characteristics and requirements of the particular data store.

Domain data: This is the data that is used in the service implementation. It is described in a standard data model and describes all of the information that is used in the implementation of a service. It represents the private knowledge of the data. A

Page 62: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

62

subset of the data is the view of the physical data and may come from one or more physical data stores.

The persistent storage solutions used by the application or service must be limited to a SQL-compliant RDBMS, an LDAP-compliant directory server, or an NFS-compliant file system.

Relational Database Management Systems (RDBMS)

RDBMS provides excellent support for OLTP when multiple changes must be applied as one transaction. Because RDBMS is a mature technology, it is widely used in enterprise application architectures to solve a majority of data management problems and provide stable, mature, standardized tools for functions such as data modeling, administration, querying, and reporting.

Vendors of RDBMS have provided a variety of ―value added‖ features that are not always portable. If portability is important, the development team should avoid features that commit the application to a sole vendor. Such features include SQL extensions, data types, stored procedures, triggers, database links, etc. Triggers within one database instance are acceptable. Database links (or any database linking mechanism) should not be used within a transaction. RDBMS is a possible solution for storage of transactional data with frequent updates, and when there is no need for data replication to multiple locations.

Directory Servers

The directory server is another data store that has gained importance with the movement to Web-centric applications and services.

Although the details of how data are stored in a directory server are not relevant to architecture discussions, some vendors have chosen to implement directory servers on top of relational databases. This convincingly proves the directory server is neither a new (immature) technology nor one that completely conflicts with relational databases. In most enterprises, directory servers coexist with relational databases as they cater to different application needs. The directory server is the recommended solution for requirements with very fast reads (lookups), few writes, and the need to distribute or publish data to multiple locations simultaneously.

Directory services are considered loosely consistent, which means there is no guarantee that all replicas hold the same data at any one time. In other words, not all replicas are updated instantaneously. Another advantage to loose consistency becomes apparent when communication problems cause a number of the servers on a network to become unavailable (or slow to respond): changes made to a directory server while network servers were out of operation are not lost. When the network problem is resolved, replicas on the affected servers receive updates.

Object-Oriented Databases (OODB)

OODB came into existence a few years back when object-oriented programming was becoming popular. Although OODB is suitable for managing complex, dynamic data such as 3D maps, engineering drawings, and scientific data, all popular RDBMS now have built-in support for objects, reducing the viability and need for specialized OODB.

Page 63: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

63

Within the KNIS, the use of OODB is not allowed unless a very specific need arises. Hence an exception is required at the time of review. Explicit approval must be obtained for use of OODB in a vendor product.

XML Database

XML is another recently popularized data format. XML is suitable for mapping object representation to a "flat" format. XML processing has full support in the latest version of the J2EE platform. Although there are situations where a native XML database6 may be suitable and desirable, most RDBMS have support for XML data. One of the advantages offered by XML DB is Xquery, which allows collections of XML files to be accessed like databases.

Because there currently are no standards broadly accepted by the majority of the industry, XML DB is still changing. Hence XML database is not allowed except when an explicit approval is obtained for use in a vendor product.

File Systems

File systems are part of the foundation of operating systems. Although using a file system is a quick and convenient way of providing data persistence, file systems should not be used by enterprise applications for transactional or reference data due to lack (or ease) of features such as transactions, replication, policy enforcement and provisioning.

File systems may be perfectly suitable for a quick application proof-of-concept, or for supplying mostly static application data such as configuration data. Generated application data, such as output and error/log files, can be stored in local files with appropriate access control mechanism.

RDBMS or Directory Servers

While, with some exceptions, the KNIS does not allow the use of OODB and XML database, the following guidelines help the development team choose between RDBMS and directory servers.

Choose a RDBMS solution if the application has many of the following characteristics:

- Access is weighted in favor of writes to reads (high W/R)

- Data change rapidly

- Multiple concurrent clients access or update the data

- Changes to data instantly available to all clients

- Transactional integrity, reliability, and recovery

- Strong ad hoc reporting tools and environment

- Control of data for administration

- Stringent auditing requirements

Applications that employ RDBMS should have a dedicated instance of the RDBMS.

6 A native XML database is one that stores XML documents in its native form without decomposing it.

Page 64: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

64

Choose a directory server solution if the application has many of the following characteristics:

- Relatively low write to read ratio (low W/R – lots of lookups and occasional changes to lots of data such as user profile data, application configuration data, etc.)

- Dynamic discovery of resources

- Data published to a large number of users in many different locations while maintained in a loosely consistent state

- Infrastructure for centralizing user, resource, and security information replication

- Little or no reporting requirements

- Runtime resource provisioning such as network bandwidth allocation

- Support for multi-valued attributes

- Flexibility for schema modifications

- Suitable for distributed infrastructure needs, due to features such as chaining and referral

Choose a file system solution for logging.

Applications that employ directory servers must use the KNIS's enterprise directory server setup. An application may not have its own dedicated directory server, unless required for a vendor product and/or an exception is granted.

Java Database Connectivity (JDBC)

After identifying the system and/or technology for the data resource tier, the development team must choose the data-access APIs.

Java applications must interact with RDBMS using JDBC API. This requires:

1. Loading an appropriate JDBC driver

2. Creating a connection or a pool of connections

3. Creating and executing SQL statements that return result sets

4. Closing the statement

5. Closing the connection or releasing it to the pool for reuse (optional)

JDBC drivers manage connectivity to the RDBMS and provide support for caching result sets.

J2EE (Web and EJB) containers provide connectivity to data resources, including connection pooling and transactional semantics; the development team does not need to code this function.

Data Storage

Data can be stored in the cloud and locally. To allow for scale, the KNIS can utilize advanced cloud platforms for data storage using a blended approach of storing some core data in the cloud.

Page 65: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

65

At a minimum, the following types of data will be stored locally to maximize contributions to privacy, access speeds and other key value adds:

- User profiles

- Saved or recent queries

- Comments associated with users or saved queries

- Geo data

- Tags

Data Source Types

Core Data

Data extracted from government and third party sources that will be stored locally to facilitate responsive user queries and mash-ups with other core data sources (including external).

External Core Data

Data residing on external servers that the KNIS cannot download locally but can access through some sort of data share or API.

The KNIS will maintain custom labels for external core data sets.

Where we can, the KNIS will store local copies of query results – at least to facilitate display to the user, and will attempt to store query results longer term –e.g. as associated with a user query so the original query results are maintained. This storage will also enable analysis of what users are asking.

These local copies should be clearly marked with the date the query was conducted and option to refresh the data (e.g. re-query the external core data source)

Core Data Labeling

Associated labels or taxonomies for core data will be stored in relational reference tables and are manually or dynamically applied dependent on the data source, user roles, or other qualifying variables. Labels are dynamically applied to core data upon display of the data or in response to a user query.

User data

User data includes profiles, roles, stored queries, and relationships to other users, comments/interaction history and related data. In addition, users can metatag core or external data or queries created for use in personal pages, to create analysis and share it with others.

Query data

Used to store query data so that searches or mash-ups can be saved, shared, or cloned by users.

It also stores associated comments with the query as well as any supplemental information. GEO data

Page 66: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

66

The KNIS will contain separate reference tables to enhance or augment geo-specific data. This should also allow for reverse look-up based on location (e.g. –show me all the data you have about Dallas, Texas)

Versioning of Data

The KNIS will implement a versioning mechanism for stored data and will allow the users with the capability to query the archive to compare how results have changed over time. KNIS retention of historical data will meet previously articulated statements regarding compliance with business rules and data provenance will be understood and retained.

Automatic Labeling

KNIS data architecture should support automatic data labeling for future versions to reduce curation overhead.

User Generated and Dynamic Tagging

Data architecture should support user and dynamic tagging of data sets, data elements, queries or mash-ups.

Content Management System

The content management system is responsible for managing content. The content typically includes, but is not limited to, textual content that has been tagged with XML or other markup. The content management system supplies tools to author and administer content as well as an API by which content can be retrieved.

Content Abstraction Layer

The content abstraction layer should provide a single, vendor-independent interface by which portals can retrieve and search for content which resides in these various content management systems. The content management system or content abstraction layer must make all content and selected metadata available for search engine discovery.

Technology Architecture of the Integration Tier

This section provides guidelines for using the middleware products that implement the standard APIs and protocols described in the client and presentation and data layers above. This includes most of the standard relational database products. It also includes some shared infrastructure services that are written and maintained by the KNIS.

Open System Web Server

The Web Server provides a container that supports servlets. It should be open Java compliant. It can be used for hosting business services that are implemented as servlets and do not require any of the special clustering and session management features of the Application Server. Due to its smaller footprint and easier administration, the Web Server is also recommended for servlet-based business tier components that do not require clustering and session management.

During implementation design, the web server must be chosen to be an open robust platform for HTTP(S) request processing. It should provide support for various open technologies such as JSP, servlets, and JDBC, as well as content technologies including CGI, SHTML, PHP, ASP, and JSP. The Web Server is optimized to serve static content via

Page 67: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

67

caching. It allows for a direct connection to databases and other resources and does not support a complex (distributed) transactional environment involving multiple data sources, global transaction recovery, etc. The Web server also provides support for Web services with servlet-only endpoints.

Open System Application Server

The Application Server provides containers that support APIs. It also allows the business services it hosts to be run in a clustered configuration for better load balancing and session management.

The Application server is an open application container providing a robust and optimized platform. It provides a high-performance platform for a complex (distributed) transactional environment. It is the recommended choice for high volume applications (greater than 100 transactions per second) whose components can reside in the same subnetwork.

Open System Portal Server

An Open System Portal Server will be used for portals.

Applications may integrate with existing portals either by providing a portal channel (portlet) or a link. If the application provides a portlet, its presentation tier must provide channel content by implementing the interfaces called for by an open system portal such as the Java System Portal Server. If the application simply provides a link, no special interfaces are required.

An application should not have its own dedicated production instances of the open System Portal Server.

Relational Database Server

Applications with large volumes of data, large numbers of concurrent users, and stringent requirements for reliability, availability, and recovery, should choose an RDBMS. In keeping with the KNIS open approach, by default design considerations should consider open source alternatives to RDBMS before closed source/proprietary.

Monitoring Products

Many current solutions for Application Performance Monitoring (APM) require no additional work by developers. Many leading products are based on capture and analysis of TCP and UDP packet traffic. Therefore, no specific KNIS guidelines exist for making special architectural considerations for application monitoring at this point in time. The development team should work with the operations team to implement the appropriate monitoring solutions (for example, Mercury Topaz for monitoring information such as response times and user transaction correctness via synthetic transaction). Use tracking is also of importance in continuing feedback to the design and governance teams.

Network Attached Storage (NAS)

NAS is a storage architecture that allows multiple servers to share file systems over a network. It is similar to NFS-mounted file systems but allows more protocols, provides better performance and security, and is more efficient to run and manage. Although the benefits are the same, with NAS each server can share the same set of application executables and configuration files. The files need only be updated once to update all the

Page 68: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

68

servers. This storage architecture is recommended for read-only (or seldom-update) files such as binaries and configuration files.

NAS is intended for file systems (as opposed to raw data). The NAS server assumes responsibility for maintaining the integrity of the file systems, thus it offers some protection against clients that might otherwise corrupt the file system.

Storage Area Network (SAN)

SAN is an architecture that enables attaching of remote computer data storage devices to servers. While NAS uses file-based protocols to do this, SAN uses block-based storage. In general, KNIS's approach will be to architect for NAS since these protocols more easily translate to cloud based services for data storage.

Virtual Private Network (VPN)

A VPN is an easy and cost-efficient technology for connecting networks that share the same security characteristics and trust levels. After establishing a shared session key, the VPN encrypts all data that flow between the networks.

A VPN should not be used to connect networks with differing security characteristics or trust levels, as its security will drop to the level of the least secure connected network.

The KNIS will enable connectivity via VPN, focused on the open standards and community capabilities of OpenVPN (for secure data exchange with stakeholders and access for management functions).

Monolithic Applications and Legacy Applications

Monolithic applications are applications which are web enabled, but with no separation of presentation tier and business logic tier, and no business logic API which can be called to invoke the backend business transactions. Such web-enabled applications can simply be linked from a portal channel. If tighter integration is desired for business reasons, they should be screen-scraped or wrapped to provide portal content. Legacy applications are applications which are not web enabled. Legacy applications should be screen-scraped or wrapped to provide access to their functions from the portal.

Syndication of Value Added Content

- When exchanging data between applications or services, XML files must be used. XML insulates applications from internal details, such as database schema, of the integrated application.

- Legacy applications and data sources which do not have XML capabilities will also be part of our ecosystem and must be engineered for. However, objective solutions that import legacy data into XML servers should be considered.

- Syndication can be by data feeds, RSS, flexible XML or integrated client feeds or other appropriate widget technologies.

Integration Testing

Although testing occurs at every tier, Integration Testing is especially important and KNIS expectations are that the implementation team will provide recommendations for standards

Page 69: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

69

to be met during integration testing (SUSA CTO is approving authority for Integration Testing metrics).

Content Delivery Services

The KNIS currently intends on using the global delivery fabric of a major cloud provider (such as Rackspace or Amazon) as the initial content delivery mechanism. However, we may quickly scale to the size where additional content delivery capabilities (such as Akamai) may be required to ensure usability for end users. Plans for and pacing of this capability will be driven by the CTO and may involve standing up a working group for collegial/community coordination on the best approach.

Page 70: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

70

Technology Trends to Watch As the KNIS architecture evolves and improves this foundational document will as well. Another important environmental consideration, however, is the fast changing landscape of technology. Over the planning period for coming KNIS solutions we expect several changes and also know we must be prepared to move to take advantage of emerging opportunities when unexpected technological changes emerge. Some key technology considerations we know we must anticipate include:

Service Registry Changes: With its recent release of the ebXML Web Service Registry, the community is on a road to explore more collaborative Web services that are further decoupled and more readily adapted to inter-enterprise business integration.

Web service trends to watch include:

- Asynchronous message style, rather than a synchronous request-response paradigm

- Document oriented, rather than procedure oriented

- More sophisticated use of standards for data semantics, process orchestration, and workflow

Changing Web Standards: Currently, these standards are mature, but changes are expected.

Run Time Dynamics: Dynamic integration is currently limited by the need for business and service agreements and by technical interface integration at the API level. However, new technologies that advance run-time dynamics in deployment, integration, and management of Web services will soon be introduced. With their release, more ad hoc and dynamic integration can occur, leading to reduced time to market and increased opportunities for efficient, flexible business automation.

Security Standards: The W3C, OASIS, and WS-Security are developing standards for negotiating security constraints between service requesters and responders. As these standards evolve and gain further adoption, inherent support by developer tools, APIs, service containers, and identity management products will become available. In addition, standards-based, secure service interoperability should increase.

This will be essential for secure interoperation with external customers, partners, and suppliers. Today, without a common infrastructure to support identity and authorization management, defining standard solutions for securing Web services can be a challenge.

Speed in development: As the IT infrastructure leverages new standards-based APIs and products, the time to develop and integrate services should decrease. Additionally, both internal and externally facing Web services will use common, established infrastructure based on simple yet high-quality standards for security information exchange. This infrastructure will allow for more granular introspection and, therefore, increased security.

Additional items to watch: We must continue to watch Flash, Silverlight, HTML5, REST, SOAP and WSDL developments. The open source community itself is also changing fast and needs to be watched very closely.

Page 71: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

71

The Current SUSA Beta Architecture

The architecture SUSA is currently using for its private beta implementation can be thought of in two ways. First, it is an ―operational input‖ to this version 0.1 KNIS architecture because of the many lessons learned from its implementation. Second, it is an ―operational commentary‖ on the KNIS version 0.1 architecture as it already incorporates many of the principles outlined in this document. This current SUSA architecture is based on a mix of open scalable systems (for example, Drupal is an open content management system and some open MySQL is used) and known/proven proprietary capabilities (for example, .NET and MS SQL as key components of the data management solution). In the current architecture, Drupal and MySQL power all user-facing functionality. .Net and MS SQL power only backend data processing, which is then published as XML to the cloud. There is no real-time access to the Data Management vertical in the current beta site architecture. - A graphical depiction of this architecture is provided at Figure 9.

- Check with the KNIS architecture team for the most current version and technology details of the architecture.

Figure 9: KNIS's Current Architecture

Page 72: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

72

Glossary

Architecture: Design and description, including functionality, components and their relationships of components. Representation of the coming system.

API: Application Programming Interface. A description of how to interact with an application or its data.

Channel: A portion (usually a little box) of a portal page that contains content. For example, the stock quote channel on a page.

Content Management System (CMS): Code specifically designed to store, manage, disseminate documents.

Credentials: information presented to authenticate a user. For example, a static password, a dynamic one-time password obtained from a token card, or information on a smartcard.

CSS: Cascading Style Sheets

DSXML: Design Extensible Markup Language

GUI: Graphical User Interface

HTML: Hyper Text Markup Language

HTTP: Hyper Text Transfer Protocol

ITIL: Information Technology Infrastructure Library

J2EE: Java 2 Enterprise Edition

JDBC: Java Database Connectivity

JSP: Java Server Faces

JSP: Java Server Pages

JSR168: A standard specification for portlets.

LAMP: Short for Linux Apache MySQL PhP

NAS: Network Attached Storage

OODB: Object Oriented Database

PDA: Personal Digital Assistant

REST: Representational State Transfer. Better describes and defines HTTP-WWW client/server/application interactions.

RDBMS: Relational Database Management System

SAMP: Short for Solaris, Apache, MySQL, PhP

SAN: Storage Area Network

Servlets: A unit of java code that runs in a web server.

SOA: Service Oriented Architecture. Designing for flexibility using loosely-integrated/coupled suites of services (largely web services). From OASIS: "A paradigm for organizing and utilizing distributed capabilities that may be under the control of different

Page 73: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

73

ownership domains. It provides a uniform means to offer, discover, interact with an use capabilities to produce desired effects."

SOAP: Simple Object Access Protocol. A specification for exchanging structured info.

SMS: Short Messaging Service

URI: Uniform Resource Identifier

URL: Uniform Resource Locator

VPN: Virtual Private Network

WOA: Web Oriented Architecture, considered an extension of SOA. Maximizes browser and server interaction by REST and POX (plain old XML).

WSDL: Web Services Description Language. Models and defines the web services available.

WSRP: Web Services for Remote Portals – A protocol used between different instances of portal servers to enable one portal (a consumer portal) to obtain channel markup from a portlet which resides on another portal (a producer portal).

XSL: eXtensible Stylesheet Language – A software language which can be used to write code which converts data or documents from one markup format, such as XML, to another, such as HTML.

XSLT: extensible Stylesheet Language Transformations

XML: Extensible Markup Language. Rules for encoding documents and data in machine-readable formats.

Page 74: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

74

Architecture Resources Bredemeyer Consulting http://www.bredemeyer.com/A variety of resources to help software architects deepen and expand their understanding of software architecture and the role of the architect. It has lots of material – papers, presentations, etc. – on software architecture, software architects, and architecting.

Institute of Electrical and Electronics Engineers (IEEE) http://www.ieee.org Through its members, the IEEE is a leading authority in technical areas ranging from computer engineering, biomedical technology, and telecommunications, to electric power, aerospace and consumer electronics, among others. Lots of publications and papers available at no charge; some require a fee.

IEEE Standard 1471:―Recommended Practice for Architectural Description of Software-Intensive Systems‖ http://ieeexplore.ieee.org/xpl/tocresult.jsp?isNumber=18957

Open Group Architecture Framework (TOGAF) http://www.opengroup.org/ Information flow without boundaries, achieved through global interoperability in a secure, reliable and timely manner.

SEI – Software Engineering Institute http://www.sei.cmu.edu/sei-home.html A office of Carnegie Mellon University, the SEI's core purpose is to help others make measured improvements in their software engineering capabilities. Most SEI material is timely and free.

SEI Architecture Documentation page http://www.sei.cmu.edu/ata/arch_doc.html

Worldwide Institute of Software Architects http://www.wwisa.org/ A nonprofit corporation founded to accelerate the establishment of the profession of software architecture and to provide information and services to software architects and their clients. The WWISA site has lots of white papers, books, etc., on software architecture as a profession.

Zachman Institute for Framework Advancement (ZIFA) http://www.zachmanframework.com/ The ZIFA is a network of information professionals. Its mission is to promote the exchange of knowledge and experience in the use, implementation, and advancement of the Zachman Framework for Enterprise Architecture.

Page 75: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

75

Table of Standards

Standards relevant to the KNIS architecture are summarized below, organized by layers of the OSI model stack:

OSI model

7. Application Layer

NNTP · SIP · DNS · FTP · HTTP · NFS · NTP · SMPP · SMTP ·DHCP · SNMP

6. Presentation Layer

MIME · XDR · TLS · SSL

5. Session Layer

Named Pipes · NetBIOS · SAP · SIP ·L2TP · PPTP

4. Transport Layer

TCP · UDP ·

3. Network Layer

IP (IPv4, IPv6) · ICMP · IPsec ·

2. Data Link Layer

ARP · Ethernet

1. Physical Layer

802.11 · USB

Page 76: KNIS Open Architecture v0.1

Please provide comments on this draft to [email protected]

SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

76

About This Architecture

This document – Version 0.1 -- was produced by the SUSA architecture team. Your input and ideas for improvement are strongly desired.

For questions/comments/suggestions please contact SUSA at:

The State of the USA, Inc. 1146 19th Street, Suite 300 Washington, D.C. 20036

or

General inquiries: (202) 540-5400 [email protected]