chapter 12. standards and governance in organizing...

24
Chapter 12: Standards and Governance Last revised: October 27, 2010 ‐1‐ Chapter 12. Standards and Governance in Organizing Systems Chapter authors: Michael Manoochehri and Robert J. Glushko Table of Contents 12.1 Introduction ..................................................................................................................................................................... 1 12.1.1 Setting the Standard: Driving In Samoa ................................................................................................. 1 12.1.2 Implementing Organizing Systems with Software, Specifications and Standards ................... 2 12.1.3 When are Standards in Organizing Systems Necessary? .................................................................. 4 12.1.4 Processes for Defining and Implementing Standards ........................................................................ 5 12.1.5 Governance....................................................................................................................................................... 6 12.2 Standards in Organizing Systems ............................................................................................................................. 7 12.2.1 Standards for Depth and Scope of Description .................................................................................... 7 12.2.2 Standards for Data Formats and Storage Environments................................................................. 11 12.2.3 Standards Wars ............................................................................................................................................. 13 12.2.4 Proprietary versus Open Standards ....................................................................................................... 14 12.3 Standards-Making....................................................................................................................................................... 15 12.3.1 Standards-Making By Governments ..................................................................................................... 15 12.3.2 Standards-Making by Standards Bodies .............................................................................................. 16 12.3.3 Standards-Making by Consortia and Quasi-Standards Bodies .................................................... 17 12.3.4 De Facto Standards Created Outside of the Standards Process................................................. 18 12.3.5 From de facto to de jure standards ........................................................................................................ 18 12.4 Governance and Maintenance of an Organizing System .............................................................................. 19 12.4.1 What is Governance? ................................................................................................................................. 19 12.4.2 Why do We Need Systems of Data Governance? ............................................................................ 19 12.4.3 Designing Governance Systems ............................................................................................................. 20 12.4.4 The Interplay of Standards Bodies and Commercial Interests ..................................................... 21 References............................................................................................................................................................................... 22 12.1 Introduction 12.1.1 Setting the Standard: Driving In Samoa Whether you travel by bus, car, or bicycle, you always keep to one side of the road. The convention of driving on either the right side or the left side is a legal standard that you, and others who share the road, take for granted. But you must follow it to ensure safe driving and avoid running into other vehicles and pedestrians. In large international cities like London, Tokyo and Hong Kong "look right” messages are painted on the roadway to remind pedestrians

Upload: nguyenhanh

Post on 13-Aug-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 1 ‐ 

Chapter 12. Standards and Governance in Organizing Systems Chapter authors: Michael Manoochehri and Robert J. Glushko Table of Contents12.1 Introduction ..................................................................................................................................................................... 1 

12.1.1 Setting the Standard: Driving In Samoa ................................................................................................. 1 12.1.2 Implementing Organizing Systems with Software, Specifications and Standards ................... 2 12.1.3 When are Standards in Organizing Systems Necessary? .................................................................. 4 12.1.4 Processes for Defining and Implementing Standards ........................................................................ 5 12.1.5 Governance....................................................................................................................................................... 6 

12.2 Standards in Organizing Systems ............................................................................................................................. 7 12.2.1 Standards for Depth and Scope of Description .................................................................................... 7 12.2.2 Standards for Data Formats and Storage Environments................................................................. 11 12.2.3 Standards Wars ............................................................................................................................................. 13 12.2.4 Proprietary versus Open Standards ....................................................................................................... 14 

12.3 Standards-Making ....................................................................................................................................................... 15 12.3.1 Standards-Making By Governments ..................................................................................................... 15 12.3.2 Standards-Making by Standards Bodies .............................................................................................. 16 12.3.3 Standards-Making by Consortia and Quasi-Standards Bodies .................................................... 17 12.3.4 De Facto Standards Created Outside of the Standards Process ................................................. 18 12.3.5 From de facto to de jure standards ........................................................................................................ 18 

12.4 Governance and Maintenance of an Organizing System .............................................................................. 19 12.4.1 What is Governance? ................................................................................................................................. 19 12.4.2 Why do We Need Systems of Data Governance? ............................................................................ 19 12.4.3 Designing Governance Systems ............................................................................................................. 20 12.4.4 The Interplay of Standards Bodies and Commercial Interests ..................................................... 21 

References ............................................................................................................................................................................... 22  

12.1 Introduction

12.1.1 Setting the Standard: Driving In Samoa  Whether you travel by bus, car, or bicycle, you always keep to one side of the road. The convention of driving on either the right side or the left side is a legal standard that you, and others who share the road, take for granted. But you must follow it to ensure safe driving and avoid running into other vehicles and pedestrians. In large international cities like London, Tokyo and Hong Kong "look right” messages are painted on the roadway to remind pedestrians

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 2 ‐ 

from elsewhere that the side of the road standard might differ from the one they are used to in their home cities like New York or San Francisco. This standard of which side of the road you drive on, simple as it seems, was not decided arbitrarily, but rather, it was adopted as a result of history, convention, and the need for organization. If you were the only person in your country to use the road, you could choose to travel on any side you wanted, even travel right down the middle. But as soon as more than one person needs to use the same road, the risk of collisions compels the creation of a coordinating standard. In 2009, the government of Samoa took the rare step of changing the side of the road standard from driving on the right to driving on the left. The original standard reflected the influence of German colonization in the early 1900s. However, Samoa is both geographically close to and economically intertwined with Australia and New Zealand, former British colonies that follow the British convention of driving on the left side. This proximity gives Samoa access to a nearby source of used cars that would be attractive to Samoa’s relatively poor population. So, the Samoan government decided to use its authority to change the driving standard so that more of its people could afford to buy cars. As one could imagine, this decision was not implemented without controversy and opposition. While the decision benefited people currently without cars, it negatively affected those who already owned them. After a switch like this, what happens to the current market value of the thousands of cars designed to drive on the right? Opponents also claimed that the switch would cause unprecedented safety hazards. If even a small fraction of drivers were not able to immediately get the hang of driving on the other side, the accident rate could increase tremendously. Imagine the current pool of buses designed with doors that open on the right hand side - would they now let passengers out in the middle of the street? Who would pay to have the buses modified to put doors on the left hand side?

12.1.2 Implementing Organizing Systems with Software, Specifications and Standards

In the Samoan side-of-the-road example the organizing system is defined by government regulations and implemented with signs, signals, speed bumps, and other visible and tangible mechanisms and technologies to inform drivers and ensure their compliance. Similar visible physical mechanisms play a role to inform users and shape their behavior in the organizing systems of libraries, museums, and other contexts where the things being organized are tangible objects. But to the extent that organizing systems deal with “digital things” the decisions about how they work and how they should be used are likely to be implemented in software.. A simple organizing system to satisfy personal recordkeeping or some short-lived information management requirements can be implemented using folders and files on a personal computer or by using “off the shelf” generic software such as web forms, spreadsheets, databases, and wikis. Other simple organizing systems run as applications on smart phones or PDAs. Some small amount of configuration, scripting, structuring or programming might be involved, but in many cases this work can be done in an ad hoc manner.

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 3 ‐ 

More capable organizing systems that enable the persistent storage and efficient retrieval of large amounts of structured information generally require additional design and implementation efforts. Flat word processing files and spreadsheets are not adequate. Instead, XML document models and database schemas often must be developed to ensure more control of and validation of the content and metadata. Software for version and configuration management, security and access control, query and transformation, and for other functions and services must also be developed to implement the organizing system. It might be possible to implement these capabilities and services to an organizing system in an incremental fashion with informal design and implementation methods. But if information models, processing logic, business rules and other constraints are encoded in the software without explicit traceability to requirements and design decisions the organizing system will be difficult to maintain if the context, scope or requirements change. Implementing an organizing system of significant scope and complexity in a robust and maintainable fashion requires a precise description of the entities and information components it contains, their formats and descriptions, the classes, relations, structures and collections in which they participate, and the processes that ensure their efficient and effective use. Rigorous descriptions like these are often called “specifications” and there are well-established practices for developing good ones for both information models and the processes that use them. There is a subtle but critical distinction between “specifications” and “standards” in organizing systems. Any person, firm, or ad hoc group of people or firms can create a specification for an information or process model and then use it themselves or attempt to get others to use it. In contrast, a “standard” is a published specification that is developed and maintained by consensus of all the relevant stakeholders in some domain by following a defined and transparent process, usually under the auspices of a recognized “standards” organization. In addition, implementations of standards often are subject to conformance tests that establish the completeness and accuracy of the implementation. This means that users can decide either to implement the specification themselves or choose from other conforming implementations. The additional rigor and transparency when specifications are developed and maintained through a standards process often makes them fairer and gives them more legitimacy. Governments often require or recommend these “de jure” standards, especially those that are “open” or “royalty free” because they are typically supported by multiple vendors, minimizing the cost of adoption and maximizing their longevity. Despite these important distinctions between “specifications” and “standards,” however, in conventional usage “standard” is often simply a synonym for “dominant or widely-adopted specification.” These “de facto” standards, in contrast with the “de jure” standards created by standards organizations, are typically created by the dominant firm or firms in an industry or by a new firm that is first to use a new technology or innovative method. De facto standards and de jure standards often co-exist and compete in “standards wars,” especially in information-intensive domains and industries with rapid innovation. As a result, even though it would be technically correct to argue that “while all standards are specifications,

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 4 ‐ 

not all specifications are standards,” this distinction is hard to maintain in practice. So we will treat standards and specifications as synonyms and adopt “standards-making” as shorthand for the process by which specifications or standards are created unless the distinctions are critical.

12.1.3 When are Standards in Organizing Systems Necessary? Any individual., group, or enterprise can create an organizing system that meets their specific needs, but once this organizing system involves two or more parties with different needs, there is a potential for conflict.  For example, if you are the sole user of an email account, you can organize your messages any way you want. You can use any number of folders that need only make sense to you, or you can leave everything unorganized in the inbox.  But if you share an email account with another person, they are likely to have different organizational needs or preferences. Perhaps you tend to organize saved messages by sender, while they prefer to organize messages by topic or project. So if the two of you are to share the email account, some negotiation about the organizing system must take place.  

Similarly, many businesses have realized that their business information and data are crucially valuable assets, and that revenue can be derived both directly and indirectly from managing them effectively.  But a challenge in many businesses because of the incremental and opportunistic ways in which they adopt information technology is that different departments within the enterprise employ different information models and software applications to manage and process information.  This means that organizing systems that follow different specifications often can’t share and combine information without excessive rework. Retrofitting or replacing these applications to enable efficient interoperability is often possible, and it is usually desirable for the firm to develop enterprise standards for information exchange models rather than pay the recurring costs to integrate or transform incompatible formats (see Chapter 10).   

Well designed systems of organization are important not only for efficiency within a firm, but for interactions with external parties. Automating transactions with suppliers and customers in a supply chain requires that all the parties use the same data format or formats that can be transformed to be interoperable.  A dominant firm might propose to use its internal standard for information exchanges with its supply chain, but its suppliers and customers might prefer a neutral industry standard as the interchange format.  Indeed, if the suppliers and customers have the market power to do so, they might propose that their own format standards be used instead (see Chapter 10).    

Standards are especially important in industries or markets that have significant network effects where the value of a product depends on the number of interoperable or compatible products -- these include much of the information and service economies.   Standards are often imposed by governments to protect the interests of their citizens by coordinating or facilitating activities that might otherwise not be possible or safe.  Some of them, like the “side of the road” standard with which we began this chapter, primarily concern public or product safety and are only tangentially relevant to systems for organizing information.  But others are highly relevant, especially those that specify the formats and content of information exchange, like the mandates by many European governments for firms doing business with the government to adopt the Universal Business Language, an 

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 5 ‐ 

OASIS standard for transactional documents (see Chapter 10).  Other government standards that are important in organizing systems are those that express requirements for retention of auditing information for financial activities, such as the Sarbanes‐Oxley act, or for non‐retention of personal information, such as HIPAA and FERPA.  Complying with government regulations like these can be expensive and difficult, and many companies, especially smaller ones, complain about the cost. On the other hand, the argument can be made that investing in a rigorous system for organizing information can provide competitive advantages, turning the compliance burden into a competitive weapon (see Joy of SOX). 

12.1.4 Processes for Defining and Implementing Standards It might be possible for a small number of people to agree on an organizing system that meets the needs of each participant. But obviously the potential for conflict increases when more people are involved, and “bottom-up” ad hoc negotiations to resolve every disagreement between every pair of participants just aren’t feasible. Instead, for a large-scale organizing system, standards are usually decided by entities that have the authority to coordinate actions and prevent conflicts by imposing a single solution on all the participants in a “top-down” manner. This authority can come from many different sources, but they can be roughly categorized as “authority from power” and “authority from consensus.” Mao Zedong's often-cited quote that "political power grows out of the barrel of a gun” explains authority as the power of one entity to dominate other ones. This is the source of power for totalitarian leaders and governments, but is also a description of market power, when the economic dominance of a firm allows it to control how business gets done in its industry. One key part of that is establishing specifications for data formats and classification schemes in organizing systems, which usually means requiring other firms to use the ones developed by the dominant firm for its own use. This ensures the continued efficiency of their own business processes while making it harder for other firms to challenge their market power. For example, Apple has set standards for iPhone application development that inhibit the creation of applications that also run on competing smart phone platforms (ref here to Apple’s Flash prohibition). However, the authority that derives from market power can sometimes be challenged by a second source of authority called the first-mover advantage or disruptive innovation. In technology-driven domains, an upstart innovative firm can introduce a new technology, define specifications for its use, and seize control of the new market enabled by this innovation. The large established firms that controlled the existing market might be hampered by their legacy technology and organizing systems and by their reluctance to change their current successful business model. In contrast to authority that arises from power, another form of authority to create and impose standards comes from consensus. Democratic governments derive “their just powers from the consent of the governed,” and while Jefferson wasn’t thinking about information organization when he included this phrase in the US Declaration of Independence, his perspective that governments are legitimate only when they serve their citizens induces them to create standards for the public good. Indeed, Article I, Section 8 of the US Constitution grants Congress the power to set standards “of Weights and Measures” to facilitate commerce and protect the public.

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 6 ‐ 

Consensus is also the authority mechanism embodied in the workings of the open source community, where the freedom to view and change data formats and code that uses them encourages cooperation and adoption. Consensus also underlies the authority of voluntary standards activities, where firms work together under the auspices of a standards body and agree to follow its procedures for creating, ratifying, and implementing standards. International and national standards bodies have authority both from consensual participation and from the authority of the governments that created them. The most “standard” of all standards organization is the International Organization for Standards (ISO), whose members are themselves national standards organizations, which as a result gives the nearly 20,000 ISO standards the broadest and most global coverage. In addition, there are scores of other national and industry-specific standards bodies whose work is potentially relevant to organizing systems of the sorts discussed in this book. But standards organizations arguably derive most of their authority from the collective power of their members, because many influential standards organizations like OASIS, W3C, OMG, and IETF are not chartered or sponsored by governments. In addition, firms often create ad hoc “quasi-standards” organizations or “communities of interest” to facilitate relatively short-term cooperative standards-making activities that in the former case would otherwise be prohibited by anti-trust considerations. Finally, at the extreme “lightweight” end of the standards-making continuum, the codification of simple and commonly used information models as “microformats” depends on authority that emerges from the collaboration of individuals rather than firms. There are many different processes by which specifications become standards. Sometimes a standard is developed from the collective requirements of the participating individuals or firms, but more often a standard evolves from an existing specification submitted to a standards organization by the firm that created it. In other cases, the specifications used by a dominant firm becomes a de-facto standard by other firms in its industry, and it is never submitted to a formal standards-making process. Just as some people can be excluded from a conversation because they don't know the language being spoken, stakeholders are often excluded from standards-making and governance processes. Some processes embody economic or social biases that determine who can participate. For example, a standards organization can claim that any person or firm is eligible to become a member, but if it costs $50,000 a year to join and meetings alternate between Tokyo, Berlin, Vancouver, and Boston only the biggest firms can realistically afford to participate. Sometimes the processes work for current stakeholders but fail to identify the needs of future users or possibilities enabled by new technologies.

12.1.5 Governance “Governance” is “a quality control discipline for assessing, managing, using, improving, monitoring, maintaining, and protecting organizational information” (IBM). We can adapt this broad definition for our narrower purposes as “maintaining an organizing system effectively over time.” Governance is a concern in any organizing system, regardless of its size or scope, because

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 7 ‐ 

it is fundamentally about preserving the investments made in designing and implementing it by anticipating and responding to changes in operating context, requirements and opportunities. “Curation” is roughly equivalent to “governance” but is typically used to describe the governance of the objects and organizing systems in libraries, archives, and museums rather than those in enterprise or inter-enterprise contexts. An obvious governance issue is ensuring that the organizing system is implemented and operated in a way that can persist beyond the people currently using it. If the principles, specifications, and practices embodied in the organizing system are tacit rather than explicit, they can deteriorate or be compromised without warning. This can result in the loss or corruption of information with significant negative consequences for the firm’s reputation as well as financial and legal liability. It is thus essential to implement as much as possible of the organizing system using technology like data schemas, programming language classes and APIs, decision trees, business rules, and other “computable specifications” so that the workings of the organizing system are transparent, verifiable, and easier to secure. The deliberate nature of standards-making provides assurance that standards-based investments in an organizing system will be preserved, because even when standards change, they do so in a controlled manner with careful version and consideration for backwards compatibility. No such guarantees exist with proprietary specifications, and indeed vendors often change specifications, especially data formats, without notice in order to offer new capabilities or cause problems for competing implementations. Some large firms participate in numerous standards organizations as a form of competitive intelligence gathering because they can infer from proposed changes to standards what other firms are planning to do. Technology for organizing systems will always evolve to enable new capabilities. For example, “cloud computing” and especially cloud storage are radically changing the scale of organizing systems and the accessibility of the information they contain. This rapid technological innovation has caused some people to disparage formal and deliberate standards-making and governance processes as barriers to timely and effective adoption of those innovations. On the other hand, hasty or ad hoc specification and adoption of new technologies or specifications can result in brittleness and non-interoperability, especially when they are applied outside the narrow scope in which they were initially developed. Effective governance procedures can ensure that innovations are systematically evaluated and implemented in a way that balances the potential against the risk. Good governance is also essential in enabling compliance with governmental mandates like the Sarbanes-Oxley Act, HIPAA and FERPA. But implementing the rigorous information management practices required for compliance can yield operational efficiency and strategic flexibility that outweighs the compliance costs.

12.2 Standards in Organizing Systems

12.2.1 Standards for Depth and Scope of Description

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 8 ‐ 

The depth and scope of description is the most fundamental decision in organizing systems, one that is subject to a basic and unavoidable tradeoff principle that a few people can standardize on a lot, but a lot of people can only standardize on a little (Rosenthal et al, 2004). This is both a conceptual and pragmatic constraint on the amount of agreement about the meaning of information components that can be achieved in an organizing system. The conceptual limitation can be measured in terms of a “person-concept” cost element; if semantic agreement comes at a cost, this cost increases with the number of people trying to reach agreement and with the number of concepts they must agree about. The pragmatic constraint is more subtle, reflecting the reality that stakeholders vary in their willingness to compromise to reach semantic agreement. When people have different requirements, different relationships within the set of participants trying to reach agreement, and different extents to which they are subject to the authority behind the desired agreement, it is not surprising that standardization approaches “that require perfect coordination and altruism are of no practical interest” (Rosenthal et al, p 47) . Put another way, the cost to devise and implement standards in complex organizing systems with many stakeholders is sometimes excessive and not always worth it, because they might not all want the same degree of standardization and don’t want to pay for something they don’t want. A corollary principle is that the extent to which an organizing system can be implemented using standards is often determined by the domain of the system because some domains are inherently more describable using objective or deterministic dimensions. In particular, organizing systems that manage scientific or technical data and transactional information are inherently easier to standardize because the information components in these domains have more semantic precision to start with. The components in these organizing systems are mostly granular pieces of content that have objective and deterministic properties that can assume only one a finite range of values. Somewhat paradoxically, perhaps, when components are easier to describe, there are often a large number of competing standards for describing them. For example, there are many formats for describing time. Some countries may use a 24 hour format, while others might use the 12 hour am/pm format. While virtually everyone is able to understand these common formats, there are many other ways to describe time. Many computer systems use “Unix” or “POSIX” time, which specifies time in terms of the number of seconds of “Coordinated Universal Time” (UTC) which began at midnight on January 1, 1970, because it makes it extremely easy to compute with time information and synchronize applications. A scientist might count time using an atomic clock standard that defines time in nanoseconds. Most people don’t need to know or understand the programmer’s or scientist’s standards for expressing the time, and would therefore not use it in an organizing system. The simpler time format is ubiquitous but lacks the special properties of the more complex ones. The situation is somewhat different when an organizing system manages more narrative, qualitative information. Here the information components are more heterogeneous, larger, and with less intrinsic semantic precision. This makes describing them less deterministic and objective, so there is more variation in those descriptions, making semantic standards less likely. This of course is the well-known “vocabulary problem” where the problem is lack of agreement on the words used to describe things. On this end of the document type spectrum the choice of alternative descriptions is broader, more multidimensional, and shaped by potential uses and strategy rather than by objective criteria of completeness or precision.

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 9 ‐ 

But whether or not there are multiple standards for describing the components in some domain, the simplicity of a standard contributes to its adoption, because there isn’t any value in using a complex system of description in an organizing system if a simpler one will suffice for the intended uses and users. A more precise and complex standard will require more knowledge to implement and impose more effort on users. A simpler standard gives up some semantic completeness and expressiveness in exchange for less work and less conceptual complexity. We can look to library science for an example of the tradeoff between simplicity and expressiveness. Machine Readable Cataloging standards, or MARC, have been in use for decades as a highly granular descriptive model for use in organizing systems for bibliographic materials. However, the granular complexity of the MARC standards meant that its users were almost always trained library science professionals. With the explosion of content generated by the World Wide Web, it became apparent that a simpler, more accessible categorization standard would be needed to allow for more non-professionals to participate in bibliographic organization. A new set of standards known as the Dublin Core was developed, with the design goal of being simple and flexible enough to allow for classification by non-experts. While this simpler approach address the problem of accessibility, Dublin Core's less granular approach also causes disadvantages for interoperability, as in a case where an less granular entry must be converted into more granular format. For an in-depth look at the development of MARC and Dublin Core record formats, see Chapter 4, "Metadata: Storing Descriptions." This tradeoff between simplicity and descriptive power is also illustrated in the design and evolution of HTML, the markup language used in web pages. When Tim Berners-Lee designed the first version of HTML in 1990 it had a very limited vocabulary of about a dozen tags that described simple structures like titles, headings, and lists. As a result, HTML describes what a web page looks like, not what it about. Because he envisioned the web as a mechanism for sharing technical publications, Berners-Lee considered taking a more content-oriented approach to description, but he concluded that a fixed set of tags would have to be very large to adequately describe the diverse content of these types of documents. Berners-Lee also considered as an alternative to a single HTML tag set the idea of using SGML, the Standard Generalized Markup Language. SGML would have enabled users to define new sets of tags for describing the content of any type of document. However, Berners-Lee believed that the intellectual and technical overhead of using a metalanguage rather than a fixed vocabulary would make web pages and applications too difficult to implement and use. Many experts in electronic publishing argued that HTML was too limited and that an SGML-powered web was necessary (ref HTML- Poison or Panacea citation). Berners-Lee turned out to be right, because the simplicity of HTML 1.0 made it possible for almost anyone to create web pages and led to its exponential growth. But the rapid global adoption of the web created two new categories of requirements that could not be satisfied by the simple tag set of HTML 1.0. The first new category of requirements reflected the desire for more control over the appearance and behavior of web pages so that they could be “branded” like non-web content, enabling firms to differentiate themselves and their products. This led to the “browser war” between Microsoft

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 10 ‐ 

and Netscape in the mid-1990s in which both added proprietary tags and scripting languages that worked only in their browsers. We’ll say more about this in the next section of this chapter. The second category of requirements derived from the desire to conduct business on the web. This represented a more fundamental challenge to the simple descriptive model of HTML 1.0 because it requires the encoding of the content of product catalogs and transactional documents in a computer-processable way. The variety of business content and applications means that no single descriptive vocabulary can have the necessary semantic precision. The solution turned out to XML, the Extensible Markup Language, first imagined as a profile of SGML but later specified as a standalone metalanguage. XML makes it possible to create arbitrary sets of content-oriented information components for describing things. Because each kind of thing can have its own customized set of tags, XML applications are sometimes called domain-specific languages. XML tagged content isn’t displayed in browsers unless it is associated with presentation instructions in a style sheet, and this clean separation of content and presentation is fundamental to making information reusable in different applications, devices, and contexts. Those who argued against HTML 1.0 as too simplistic felt vindicated. But the debate about the tradeoffs between simplicity and expressiveness hasn’t gone away. The same issues have arisen more recently with respect to the methods and descriptive technologies for the "Semantic Web." Like that of XML’s domain-specific languages, the goal of the semantic web is to make web pages and other online information sources understandable by people as well as by computers. But the semantic web vision goes further. It assumes that the computer-processable descriptions of content aren’t fixed, and that people and computers should be able to add additional descriptions and metadata to existing content or create new content as "meta-pages" that facilitate semantic processing of all the other descriptions and metadata. Creating these aids to finding and understanding information sources are of course traditional activities of librarianship and cataloguing, but in the semantic web this work is mostly done by computational processes that use metadata links to locate resources and make inferences in the directed, labeled graph of assertions that interrelate them to create new knowledge (see Chapter 7). The complex but semantically expressive approach to making the web semantic proposes a set of standards for making semantic assertions and defining the ontologies needed to understand the terms and properties used in the assertions. The most commonly used standard for the former is the Resource Description Framework (RDF), and the most commonly used standard for the latter is the Web Ontology Language (OWL), both of which were developed by the World Wide Web Consortium, the standards-making organization started and headed by Tim Berners-Lee. RDF+OWL approaches are making headway in enterprise applications and organizing systems where making content and applications more semantically-aware has the potential to reduce the significant cost of doing systems integration, content management, and compliance with semantically-unaware systems. In contrast, the argument for a simpler approach to making the web more semantic revolves around less rigorously defined specifications collectively known as microformats. Instead of following rigorous and deliberate methods to define domain-specific languages, the developers of microformats codify information models that already exist for small chunks of structured information like personal contact information, events, and content licenses. Furthermore,

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 11 ‐ 

microformats are implemented using attributes on existing HTML tags. A similar approach is taken in RDF-a, which uses attributes (hence the "a") to embed RDF statements into XHTML tags. Embedding semantic information into the presentation framework of HTML foregoes the benefits of separating content from presentation, but it also lowers the cost of moving to a more semantically-aware web. This section has described the design challenges that arise when making choices about the depth and scope of description in an organizing system, especially those due to competing standards. These problems can also be dealt with at “run time” through the use of hub languages, crosswalks and transforms (see Chapter 10). Of course, it is often straightforward to automate the conversion of semantically rich and highly granular descriptions to less precise and coarser component models because this process is essentially throwing information away. This is often called “down-translation” because like water flowing downhill, it naturally occurs without any energy source. The reverse process of converting semantically thin or coarse descriptions is called “up-translation” for the obvious reason that it takes work (and often human intervention) to increase the semantic value of the content.

12.2.2 Standards for Data Formats and Storage Environments Designers of organizing systems must often face the challenge of the great variety of data formats in which they encounter information. Some common legacy formats are decades old, like MARC records in online library catalogs, COBOL flat files in business applications, and EDI messages for business-to-business document exchange. Each of these examples are record-oriented, have little or no content markup, and face competition from technically superior formats based on XML. Despite these disadvantages, each format is still in widespread use, as it can be extremely expensive to convert data and the software that processes it from an existing legacy standard to a newer, more capable standard. When a firm or industry experiences this kind of “lock in” or constraints on technology choices because of an earlier choice it is described as "path dependence" (David, 1985). Similarly, consider how information storage environments have evolved in the last few decades. Storage technology changed from tapes to floppy disks to hard drives, with capacities increasing from kilobytes to megabytes to gigabytes. When a new storage technology is introduced, information stored on prior formats sometimes is migrated to the new ones, but not always, and storage is now so cheap that every device has some. Sometimes formats directly compete and “winner take all” or “network” effects makes one of them lose and disappear from the market (as VHS videotape formats caused Betamax to do), but even then people who bet on the wrong format don’t immediately rush off to by the winner. The Blu-ray optical disc format offers many times the storage capacity of DVD formats, but the latter show no signs of going way anytime soon. New data or storage formats are often designed to take advantage of or to enable new device or application capabilities. But many organizing systems operate in complex environments with many different data formats and devices that are collectively incompatible. This makes it logically impossible to define an optimal solution, and tradeoffs are always necessary. Should you create a least common denominator interchange or “hub” format, with crosswalks that

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 12 ‐ 

“down translate” as required to achieve interoperability? Should these transformations be carried out “in a batch” at design time or “as needed” at run time? (See Chapter 10). Problems posed by multiple formats arise at all scales of organizing systems, from those for personal information management, for small and large businesses, and for non-profit or public sector systems like those in libraries and governments agencies. How these problems are dealt with depends in general on available resources and technical sophistication, and for businesses additional considerations include the competitive and regulatory environments they face. Some of these dilemmas are avoided with “cloud-based” or “virtual” storage that treat it as a service rather than as a physical device. This abstraction means that storage is for all practical purposes unlimited, but it creates privacy and availability concerns. If a cloud-based organizing system “goes down” it becomes completely inaccessible, a concern people have raised about the Google Books collection, which is often touted as the library of the future. Physical copies of books never “go down” and the only format incompatibility that owners of physical books face is when a book doesn’t fit properly in a backpack or on a shelf. But recent trends in the digital book industry are presenting consumers with the challenge of having to choose between incompatible electronic book formats. One of the most widespread digital book formats, known as EPUB, is an open XML-based standard that is supported by a standards organization known as the International Digital Publishing Forum. While many digital books are distributed with no restrictions on copying or sharing, the EPUB specification allows publishers to produce digital books that are wrapped with a Digital Rights Management (DRM) system to restrict copying. However, the EPUB standard does not strictly define what type of DRM systems may be used, leaving retailers free to choose a copy protection system which suits their business models. Therefore, despite being based on the same, nominally open specification, EPUB digital books that are sold by one retailer may not be compatible with readers that support files bought from a different retailer. Nearly every digital bookseller and reader supports the EPUB standard, with one major exception. Market leader Amazon.com instead publishes books in a proprietary digital format know as AZW. By July 2010, Amazon not only claimed 75 percent of the digital book market (ref?), but the company reported that it had begun to sell more electronic copies of books than physical versions. The major business advantage in selling digital books using single proprietary format is that Amazon is able to lock buyers into its own retail ecosystem. Amazon is also able to circumvent the need to invest any time or money in the standards making process. Given the choice to support an open digital book standard, Amazon decided to take advantage of its strong market position and forge ahead with a proprietary digital book format. Ultimately, consumers will determine the success of Amazon's strategy. The open nature and wide support for the EPUB standard may allow for a fast pace of innovation of the format, perhaps resulting in new features from publishers that use the EPUB platform. On the other hand, consumers may decide that Amazon's large selection of books and dominant market position produces a more attractive product.

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 13 ‐ 

12.2.3 Standards Wars The power to impose a standard of organization can reflect the market power of players within an industry. In some cases, a dominant company or technology will drive stakeholders into adopting a single standard, while in others, groups of firms within an industry may find economic incentive to favor one standard over another. Standards "wars" tend to occur when different firms or groups of firms develop two or more standards that tend to address the same needs. Not surprisingly, the competing standards are often incompatible on purpose. At first this lets each standard attract customers with features not enabled by the other, but it ends up locking them in by imposing switching costs. The “browser war” in the early days of the web is a classic example. In order to differentiate themselves, the two dominant web browsers of the early 1990s, Netscape Navigator, and Microsoft Internet Explorer, each added features to their product that did not conform to the HTML specification. Each firm added support for their own non-standard page elements. This was an era which introduced customers to the notorious "blink" (that made text turn on and off) and "marquee" (aka “moving ants”) elements. These incompatibilities motivated the creation of the World Wide Web Consortium (W3C) in 1995 to restore control over the HTML standard, but Microsoft ultimately won the war by giving away its browser, which drove Netscape out of business. Entire industries may delay innovation while waiting for a standards victor to emerge, and thus a standards battle can be costly to all involved. If consumers are unsure if a particular media or document description format will have long term viability, they may forgo purchasing until it is clear that one will be a better choice. A major concern for many consumers of business productivity software is data interoperability. If a company creates a document, and needs to share it with a customer or client, what guarantee is there that the recipient will be able to read the file? Will digital documents created today be readable and editable decades into the future, even if the company that provided the software that created them goes out of business? For many years, the dominant provider of software for word processing, spreadsheets, and presentation has been Microsoft, creator of the Office suite. Constant feature requests from the large consumer base resulted in frequent updates to Microsoft's proprietary file formats. In order to maintain document compatibility with other users, customers had little choice but to upgrade with each new version of the product. Microsoft eventually faced pressure from government entities, some of whom had begun to demand that file formats used by government office workers be based on open standards. Meanwhile, Sun Microsystems attempted to erode Microsoft's market dominance by releasing a office productivity suite as a freely available open source product. This product, later named OpenOffice, also featured an XML-based document format known as OpenDocument, which was ratified as a standard by OASIS in 2005. In 2006, Microsoft submitted the specification for Open Office XML (OOXML) to ECMA International, and later to ISO for ratification as an open, patent-free standard for business documents.

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 14 ‐ 

The existence of two similar, open, and competing standards meant that consumers had to invest into research to ensure that whichever product they chose would provide compatibility within their industry. Some governments chose one format over another outright. After years of review, Denmark notably decided to not endorse one format or another, but rather to set of guidelines that open document standards must meet in order to be used by the Danish government. The pressure by governments to ensure interoperability has resulted in Microsoft, as well as the makers of many competing business productivity packages, to actually support a variety of standards in some fashion.

12.2.4 Proprietary versus Open Standards This is a particular kind of standards war…, or maybe better put as “if the conflict in a specific standard is a battle, the war is the sum of the battles” – this is a battle on multiple fronts. De facto standards designed by a single dominant firm are often proprietary, meaning that using them carries patent, copyright, or trademark restriction. This does not mean that they are completely unavailable to use or implement, but there may be a fee for doing so or they may be made available only to business partners of the firm that owns the standard. On the other hand, just because a standard was developed by a standards organization does not mean that it will be freely available for any entity to use. Standards organizations usually require that participating firms declare any relevant intellectual property they own that is used by the standard, but they do not require that the firms give up their rights to the IP. Since all of the well established firms in a given industry will have some IP portfolio, it is common for them to agree to grant each other “Reasonable and non discriminatory” terms in licenses or rely on generic cross-licenses, sometimes jokingly called “mutual non-aggression pacts.” If the standard addresses a particular, well-defined industry need, RAND terms are usually not an obstacle because the technology that implements the standard is often not the most substantial cost of designing and operating the organizing system. While governments may sometimes require the data formats they use be completely open, corporations have less need for a particular data standardization to be completely open and publicly available as long as licensing fees are not excessive. Businesses will often gravitate toward standards that are cheaper to implement overall. However, more recent practice in software and web services domains has been the use of royalty-free (RF) rather than RAND licenses. This enables firms that did not participate in the standards-making to use the standards without compensating the firms that did. Since the set of affected stakeholders in a domain is often much larger than the memberships of standards organizations, this often substantially increases the adoption of the standard. Royalty-free standards are usually described as “open” by the standards bodies that publish them. The open source community has long clamored for completely free standards, largely because many open source users are individuals for whom even modest license fees might be prohibitive. OASIS and the W3C have modified their intellectual property policies to accommodate some of these concerns, but many open source advocates remain dissatisfied, perhaps because they don’t

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 15 ‐ 

appreciate how nuanced the issues are for large established firms like IBM, Oracle, and Microsoft that have a mixture of open source and proprietary activities and implementations.

12.3 Standards-Making Tim Bray, one of the editors of the XML standard and a well-respected software technologist, argues that one should never make a new standard if it is possible to use an existing one. But as discussed in Chapter 3, it is essential that a descriptive vocabulary fits the requirements of a domain before it is applied in an organizing system. If existing standards don’t meet some requirements, standards making might be necessary. This section describes different ways that this happens.  Processes by which specifications and standards emerge differ widely. Much of the process is shaped by the source of the authority needed to make the standard, because this determines which stakeholders are involved and their sense of what an appropriate process might be. In this section we’ll consider three sources of standards-making authority: governmental, institutional (standards or quasi-standards bodies), and market power. Institutional and market-based authority aren’t mutually exclusive, since market power shapes participation in standards-making organizations, but it is easier to explain them as if they are separate.

12.3.1 Standards-Making By Governments Governments have inherently long time horizons for their actions, they need to serve all citizens fairly and without discrimination, and they (should seek to) minimize cost to taxpayers. Each of these principles is an independent argument for standards and taken together they make a very strong one. Indeed, one the founding goals in the US Constitution is to protect the public interest, and this is enabled by granting Congress the power to set standards “of Weights and Measures” to facilitate commerce. Setting standards is a key role of the National Institute of Standards and Technology, part of the Department of Commerce, and other departments have similar standards-setting responsibilities and agencies, like the Food and Drug Administration (FAA) in the Department of Health and Social Services. In addition, independent government agencies like the Federal Communications Commission (FCC) and Federal Trade Commission (FTC) set numerous standards that are relevant to information organizing systems. And of course, the Library of Congress maintains procedures and standards needed “to sustain and preserve a universal collection of knowledge… for future generations” (LOC.gov/about). Government organizations often have data asset needs that address the goal of providing the greatest possible access and interoperability at the lowest possible cost to citizens of the state. Therefore, government-mandated data needs tend to emphasize organizing systems that are unencumbered by legal restrictions or technical barriers. In this case, data formats that are free to use and well supported are ideal, thus lowering the barrier of entry for citizens to access the data. Backwards compatibility is also important, as governments produce a great deal of information needed to create, evaluate, and refine policies and decisions.

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 16 ‐ 

In 2005, Eric Kriss, then the Massachusetts Secretary of Administration and Finance told a meeting of the Massachusetts Software Council, "it is an overriding imperative of the American democratic system that we cannot have our public documents locked up in some kind of proprietary format, perhaps unreadable in the future, or subject to a proprietary system license that restricts access." [citation] Kriss was referring to the use of an open, public, and well-documented data standard for describing business documents. This statement is noteworthy in that it was an early example of a policy maker referring to a digital system of organization as important to the sovereignty of a government. In 2009, the US Securities and Exchange Commission issued a requirement that publicly held American companies publish financial information on their websites in a standard data format. This format, known as eXtensible Business Reporting Language (XBRL), was chosen for various reasons, including the fact that data provided this standard can be used by various software tools and requires no licensing fee. While the SEC claims the power to require that financial information not only be reported, but reported in a standard format, it does not claim the power to require companies to choose a particular set of software tools to create this data. If the SEC mandated that only a single software vendor’s product could be used to create XBRL documents, it would come into conflict not only with American anti-trust laws, but also the power of the provider of software solutions. Even a powerful government organization like the SEC has limits.

12.3.2 Standards-Making by Standards Bodies A standards organization can be thought of as an entity that is primarily concerned with the creation, adoption, and revision of standards. Some standards organizations are chartered by counties and some are created as global ones under the auspices of the UN. Most organizations that create standards are governmental or non-profit entities. Standards organizations often have codified rules that dictate membership requirements as well as the procedures for how standards are developed. An association with a standards organization can also help legitimize a commitment to interoperability in the eyes of other stakeholders. As noted earlier, large corporations sometimes have a great deal of power in defining systems of organization. While smaller entities might not be able to set standards on their own , the process of joining a standards organization gives them a defined role at the bargaining table. Another important issue is for standards organizations to address is how they balance the needs of stakeholders. In standards organizations whose members are nations, “one country one vote” for most activities seems fair but there may be provisions for the large or economically dominant countries to have additional influence. Similar concerns arise in organizations whose members are companies. Some standards organizations also have additional member or other stakeholder types, including academic researchers and non-governmental “civil society” NGOs. One of the most well known standards organizations is the International Organization for Standardization (ISO). ISO, headquartered in Geneva, Switzerland, is responsible for many

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 17 ‐ 

widely-used standards across many industries. Examples of the standards that ISO maintains include the data formats used in physical media, to the standards used by major programming languages. ISO's members are not individual organizations, but rather, members are primarily representatives of nations. The process for the development of an ISO standard is generally a complex process initiated by committees of experts representing a particular industry. ISO's rigid bureaucratic structure reflects the position of the organization as a respected and powerful international authority.

12.3.3 Standards-Making by Consortia and Quasi-Standards Bodies An organization doesn't have to be widely recognized as an "official" standards organization to promote and maintain widely used data specifications. Many useful specifications are created by what we might call quasi‐standards bodies. These organizations share the open, transparent specification development process of larger standards organizations. Examples may include technology industry consortia, as well as bodies that deal specifically with the development of standards for digital documents, such World Wide Web Consortium (W3C), and the Organization for the Advancement of Structured Information Standards (OASIS).

Even for powerful corporations, the financial investments necessary to define and maintain a standard for data organization may be cost prohibitive. Therefore, in general, it is often financially advantageous for competing corporations to join consortia that are set up primarily to define and maintain standards. These standards organizations may ultimately help to lower costs and improve interoperability across an industry. In other cases, quasi‐standards organizations may be created by groups of companies who may want to establish ongoing organization and governance procedures to maintain and extend a standards "family" around an initial standard.  The Internet Engineering Task Force (IETF) is an interesting example of a quasi-standards organization that consists of an international collection of volunteers. IETF meetings are open to the public, and much discussion about specifications and standards are shared through open participation on email lists and Internet forums. Standards proposals are submitted to the IETF in the form of publicly available documents known as "Requests for Comments" (RFC). It is no accident that the relatively inclusive structure of the IETF is reminiscent of the decentralized structure of the Internet. As a result, many common data standards used in Internet technologies, such as the TCP/IP and the OAuth protocols, are developed or adopted by the IETF. Despite the potential for quasi‐standards organizations to essentially fulfill the organizational role that characterize larger standards bodies, governments may require the use of standards that are backed by a widely recognized standards organization.  Therefore, a quasi‐standards body may spend a great deal of energy in attaining recognition as an official standards body.  It is important to recognize the role of traction in the development of standards. While any organization has the ability to create a specification and nominate itself as a standards body, the utility of any standard is determined by the amount of acceptance and use that it 

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 18 ‐ 

receives. In some instances, a single firm might attempt to create an ad hoc organization that mimics the features of a legitimate standards body. However, these types of institutions lack the open and transparent features of organizations that exist to promote the widespread adoption of standards.

12.3.4 De Facto Standards Created Outside of the Standards Process  Not all firms participate in standards activities.  For start‐up firms trying to establish a product and marketplace presence, diverting scarce engineering resources to standards activities might be a bad investment. The firms just creates whatever specifications for information models and processes that it needs to carry out its business model..  But if a firm succeeds these proprietary specifications might become widely adopted by its partners and customers, creating “de facto” standards rather than the “de jure” ones created by standards and quasi‐standards organizations. When a firm is large and powerful it can define data standards that maintain its market dominance and discourage competition. But unlike standards imposed by government authority or by standards organizations, those imposed by private firms often face ongoing challenges to their legitimacy because their adoption isn’t mandatory and because another firm or organization can propose a rival standard. For example, Apple can set standards for iPhone applications, but developers might reject the iPhone and write software for more open smart phones. Similarly, groups of customers or clients can often influence standards and organizing systems even if they have no direct means to create them.

12.3.5 From de facto to de jure standards Occasionally a firm that has a de facto standard decides that the benefits of making it an open standard outweighs the benefits of maintaining it as a proprietary one. For example, in the early days of the web Netscape created the language JavaScript to add dynamic features to otherwise static web pages. Microsoft reacted to Netscape's new feature by releasing a mostly compatible scripting language that it referred called JScript. While the specifications of JavaScript and JScript were close, and solved similar problems, the differences were great enough to cause web developers to invest time in work arounds, or worse, design web pages to conform to a single browser platform. Eventually, after many years, Netscape submitted its JavaScript specification to Ecma International, and there is now a well accepted standard for use of scripting language in web browsers. Another example of the widespread use of a specification prior to standardization is "XMLHttpRequest," developed by Microsoft to allow the Internet Explorer browser to retrieve online data without reloading web pages. This specification fueled an explosion of browser-based application development and became a de facto standard in so-called AJAX implementations. Rather than prohibiting its use by competing browsers, Microsoft contributed XMLHttpRequest to the W3C, which, has since begun the process of developing a standard definition.

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 19 ‐ 

12.4 Governance and Maintenance of an Organizing System

12.4.1 What is Governance? While automated data exchange clearly has a technical aspect, in practice, there are many other issues involved in addressing interoperability using data standards. Employees must understand how their actions and software use affect the quality of an organization's data management. The members of an organization who make decisions about which software to purchase must understand how the data produced by these applications are interoperable with other systems. The process and strategy for dealing with is known as data governance. Data governance is primarily concerned with the creation and enforcement of best practices for managing data. Governance is not merely a matter of solving technical problems. Khatri & Brown, authors of Designing Data Governance, write that governance should be concerned with "what decisions must be made to ensure effective management and use of IT" as well as "who makes the decisions." In summary, a governance strategy identifies a set of policies necessary to ensure data organization best practices, as well as the hierarchy of decision makers who are responsible for the implementation of that policy.  

12.4.2 Why do We Need Systems of Data Governance? Organizations that face potentially increasing data integration problems face ever growing costs. For example, if an organization produces a great deal of stored data that is aggregated using incompatible data formats, there may be a significant cost liability if the data needs to be combined into a single database. Dan Woods, in an essay entitled Why Data Quality Matters, asserts that "it is far cheaper to be proactive in cleaning up data quality messes than to wait for a meltdown." Governance is a process that must reflect coordination between multiple departments in an organization. While a considerable amount of investment may go into software solutions for data management, without proper coordination between employees as well as an investment into best practices and training, Governance is not just a principle that aids in an organization's ability to efficiently organize data. Increasingly, governance strategies are becoming a necessity for companies to address regulatory requirements mandated by governments. In 2001, the sudden collapse of Enron was the largest bankruptcy in the United States had yet experienced. Enron’s management famously concealed financial failures behind poorly reported financial statements. The financial subterfuge and fall of Enron was followed by several other high profile financial scandals. These events prompted Congress to pass the Sarbanes–Oxley Act of 2002 (sometimes known as “Sarbox” or “SOX”), which mandated that publicly traded companies must implement internal financial controls. SOX also mandated that the CEO and CFO would now be held legally responsible for the accurate financial reporting. SOX has had a large impact on American corporations. With strong legal ramifications for corporate management, internal data practices could no longer be placed solely in the hands of

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 20 ‐ 

technical staff. Enterprises suddenly faced the challenge of ensuring that internal systems. In many cases, this required. The financial impacts of SOX are constantly debated, and a great deal of criticism toward the law has centered on the possibility that increased financial overhead has reduced the competitiveness of American companies. On the other hand, ensuring that internal best practices are observed may produce long term benefits for corporations, including increased efficiency and data interoperability. While it may not be legally necessary, following internal IT governance best practices may often be the most successful strategy for dealing with changes in regulatory environments. A major challenge for organizations of any size is to provide engage in data transactions with various outside. Multinational corporations must often provide financial reporting data in multiple formats and standards, as international and local regulations may be different. Smaller organizations face similar challenges, from reporting on tax information, insurance claims, purchase orders, and countless other transactions that require data intensive applications. Often, following proper IT governance practices will also provide an organization with the most efficient methods of managing data. Despite the possibility of initial increased investment, reducing data redundancy and inconsistency throughout an organization may save money and time in the long run. Consumer acceptance of mobile devices and web service providers has resulted in a great deal of personal information being stored in online databases. However, constantly evolving regulations around privacy and use of this personal information by web services may mean that reporting requirements for web service providers may change in the future. If laws are passed that require organizations to demonstrate that they are following privacy and security guidelines, those who follow are already following best practices for data accountability will have a distinct advantage.

12.4.3 Designing Governance Systems  

Defining regulations for how a data organization system should be implemented and governed is not the only purpose of data governance. A governance system should also define which personnel in an organization are responsible for making decisions and enforcing best practices. In order to ensure that an organization system is being used effectively, it is important to define how data will be stored and shared, as well as to clearly define who will be responsible for ensuring that best practices for data management will be followed. 

Who is Responsible for Data Governance? Five Decision Domains 

Earlier in this chapter, we introduced the concept that, given the choice, standards for organizing data should generally be followed when available. The same idea may be applied to an organization's approach to systems of governance.  

Some industry consortiums publish governance guidelines that codify best practices for establishing an IT governance policy. These guidelines will often contain scoring and monitoring strategies for evaluating internal practices. For example, the IT industry group ICASA publishes the Control Objectives for Information and related Technology (COBIT), which. The SEC has recommended that organization who are bound by SarBox regulations follow a similar set of IT practices described by Committee of Sponsoring Organizations of 

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 21 ‐ 

the Treadway Commission (COSO). There are many challenges in the process of developing a system of data governance for an organization. Where should a person tasked with the goal of developing a system of governance start? In a paper entitled Designing Data Governance, Vijay Khatri and Carol Brown describe five major points of concern for anyone attempting to employ a data governance strategy.

In Khatri and Brown's model, Data Principles are the overarching governance principle which informs all other aspects. Focusing on data principles means asking questions about the importance of each particular type of data being described. What is the data being used for? Khatri and Brown suggest that this aspect of governance is important enough that it should fall under the responsibility of the company's executive management team.

Ensuring an acceptable level of Data Quality, on the other hand, is the responsibility of many stakeholders in an organization. Data producers, such as retail workers, generate a great deal of customer information. When an organization is large enough, it might consider hiring an individual whose primary task is to ensure a high level of data quality.

Having complete and useful Metadata allows for greater opportunities for interoperability, and again, this responsibility should in the hands of an executive in charge of data management.

A concern about Data Access refers to the need to understand how to secure your data. How do employees learn about best practiced about data security?

A sometimes overlooked aspect of institutional data governance is Data Lifecycle, or how data is archived and referred to over time. This issue is also covers the need of an organization to meet legislative requirements for producing information for regulatory purposes. This responsibility should fall under the auspices of a CIO or high level data manager, especially since some legislation will hold the executive staff of a company criminally accountable for reporting transgressions.

The events that prompted the passing of the Sarbanes‐Oxley bill in the United States illustrates the reality that regulations and political sentiment can change incredibly quickly. SarBox essentially required that corporations report financial information in a complete, efficient and unambiguous way. Corporations which had already developed good practices for meeting data compliance needs were at an advantage, while others caught were forced to quickly investment in resourced to catch up. 

A system of organization should attempt to not only meet the requirements described in previous sections of this chapter, but it should attempt to be able to evolve for the future Anyone tasked with. In an article entitled Governance in the Digital Age, Sharon S. Dawes sums up the scope of planning for future by writing "infrastructure suited to the future of government must consider values and policies, and human, organizational, institutional, and societal factors in addition to foundational tools and technologies."

 

12.4.4 The Interplay of Standards Bodies and Commercial Interests While best practices for data governance exist within organizations, the concept of governance also applies to the interplay between standards bodies, industry, and the public. In recent years, perhaps the most massive example of data governance can be found in the 

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 22 ‐ 

system of standards and organizations involved in maintaining the data interoperability of the Internet. Indeed, the sheer scale of the Internet has produced a daunting governance challenge. 

Internet standards must balance the needs of users from every part of the world, some of whom may have limited access to the latest technology, or are challenged by language barriers or other inequities. Internet standards must be rigid enough to reflect a high degree of interoperability, while proving enough leeway for technology companies to create new products and innovations. Apart from ensuring maximum data interoperability across a vast diversity of hardware, the governance of the Internet includes issues such as how domain names are allocated and even which media types are accepted as standards. 

At its inception, the Internet was primarily an American invention. However, stakeholders from countries all around the world are able to register domain names and host a website. The organizing system for this domain name system (known as DNS), is comprised of a collection of protocols and standards. Many of these standards are governed by an organization called the Internet Assigned Numbers Authority (or IANA), which is in turn administered by an American non‐profit organization known as the Internet Corporation for Assigned Names and Numbers (ICANN). As an American institution, ICANN has been criticized for practices that are biased toward the United States. Some organizations, including the United Nations, have been concerned over a single country being in control of the entire mechanism of how domain names are assigned to users.  

Internet governance also touches on issues around privacy and security. For example, what format or specifications should be followed in order to provide users with a widely used, and strongly encrypted, method for electronic online transactions? Currently, the most advanced useful method for encrypting data sent over the Internet is a specification known as Transport Layer Security (TLS). As of 2010, TLS was being evaluated for becoming an official standard sponsored by the IETF.  

The development of the Internet is rife with examples of conflicting specifications, standards battles, and tension created by conflict of needs between an enormous variety of stakeholders. As the Internet continues to evolve and grow, new battles over standards and governance challenges will emerge. However, the continued function of the Internet is a direct result of a governance model that reflects the guiding principles of industry freedom, consumer need, and data interoperability. 

References  BISAC Subject Headings List, Major Subjects ‐ 2009 Edition. (2009). BISAC Subject Headings. Retrieved May 28, 2010, from http://www.bisg.org/what‐we‐do‐0‐136‐bisac‐subject‐headings‐list‐major‐subjects‐‐‐2009‐edition.php  Paul David (1985). "Clio and the Economics of QWERTY". American Economic Review: 332.   

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 23 ‐ 

Dawes, S. S. (2009). Governance in the digital age: A research and action framework for an uncertain future. Government Information Quarterly, 26(2), 257‐264.  FIPS 161‐2 ‐ (EDI), Electronic Data Interchange. (n.d.). . Retrieved May 29, 2010, from http://www.itl.nist.gov/fipspubs/fip161‐2.htm  Glushko, R. J., & McGrath, T. (2008). Document engineering: analyzing and designing documents for business informatics and Web services. MIT Press Books, 1.  Google Rekindles Browser War ‐ WSJ.com. (n.d.). . Retrieved August 1, 2010, from http://online.wsj.com/article/SB10001424052748704178004575351290753354382.html  Grafton, Anthony. (2009, September 18). The Book Bench: Google Books and the Judge : The New Yorker. Retrieved May 27, 2010, from http://www.newyorker.com/online/blogs/books/2009/09/google‐books‐and‐the‐judge.html  Insurance News ‐ Outflanked! (n.d.). . Retrieved May 22, 2010, from http://insurancenewsnet.com/article.aspx?id=165374&type=newswires  Khare, R., & Çelik, T. (2006). Microformats: a pragmatic path to the semantic web. In Proceedings of the 15th international conference on World Wide Web (p. 866). ACM.  Khatri, V., & Brown, C. V. (2010). Designing data governance. Communications of the ACM, 53(1), 148‐152.  Kleinwachter, W. (2002). From Self‐Governance to Public‐Private Partnership: The Changing Role of Governments in the Management of the Internet's Core Resources. Loy. LAL Rev., 36, 1103.  Language Log » Google Books: A Metadata Train Wreck. (n.d.). . Retrieved May 22, 2010, from http://languagelog.ldc.upenn.edu/nll/?p=1701  McCarthy, C. (n.d.). Amazon: Kindle titles outpacing hardcovers | Digital Media ‐ CNET News. Retrieved August 1, 2010, from http://news.cnet.com/8301‐1023_3‐20010975‐93.html  Microformats vs. RDF: How Microformats Relate to the Semantic Web ‐ Blog ‐ Semantic Focus ‐ The Semantic Web, Semantic Web technology and computational semantics. (n.d.). . Retrieved June 15, 2010, from http://www.semanticfocus.com/blog/entry/title/microformats‐vs‐rdf‐how‐microformats‐relate‐to‐the‐semantic‐web/  Microsoft: Why the ODF vs. OOXML battle matters | ZDNet. (n.d.). . Retrieved May 27, 2010, from http://www.zdnet.com/blog/microsoft/microsoft‐why‐the‐odf‐vs‐ooxml‐battle‐matters/290  

Chapter 12: Standards and Governance      Last revised: October 27, 2010 

  ‐ 24 ‐ 

Nunberg, G. (2009). Google's Book Search: A Disaster for Scholars. Chronicle of Higher Education. Retrieved from http://chronicle. com/article/Googles‐Book‐Search‐A/48245/on, 3.  Nunberg, Geoffrey. (n.d.). Language Log » Google Books: A Metadata Train Wreck. Retrieved May 27, 2010, from http://languagelog.ldc.upenn.edu/nll/?p=1701  Shah, R. C., & Kesan, J. P. (2007). Open standards and the role of politics. In Proceedings of the 8th annual international conference on Digital government research: bridging disciplines & domains (p. 12). Digital Government Society of North America.  Shapiro, C., & Varian, H. R. (2003). The art of standards wars. Managing in the modular age: architectures, networks, and organizations, 247.  Shifting the Right of Way to the Left Leaves Some Samoans Feeling Wronged ‐ WSJ.com. (n.d.). . Retrieved August 1, 2010, from http://online.wsj.com/article/SB125086852452149513.html  Weir, Rob. (n.d.). ODF at 5 Years. Retrieved May 28, 2010, from http://www.robweir.com/blog/2010/05/odf‐5‐years.html  Why Data Quality Matters ‐ Forbes.com. (n.d.). . Retrieved May 22, 2010, from http://www.forbes.com/2009/08/31/software‐engineers‐enterprise‐technology‐cio‐network‐data.html