a field study of subject gateways on 'zeitgeschichte

60
A Field Study of Subject Gateways on “Zeitgeschichte”. Applied Historical Information Science Diplomarbeit zur Erlangung des akademischen Grades eines Magisters der Philosophie an der Philosophisch-Historischen Fakultät der Leopold-Franzens-Universität Innsbruck Eingericht am Institut für Zeitgeschichte bei o. Univ.-Prof. Dr. Rolf Steininger von Michael Kröll Innsbruck, im März 2006

Upload: others

Post on 24-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Field Study of Subject Gateways on 'Zeitgeschichte

A Field Study of Subject Gateways on “Zeitgeschichte”.

Applied Historical Information Science

Diplomarbeit

zur Erlangung des akademischen Grades

eines Magisters der Philosophie an der

Philosophisch-Historischen Fakultät der

Leopold-Franzens-Universität Innsbruck

Eingericht am Institut für Zeitgeschichte

bei

o. Univ.-Prof. Dr. Rolf Steininger

von Michael Kröll

Innsbruck, im März 2006

Page 2: A Field Study of Subject Gateways on 'Zeitgeschichte

2

Screenshot of the project homepage, March 5th 2006 http://www.pepl.info/papers/fieldstudy_sg_zeitgeschichte/

Page 3: A Field Study of Subject Gateways on 'Zeitgeschichte

3

Table of contents I Introduction....................................................................................................................... 4

1 What is a “subject gateway”?......................................................................................... 6 2 Brief history of the development of subject gateways for the field of Contemporary

History............................................................................................................................ 7 II Basic information about the three web-based subject gateways .................................. 9 III Generic technical web-page evaluation methods ......................................................... 10

1 Syntactic standards....................................................................................................... 10 2 Metadata standards....................................................................................................... 10 3 Accessibility guidelines................................................................................................ 11 4 Usability evaluation...................................................................................................... 11

IV Setting up a framework for specific analyses............................................................... 14 V Using the custom analysis framework .......................................................................... 15

1 Resource identifier validation ...................................................................................... 15 2 Subject classification analysis...................................................................................... 16

2.1 Current subject indexing standards ..................................................................... 16 2.2 Semantic Web to the rescue? .............................................................................. 17 2.3 Subject classification of the three subject gateways ........................................... 18 2.4 Subject focuses as compared to the “offline world” of Contemporary History

research and teaching. ......................................................................................... 19 3 Common resource identifiers analysis ......................................................................... 21 4 Duplicate pointers analysis........................................................................................... 22 5 Information network analysis....................................................................................... 23

VI Interpreting the results of the analysis: Two theses on Contemporary History in the

German language area............................................................................................... 25 1 Is Contemporary History the history of national socialism? Discussing the meaning

and purpose of “Zeitgeschichte” .................................................................................. 25 2 Subject gateways are needed as hubs in the Contemporary History Network............. 28

VII Conclusions...................................................................................................................... 30 VIII Bibliography .................................................................................................................... 32 IX Appendix.......................................................................................................................... 47

1 Web usability checklists............................................................................................... 47 1.1 Best practices for web interfaces of searchable databases .................................. 47 1.2 Selected Nielsen web design mistakes ................................................................ 49

2 Top 50 keywords in the aggregated ZIS-, VLZ-, ZOL-Link database ........................ 51 3 Top 50 keywords related to Contemporary History - Innsbruck University Library

OPAC database ............................................................................................................ 53 4 URLs common to all three subject gateways ZIS/ZOL/VLZ ...................................... 55 5 System setup and availability....................................................................................... 56 6 Database design............................................................................................................ 57 7 Overview of crawler and import programs .................................................................. 58 8 Overview of the analysis programs.............................................................................. 59

Page 4: A Field Study of Subject Gateways on 'Zeitgeschichte

4

I Introduction In the German speaking language area1, three major web-based subject gateways

focusing on Contemporary History2 have been built up during the last ten years. These

projects, essentially working in parallel, share the pretence of being a main reference of their

kind. Nevertheless, they differ substantially with regard to certain project characteristics, such

as their dates of establishment or disposable resources. This paper provides a comparative

analysis of www.zeitgeschichte-online.de, www.vl-zeitgeschichte.de and zis.uibk.ac.at as

three major examples of German-speaking web-based subject gateways on Contemporary

History embedded in the context of applied Historical Information Science.

The principal historical methods and quality standards established during the last one

and a half centuries by the science of history can be adapted to the online world of historical

content. However, new criteria for online historical content still have to be established for

“secondary aspects”3: such as technical form and structure of content, metadata and linking.

Only if such standards are well-established inside the community of online history, an

efficient coverage of the online historical information space can be established and potentially

leverage a participation in the vision of the Semantic Web4 for the further future.5

Tim Berners-Lee, original “inventor” of the current World Wide Web, envisioned the

Semantic Web as “an extension of the current web in which information is given well-defined

meaning, better enabling computers and people to work in cooperation.”6 This new generation

World Wide Web should overcome the limits of current search engines and provide a

completely new quality of knowledge development. Although Berners-Lee’s vision may

sound utopian in part, the technological foundation for the Semantic Web to function has

1 For the most parts of this study, the reference to the “German language area” covers Germany and

Austria. 2 If mentioned without a specific context, the meaning of “Contemporary History” has been used

synonymously with “Zeitgeschichte” in this paper, bearing in mind that “Contemporary History” has a number of different meanings in diverse national and language specific settings. An exemplary overview of the different “Contemporary Histories” in a number of European countries is provided by Gehler (2002) "Zeitgeschichte zwischen Europäisierung und Globalisierung" 25-32.

3 Enderle (2001a) "Der Historiker, die Spreu und der Weizen, zur Qualität und Evaluierung geschichtswissenschaftlicher Internet-Ressourcen" 62.

4 An general overview on the Semantic Web is provided by Miller and Swick (2003) "An Overview of W3C Semantic Web Activity".

5 Cf. Enderle (2001b) "Geschichtswissenschaft, Fachinformation und das Internet" 7. 6 Berners-Lee, et al. (2001) "The Semantic Web. A new form of Web content that is meaningful to

computers will unleash a revolution of new possibilities".

Page 5: A Field Study of Subject Gateways on 'Zeitgeschichte

5

already been laid in the last years and considerable effort is still put into the further

development.7

Before standards for the technical form and structure of content, metadata and linking

of online historical content, which are amongst other things the basis for a participation in the

Semantic Web8 can be put to wide-spread use, it will be necessary to get an idea about the

current status. For web-pages on Contemporary History in the German language area,

systematic evaluations already exist9. As a complement thereto, this paper introduces aspects

of a methodological canon for evaluating subject gateways. Pursuing a higher level empirical

and technical approach, the author will discuss the methodological potential, prospects, and

implications as an example of applied Historical Information Science. It shall be

demonstrated that approaching historical content by analyzing its “secondary”, i.e. formal

aspects, can prepare the grounds for new insights about the content itself, and that such

analysis can also instigate discussion of the content focus of a whole discipline.10

In the context of this paper, Historical Information Science shall be defined “as an

extension of older notions of scientific historicism combined with modern Information

Science, with the application of modern Social Science research methodology and state-of-

the-art information technology”11. This follows the definition of U.S. American historian and

7 Cf. The W3C Semantic Web homepage: http://www.w3.org/2001/sw/, January, 18th 2006. 8 Cf. Berners-Lee, et al. : “For the semantic web to function, computers must have access to

structured collections of information and sets of inference rules that they can use to conduct automated reasoning”.

9 Cf. Wirtz (2005) "Marktanalyse. Deutschsprachige Online- und CD/DVD-Produktionen zum Thema Nationalsozialismus und Holocaust. Ein Projekt des Fritz Bauer Instituts im Auftrag der Bundeszentrale für politische Bildung" and Dornik (2003) "Zeitgeschichte und Internet".

10 Cf. the two theses on Contemporary History in the German language area in chapter VI. 11 McCrank (2002) "Historical Information Science. An emerging Unidiscipline" 593. He gives a

more elaborate definition at 56f: “the scientific study of historical information and of information and communication technologies, and the techniques, methods, and intellectual frameworks by which we extract meaning from this sources. This includes the creation of sources and their use in original content, historical use, and current use in studying History. This broad, integrative and unifying super-discipline concerns records of all kinds but especially electronic sources and archives because of the application of modern information technology for their access and analysis; historical information access and retrieval and contemporary access to historical materials; meta-history and metadata in documentation; data-text-image analysis; forensics and computing applications; and information technologies applied in historical research, communication, and instruction”. Alternative definitions of Historical Information Science, “historische informatiekunde” and “Historische Fachinformatik und Dokumentation” respectively, provide Boonstra, et al. (2004) "Past, present and future of historical information science" 20 and Kropač (2004) "Was ist 'Historische Fachinformatik und Dokumentation'? Terminologisches, Inhalte, Aufgaben". In the German mainstream of historical research, “Historische Fachinformatik” is seen as a historical ancillary science; cf. Vogeler, et al. (2005) "Historische Hilfswissenschaften".

Page 6: A Field Study of Subject Gateways on 'Zeitgeschichte

6

librarian Lawrence McCrank. Being a study in his concept of applied Historical Information

Science, focuses and objectives12 such as

• “exploration of methodologies”,

• “testing potential applications”, or

• “experimentation to develop processes and products for historical research which

may be broadly applicable or customized for problems in specific domains”

have been followed. The database including the analysis data and the software programs

developed in the course of this study have been included in the study’s project homepage

available at http://www.pepl.info/papers/fieldstudy_sg_zeitgeschichte/.13

After defining the term “subject gateway”, a brief history of the development of

subject gateways for the field of Contemporary History will be given. Following basic

information about the three web-based subject gateways at issue, generic technical web-page

evaluation methods will be discussed. Subsequently, a framework for specific analyses of

web-pages and subject gateways will be introduced and applied to the three subject gateways

along several analysis vectors. Aspects of this specific analysis framework and the analysis

vectors have already been discussed in the author’s contribution to the XVI international

conference of the Association for History and Computing in Amsterdam 2005.14 Two theses

on Contemporary History in the German language area will be established in conclusion from

the analysis. Finally, the findings of the study will be summed up in the “Conclusions”

chapter.

1 What is a “subject gateway”?

A subject gateway is an Internet Service. Although sometimes used synonymously

with “Internet Portal” or “Virtual Library”15, the term “subject gateway” has a distinctive

meaning. In the course of this study, “subject gateway” and “quality-controlled subject

gateway” will be used as defined by “Digital Library Scientist“ Traugott Koch:

Subject gateways are thereby defined as

12 Cf. McCrank "Historical Information Science. An emerging Unidiscipline" 594. 13 Thus, meeting another Historical Information Science focus: “The deposit of data, files,

programming, shareware, etc. in appropriate archives and research centers so that one contributes to the cumulative resource base available to historians everywhere.” Ibid.

14 Kröll (2005) "Not ready for the Semantic Web: A field study of subject gateways on Contemporary History".

15 Complementary typologies of Subject Gateway-related Internet services provide Campbell, et al. (2003) "Definitions for Web-Based Services" and Nentwich (2003) "Cyberscience. Research in the Age of the Internet" 78-81.

Page 7: A Field Study of Subject Gateways on 'Zeitgeschichte

7

Internet services which support systematic resource discovery. They provide links to resources (documents, objects, sites or services), predominantly accessible via the Internet. The service is based on resource description. Browsing access to the resources via a subject structure is an important feature.16

Quality-controlled subject gateways are thereby defined as

Internet services which apply a rich set of quality measures to support systematic resource discovery. Considerable manual effort is used to secure a selection of resources which meet quality criteria and to display a rich description of these resources with standards-based metadata. Regular checking and updating ensure good collection management. A main goal is to provide a high quality of subject access through indexing resources using controlled vocabularies and by offering a deep classification structure for advanced searching and browsing.

In the following, the three subject gateways at issue will still be referred to as “subject

gateway” and not “quality-controlled subject gateways” because none of the three fulfills all

seven criteria17 laid down for the latter by Koch.

2 Specifics of subject gateways for the field of Contemporary

History in the German language area

Subject gateways for the field of History have been developed as a manifestation of

the institutionalization of systematic resource discovery for online historical information.

Prior to that, single historians maintained more or less extensive annotated link lists which in

part still exist today, providing successive value for fellow historians.

In the Anglo-American language area, the development of subject gateways on

History did not include a single subject gateway specifically focused on Contemporary

History. Up to today, still not a single one can be found listed in main catalogues for subject

gateways18 and online resources on History. In that catalogues, only German language subject

gateways on “Zeitgeschichte” are present. An answer to the question why subject gateways on

Contemporary History are a domain of the German language area can be found in the

organizational structure of the respective academic research domains: Departments for

Contemporary History can only rarely be found outside the German language area. It can be

16 Koch (2000) "Quality-controlled subject gateways: definitions, typologies, empirical overview"

24f. 17 Ibid. 25f. 18 Cf. http://vlib.org/History, January 20th 2006 and

http://www.history.ac.uk/ihr/Resources/Type/gateway.html, January 20th 2006. Homepages of institutions or projects and web-pages lacking an explicit author will subsequently be referred to by URL and last access date.

Page 8: A Field Study of Subject Gateways on 'Zeitgeschichte

8

assumed that given the lack of an institutional background it will be difficult to establish and

maintain subject gateways. Rather than subject gateways covering the broad area of

Contemporary History, web-pages with a more specific thematic focus e.g. like the Holocaust,

the Northern Ireland conflict, the Cold War, or the Vietnam War19 can be found in the Anglo-

American language area.

Following the developments of subject gateways in the German language area, it can

be noticed that besides fulfilling the original purpose of providing resource descriptions, a

trend towards additional services like providing content in form of articles or primary sources

as well as providing specialized communications platforms in form of discussion fora and

mailing lists can be observed.

19 Exemplary: “Holocaust Cybrary remembering the Survivors” (http://www.remember.org/,

February 5th 2006), “CAIN: Northern Ireland Conflict, Politics, & Society. Information on 'the troubles'” (http://cain.ulst.ac.uk/, February 5th 2006), “Cold War” (http://www.cnn.com/SPECIALS/cold.war/, February 5th 2006), “Vietnam War Internet Project” (http://www.vwip.org/, February 5th 2006).

Page 9: A Field Study of Subject Gateways on 'Zeitgeschichte

9

II Basic information about the three web-based subject gateways

The “Zeitgeschichte Information System” (ZIS), online since early 1995, is the

longest-running web-based subject gateway on Contemporary History among the three

projects examined. Maintained by the Institute for Contemporary History at the Leopold-

Franzens-University of Innsbruck, its main features include an annotated link database

comprising about 800 entries, primary sources of 20th century Austrian history, a

documentation of the history of South Tyrol and a documentation on “Austria & Israel since

1945”. The most recent review of ZIS has been published by Martin Gasteiner and Christian

Pape in 200520.

The “Virtual Library Zeitgeschichte” (VLZ), part of the W3C Virtual Library21, was

the result of the merge of the Virtual Library sections “Third Reich/World War II” with “20th

Century” in 2003. The VLZ is managed by a team of historians, Ralf Blank and Stephanie

Marra on an honorary basis. Its main feature represents a link database including about 700

entries. In November 2005 the VLZ was target of a hacker attack from which it unfortunately

will not have recovered before the planned re-launch in March 200622. The most recent

review of VLZ has been published by Ingrid Böhler and Michael Gehler in 200423. The raw

data used for the comparative analysis has been gathered in April 2005.

The “Zeitgeschichte-Online” (ZOL) project is a joint endeavor of the

“Zentrum für Zeithistorische Forschung“ (ZZF), Potsdam and the “Staatsbibliothek zu

Berlin – Preußischer Kulturbesitz“ (SBB), Berlin funded by the “Deutsche

Forschungsgemeinschaft”. The subject gateway went online in early 2004, and is supported in

close co-operation with the two probably most important subject gateways on History in the

German speaking area, “Clio-Online”24 and “H-Soz-u-Kult”25. “Zeitgeschichte-Online”

features a database on institutions related to and persons working in the field of Contemporary

History, a sub-branch of the H-Net list H-Soz-u-Kult called “H-Soz-u-Kult/Zeitgeschichte”,

pertinent subject foci, subject related online discussion fora, and a link database including

20 Gasteiner and Pape (2005) "Clio-online Guide Österreich". 21 http://vlib.org/, January, 20th 2006. 22 According to an E-Mail from Ralf Blank to the author from February, 20th 2006. 23 Böhler and Gehler (2004) "Wendungen nach innen? Selektive Blicke auf die Zeitgeschichte". 24 http://www.clio-online.de/, January, 20th 2006. 25 Cf. Hohls (2004) "H-Soz-u-Kult: Kommunikation und Fachinformation für die

Geschichtswissenschaften" and http://hsozkult.geschichte.hu-berlin.de/, January, 20th 2006.

Page 10: A Field Study of Subject Gateways on 'Zeitgeschichte

10

about 2,100 entries. The most recent review of ZOL has been published by Dirk van Laak26 in

2004.

Judging from the infrastructural background of the co-operation partners, the

“Zeitgeschichte-Online” project’s subject gateway should by far show the highest grade of

professionalism of the three subject gateways at issue.

III Generic technical web-page evaluation methods The Internet could not exist without technical standards. However, in the light of the

majority of web-pages currently available, one would be inclined to think that quite the

opposite is true. Given the lack of conformity with regard to technical standards, a

considerable lack of interoperability, accessibility, and usability can be discerned.

1 Syntactic standards

The W3-Consortium27, mainly responsible for creating web-standards, provides

validation services for syntactic web-page standards. Using these validators28 for the start-

pages of the three subject gateways to test HTML and CSS validity has shown that all pages

are invalid with error counts ranging from 6 to 410. Despite being syntactically invalid, the

document will still be accessible using most browsers. It has to be concluded that the creators

of the HTML- and CSS-pages simply are not aware or are not concerned about standard

conformance.29

2 Metadata standards

Only if a web-page is syntactically formalized, i.e. by being marked-up in valid

(X)HTML can value-added processing by software tools be undertaken. Adding a formal and

explicit meaning to content by using metadata is one of the cornerstones of a future Semantic

Web. Implementing Dublin Core30 as the de-facto formal metadata standard for one’s web

pages would be a first step towards that goal. Only one of the three subject gateways at issue,

the “Zeitgeschichte Informations System”, partly31 uses Dublin Core in its HTML pages.

26 Van Laak (2004) "Rez. WWW: Zeitgeschichte-online". 27 http://www.w3.org/, January, 20th 2006. 28 The validator used for HTML has been http://validator.w3.org/, January, 20th 2006, the one used

for CSS validation has been http://jigsaw.w3.org/css-validator/, January, 20th 2006. 29 A comprehensive discussion of the implications thereby created, exceed the scope of this paper. 30 http://www.dublincore.org/, January, 20th 2006. A brief overview and introduction of its usage is

provided by Hillman (2003) "Using Dublin Core". 31 Dublin Core is used on the entry pages only.

Page 11: A Field Study of Subject Gateways on 'Zeitgeschichte

11

3 Accessibility guidelines

Another generic factor for the quality of web-pages is their conformance to

accessibility guidelines like the W3C’s WAI32 or the U.S. Government’s Section 50833. Again,

the use of validators34 to check conformance shows that none of the three subject-gateways

passes the tests. In contrast to the HTML- and CSS-validation tests, the effects of non-

accessible pages are far more severe for people with disabilities and therefore a strong call for

action to make web-pages accessible has to be stated.

4 Usability evaluation

Usability engineering for web-pages has grown out of the software development

discipline of Human Computer Interaction (HCI) and is faced with a number of web related

problems: The diversity of user configurations may cause a web page to be displayed or

loaded completely different for the individual user. Also, target audiences are difficult to

define because of the global nature of the Internet. In addition, the rapidly changing nature of

the Internet causes short development cycles, making it difficult to incorporate the findings of

usability studies.

That short list of web related problems of web-usability shows only some aspects of

difficulties a generalization of web-usability evaluation methods will typically face.

Therefore, it is not surprising that none of the pertinent web standardization bodies have yet

published generic web-page usability standards.

What has been published, however, is a number of web-usability guidelines35,

checklists and criteria36. The criteria contained in these guidelines can be used during the

process of entertaining the most prominent web-page evaluation method called “heuristic

evaluation”37, where “a small set of evaluators examines the interface and judges its

32 http://www.w3.org/WAI/, January, 20th 2006. 33 Implementation of Section 508 is legally binding for U.S. federal agencies. More information can

be found at http://www.section508.gov/, January, 20th 2006. 34 Validator used for WAI/WCAG and Section 508: http://www.contentquality.com/, January, 20th

2006. 35 Cf. Nielsen (1996) "Original Top Ten Mistakes in Web Design", Nielsen (1999b) "The Top Ten

New Mistakes of Web Design", Nielsen (1999a) "'Top Ten Mistakes' Revisited Three Years Later", Nielsen (2002b) "Top Ten Web Design Mistakes of 2002", Nielsen (2002a) "Top Ten Guidelines for Homepage Usability", Nielsen (2003b) "Top Ten Web Design Mistakes of 2003", Nielsen (2003a) "The Ten Most Violated Homepage Design Guidelines", Nielsen (2004) "Top Ten Mistakes in Web Design", Nielsen (2005b) "Top Ten Web Design Mistakes of 2005", and Koyani, et al. (2003) "Research-Based Web Design & Usability Guidelines".

36 Cf. Hennig and Quirion (2004) "Best practices for web interfaces of searchable databases". 37 Nielsen (2005a) "Ten Usability Heuristics".

Page 12: A Field Study of Subject Gateways on 'Zeitgeschichte

12

compliance with recognized usability principles”. If the resources for such test settings are not

available, it is still possible to use the afore mentioned guideline listings as checklists and

either automatically test conformance with an experimental tool like WebSat38 or uzReview39,

or otherwise do the testing manually. The latter approach has been applied for this paper.

In this analysis, two different sets of web-page usability guidelines have been used to

evaluate web-usability of the three subject gateways. To honor the specific context of the

subject gateways, the first guideline set was taken from “Best practices for web interfaces of

searchable databases”40 published by Nicole Hennig and Christine Quirion of MIT’s Web

Advisory Group. Generic web-usability guidelines have been assembled for the second

guideline set by selecting “Web Design Mistakes” from Jakob Nielsen’s “Alert Boxes”41. The

full list of guidelines and detailed results can be found in the Appendix. A summary of results

for the two sets of guidelines is shown in the following table:

Subject Gateway Passed Tests,

Guideline Set 1 Henning and

Quirion 2004, IX1.1

Weighted Mistakes42,

Guideline Set 2 Nielsen, IX1.2

Zeitgeschichte Informationssystem

18 of 31 21 of 119

Virtual Library Zeitgeschichte

20 of 31 22 of 119

Zeitgeschichte-Online 22 of 31 30 of 119

These results merit further discussion: Because the web-usability guidelines are not

standardized and therefore have no reference character, it is more important to examine the

relationship of the three subject gateways fare, rather than examining the total number of

passed checks or total mistakes in general. That said, it can be noted that the statistical spread

of the results of the three subject gateways is rather low – the three gateways have very

similar scores. Regarding the searchable databases usability guideline set on the one hand, the

“Zeitgeschichte Online (ZOL)” project fares best. As regards the generic web-usability

guideline set on the other hand, the “Zeitgeschichte Informationssystem (ZIS)” project’s web-

page does. In both cases, the two other projects are running up to each other very closely.

Why does the ZIS web-page design score best in regard to the generic usability guidelines 38 Cf. NIST (2002) "WebSAT". 39 Cf. Edmonds, et al. (2003) "uzReview 0.7.1". 40 Hennig and Quirion "Best practices for web interfaces of searchable databases". 41 Cf. footnote 35. 42 The 36 mistakes have been weighted from a scale from 1 to 5 because a mistake like “Overly

detailed ALT Text” has less impact than “No Contact Information or Other Company Info”. Therefore the worst score would be the sum of the weighted mistakes which is 119.

Page 13: A Field Study of Subject Gateways on 'Zeitgeschichte

13

even though it has not been updated for over two years? The answer for that lies in the

question itself: Usability of web-pages does not have to relate to the date of the establishment

of a homepage. The ZIS web-pages have a very simple layout with few graphics and even

fewer non-standard web-page elements. From that alone it is harder to make more mistakes

than it would be with a more complex layout. The ZIS project has a strong focus on its link

database and does not provide the variety of content, as ZOL does for example. That makes it

easier for ZIS to use a simple and focused layout of the web-pages. Consequently, the

challenge of web-usability grows with the variety and complexity of content.

Overall, the three subject gateways score satisfactorily with regard to the searchable

databases usability guideline set and they score well with regard to the generic web-usability

guideline set. The web-pages provide a usable platform for the two prototypes of web-users:

the “link-dominant” and the “search-dominant” ones43 and meet another important usability

criterion in that they clearly provide context for the different web-pages: Where am I? What

Can I do here? Where can I go to from here?44

43 The two terms have been coined by Nielsen (1997) "Search and You May Find". The same

prototypes have been identified by Krug (2000) "Don't Make Me Think. A Common Sense Approach to Web Usability" 54 or Kyunghye (2002) "A Model-based Approach to Usability Evaluation for Digital Libraries" (“scanning” vs. “searching”) for example. Also interesting in that context are the results of a usability study by Mitchell, et al. (1999) "Testing the Design of a Library Information Gateway" where the majority of tested users turned out to be “search-dominant” and almost too spoiled by the “Google-Comfort”: “The finding that came out most forcefully was that students want a white box into which they can type their search terms. If students have to go beyond two screens to find such a box, they become frustrated and impatient“.

44 Cf. Krug "Don't Make Me Think. A Common Sense Approach to Web Usability" 87: “Trunk Test” and Theng, et al. (2000) "Purpose and usability of digital libraries" 239: “Feeling Lost”.

Page 14: A Field Study of Subject Gateways on 'Zeitgeschichte

14

IV Setting up a framework for specific analyses The discussed generic technical web-page evaluation methods can only provide rather

generic answers. For more specific questions, e.g. in quantitative analysis, more specifically

tailored software is needed. In the course of the comparative analysis of the three afore-

mentioned subject gateways, a crawler program has been developed for harvesting the content

of each subject gateway’s link database. The crawler has to use heuristics to map the crawled

data into a common database. There are two reasons for this: First, none of the subject

gateways offers a formalized public interface to access its databases, like for example

providing a custom Web Service. Therefore, the link databases have to be harvested by

parsing their HTML output. Second, it is necessary to map the harvested link metadata to a

common scheme. None of the three subject gateways declares to use a common metadata

scheme, thus a specific conceptual mapping to the Dublin Core-using aggregate database had

to be set up. Analyzing the data to be harvested showed that these metadata mappings could

not be static. In case of “Zeitgeschichte-Online”, for example, the fields “Autor” (“author”),

“Herausgeber” (“editor”), and “Veröffentlicht durch” (“published by”) could not be mapped

1:1 to DC-Creator and/or DC-Publisher, the only two DC-fields available for matching in that

case. Depending on the presence of data in one of the three fields, a different semantic

meaning had to be applied: If data was present in the “published by” field, it was used for the

Dublin Core “publisher” field and Dublin Core “creator” was filled with the “author” field, or

– if the “author” field was empty – with the “editor” field. If no data was present in the

“published by” field but in the “author” and “editor” fields, the content of the “author” field

was used for Dublin Core “creator” and “editor” for Dublin Core “publisher”.45 Using

standardized means for providing access to one’s metadata or archival information, e.g. by

implementing an interoperable OAI-PMH46 data provider interface, could avoid potential

errors due to such ambiguities.47

45 The corresponding application logic can be found at the end of the process_content() function of

harvester_zol.pl, http://pepl.info/viewcvs/trunk/harvester-zol.pl?view=log, February 22nd, 2006. 46 OAI PMH stands for the Open Archive Initiative Protocol for Metadata Harvesting. See Caplan

(2004) "OAI-PMH" for information about the protocol and Kelly (2004) "Interoperable Digital Library Programmes? We Must Have QA!" for general consideration on Digital Library interoperability.

47 Enderle "Der Historiker, die Spreu und der Weizen, zur Qualität und Evaluierung geschichtswissenschaftlicher Internet-Ressourcen" 60 also considers an OAI interface as one of the quality criteria for a web subject gateway.

Page 15: A Field Study of Subject Gateways on 'Zeitgeschichte

15

The common database storing the harvester results has been implemented using the

PostgreSQL48 RDBMS. The crawlers have been implemented using Perl49 and Perl CPAN50

modules. They retrieved the metadata from the contents of the link databases assignable to the

Dublin Core Metadata Element Set, the HTTP status code and MD5 checksum of the database

item’s content. In addition, each link database item’s homepage was crawled recursively to

three levels of depth, to store the out-links to other link database items.

V Using the custom analysis framework Having crawled 3,646 interlinked items holding a number of attributes provides

copious space for analysis. In the following, a selection of options for analysis will be

discussed.

1 Resource identifier validation

HTTP status codes tell us about the availability of a resource. Status codes greater than

400 denote an invalid resource, which could inter alia be the result of either “404 Not Found”

or “500 Server Error”. The following table provides an overview of invalid items in the

aggregated link database grouped by subject gateway as of April, 14th 2005.

Subject Gateway Total Items Invalid Items Invalid Items %

Zeitgeschichte Informationssystem

822 178 21 %

Virtual Library Zeitgeschichte

693 66 9 %

Zeitgeschichte-Online 2,131 82 3 %

The disparate percentage of invalid items from the ZIS database could be explained by

the fact that the last update of that database has been performed at August, 13th 2003.

Unfortunately, the last update timestamps of the other databases could not be ascertained from

their homepages.

48 http://www.postgresql.org/, January, 20th 2006. 49 http://www.perl.org/, January, 20th 2006. 50 http://cpan.perl.org/, January, 20th 2006.

Page 16: A Field Study of Subject Gateways on 'Zeitgeschichte

16

2 Subject classification analysis

2.1 Current subject indexing standards

Subject indexing is one of the most challenging and time consuming tasks of metadata

classification. It is also one of the tasks still most recalcitrant to automation, as a recent report

on Automated Metadata Classification51 has shown.

The creation of all three subject gateways at issue coincided with a period of change

for the respective bibliographic standards for subject indexing in the German language area.

Even the mere existence of the subject area of “Bibliothekswissenschaft” (Library Science)

itself has been questioned in the last years.52 Since the mid-nineties of the last century, a

transition of the conventional standards for both formal and subject indexing to new and

international, or internationally oriented ones has been on its way. The assets and drawbacks

of a transition from the German formal cataloging rules RFK (RAK)53 to its Anglo-American

counterpart AACR254 have been discussed intensively55. Regarding subject indexing, the

situation is similar: The main German language subject headings, SWD56, are difficult to map

to their other language counterparts because of their inherent rules57, and the recently

translated Dewey Decimal Classification System (DDC)58 misses the required granularity for

the purposes of a web subject gateway. What classification system should a German language

web subject gateway use for subject indexing then? The de-facto standard for subject

51 Cf. Greenberg, et al. (2005) "Final Report for the AMeGA (Automatic Metadata Generation

Applications) Project". 52 Cf. Hauke, et al. (2005) "Library Science - quo vadis? (Re)Discovering „Bibliothekswissenschaft“"

and Gradmann (2005) "Hat Bibliothekswissenschaft eine Zukunft? Abweichlerische Gedanken zur Zukunft einer Disziplin mit erodierendem Gegenstand".

53 An Introduction to cataloging rules along the „Regeln für die Formalkatalogisierung (RFK)“ formerly „Regeln für die alphabetische Katalogisierung (RAK)“ provides Eversberg (2005) "Wie katalogisiert man ein Buch? Ein Leitfaden nicht nur für Einsteiger".

54 AACR2, the “Anglo-American Cataloguing Rules, Second Edition”, http://www.aacr2.org/, January, 20th 2006.

55 An overview of the discussions provides Arbeitsgemeinschaft der Parlamentsbibliotheken und Behördenbibliotheken (2003) "Stellungnahmen, Materialien und Informationen zu dem Beschluss des Standardisierungsausschusses bei der Deutschen Bibliothek, einen Umstieg von den deutschen auf internationale Regelwerke und Formate (AACR und MARC) anzustreben".

56 SWD stands for “Schlagwortnormdatei”; cf. Deutsche Bibliothek (2005) "Schlagwortnormdatei (SWD)".

57 Cf. Eversberg (2004) "Eine seltene Sache. Erwartung und Ernüchterung bei der thematischen Katalogsuche" An introduction to the subject indexing rules RWSK provides Umlauf (2005) "Einführung in die Regeln für den Schlagwortkatalog RSWK".

58 Heiner-Freiling and Svensson (2005) "Dewey-Dezimalklassifikation".

Page 17: A Field Study of Subject Gateways on 'Zeitgeschichte

17

indexing in the Anglo-American language area, the Library of Congress Authorities59, would

very likely provide the required granularity. For the use on the web, however, it has been

considered too complicated and the attempts to simplify the Library of Congress Authorities

for web resources using a concept of “Faceted Application of Subject Terminology”60 have

only started. Even if the result of those projects would be mature, the problem of how to

ensure interoperability of the different language and domain specific subject headings will

remain unresolved. In the near future there will be no shared international authority file and

no standards how to automatically map between different subject headings.61

2.2 Semantic Web to the rescue?

As we have seen, traditional Library Science cannot offer an out-of-the-box solution

for subject classification for web subject gateways. If a new subject gateway was to be set up

today, a custom subject authority file would still have to be compiled for subject indexing. To

ensure interoperability and long-time compatibility, such authority file could be created using

Semantic Web technologies62. For subject indexing, that would require the creation and use of

a domain specific Ontology63; roughly: the equivalent of a thesaurus.

Unfortunately, building such Ontologies is rather costly, difficult and still even

technically experimental. To the author’s knowledge, only one History related Ontology64 is

publicly available by the end of 2005. To roll up the research interests and current status of

Ontologies in the Humanities, a workshop will take place in April 2006 in Hamburg65. That

fact alone shows that the whole topic of Ontologies in the Humanities, let alone History, is in

a very early development phase and not ready to be widely used.

59 Library of Congress Authorities (LCA): http://authorities.loc.gov/, January, 20th 2006. Formerly the

LCA were called Library of Congress Subject Headings (LCSH). 60 Cf. Dean (2003) "FAST: Development of Simplified Headings for Metadata" This project also

investigates how to formalize the subject headings using the Dublin Core standard. 61 Cf. Tillet (2003) "Authority Control: State of the Art and New Perspectives". 62 Cf. Enderle "Geschichtswissenschaft, Fachinformation und das Internet" 7: “Es sei daher von der

These ausgegangen, daß künftige fachwissenschaftliche Erschließungsformen sich die Idee des Semantic Web zu eigen machen sollten.”

63 Boonstra, et al. "Past, present and future of historical information science" 102. 64 This Ontology has been developed by the VICODI Project (http://www.vicodi.org/, January, 20th

2006): “The objective of VICODI is to enhance human comprehension of the digital content on the Internet. This is reached by introducing novel visualisation and contextualisation environment for digital content.”

65 “Ontology Based Modelling in the Humanities”, 7-9 April 2006, University of Hamburg (http://www.c-phil.uni-hamburg.de/view/Main/OntologyWorkshop, January, 20th 2006).

Page 18: A Field Study of Subject Gateways on 'Zeitgeschichte

18

2.3 Subject classification of the three subject gateways

From the last two chapters we know that during the creation of the three web subject

gateways at issue, no evident choice of specific subject indexing standards was available.

Even today it would be necessary to create a specific subject authority file as none exists for

the field of Contemporary History yet. In addition, the formal representation of the authority

file, i.e. in which way the authority terms and the file itself would be technically expressed,

would not be obvious, as standards are still at a development stage. As a consequence, given

that an application would rely on a formal representation e.g. using an XML file along a

specific schema would not guarantee that a finally developed standard would require

something completely different and consequently in a best-case scenario at least additional

conversion effort.

Independently from that difficulties, still, applications should at least be designed to

avoid free-form data entry as it seems to be the case with “Zeitgeschichte-Online”, where the

concept of “Arbeiterbewegung” (labor movement) can be found in the four different

keywords “Arbeiterbewegeung”, “Arbeiterbewegung”, “Arbeiterbewegungen”,

“Arbeiterbewgung”. Similar spelling errors or redundant classifications can also be found in

the other two databases using duplicate detecting algorithms66.

The distribution of keyword usage in the aggregated database can be interpreted as an

indication of the most popular research topics in web-present Contemporary History in the

German language area:

Top 4 Keywords (Number of Occurrences)67 425 Nationalsozialismus 288 Holocaust 243 Sozialgeschichte 203 Widerstand

Looking at the keywords’ distribution broken down by subject gateway it can be

noticed that all three mostly used keywords are related to the same subject area. In addition to

this well-defined subject area focus to be discussed in the following chapter, a maverick in

keyword distribution can be noticed:

66 For this study, the word stem, Soundex, and Levenshtein Edit Distance of keywords have been used

to identify duplicates. More information on that topic addressing the Merge/Purge problem provides Hernández and Stolfo (1998) "Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem".

67 A list of the 50 most used keywords is available at the corresponding table in chapter IX2.

Page 19: A Field Study of Subject Gateways on 'Zeitgeschichte

19

Subject Gateway Total

items Distinct

keywords KWs with only one

occurrence /per distinct keywords %

Most used keyword

Zeitgeschichte Informationssystem

822 166 26 15.66% Holocaust

Virtual Library Zeitgeschichte

693 126 19 15.08% „Drittes Reich“

Zeitgeschichte-Online

2,131 1,146 502 43.80% Nationalsozialismus

Approximately 44% of the keywords used by “Zeitgeschichte-Online” are used only

once. This can be interpreted either as an indication for a thematically wide-spread content of

the link-database, or – for the worse – for a lack of stringent rules for subject classification. A

third possible interpretation for that outlier is a rather practical one: “Zeitgeschichte-Online”

is aggregating link database entries from partner institutions, which represents a factor

potentially increasing the content and classification diversity. In disregard of possible

explanations for the reasons of why almost half of the keywords used by “Zeitgeschichte-

Online” are used only once, it has to be put into question what purpose do keywords that are

used only once fulfill?

2.4 Subject focuses as compared to the “offline world” of Contemporary History research and teaching.

The last chapter showed that “National Socialism” and “Holocaust” are the top

subjects in the aggregated database and with that, also in the web-present Contemporary

History in the German language area. To what extent can those subject focuses be compared

to the “offline world” of Contemporary History? To answer that question it will be necessary

to analyze the subject focuses of printed publications in the field of Contemporary History.

The most comprehensive overview of print publications on History in the German

language area is provided by the annually published Historische Bibliographie68.

Unfortunately, using the query interface of the online version, it is not possible to get an

overview of the most common subjects used for classification. E-Mail correspondence with

68 The “Historische Bibliographie” is edited by the Arbeitsgemeinschaft außeruniversitärer

historischer Forschungseinrichtungen in der Bundesrepublik Deutschland. Its homepage including a test-access is available at: http://www.ahf-muenchen.de/HistBib/, January, 20th 2006.

Page 20: A Field Study of Subject Gateways on 'Zeitgeschichte

20

the publisher of the Historische Bibliographie69 could not provide the answers about the

wanted keyword statistics either.

An alternative provider for bibliographic information with an international focus and

only partially covering the publications on History in the German language area, are the Arts

& Humanities Citation Index70 and the Social Sciences Citation Index71. As with the

Historische Bibliographie however, the query interfaces of both bibliometric tools could not

provide the sought answers72 and e-mail correspondence with the publisher73 remained

unanswered.

Since the trans-regional and trans-national sources of bibliographic information on

publications in the field of Contemporary History could not be used to answer the question

about subject focuses of printed publications, we have to resort to a bibliographic source with

a potential regional as well as a specific language focus: The OPAC database of a

University Library74. In our case, it was possible to retrieve an excerpt of the history related

subject classifications of publications indexed in the OPAC database of the University Library

of the Leopold-Franzens-University Innsbruck75. This University Library is not known to

have any specific History related subject focus, so it can be assumed that the composition of

indexed publications is sufficiently representative for the German language area.

After importing the round about 150,000 records into the aggregate database, it was

possible to sort the round about 62,000 distinct keywords by occurrence. In a final step,

69 E-Mail to [email protected], June, 26th 2005 with subject“Anfrage

Themenstatistik“, answer from Helmut Zedelmaier <[email protected]>, June, 28th 2005.

70 http://scientific.thomson.com/products/ahci/, December, 22nd 2005. 71 http://scientific.thomson.com/products/ssci/, December, 22nd 2005. 72 Thanks to Eveline Pipp and Heinz Hauffe of the University Library of the Leopold-Franzens-

University Innsbruck for their help searching the A&HCI and SSCI using the DIALOG interface. 73 E-Mail to Philip Heller <[email protected]> and George Herzhoff

<[email protected]>, June 23rd 2005 with subject “Request for Keyword Statistics”. 74 Potential alternatives for analyzing the contents of a OPAC database would have been the analysis

of subject focuses of thesis papers or journal articles, both of which would have implicated a separate study on their own. Mattl (1983) "Bestandsaufnahme zeitgeschichtlicher Forschung in Österreich" 27-53, provides a brief analysis of subject focuses of round about 280 thesis papers on Contemporary History published in Austria between 1975 and 1981. Because of its age and very strong focus on Austrian History it has not been used here. Mattl (2003) "Nicht eine, sondern viele Zeitgeschichten. In Annahme einer 'dritten Generation'" 365, footnote 28, provides a very brief “provisional” analysis of the round about 260 articles published in the journal zeitgeschichte until November 2003. Because of its tentative character and focus on a single journal it has only been put to “footnote use” for the comparison of “offline”- and “online-world” here at issue.

75 Thanks to Georg Stern-Erlebach of the University Library of the Leopold-Franzens-University Innsbruck for providing this list.

Page 21: A Field Study of Subject Gateways on 'Zeitgeschichte

21

manually filtering the 30076 most used keywords by their relation to Contemporary History

yielded the wanted list of top keywords, comparable to the list of most used keywords in the

link databases of the three subject gateways:

Top 4 Keywords related to Contemporary History,

Innsbruck University Library OPAC Database (Number of Occurrences)77 1,335 Juden 1,157 Geschichte 1933-1945 1,075 Drittes Reich 877 Nationalsozialismus

The most used keywords in aggregated database as presented in chapter 2.3 have been

“Nationalsozialismus”, “Holocaust”, “Sozialgeschichte”, and “Widerstand”. The keyword

“Holocaust” does not exist in the OPAC database, an interpretation of equivalent mappings

could be: “Juden” (1,335), “Judenverfolgung” (293), and “Judenvernichtung” (271). All

occurrences of “Sozialgeschichte” with a chronological constraint to the 20th century can be

seen as equivalents to the keyword “Sozialgeschichte” of the aggregated database. Those

entries, e.g. “Sozialgeschichte 1945-1950”, have been used 97 times in the OPAC database,

putting them well behind the top 50 most used Contemporary History related keywords. The

4th most used keyword in the aggregated database, “Widerstand”, takes the 11th place in the

OPAC top-keywords list, by being used 367 times.

Although the keywords in the two databases are not literally identical, the focus on the

events in Europe between 1933 and 1945 remains the same. Thereby, the most treated

research topics of Contemporary History in the German language area do not substantially

differ in the “offline-” and the “online world”.78

3 Common resource identifiers analysis

When comparing three link databases with a common content focus, the question

about commonly shared URLs seems rather obvious. In the case of the present study, the

answers to that question turned out to be a surprise: From the total of 3,370 distinct

76 From the 62,000 distinct keywords related to History, the 300 most used keywords have been

selected assuming that the most used keywords related to Contemporary History would be included in that list of the top 300.

77 A list of the 50 most used keywords related to Contemporary History is available at the corresponding table in chapter IX2.

78 A similar, although “provisional” finding provides Mattl "Nicht eine, sondern viele Zeitgeschichten. In Annahme einer 'dritten Generation'" 365, footnote 28, there he shows that “Nationalsozialismus” and “History of Judaism” have been the most prominent subject focuses in the journal zeitgeschichte up to November 2003. Also see Footnote 74 in this paper.

Page 22: A Field Study of Subject Gateways on 'Zeitgeschichte

22

normalized URLs, only 195 (5.79 %) are common to at least two subject gateways and only

2479 (sic!) (0.71 %) are common to all three. Especially the last number makes a statistician

doubt his methods and findings. However, the result stayed the same after double-checking.

There are some possible interpretations for that very low number of shared URLs.

However, each interpretation only provides a partial explanation. The fact that no entry of the

ZIS database is newer than August 2003 could be one factor, another that the “Virtual Library

Zeitgeschichte” has a relatively specific subject focus. The disproportionately small number

of shared URLs questions the comparability and the authority of the three subject gateways,

irrespective of the reasons for it and irrespective of the gateways’ – at least original –

authoritative pretence.

Shared URLs by subject gateway Virtual Library Zeitgeschichte Zeitgeschichte-Online

Zeitgeschichte Informationssystem 50 77 Virtual Library Zeitgeschichte n/a 116

Having only 24 shared link database entries facilitates a classification analysis.

Subsequent to the general classification analysis undertaken above, the results are not

unexpected: Not a single keyword is used by all three subject gateways for one of the 24

items. For 16 of the 24 items, at least one identical keyword has been used by two of the three

subject gateways. In the remaining 8 cases, all the keywords used per database item were

different. As an example, the classification of the resource http://chronik-der-mauer.de/ is

shown in the following table: Zeitgeschichte Informationssystem Berliner Mauer, Bibliographie, DDR, Kalter Krieg Virtual Library Zeitgeschichte Mauerbau und Grenzbefestigung, Nachkriegszeit Zeitgeschichte-Online Berlin, Berlinpolitik, Grenzen, Mauerbau, Zeitzeuge

It can be noticed that some of the used keywords, e.g. “Berlin” and “Grenzen”

(borders) are formulated very broad and therefore very unspecific.

4 Duplicate pointers analysis

The web offers several possibilities to address the same resources under different

URLs. For example, host aliases or directory index files allow http://www.h-net.org/~german/

and http://www2.h-net.msu.edu/~german/ or http://www.icbh.ac.uk/icbh/ and

http://www.ihrinfo.ac.uk/icbh/welcome.html to point to the same documents. Identical

content can be identified by using message digest algorithms. After harvesting the link

79 This list of URLs can be found at the table in chapter IX3.

Page 23: A Field Study of Subject Gateways on 'Zeitgeschichte

23

databases of the three subject gateways at issue, the crawler stored MD5 checksums for each

document. Using that mean, it was possible to identify several documents stored under

different URLs. The ZIS database stores 9 distinct documents under 19 different URLs, the

ZOL database 11 under 26, and the VLZ database 4 distinct documents under 8 URLs. In

other words, the same document had been entered under two different URLs into the VLZ

database in four cases.

5 Information network analysis

As mentioned earlier, the crawler programs stored the outgoing hyperlinks of the

database items’ web-pages to the other items’ web-pages over a depth of three levels (clicks).

The resulting graph can be analyzed by using a variety of methods related to the field of

network analysis80.

The PageRank algorithm81, popularized since its use by the Google82 search engine,

can be used to help determine a page’s relevance in relation to other pages of a network.

Assuming that a page is casting a vote on another page by linking to it, the importance of a

page is determined by the number of votes cast for it. Also, the importance of the page that is

casting the vote determines how important the vote itself is. For our ZIS/ZOL/VLZ network,

the top-five ranked URLs are shown in the following table:

URL In-

Degree83 PageRank (ZIS/ZOL/VLZ network, n=2278, scale: 1-1084)

http://www.dhm.de/ 123 8.87 http://www.wiesenthal.com/ 109 7.94 http://www.ubka.uni-karlsruhe.de/kvk.html 119 7.76 http://www.iwm.org.uk/ 24 7.60 http://www.iwmcollections.org.uk/ 19 7.41

After converting the directed link graph to a binary asymmetric adjacency matrix, the

wide-ranged power of network analysis software tools like UCINET85 or Pajek86 can be put to

use.

80 An introduction to Social Network Analysis Methods provides Hanneman (2001) "Introduction to

Social Network Methods". 81 Cf. Page, et al. (1998) "The PageRank Citation Ranking: Bringing Order to the Web". 82 http://www.google.com/, January, 20th 2006. 83 The In-Degree determines how often other pages of the network link to a page. 84 The logarithmic PageRank scale has been simplified in analogy to the Google toolbar here. The

higher the value, the more important a page is considered to be. Details of the PageRank calculation can be found in the implementation of cagipch_pagerank.pl, http://pepl.info/viewcvs/trunk/cagipch_pagerank.pl?view=log, February, 22nd 2006.

Page 24: A Field Study of Subject Gateways on 'Zeitgeschichte

24

www.archiv-buergerbewegung.de

people.freenet.de/DDR-Forschung.English_home.htm

www.tu-dresden.de/hait

www.zzf-pdm.de

www.chronik-der-mauer.de

www.calvin.edu/academic/cas/gpa www.ddr-suche.de

www.thueraz.de

www.bstu.de

www.thueraz.de/links.htm

www.17juni53.de

www.bpb.de/publikationen/*Alltagskultur_Ostdeutschland.html

www.bpb.de/themen/*Weltfestspiele_1973.html

www.stasiopfer.de

www-sul.stanford.edu/depts/hasrg/german/cultural.html

www.umass.edu/defa

Because the density of ties in the whole ZIS/ZOL/VLZ network is very low (0.3%),

we will partition the network matrix along keyword-based parameters to be able to tell about

degree-, betweenness-, and closeness centrality, as well as other network analysis concepts

like Bonacich Power Indices or cliques for the subject-related sub-networks.

Using network visualization software like NetDraw87 allows for a quick and

comprehensive overview of such sub-networks as shown in the following exemplary diagram:

Diagram 1: Network based on keywords 'DDR' and 'Deutsche Demokratische Republik 1949-1990'. Nodes with a degree of one have been removed for better visibility.

At 4.73%, the density of this “GDR-Network” is much higher compared to the overall

ZIS/ZOL/VLZ network. www.bstu.de, www.17juni53.de, and www.zzfg-pdm.de can be

identified as the three central web-pages by catching a glimpse of the diagram. The “GDR-

Network” only has 31 nodes, whereas the total number of distinct items sharing one of the

two keywords “DDR” or “Deutsche Demokratische Republik 1949-1990” in the aggregate

database is 59. That means that almost half of those items are not cross-linked by the others.

Because of this and the overall very low density of the network, the case for web-subject

gateways filling those missing links can easily be established, assuming that it is not Google

that will be forestalling this function in the pre-Semantic Web88 era.

85 Borgatti, et al. (1999) "UCINET 6.0 Version 1.00". 86 Batagelj (2005) "Pajek 1.04". 87 Borgatti (2002) "NetDraw: Graph Visualization Software". 88 In the context of the this study, an overview of the Semantic Web from a Web-Mining perspective

is given by Berendt, et al. (2004) "A Roadmap for Web-Mining: From Web to Semantic Web".

Page 25: A Field Study of Subject Gateways on 'Zeitgeschichte

25

VI Interpreting the results of the analysis: Two theses on Contemporary History in the German language area

The results of the analysis obtained by using the custom framework have laid a

residuum for further discussion and interpretation. In the following, two theses based on the

results will be discussed.

1 Contemporary History is still the history of National Socialism –

Discussing the meaning and purpose of “Zeitgeschichte”

From the subject focus analysis in chapters V 2.3 and V 2.4, it has been concluded that

the events in Europe between 1933 and 1945 are the most treated research topics of

Contemporary History in the German language area. Gaining that insight in the year of 2005

one may ask, how contemporary that version of Contemporary History can call itself, and

how inherent this subject focus may be in the discipline itself. Reflecting about the status of

“Zeitgeschichte” as an academic discipline in the German language area, it will be helpful to

discuss its establishment on the one hand and the definitions of “Zeitgeschichte” found in the

historiography on the other.

Two milestones for establishing Contemporary History as an academic discipline in

Germany were (i) the foundation of the “Institut für Zeitgeschichte” (IfZ) in Munich in 194989

and (ii) the publication of the journal Vierteljahreshefte für Zeitgeschichte by that institute

starting in 1953. In the first article of its first issue the then director of the IfZ and co-

publisher of the latter journal, Hans Rothfels, introduced “Zeitgeschichte als Aufgabe” in

1953. Subsequently, he narrowed “Zeitgeschichte” down to “Epoche der Mitlebenden”90, then

starting “etwa mit den Jahren 1917/18”91. Rothfels’ article has had great impact and is still

referenced today by most articles when it comes to defining status and purpose of

“Zeitgeschichte” in the German language area. The current successor of Rothfels in his role as

head of the IfZ and, consequently, head of one of the largest92 academic institution dealing

with “Zeitgeschichte” in the German language area, Horst Möller, defined “Zeitgeschichte” as

89 Cf. http://www.ifz-muenchen.de/das_ifz/geschichte.html, January, 23rd 2006. 90 Rothfels (1953) "Zeitgeschichte als Aufgabe" 2; “Epoch of the co-living”. 91 Ibid. 6 (emphasis added); “about the years 1917/18”. 92 Based on the list of staff members available on the homepage, the IfZ is currently the largest with

52 members (http://www.ifz-muenchen.de/mitarbeiter/index.html, February 5th, 2006). The “Zentrum für Zeithistorische Forschung”, established 1996 in Potsdam, is closely running up with 51 members (http://www.zzf-pdm.de/mitarb/mtarbfr.html, February 5th, 2006).

Page 26: A Field Study of Subject Gateways on 'Zeitgeschichte

26

the history of the 20th century from 1917 until 1989/9193 in his latest publication of

“Einführung in die Zeitgeschichte”. Has the start of the "Epoch of the Co-Living" not changed

in fifty years? Most of other current German speaking historians have taken a different view:

“Zeitgeschichte” can not94 and should not95 be exactly defined. Because “Zeitgeschichte” as

term has been assigned for the long period of time starting from 1917, it has also been

proposed to introduce “neueste Zeitgeschichte”96 as a new term for the era of the latest years.

One way or another: No single concept “Zeitgeschichte” exists in the German language area,

but rather more than one “Zeitgeschichten”97.

Besides the meaning of the term “Zeitgeschichte” itself, the strong national focus of

historiography has also been a prevailing point at issue. Although Hans Rothfels in 195398

already saw the need for dealing with “Zeitgeschichte” in an international context,

Contemporary History in the German language area still means Contemporary History of the

German language area in 2005. Differences or comparisons on an international scale still do

not have a strong focus in the mainstream historiography99. Also, initiatives against that

national focuses100 did not have any decisive impact so far101.

Not only the temporal “Epochengrenzen” and the national but also the thematic focus

of the historiography on Contemporary History in the German language area have been and

still are subject to debate. In the hindsight, it looks like the postulate by Hans Rothfels from 93 Möller and Wengst (2003) "Einführung in die Zeitgeschichte" 11: “Als Zeitgeschichte gilt heute in

Deutschland die Geschichte des 20. Jahrhunderts von 1917 bis 1989/91”. Möller relativizes this rather absolute definition later on page 25: Zeitgeschichte be “ebenso fließend wie ihr Gegenstandsbereich”.

94 Cf. Gehler (2001) "Zeitgeschichte im dynamischen Mehrebenensystem: Zwischen Regionalisierung, Nationalstaat, Europäisierung, internationaler Arena und Globalisierung" 12: “Dilemma der Zeitgeschichtsschreibung, daß sie sich einer genauen Zuordnung entzieht, was in der Natur der Sache liegt”.

95 Cf. Hockerts (2001) "Zeitgeschichte in Deutschland. Begriff, Methoden, Themenfelder" 19: “Zeitgeschichte” should itself “niemals zwischen die Grenzpfähle exklusiver Definitionen sperren lassen”.

96 Cf. Schwarz (2003) "Die neueste Zeitgeschichte". 97 Cf. Mattl "Nicht eine, sondern viele Zeitgeschichten. In Annahme einer 'dritten Generation'" for the

specifics of the Austrian historiography on “Zeitgeschichte”. 98 Rothfels 7: “[…] daß Zeitgeschichte als Aufgabe im Prinzip einer Behandlung im internationalen

Rahmen bedarf”. 99 Cf. Möller and Wengst "Einführung in die Zeitgeschichte" 11: In his introduction on

“Zeitgeschichte” Möller writes: “wobei international Entwicklungen soweit möglich und erforderlich in die Betrachtung einbezogen werden” (emphasis added). Critical about this: Angerer (2004) "'Eigene' deutsche Zeitgeschichte oder: wie veraltet eine neue 'Einführung in die Zeitgeschichte' sein kann" 264.

100 For example: Gehler "Zeitgeschichte zwischen Europäisierung und Globalisierung" and Kleßmann (2002) "Zeitgeschichte als wissenschaftliche Aufklärung".

101 Cf. Jarausch (2004) "Zeitgeschichte zwischen Nation und Europa. Eine transnationale Herausforderung", 4.

Page 27: A Field Study of Subject Gateways on 'Zeitgeschichte

27

over fifty years ago, namely to deal with the time of National Socialism as an obligation102,

has been dutifully fulfilled. In particular, the historiography on “Zeitgeschichte” of the last

fifty years has had a determining focal point in dealing with the time of National Socialism103.

Although that rather exclusive focus, mainly in combination with a focus on political

history104, has been strongly criticized105 and has also been seen as already being deteriorating

in 1991106, the results of the analysis as presented in chapters V 2.3 and V 2.4 of this study

showed that it is still very present.

The list of alternative or additional topics for Contemporary History in the German

language area is innumerable. “Globalization”107 or the “History of European Integration”108

are just two recently discussed potential topics of “Zeitgeschichte”.

Especially in Germany and Austria with their specific historical backgrounds, it will

not be easy to – more or less by the way – have Contemporary History set its topical focus

after 1945, as it is currently practiced by the Anglo-American Journal of Contemporary

History109. The latter’s publisher announced the new chronological focus in an editorial. Not

102 Cf. Rothfels 8: It is a “unabweisbare Verpflichtung gerade der deutschen Wissenschaft, die

nationalsozialistische Phase mit aller Energie anzugehen”. 103 Möller and Wengst "Einführung in die Zeitgeschichte" 25: “entscheidende Prägung der deutschen

Zeitgeschichtsforschung durch die NS-Thematik”. Also see: Gehler "Zeitgeschichte zwischen Europäisierung und Globalisierung" 33: “beispiellos untersuchtes ‘Drittes Reich’”.

104 Cf. Angerer "'Eigene' deutsche Zeitgeschichte oder: wie veraltet eine neue 'Einführung in die Zeitgeschichte' sein kann" 265f.

105 For example: Gehler "Zeitgeschichte zwischen Europäisierung und Globalisierung" 33: “Eine ausschließlich oder überwiegend mit dem Nationalsozialismus befasste Zeitgeschichte bleibt nicht nur rückwärtsgewandt, sondern auch rückständig”. Similar in a review of the Virtual Library Zeitgeschichte: Böhler and Gehler "Wendungen nach innen? Selektive Blicke auf die Zeitgeschichte": “Ein zeithistoriografischer Mauerfall im Sinne einer gegenwartsorientierten Zeitgeschichtsbetrachtung hat hier ebenso wenig stattgefunden, wie eine Zeitgeschichte im Stile eines ‘dynamischen Mehrebenensystems’ erkennbar wird. […] Fixiert auf ‘Drittes Reich’ und Holocaust bewegt sich nahezu alles um die Fluchtpunkte 1933 und 1945 und ihre ‘Rezeption’“.

106 Cf. Botz (1991) "Zeitgeschichte in einer politisierten Geschichtskultur. Historiographie zum 20. Jahrhundert in Österreich" 328: “das nur allmähliche Abklingen der ‘politischen’, ‘klassischen’ Zeitgeschichte und die langsame Heraufkunft der Gegenwartsgeschichte mit ihren epochenübergreifenden, dennoch spezifisch gegenwartsbezogenen humanwissenschaftlichen Momenten […] erscheint als wahrscheinlich”.

107 Gehler "Zeitgeschichte im dynamischen Mehrebenensystem: Zwischen Regionalisierung, Nationalstaat, Europäisierung, internationaler Arena und Globalisierung" 192-197. In Möller and Wengst "Einführung in die Zeitgeschichte", Globalization is only indirectly referred to as “Interdependenz der modernen Welt” on the half of page 51. See Thomas Angerer’s critical comments on that at Angerer "'Eigene' deutsche Zeitgeschichte oder: wie veraltet eine neue 'Einführung in die Zeitgeschichte' sein kann" 264.

108 Cf. Jarausch , Gehler "Zeitgeschichte zwischen Europäisierung und Globalisierung", and Hockerts "Zeitgeschichte in Deutschland. Begriff, Methoden, Themenfelder" 19.

109 Cf. Marwick (1997) "A New Look, A New Departure: A Personal Comment on Our Changed Appearance", Journal of Contemporary History 32, 6 cited in Angerer (1997)

Page 28: A Field Study of Subject Gateways on 'Zeitgeschichte

28

to think about the discussions and protest such a modus operandi would render for a German

or Austrian journal on Contemporary History.

The seemingly unchangeable central subject focus in the German language area may

be additionally amplified by the mere structure of academic organization, where departments

and chairs for “Zeitgeschichte” and “Neuere Geschichte” co-exist with blurry and partially

overlapping contours rather easily prone to turf wars. For “Zeitgeschichte” researchers it can

not be easy to metaphorically pass over the field of their specialty for which they have been

known for over years or even decades in the public, the academe and –– not to forget –– the

institutions responsible for funding: University heads and the federal ministry.

Still, if the current focus of Contemporary History in the German language area

remains the same, should it then still be called “Zeitgeschichte”?

Although far from being wide-spread, a discussion about the crisis of academic

“Zeitgeschichte” has already been started110. In order to prevent any such crisis to further

manifest itself, a new and clear answer to Hans Rothfels’ original question about the task of

“Zeitgeschichte” is required. Such answer would need to provide a clear profile of

“Zeitgeschichte” and thereby could explicitly denote the diversity of topics and methods as

inherent properties. Absolute definitions equating “Zeitgeschichte” limiting it to the 20th

century or a certain geographic area will be of very little help in that regard. Such definitions

would have to get more and more dissatisfying with the progress of time and the global

challenges of the future.

2 Subject gateways are needed as hubs in the Contemporary

History Network

As has been demonstrated in chapter V 5, the network density of the ZIS/ZOL/VLZ

Contemporary History network is very low. Only an evanescent number of websites are

linked among themselves. Reasons for this inter alia include the still reluctant attitude of

German historians towards “Computing in the Humanities” in general111 and the World Wide

Web in particular. In the recent “Einführung in die Zeitgeschichte” of Horst Möller and Udo

Wengst, the only article related to the field of Computing in the Humanities therein is titled

"'Gegenwartsgeschichte'? Für eine Zeitgeschichte ohne Ausflüchte" 50. Also see ibid. 53: “Demnach wird sich das JCH künftig auf die Zeit nach 1945 konzentrieren“.

110 Cf. Gehler "Zeitgeschichte im dynamischen Mehrebenensystem: Zwischen Regionalisierung, Nationalstaat, Europäisierung, internationaler Arena und Globalisierung", 197-204. Also see Falch, et al. (2003) "The neXt Generation - 7 Positionen" 383.

111 Cf. Enderle "Geschichtswissenschaft, Fachinformation und das Internet" 2.

Page 29: A Field Study of Subject Gateways on 'Zeitgeschichte

29

“Internet”112 and encompasses six pages in a subsection called “practical tools”. The quality

of that article with regard to the usability for students of Contemporary History follows its

rather poor quantity and furthermore mirrors the attitude towards the relevance of the Internet

for the field of Contemporary History in the German language area; a relevance that is even

lapidary questioned per se in the preface of that book113 published by one of the largest

academic institution dealing with “Zeitgeschichte” in the German language area.

The character of most of the web pages related to Contemporary History as indexed by

the three subject gateways ZIS/ZOL/VLZ can be seen as another reason for the very low

network density: Many of the pages have a representative character and do not include neither

interactive elements like discussion fora, nor sources, nor other material potentially relevant

for research or teaching. In addition, scientific publishing standards are not met by many of

the web-pages114, revealing a striking discrepancy between the “offline-“ and the “online-

world” of scientific Contemporary History. Most web-pages are missing any link to subject-

related counterparts. One may therefore conclude that the original idea of Hypertext did not

have its breakthrough in the Contemporary History network.

For Contemporary History research professionals in the German language area, the H-

Net mailing list H-Soz-u-Kult115 has established itself a central role for subject specific

communication and information116. Web-pages, including subject gateways, only play a

secondary role117, although especially with regard to editorial quality filters, subject gateways

could provide a unique service.

However, interested non-professionals or undergraduate students on the other hand

will primarily see the “online-world” of Contemporary History through the viewpoint of the

World Wide Web and therefore the existence of web-subject gateways filling the large gaps in

the Contemporary History network is a necessity for the current status of the World Wide

Web.

112 Möller and Wengst "Einführung in die Zeitgeschichte" 255-260. 113 Ibid. 12: “Ein Kapitel unter der Überschrift ‘Praktische Hilfsmittel’ [..] enthält [..] einige

Basisinformationen über das Internet, soweit es für Zeithistoriker von Bedeutung ist” (emphasis added).

114 Cf. Wirtz 4 and Dornik "Zeitgeschichte und Internet" 166. 115 Cf. Footnote 25. 116 Cf. Enderle "Geschichtswissenschaft, Fachinformation und das Internet" 5. 117 Cf. The conclusion of Wirtz about the Internet as a publishing medium for Contemporary

Historians, 4: “Offensichtlich wird das Internet bei ZeitgeschichtlerInnen immer noch nicht als wissenschaftliches Publikationsmittel ernst genommen“.

Page 30: A Field Study of Subject Gateways on 'Zeitgeschichte

30

VII Conclusions In the course of this field study, three major web-based subject gateways on

Contemporary History in the German language area have been analyzed as an example of

applied Historical Information Science.

Starting out by applying generic technical web-page evaluation methods, it could be

shown that support for formal standards of the three subject gateways is poor. Neither

syntactical, nor metadata nor accessibility standards are applied to an adequate degree. A

usability evaluation using two sets of custom evaluation criteria has revealed that the three

subject gateways score satisfactorily with regard to a searchable databases usability guideline

set, and that they score well with regard to a generic web-usability guideline set.

For further analysis, specifically tailored crawlers had to be developed to use

heuristics to harvest the data of the link databases because no standards-based interoperability

framework is used by the subject gateways. Analyzing the data in the harvested aggregate link

database has demonstrated that the subject gateways’ management software does not prevent

from entering duplicate resources and does not take measures to avoid spelling mistakes

during classification.

Subject classification analysis has shown that a strong focus on “National Socialism”

and “Holocaust” is present in the “online-world” of Contemporary History in the German

language area. A comparison to the subject classification focus of the “offline-world”

displayed the similarly strong subject focus.

The three subject gateways’ link databases only share less than 1% of their total sum

of URLs, which indicates that the overall network density of the ZIS/VLZ/ZOL network is

very low. Applying information network analysis methods further supports this indication.

The field study has shown that the three subject gateways are not making use of the

current web-technologies’ and metadata-standards’ potential. Especially in the light of the

development towards a Semantic Web it remains to hope that the subject gateways on

Contemporary History at issue will improve their standard compliance and interoperability, so

that they will not remain in the domain of a comparatively meaningless web.

The results of the application of the custom analysis framework fertilized further

discussion on the meaning and purpose of “Zeitgeschichte” as a term per se on the one hand

and as an academic profession on the other. In addition, an interpretation as to the role of

subject gateways in the Contemporary History Network of the World Wide Web has been

inspired.

Page 31: A Field Study of Subject Gateways on 'Zeitgeschichte

31

This study has introduced aspects of a methodological canon for evaluating subject

gateways as an example of applied Historical Information Science. The author has shown that

approaching historical content by analyzing its “secondary”, i.e. formal aspects, can prepare

the grounds for new insights about the content itself, and that such analysis can enrich the

discussion about the current status and prospects of a whole discipline.

Page 32: A Field Study of Subject Gateways on 'Zeitgeschichte

32

VIII Bibliography Angerer, T. (1997). "Gegenwartsgeschichte"? Für eine Zeitgeschichte ohne Ausflüchte.

Zeitgeschichte im Wandel. 3. Österreichische Zeitgeschichtetage 1997, Wien, Innsbruck -

Wien: Studienverlag 1998: 46-53.

Angerer, T. (2004). "'Eigene' deutsche Zeitgeschichte oder: wie veraltet eine neue 'Einführung

in die Zeitgeschichte' sein kann." zeitgeschichte 4 (31): 261-269.

Angerer, T. (2005). Zur Kritik an NS-Fixierungstendenzen der Österreichischen

Zeitgeschichtsforschung. Mit einem Blick auf den französischen Vergleichsfall.

Demokratie - Zivilgesellschaft - Menschenrechte. Österreichischer Zeitgeschichtetag

2001, Klagenfurt, 4. - 6. Oktober 2001, Klagenfurt, Innsbruck - Wien - München - Bozen:

Studienverlag 2005 (in print). Longer PDF-Version available at

http://www.univie.ac.at/igl.geschichte/angerer/IfG_homepage/Aufsaetze/Angerer_Zur_Kr

itik_an_NS-Fixierungstendenzen.pdf.

Apps, A. (2004). zetoc SOAP: A Web Service Interface for a Digital Library Resource.

Research and Advanced Technology for Digital Libraries, 8th European Conference,

ECDL 2004, Bath, UK.

Arbeitsgemeinschaft der Parlamentsbibliotheken und Behördenbibliotheken. (2003).

"Stellungnahmen, Materialien und Informationen zu dem Beschluss des

Standardisierungsausschusses bei der Deutschen Bibliothek, einen Umstieg von den

deutschen auf internationale Regelwerke und Formate (AACR und MARC) anzustreben."

Retrieved 2005-04-05, from http://www.apbb.de/aacr.html.

Baca, M., A. Gilliland-Swetland, T. Gill and M. Woodley. (2000). "Introduction to metadata:

Pathways to digital information." Retrieved 2005-04-20, from

http://www.getty.edu/research/conducting_research/standards/intrometadata/index.html.

Badre, A. N. (2002). Shaping Web Usability: Interaction Design in Context, Addison-Wesley

Professional.

Barton, M. R. and M. M. Waters. (2004). "Creating an Institutional Repository: LEADIRS

Workbook." Retrieved 2005-04-08, from http://dspace.org/implement/leadirs.pdf.

Page 33: A Field Study of Subject Gateways on 'Zeitgeschichte

33

Batagelj, V. (2005). Pajek 1.04. Retrieved 2005-04-04, from http://vlado.fmf.uni-

lj.si/pub/networks/pajek/.

Benta, M. (2003). Agna 2.1.1. Retrieved 2005-03-02, from

http://www.geocities.com/imbenta/agna/.

Berendt, B., A. Hotho, D. Mladenic, M. van Someren, M. Spiliopoulou and G. Stumme

(2004). A Roadmap for Web-Mining: From Web to Semantic Web. Web Mining: From

Web to Semantic Web. First European Web Mining Forum, EWMF 2003, Cavtat-

Dubrovnik, Croatia.

Bergman, M. M. and A. P. M. Coxon (2005). "The Quality in Qualitative Methods." Forum:

Qualitative Social Research 6 (2). Retrieved 2005-12-06, from http://www.qualitative-

research.net/fqs-texte/2-05/05-2-34-e.htm.

Berners-Lee, T., J. Hendler and O. Lassila (2001). "The Semantic Web. A new form of Web

content that is meaningful to computers will unleash a revolution of new possibilities"

Scientific American (May 2001). Retrieved 2006-01-27, from

http://www.scientificamerican.com/article.cfm?articleID=00048144-10D2-1C70-

84A9809EC588EF21&catID=2.

Biste, B. and R. Hohls (2000). "Fachinformation und EDV-Arbeitstechniken für Historiker.

Einführung und Arbeitsbuch. Anhang: Online-Referenz." HSR-TRANS 6. Retrieved

2005-06-20, from http://hsr-trans.zhsf.uni-koeln.de/volume6.htm.

Blandford, A. and G. Buchanan (2002). Workshop report: Usability of Digital Libraries @

JCDL’02. Joint Conference on Digital Libraries 2002. Portland, Oregon, USA. Retrieved

2005-11-27, from http://www.uclic.ucl.ac.uk/annb/docs/SIGIR.pdf.

Bodoff, D., P. C. K. Hung and M. Ben-Menachem (2005). "Web metadata standards:

observations and prescriptions." IEEE Software 22 (1): 78-85. Retrieved 2005-03-01,

from

http://ieeexplore.ieee.org/iel5/52/30054/01377128.pdf?isnumber=30054&prod=JNL&arn

umber=1377128&arSt=+78&ared=+85&arAuthor=David+Bodoff%3B+Hung%2C+P.C.

K.%3B+Ben-Menachem%2C+M.

Page 34: A Field Study of Subject Gateways on 'Zeitgeschichte

34

Böhler, I. (2001). "Zeitgeschichtsforschung und Internet. ZIS (Zeitgeschichte-Informations-

System) als Beispiel." e-forum Zeitgeschichte 1. Retrieved 2005-03-18, from

http://www.eforum-zeitgeschichte.at/1_01a5.html.

Böhler, I. and M. Gehler (2004). "Wendungen nach innen? Selektive Blicke auf die

Zeitgeschichte." Zeithistorische Forschungen/Studies in Contemporary History 1.

Retrieved 2005-03-01, from http://www.zeithistorische-forschungen.de/16126041-

Boehler-Gehler-1-2004.

Böhler, I., M. Kröll and E. Pfanzelter (1999). "Surfen in der Zeitgeschichte. ZIS: Das

österreichische Zeitgeschichte-Informations-System im Internet." medien & zeit.

Kommunikation in Geschichte und Gegenwart 14 (4): 43-50.

Böhler, I. and E. Pfanzelter (1997). ZIS: Das österreichische Zeitgeschichte-Informations-

System im Internet. Zeitgeschichte im Wandel. Österreichische Zeitgeschichtetage 1997,

Wien, Innsbruck - Wien: Studienverlag 1998: 449-458.

Boonstra, O., L. Breure and P. Doorn (2004). Past, present and future of historical

information science. Retrieved 2005-04-02, from

http://www.niwi.nl/nl/geschiedenis/medewerkers/peter_doorn_home_page/new_0_copy1/

past_present_future_of_historical_information_science/new/C%3A%5CDocuments+and+

Settings%5CPeterD%5CMy+Documents%5CNIWI%5CPPF%5CPPF+voor+web.pdf.

Borgatti, S. P. (2002). NetDraw: Graph Visualization Software, Analytic Technologies.

Retrieved 2005-03-18, from http://www.analytictech.com/ucinet.htm.

Borgatti, S. P., M. G. Everett and L. C. Freeman (1999). UCINET 6.0 Version 1.00, Analytic

Technologies. Retrieved 2005-03-18, from http://www.analytictech.com/ucinet.htm.

Botz, G. (1991). Zeitgeschichte in einer politisierten Geschichtskultur. Historiographie zum

20. Jahrhundert in Österreich. Geschichtswissenschaft vor 2000. Perspektiven der

Historiographiegeschichte, Geschichtstheorie, Sozial- und Kulturgeschichte. Festschrift

für Georg G. Iggers zum 65. Geburtstag. K. H. Jarausch, J. Rüsen and H. Schleier. Hagen:

299-328.

Botz, G. (1993). Zwölf Thesen zur Zeitgeschichte in Österreich. Österreichischer

Zeitgeschichtetag 1993, Innsbruck, Wien: Studienverlag 1995: 19-33.

Page 35: A Field Study of Subject Gateways on 'Zeitgeschichte

35

Brandes, U., D. Wagner, M. Baur, M. Benkert, S. Cornelsen, M. Gaertler, B. Köpf, J. Lerner

and J. Ritter (2002). visone 1.1. Retrieved 2005-03-05, from http://www.visone.de/.

Brodersen, M. D., Jürgen and J.-H. Kirsch (2003). "Zeitgeschichte-Online - Ein Fachportal

für die zeithistorische Forschung." Potsdamer Bulletin für Zeithistorische Studien (30/31):

12-16. Retrieved 2005-01-10, from http://www.zzf-pdm.de/bull/pdf/b3031/3031_zol.pdf.

Campbell, D., N. Van Kempen, L. Arkles and B. Rozmus. (2003). "Definitions for Web-

Based Services." Retrieved 2006-01-23, from

http://www.nla.gov.au/initiatives/sg/servicetypes.html.

Caplan, P. (2004). "OAI-PMH." Computers in Libraries 24 (2): 24.

Clio-online – Historisches Informationssystem. (2005). "Arbeits- und Ergebnisbericht des

DFG-Projektes Clio-online - Historisches Informationssystem (Bericht zur Projektphase

I)." Retrieved 2005-03-21, from http://www.clio-

online.de/rainbow/_Rainbow/documents/Clio_online_Endbericht_Web_20050211.pdf

Dean, R. J. (2003). FAST: Development of Simplified Headings for Metadata. Authority

Control: Definition and International Experiences. Florence, Italy. Retrieved 2005-12-13,

from http://www.sba.unifi.it/ac/relazioni/dean_eng.pdf.

Deutsche Bibliothek. (2005). "Schlagwortnormdatei (SWD)." Retrieved 2005-12-12, from

http://www.ddb.de/standardisierung/normdateien/swd.htm.

Dornik, W. (2003). Zeitgeschichte und Internet, Dissertation, University of Graz, 223 pages.

Edmonds, K. A., A. Stephenson and M. Ashmore (2003). uzReview 0.7.1. Retrieved 2005-12-

14, from http://uzilla.mozdev.org/heuristicreview.html.

Enderle, W. (2001). Der Historiker, die Spreu und der Weizen, zur Qualität und Evaluierung

geschichtswissenschaftlicher Internet-Ressourcen. Geschichte und Internet – Raumlose

Orte, geschichtslose Zeit? Geschichte und Informatik – Histoire et Informatique 12: 49-

64. Retrieved 2006-01-17, from http://www.hist.net/hs-

kurs/qualitaet/doku/enderle_qualitaet.pdf.

Page 36: A Field Study of Subject Gateways on 'Zeitgeschichte

36

Enderle, W. (2001). "Geschichtswissenschaft, Fachinformation und das Internet." eforum

zeitGeschichte 3/4. Retrieved 2006-01-17, from http://www.eforum-

zeitgeschichte.at/3_01a7.pdf.

Eversberg, B. (1999). "AACR: Die 50 wichtigsten Begriffe." Retrieved 2005-04-05, from

http://www.allegro-c.de/formate/aacr-it.htm.

Eversberg, B. (2004). "Eine seltene Sache. Erwartung und Ernüchterung bei der thematischen

Katalogsuche." Retrieved 2005-12-13, from http://www.allegro-c.de/regeln/cosarara.htm.

Eversberg, B. (2005a). "Sachliche Erschließung. Eine Aufgabe mit vielen Facetten."

Retrieved 2005-12-13, from http://www.allegro-c.de/formate/se.htm.

Eversberg, B. (2005b). "Wie katalogisiert man ein Buch? Ein Leitfaden nicht nur für

Einsteiger." Retrieved 2005-12.10, from http://www.allegro-c.de/regeln/rak-einf.htm.

Eversberg, B. (2005c). "Zur Zukunft der Katalogisierung.jenseits RAK und AACR."

Retrieved 2005-12-12, from http://www.allegro-c.de/formate/zk.htm.

Falch, S., H.-C. Gruber, G. Lamprecht, L. Rettl, A. Schober, M. Sommer and R. Thumser

(2003). "The neXt Generation - 7 Positionen." Zeitgeschichte 30 (6): 376-386.

Gehler, M. (2001). Zeitgeschichte im dynamischen Mehrebenensystem: Zwischen

Regionalisierung, Nationalstaat, Europäisierung, internationaler Arena und

Globalisierung. Bochum.

Gehler, M. (2002). "Zeitgeschichte zwischen Europäisierung und Globalisierung." Aus Politik

und Zeitgeschichte 51-52: 23-35. Retrieved 2006-01-15, from

http://www.bpb.de/publikationen/0RIY7E,0,0,Zeitgeschichte_als_wissenschaftliche_Aufk

l%E4rung.html.

Gehringer, H. (2003). "Rez. WWW: Zeitgeschichte Informations System (ZIS)." Retrieved

2005-01-18, from http://hsozkult.geschichte.hu-

berlin.de/rezensionen/id=18&type=rezwww.

Göttingen, N. S.-u. U. (1999). Das Sondersammelgebiets-Fachinformationsprojekt (SSG-FI)

Göttingen. Dokumentation – Teil 1. dbi-materialien. Schriften der Deutschen

Page 37: A Field Study of Subject Gateways on 'Zeitgeschichte

37

Forschungsgemeinschaft, Deutsches Bibliotheksinstitut. 185. Retrieved 2005-02-17, from

http://www.sub.uni-goettingen.de/ssgfi/projekt/ssgfi.pdf.

Gradmann, S. (2005). Hat Bibliothekswissenschaft eine Zukunft? Abweichlerische Gedanken

zur Zukunft einer Disziplin mit erodierendem Gegenstand. Bibliothekswissenschaft – quo

vadis? = Library Science – quo vadis? Eine Disziplin zwischen Traditionen und Visionen;

Programme – Modelle – Forschungsaufgaben. P. Hauke: 97-102. Retrieved 2005-12-11,

from http://www.rrz.uni-

hamburg.de/RRZ/S.Gradmann/Bibliothekswissenschaft_gradmann.pdf.

Greenberg, J. (2003a). "The Semantic Web: More than a Vision." Bulletin of the American

Society for Information Science and Technology 29 (4): 6-7. Retrieved 2005-04-09, from

http://www.asis.org/Bulletin/Apr-03/greenberg.html.

Greenberg, J. (2004). "Metadata Extraction and Harvesting: A Comparison of Two Automatic

Metadata Generation Applications." Journal of Internet Cataloging 6 (4): 59-82 Retrieved,

from http://www.ils.unc.edu/mrc/automatic.pdf.

Greenberg, J., K. Spurgin and A. Crystal (2005). Final Report for the AMeGA (Automatic

Metadata Generation Applications) Project, UNC School of Information and Library

Science. Retrieved 2005-03-06, from

http://www.loc.gov/catdir/bibcontrol/lc_amega_final_report.pdf.

Greenberg, J., S. Sutton and D. G. Campbell (2003). "Metadata: A Fundamental Component

of the Semantic Web." Bulletin of the American Society for Information Science and

Technology 29 (4): 16-18. Retrieved 2005-04-09, from http://www.asis.org/Bulletin/Apr-

03/greenbergetal.html.

Hackathorn, R. (2003a). "The Link is the Thing. Part I." DM Review (August). Retrieved

2005-04-12, from http://www.bolder.com/ALA/Link-DMR2003.pdf.

Hackathorn, R. (2003b). "The Link is the Thing. Part II." DM Review (September). Retrieved

2005-04-12, from http://www.bolder.com/ALA/Link-DMR2003.pdf.

Hackathorn, R. (2003c). "The Link is the Thing. Part III." DM Review (October). Retrieved

2005-04-12, from http://www.bolder.com/ALA/Link-DMR2003.pdf.

Page 38: A Field Study of Subject Gateways on 'Zeitgeschichte

38

Hanneman, R. A. (2001). "Introduction to Social Network Methods." Retrieved 2005-04-10,

from http://faculty.ucr.edu/~hanneman/SOC157/NETTEXT.PDF.

Hauke, P., J. Grünewald, B. Kaden, A. Kaufmann and M. Kindling (2005). Library Science -

quo vadis? (Re)Discovering “Bibliothekswissenschaft”. World Library and Information

Congress: 71th IFLA General Conference and Council. "Libraries - A voyage of

discovery". Oslo, Norway. Retrieved 2005-12-11, from

http://www.ifla.org/IV/ifla71/papers/048e-Hauke.pdf.

Heiner-Freiling, M. and L. G. Svensson. (2005). "Dewey-Dezimalklassifikation." Retrieved

2005-12-13, from http://www.ddc-deutsch.de/.

Hennig, N. and C. Quirion. (2004). "Best practices for web interfaces of searchable

databases." Retrieved 2005-12-01, from

http://macfadden.mit.edu:9500/webgroup/heuristics/index.html.

Hernández, M. A. and S. J. Stolfo (1998). "Real-world Data is Dirty: Data Cleansing and The

Merge/Purge Problem." Data Mining and Knowledge Discovery 2 (1): 9-37. Retrieved

2005-05-01, from

http://springerlink.metapress.com/openurl.asp?genre=article&id=doi:10.1023/A:10097616

03038.

Hillman, D. (2003). "Using Dublin Core." Retrieved 2005-03-03, from

http://www.dublincore.org/documents/usageguide/.

Hockerts, H. G. (2001b). "Zugänge zur Zeitgeschichte: Primärerfahrung, Erinnerungskultur,

Geschichtswissenschaft." Aus Politik und Zeitgeschichte 28: 15-30. Retrieved 2006-01-

16, from

http://www.bpb.de/publikationen/JSE0YE,0,0,Zug%E4nge_zur_Zeitgeschichte:_Prim%E

4rerfahrung_Erinnerungskultur_Geschichtswissenschaft.html.

Hockerts, H. G. (2001a). "Zeitgeschichte in Deutschland. Begriff, Methoden, Themenfelder."

Aus Politik und Zeitgeschichte 29-30: 3-19.

Hohls, R. (2004). "H-Soz-u-Kult: Kommunikation und Fachinformation für die

Geschichtswissenschaften." Historical Social Research (HSR) 29 (1): 212-232.

Page 39: A Field Study of Subject Gateways on 'Zeitgeschichte

39

Hohls, R. and P. Helmberger (1999). "H-Soz-u-Kult: Eine Bilanz nach drei Jahren." Historical

Social Research (HSR) 24 (3): 7-35.

Holzbauer, R. (2001). "Ein Historiker im Netz." e-forum Zeitgeschichte (3). Retrieved 2005-

03-18, from http://www.eforum-zeitgeschichte.at/set3_01a5.htm.

Jacob, E. K. (2003). "Ontologies and the Semantic Web." Bulletin of the American Society

for Information Science and Technology 29 (4): 19-22. Retrieved 2005-04-09, from

http://www.asis.org/Bulletin/Apr-03/jacob.html.

Jarausch, K. H. (2004). "Zeitgeschichte zwischen Nation und Europa. Eine transnationale

Herausforderung." Aus Politik und Zeitgeschichte 39. Retrieved 2006-01-15, from

http://www.bpb.de/publikationen/YZF6ZR,0,0,Zeitgeschichte_zwischen_Nation_und_Eur

opa.html.

Jenks, S. and S. Marra, Eds. (2001). Internet-Handbuch Geschichte. Köln - Weimar - Wien.

Jenks, S. and P. Tiedemann (2000). Internet für Historiker. Eine praxisorientierte Einführung.

Darmstadt.

Kelly, B. (2004). Interoperable Digital Library Programmes? We Must Have QA! Research

and Advanced Technology for Digital Libraries, 8th European Conference, ECDL 2004,

Bath, UK.

Kelly, B. D., A. Guy, M. Phipps, L. (2003). "Ideology Or Pragmatism? Open Standards And

Cultural Heritage Web Sites". ichim03. Retrieved 2005-03-06, from

http://www.ukoln.ac.uk/qa-focus/documents/papers/ichim03/.

Kieslinger, C. (2004). Historischer Content Online – Fachinformation in Österreich. ODOK

'03 Ein Jahrzehnt World Wide Web: Rückblick - Standortbestimmung - Ausblick.

Tagungsbericht vom 10. Österreichischen Online-Informationstreffen und 11.

Österreichischen Dokumentartag, 23.-26. September 2003, Universität Salzburg,

Naturwissenschaftliche Fakultät. Ed. E. Pipp, Biblos-Schriften 179: 237-247.

Kleßmann, C. (2002). "Zeitgeschichte als wissenschaftliche Aufklärung." Aus Politik und

Zeitgeschichte 51-52: 3-12. Retrieved 2006-01-15, from

Page 40: A Field Study of Subject Gateways on 'Zeitgeschichte

40

http://www.bpb.de/publikationen/0RIY7E,0,0,Zeitgeschichte_als_wissenschaftliche_Aufk

l%E4rung.html.

Koch, T. (2000). "Quality-controlled subject gateways: definitions, typologies, empirical

overview." Online Information Review 24 (1): 24-34. Manuscript retrieved 2006-01-23,

from http://www.lub.lu.se/~traugott/OIR-SBIG.txt.

Koyani, S. J., R. W. Bailey, J. R. Nall, S. Allison, C. Mulligan, K. Bailey and M. Tolson

(2003). Research-Based Web Design & Usability Guidelines, U.S. Department of Health

and Human Services (HHS). Retrieved 2005-11-25, from

http://usability.gov/pdfs/guidelines_book.pdf.

Kröll, M. (2005). Not ready for the Semantic Web: A field study of subject gateways on

Contemporary History. XVI international conference of the Association for History and

Computing (AHC 2005). Amsterdam, Netherlands: 176-181. Retrieved 2006-01-10, from

http://www.knaw.nl/publicaties/pdf/20051064.pdf.

Kropač, I. (2004). "Was ist 'Historische Fachinformatik und Dokumentation'?

Terminologisches, Inhalte, Aufgaben." Retrieved 2006-01-23, from http://hfi.uni-

graz.at/hfi/allg/hfi.htm.

Krug, S. (2000). Don't Make Me Think. A Common Sense Approach to Web Usability.

Indianapolis, Indiana, USA.

Kyunghye, K. (2002). A Model-based Approach to Usability Evaluation for Digital Libraries.

JCDL'02 Workshop on Usability of Digital Libraries, Portland, Oregon, USA. Retrieved

2005-11-27, from http://www.uclic.ucl.ac.uk/annb/docs/Kim33.pdf.

Lagoze, C., H. Van de Sompel, M. Nelson and S. Warner. (2002). "The Open Archives

Initiative Protocol for Metadata Harvesting." Retrieved 2005-04-10, from

http://www.openarchives.org/OAI/openarchivesprotocol.html.

Longzhuang, L., Y. Shang and W. Zhang (2002). Improvement of HITS-based Algorithms on

Web Documents. The Eleventh International World Wide Web Conference. Honolulu,

Hawai, USA. Retrieved 2005-09-01, from http://www2002.org/CDROM/refereed/643/.

Page 41: A Field Study of Subject Gateways on 'Zeitgeschichte

41

Lorenz, B. "The Regensburg Classification: An introduction." Retrieved 2005-12-20, from

http://www.bibliothek.uni-regensburg.de/Systematik/RVK-Intro_en.pdf.

Mattl, S. (1983). Bestandsaufnahme zeitgeschichtlicher Forschung in Österreich,

Bundesministerium für Wissenschaft und Forschung, Vienna.

Mattl, S. (2003). "Nicht eine, sondern viele Zeitgeschichten. In Annahme einer ‘dritten

Generation’." Zeitgeschichte 30 (6): 357-366.

McCrank, L. (2002). Historical Information Science. An emerging Unidiscipline. Medford,

New Jersey, USA.

Miller, E. and R. Swick (2003). "An Overview of W3C Semantic Web Activity." Bulletin of

the American Society for Information Science and Technology 29 (4): 8-11. Retrieved

2005-04-09, from http://www.asis.org/Bulletin/Apr-03/millerswick.html.

Mirzaee, V., L. Iverson and B. Hamidzadeh (2004). Towards Ontological Modelling of

Historical Documents. 7th International Protégé Conference. Bethesda, Maryland, USA.

Retrieved 2005-10-01, from

http://protege.stanford.edu/conference/2004/abstracts/Mirzaee.pdf.

Mitchell, W. B., L. Davidson, R. Ziegler and A. Viles. (1999). "Testing the Design of a

Library Information Gateway." Retrieved 2005-04-02, from

http://library.georgiasouthern.edu/usability/acrlwebpapers5.pdf.

Möller, H. (1988). "Zeitgeschichte - Fragestellungen, Interpretationen, Kontroversen." Aus

Politik und Zeitgeschichte 2: 3-16.

Möller, H. and U. Wengst, Eds. (2003). Einführung in die Zeitgeschichte. Munich.

Mruck, K. (2005). "Providing (Online) Resources and Services for Qualitative Researchers:

Challenges and Potentials." Forum: Qualitative Social Research 6 (2). Retrieved 2005-12-

06, from http://www.qualitative-research.net/fqs-texte/2-05/05-2-38-e.htm.

Murray, G. and T. Costanzo. (1999). "Usability and the Web: An Overview." Retrieved 2005-

11-26, from http://www.collectionscanada.ca/9/1/p1-260-e.html.

Page 42: A Field Study of Subject Gateways on 'Zeitgeschichte

42

Nagypal, G. (2005). History ontology building: The technical view. XVI international

conference of the Association for History and Computing (AHC 2005). Amsterdam,

Netherlands: 207-214. Retrieved 2006-01-10, from

http://www.knaw.nl/publicaties/pdf/20051064.pdf.

Neiling, M. (2004). Identifizierung von Realwelt-Objekten in multiplen Datenbanken,

Brandenburgische Technische Universität. Retrieved 2005-12-10, from

http://www.cis.cs.tu-berlin.de/~mneiling/NEILING_DISS_MIRROR_BTU-

COTTBUS/neiling_m.htm.

Nentwich, M. (2003). Cyberscience. Research in the Age of the Internet. Vienna.

Newman, M. E. J. (2003). "The structure and function of complex networks." SIAM Review

(45): 167-256. Retrieved 2005-04-14, from http://arxiv.org/pdf/cond-mat/0303516.

Nielsen, J. (1996). "Original Top Ten Mistakes in Web Design." Retrieved 2005-11-26, from

http://www.useit.com/alertbox/9605a.html.

Nielsen, J. (1997). "Search and You May Find." Retrieved 2005-11-26, from

http://www.useit.com/alertbox/9707b.html.

Nielsen, J. (1999a). "’Top Ten Mistakes’ Revisited Three Years Later." Retrieved 2005-11-

26, from http://www.useit.com/alertbox/990502.html.

Nielsen, J. (1999b). "The Top Ten New Mistakes of Web Design." Retrieved 2005-11-26,

from http://www.useit.com/alertbox/990530.html.

Nielsen, J. (2002a). "Top Ten Guidelines for Homepage Usability." Retrieved 2005-11-26,

from http://www.useit.com/alertbox/20020512.html.

Nielsen, J. (2002b). "Top Ten Web Design Mistakes of 2002." Retrieved 2005-11-26, from

http://www.useit.com/alertbox/20021223.html.

Nielsen, J. (2003a). "The Ten Most Violated Homepage Design Guidelines." Retrieved 2005-

11-26, from http://www.useit.com/alertbox/20031110.html.

Nielsen, J. (2003b). "Top Ten Web Design Mistakes of 2003." Retrieved 2005-11-26, from

http://www.useit.com/alertbox/20031222.html.

Page 43: A Field Study of Subject Gateways on 'Zeitgeschichte

43

Nielsen, J. (2004). "Top Ten Mistakes in Web Design." Retrieved 2005-11-26, from

http://www.useit.com/alertbox/9605.html.

Nielsen, J. (2005a). "Ten Usability Heuristics." Retrieved 2005-11-26, from

http://useit.com/papers/heuristic/heuristic_list.html.

Nielsen, J. (2005b). "Top Ten Web Design Mistakes of 2005." Retrieved 2005-11-26, from

http://www.useit.com/alertbox/designmistakes.html.

NIST (2002). WebSAT, NIST. Retrieved 2005-12-01, from

http://zing.ncsl.nist.gov/WebTools/WebSAT/overview.html.

Online Computer Library Center (OCLC). (2005). "OCLC Bibliographic Formats and

Standards." 3rd. Retrieved 2005-04-04, from http://www.oclc.org/bibformats/.

Online Computer Library Center (OCLC). (2005). "FAST: Faceted Application of Subject

Terminology." Retrieved 2005-12-12, from

http://www.oclc.org/research/projects/fast/default.htm.

Online Computer Library Center (OCLC). (2005). "OCLC Research Software." Retrieved

2005-03-20, from http://www.oclc.org/research/software/.

Page, L., S. Brin, R. Motwani and T. Winograd (1998). The PageRank Citation Ranking:

Bringing Order to the Web, Stanford Digital Library Technologies Project. Retrieved

2005-04-04, from http://dbpubs.stanford.edu:8090/pub/1999-66.

Pierau, K. (2003). "Datenbank- und Informationsmanagement in der Historischen

Sozialforschung." HSR-TRANS 14. Retrieved 2005-06-20, from http://hsr-trans.zhsf.uni-

koeln.de/volume14.htm.

Powell, A. (2003). "Expressing Dublin Core in HTML/XHTML meta and link elements."

Retrieved 2005-03-03, from http://www.dublincore.org/documents/dcq-html/.

Ravindranathan, U., R. Shen, M. A. Gonçalves, W. Fan, E. A. Fox and J. W. Flanagan (2004).

Prototyping Digital Libraries Handling Heterogeneous Data Sources - The ETANA-DL

Case Study. Research and Advanced Technology for Digital Libraries, 8th European

Conference, ECDL 2004, Bath, UK.

Page 44: A Field Study of Subject Gateways on 'Zeitgeschichte

44

Reamy, T. (2004). "To Metadata or Not To Metadata." EContent 27 (10): 34-39. Retrieved

2005-04-20, from

http://www.econtentmag.com/Articles/ArticleReader.aspx?ArticleID=7118.

Ridings, C. and M. Shishigin (2002). PageRank Uncovered Retrieved, from

http://www.chriseo.com/pagerank/PageRank.pdf.

Rothfels, H. (1953). "Zeitgeschichte als Aufgabe." Vierteljahreshefte für Zeitgeschichte 1: 1-

8.

Sabrow, M., R. Jessen, et al., Eds. (2003). Zeitgeschichte als Streitgeschichte. Große

Kontroversen nach 1945. Munich.

Sanderson, R. (2004). "A Gentle Introduction to SRW." Retrieved 2005-04-18, from

http://www.loc.gov/z3950/agency/zing/srw/introduction.html.

Schmidt, A. (1996). "Sacherschließung nach BIBOS." Mitteilungen der Vereinigung

Österreichischer Bibliothekarinnen & Bibliothekare 48 (3/4). Retrieved, from

http://www.uibk.ac.at/sci-org/voeb/vm48-34.html#alfred.

Schwarz, H.-P. (2003). "Die neueste Zeitgeschichte." Vierteljahreshefte für Zeitgeschichte 4:

5-29.

Sensch, J. (2002). "Statistische Modelle in der Historischen Sozialforschung I. Allgemeine

Grundlagen - Deskriptivstatistik." HSR-TRANS 8. Retrieved 2005-06-07, from http://hsr-

trans.zhsf.uni-koeln.de/hsr7/.

Shneiderman, B. (1997). "Designing information-abundant web sites: issues and

recommendations." International Journal of Human-Computer Studies 47 (1): 5-29.

Retrieved 2005-12-01, from http://ijhcs.open.ac.uk/shneiderman/shneiderman-nf.html.

Short, H. (2002). "The Role of Humanities Computing: Experiences and Challenges."

Historical Social Research (HSR) 24 (4): 282-301. Retrieved 2005-12-10, from http://hsr-

trans.zhsf.uni-koeln.de/hsrretro/docs/artikel/hsr/hsr2002_560.pdf.

Smith, M., R. Rodgers, J. Walker and R. Tansley (2004). DSpace: A Year in the Life of an

Open Source Digital Repository System. Research and Advanced Technology for Digital

Libraries, 8th European Conference, ECDL 2004, Bath, UK.

Page 45: A Field Study of Subject Gateways on 'Zeitgeschichte

45

Smith, P. A., I. A. Newman and L. M. Parks (1997). "Virtual hierarchies and virtual networks:

some lessons from hypermedia usability research applied to the World Wide Web."

International Journal of Human-Computer Studies 47 (1): 67-95. Retrieved 2005-12-01,

from http://ijhcs.open.ac.uk/smith/smith-nf.html.

Stearns, S. (2004). "Automate Classification and Improve Information Discovery." EContent

27 (7/8): 18.

Stumpf, G. (2004). "Internet-Informationen zur Sacherschließung." Retrieved 2005-12-20,

from http://www2.bibliothek.uni-augsburg.de/allg/swk/sacher_allg.html.

Tauscher, L. and S. Greenberg (1997). "How people revisit web pages: empirical findings and

implications for the design of history systems." International Journal of Human-Computer

Studies 47 (1): 97-137. Retrieved 2005-12-01, from

http://ijhcs.open.ac.uk/tauscher/tauscher-nf.html.

Tennant, R. (2004). "The Expanding World of OAI." Library Journal 129 (3): 32.

Thaller, M. (1997). Virtuelle (Zeit-)Geschichte. Eine Disziplin zwischen Popularität,

Postmoderne und dem Post-Post-Positivismus. Zeitgeschichte im Wandel. 3.

Österreichische Zeitgeschichtetage 1997, Wien, Innsbruck - Wien: Studienverlag 1998:

407-421.

Theng, Y. L., G. Buchanan, H. Thimbleby and N. Mohd-Nasir (2000). Purpose and usability

of digital libraries. Fifth ACM Conference on Digital Libraries, ACM DL'2000, San

Antonio, Texas, USA Retrieved 2005-12-01, from

http://www.uclic.ucl.ac.uk/harold/srf/dl00-purpose.pdf.

Theng, Y. L., N. Mohd-Nasir and H. Thimbleby (2000). A Usability Tool for Web Evaluation

applied to Digital Library Design. WWW9 Poster Proceedings, Amsterdam, Netherlands.

Retrieved 2005-12-01, from http://www.uclic.ucl.ac.uk/harold/srf/www9-tool.pdf.

Thome, H. (2001). "Grundkurs Statistik für Historiker. Teil I: Deskriptive Statistik." HSR-

TRANS 7. Retrieved 2005-10-12, from http://hsr-trans.zhsf.uni-koeln.de/hsr2/.

Page 46: A Field Study of Subject Gateways on 'Zeitgeschichte

46

Tillet, B. B. (2003). Authority Control: State of the Art and New Perspectives. Authority

Control: Definition and International Experiences. Florence, Italy. Retrieved 2005-12-13,

from http://www.sba.unifi.it/ac/relazioni/tillett_eng.pdf.

Umlauf, K. (2005). "Einführung in die Regeln für den Schlagwortkatalog RSWK." Retrieved

2005-12-11, from http://www.ib.hu-berlin.de/~kumlau/handreichungen/h66/.

Urbaner, R. and G. Lamprecht (2003). "'eForum zeitGeschichte' – ein Erfahrungsbericht."

zeitenblicke 2 (3). Retrieved 2005-03-08, from

http://www.zeitenblicke.historicum.net/2003/02/pdf/urbaner.pdf.

Van Laak, D. (2004). "Rez. WWW: Zeitgeschichte-online." Retrieved 2005-01-18, from

http://hsozkult.geschichte.hu-berlin.de/rezensionen/id=48&type=rezwww.

Vogeler, G., P. Sahle, H. Enzensberger and T. Frenz. (2005). "Historische

Hilfswissenschaften." Retrieved 2006-01-20, from http://www.vl-ghw.uni-

muenchen.de/hw.html#sect31.

Wirtz, S. (2005). "Marktanalyse. Deutschsprachige Online- und CD/DVD-Produktionen zum

Thema Nationalsozialismus und Holocaust. Ein Projekt des Fritz Bauer Instituts im

Auftrag der Bundeszentrale für politische Bildung." Retrieved 2006-01-20, from

http://www.fritz-bauer-institut.de/forschung/medienstudie.htm.

World Wide Web Consortium (W3C). (2004). "Architecture of the World Wide Web,

Volume One." Retrieved 2005-06-23, from http://www.w3.org/TR/webarch/.

Page 47: A Field Study of Subject Gateways on 'Zeitgeschichte

47

IX Appendix

1 Web usability checklists

1.1 Best practices for web interfaces of searchable databases118

ZIS VLZ ZOL Category “Homepages” 1 description of scope 1 1 1 2 table of contents 0 1 1 3 prominent search box 1 1 0 4 visible browse

categories 1 1 1

5 links back to parent organization

1 1 1

6 consistent site id/logo on top

1 1 1

7 links to contact info, staff, projects, and related systems

1 1 1

8 link to "about" the project or system

0 1 1

9 news/spotlight/featured items

1 1 1

Category “Search Screens” 10 visible examples of

search syntax 1 0 0

11 link to advanced/detailed search

0 0 1

12 ability to search the whole vs. search particular collections

0 0? 1

13 ability to AND/OR/NOT across fields

1 0? 1

14 ability to limit search to specific fields

1 0 1

15 search should be prominent on home page and other pages (results)

1 1 1

16 good "no hits" screen with ideas for how to modify search

0 0 0

17 the fact that you got no results should stand

0 0? 0

118 Cf. Hennig and Quirion "Best practices for web interfaces of searchable databases".

Page 48: A Field Study of Subject Gateways on 'Zeitgeschichte

48

out on the screen Category “Browse Screens” 18 visible categories on

top level 1 1 1

19 make sure links look clickable

1 1 1

20 show number of hits in each category (before you click)

0 1 0

Category “Results Screens” 21 show the number of

hits on the top of the page, and what you searched for

1 1 1

22 ability to modify the search right on that page

1 1 1

23 ability to sort by different criteria

0 1 1

24 make the default sort be the most useful one

1 1 1

25 ability to set how many results per page

0 0 0

26 Forward and back should be clear

1 1 1

27 rows of table display, every other row opposite color, helps scan

0 0 0

28 have a brief display that links to a full record

0 1? 1

29 avoid pop-up windows 1 1 1 30 links to related items 0 0 0 31 links to related

searches 0 0 0

Sum: 18 20 22

Page 49: A Field Study of Subject Gateways on 'Zeitgeschichte

49

1.2 Selected119 Nielsen web design mistakes

Nr. Mistake Nielsen Reference (Year)

Severity (1-5)120

ZIS VLZ ZOL

1 Gratuitous Use of Bleeding-Edge Technology

1996 5 0 0 0

2 Scrolling Text, Marquees, and Constantly Running Animations

1996 5 0 0 0

3 Outdated Information 1996 5 1 1 0 4 Overly Long Download

Times 1996 5 0 0 0

5 Breaking or Slowing Down the Back Button

1999 5 0 0 0

6 No Contact Information or Other Company Info

2005 5 0 0 0

7 Complex URLs 1996 4 1 1 1 8 Lack of Navigation Support 1996 4 1 0 0 9 PDF Files for Online

Reading 2004 4 0 0 1

10 Legibility Problems 2005 4 0 0 0 11 Non-Standard Links 2005 4 0 0 1 12 Flash 2005 4 0 0 0 13 Bad Search 2005 4 0 0 0 14 Cumbersome Forms 2005 4 0 0 0 15 Frozen Layouts with Fixed

Page Widths 2005 4 0 1 1

16 Using Frames 1996a 3 1 0 1 17 Orphan Pages 1996 3 0 0 0 18 Opening New Browser

Windows 1999 3 1 1 1

19 Non-Standard Use of GUI Widgets

1999 3 0 0 0

20 Headlines That Make No Sense Out of Context

1999 3 0 0 0

21 Jumping at the Latest Internet Buzzword

1999 3 0 0 0

22 Anything That Looks Like Advertising

1999 3 0 0 0

23 Horizontal Scrolling 2002 3 0 1 1 24 Unclear Statement of 2003 3 0 0 0

119 This selection is based on Nielsen’s “Web Design Mistakes” (cf. footnote 35). The main selection

criteria were that a “mistake” had to be applicable to all three subject gateways. “Mistakes” like “No Prices” (Nielsen "Top Ten Web Design Mistakes of 2002") or “No ‘What-If’ support” (Nielsen "Top Ten Web Design Mistakes of 2003") failed that test and have been omitted.

120 “1” means least and “5” means most severe. Mistakes 5, 18-22, and 31 have been scored in analogy to Nielsen "'Top Ten Mistakes' Revisited Three Years Later".

Page 50: A Field Study of Subject Gateways on 'Zeitgeschichte

50

Purpose 25 Overly Restrictive Form

Entry 2003 3 0 0 0

26 Non-Scannable Text 2004 3 0 0 0 27 Page Titles With Low

Search Engine Visibility 2004 3 0 0 1

28 Violating Design Convention

2004 3 0 0 0

29 Browser Incompatibility 2005 3 0 0 0 30 Long Scrolling Pages 1996 2 0 1 0 31 Lack of Biographies 1999 2 1 0 0 32 JavaScript in Links 2002 2 0 0 1 33 Small Thumbnail Images of

Big, Detailed Photos 2003 2 0 0 0

34 Mailto Links in Unexpected Locations

2002 1 0 0 0

35 Overly detailed ALT Text 2003 1 0 0 0 36 Pages That Link to

Themselves 2003 1 0 1 0

Weighted Sum: 21 22 30

Page 51: A Field Study of Subject Gateways on 'Zeitgeschichte

51

2 Top 50 keywords in the aggregated ZIS-, VLZ-, ZOL-link

database

Keyword Number of Occurrences

1 Nationalsozialismus 4252 Holocaust 2883 Sozialgeschichte 2434 Widerstand 2035 Zweiter Weltkrieg 1716 Politik 1677 Kulturgeschichte 1678 Medien 1669 Politikgeschichte 163

10 Gesellschaft 15111 Zeitgeschichte 13012 Dokumente 12413 Wirtschaftsgeschichte 11114 Didaktik 11115 Biographie 11016 Opposition 9717 Kultur 9618 "Drittes Reich" 9519 US-Amerikanische Geschichte 9320 Parteien 9021 Institutionen 9022 Europa 9023 Landesgeschichte 8924 Nachkriegszeit 8825 Wirtschaft 8826 Elektronische Publikationen 8827 Linksammlung 8428 Hilfsmittel 8229 Film 8130 Archive 8031 Erinnerungskultur 7932 Alltag 7933 Kalter Krieg 7734 Demokratie 7735 Antisemitismus 7636 Technikgeschichte 7437 Archiv 7338 Migration 7139 Bibliographie 7140 2. Weltkrieg 6741 Zeitschriften 67

Page 52: A Field Study of Subject Gateways on 'Zeitgeschichte

52

42 Außenpolitik 6643 Konzentrationslager 6544 Kommunismus 6545 Erster Weltkrieg 6246 Alltagsgeschichte 6047 Bibliothek 6048 Rechtsgeschichte 5849 DDR 5750 Militärgeschichte 57

Page 53: A Field Study of Subject Gateways on 'Zeitgeschichte

53

3 Top 50 keywords related to Contemporary History - Innsbruck

University Library OPAC database

Keyword Number of Occurrences

1 Juden 13352 Geschichte 1933-1945 11573 Drittes Reich 10754 Nationalsozialismus 8775 Deutschland <DDR> 8696 Südtirol 6207 Judentum 5358 Geschichte 1900-2000 4519 Weltkrieg <1939-1945> 449

10 Geschichte 1939-1945 38211 Widerstand 36712 Antisemitismus 32513 Weimarer Republik 32214 Konzentrationslager 30915 Weltkrieg <1914-1918> 30716 Geschichte 1938-1945 29417 Judenverfolgung 29318 Geschichte 1900-1990 27619 Judenvernichtung 27120 Arbeiterbewegung 24421 Nationalismus 24422 Geschichte 1918-1933 23623 Geschichte 1945-1990 23224 Geschichte 1945-1995 22525 Vergangenheitsbewältigung 22226 Palästina 22027 Europäische Integration 21828 Geschichte 1945 20329 Geschichte 1945-1955 17830 Geschichte 1918-1938 17831 Deutsche Frage 16632 Geschichte 1980-1990 15233 Wehrmacht 15034 Faschismus 15035 Geschichte 1918-1945 14436 Geschichte 1914-1918 14437 Gebirgskrieg 13438 Geschichte 1940-1945 12839 Geschichte 1943-1945 12740 Geschichte 1945-2000 12541 Geschichte 1945-1949 125

Page 54: A Field Study of Subject Gateways on 'Zeitgeschichte

54

42 Geschichte 1915-1918 12343 Geschichte 1945-1985 12144 Besatzungspolitik 11945 Geschichte 1990-2000 11746 Vertreibung 11647 Politische Verfolgung 11548 Geschichte 1941-1945 11549 Geschichte 1989 11050 Ost-West-Konflikt 109

Page 55: A Field Study of Subject Gateways on 'Zeitgeschichte

55

4 URLs common to all three subject gateways ZIS/ZOL/VLZ

URL Google

Pagerank PageRank (ZIS/ZOL/VLZ network, n=2278, scale: 1-10)

1 http://www.ushmm.org/ 8 7.04 2 http://www.dhm.de/ 7 8.87 3 http://www.history-journals.de/ 7 0.69 4 http://www.hdg.de/ 6 3.33 5 http://www.17juni53.de/ 6 1.53 6 http://www.chronik-der-mauer.de/ 6 1.35 7 http://www.fritz-bauer-institut.de/ 6 1.28 8 http://www.bstu.de/ 6 1.09 9 http://www.doew.at/ 6 0.94 10 http://www.his-online.de/ 6 0.84 11 http://www.sehepunkte.historicum.net/ 6 0.30 12 http://www.wienerlibrary.co.uk/ 6 0.22 13 http://www.beutekunst.de/ 6 0.15 14 http://www.querelles-net.de/ 5 0.65 15 http://www.gedenkstaettesteinhof.at/ 5 0.54 16 http://www.rrz.uni-hamburg.de/FZH/ 5 0.35 17 http://www.nachkriegsjustiz.at/ 5 0.28 18 http://www.eforum-zeitgeschichte.at/ 5 0.24 19 http://www.salvator.net/salmat/pw/luft/blockade.html 5 0.24 20 http://www.hdg.de/zfl/ 5 0.19 21 http://www.h-ref.de/ 4 0.51 22 http://www.uni-kassel.de/fb1/infonsnh/ 4 0.20 23 http://www.nfhdata.de/premium/literaturdatenbank_index.html 4 0.19 24 http://www.topographie.de/imt/ 4 0.17

Page 56: A Field Study of Subject Gateways on 'Zeitgeschichte

56

5 System setup and availability

The crawler, import and analysis programs have been developed and put to practice

using a Debian 3.1 (Sarge)121 Linux system running on a Windows coLinux122 host system for

heightened mobility and flexibility. PostgreSQL 8.0.3123 has been used as RDBMS system,

Subversion 1.2.0124 as code versioning system. The custom programs have been implemented

using Perl125 and Perl CPAN126 modules, amongst them Class::DBI127 for a simple database

to object mapping layer.

The database including the analysis data, analysis SQL queries, and the source code of the

programs is available at: http://www.pepl.info/papers/fieldstudy_sg_zeitgeschichte/

121 http://www.us.debian.org/releases/sarge/, January, 20th 2006. 122 http://www.colinux.org/, January, 20th 2006. 123 http://www.postgresql.org/, January, 20th 2006. 124 http://subversion.tigris.org/, January, 20th 2006. 125 http://www.perl.org/, January, 20th 2006. 126 http://cpan.perl.org/, January, 20th 2006. 127 http://search.cpan.org/~tmtm/Class-DBI-v3.0.13/, January, 20th 2006.

Page 57: A Field Study of Subject Gateways on 'Zeitgeschichte

57

6 Database design

The data diagram:

The table cagipch_portals has been filled manually with the three subject gateways,

Clio-Online, and Aleph OPAC “Geschichte” – University Library of the

University of Innsbruck.

All other tables have been filled by the crawler and analysis programs. The view

cagipch_identifierlinks_idmatrix has been used by the information network analysis

programs.

cagipch_portals id: serial name: character varying(128) url: character varying(256)

cagipch_identifier_normalized id: serial identifier_normalized: character varying(512) google_pagerank: double precision cagipch_pagerank: double precision

cagipch_identifierlinks id: serial identifier: character varying(512) links_to: character varying(512)

cagipch_identifierlinks_idmatrix(view)

id: integer identifier_id: integer

links_to_id: integer

cagipch_items id: serial dces_description: character varying(4096) dces_date: timestamp(0) dces_type: character varying(200) dces_format: character varying(64) dces_identifier: character varying(512) dces_relation: character varying(200) dces_coverage: character varying(200) dces_rights: character varying(200) portal: integer last_time_checked: timestamp(0) url_host_part: character varying(128) dces_title: character varying(400) dces_source: character varying(400) dces_language: character varying(32) identifier_ltc: timestamp(0) without time zone identifier_status_code: integer identifier_redirected_to: character varying(512) identifier_content_md5sum:character varying(32) psource: character varying(128) dces_creator: character varying(1024) dces_subject: character varying(1024) dces_publisher: character varying(1024) dces_contributor: character varying(1024) identifier_normalized: character varying(512)

cagipch_keywords id: serial item: integer

name: character varying(200)

cagipch_keywords_levenshtein id: serial kw_from: character varying(200) kw_to: character varying(200) distance: integer

cagipch_keywords_normalized id: serial name: character varying(200) usage_count: integer stem: character varying(200) soundex: character varying(8)

Page 58: A Field Study of Subject Gateways on 'Zeitgeschichte

58

7 Overview of crawler and import programs

• aleph-keywords_import.pl

Imports items into cagipch_items for the Aleph OPAC “portal” using dummy titles and

URLs. Assigns keywords accordingly from a plaintext import file.

• harvester-vlz.pl

Harvests items from www.vl-zeitgeschichte.de and imports them into cagipch_items for

the Virtual Library Zeitgeschichte portal. Uses a caching version of

WWW::Mechanize::Sleepy128 for polite harvesting.

• harvester-zol.pl

Harvests items from www.zeitgeschichte-online.de and www.clio-online.de and imports

them into cagipch_items for the Zeitgeschichte Online and Clio-Online portals. Uses a

caching version of WWW::Mechanize::Sleepy for polite harvesting and handles mixed

Latin1 and UTF-8 encoded metadata.

• zis_import.pl

Imports items into cagipch_items for the ZIS portal from an XML encoded import file.

Uses XML::LibXML129 for parsing and handling the XML file.

• check_status_codes.pl

Checks HTTP status codes of items in cagipch_items using LWP::UserAgent130.

Follows possible redirects and stores the MD5 sum of the returned content using

Digest::MD5131.

128 http://search.cpan.org/~esummers/WWW-Mechanize-Sleepy-0.5/, January, 20th 2006. 129 http://search.cpan.org/~phish/XML-LibXML-1.58/, January, 20th 2006. 130 http://search.cpan.org/~gaas/libwww-perl-5.805/, January, 20th 2006. 131 http://search.cpan.org/~gaas/Digest-MD5-2.36/, January, 20th 2006.

Page 59: A Field Study of Subject Gateways on 'Zeitgeschichte

59

8 Overview of the analysis programs

• populate_identifier_normalized.pl

Populates cagipch_items.identifier_normalized with the canonical version of the URL

stored in cagipch_items.dces_identifier so that cagipch_identifier_normalized can be

filled with the distinct URLs using the SQL insert statement at line number 100 of

cagipch.sql.

• populate_identifier_network.pl

Processes in- and outgoing links with a recursion depth of 3 levels for the items stored

in cagipch_identifier_normalized and stores the result in cagipch_identifierlinks. Stores

the Google Pagerank of each crawled URL using WWW::Google::PageRank132 in the

cagipch_identifier_normalized table. After being run, the view

cagipch_identifierlinks_idmatrix can be created using the SQL statement starting at line

number 128 of cagipch.sql.

• cagipch_pagerank.pl

Populates cagipch_identifier_normalized.cagipch_pagerank with the PageRank values

of a network defined by cagipch_identifierlinks_idmatrix using

Algorithm::PageRank133. Optionally limits the network nodes to a specific to a subject-

area-specific network by passing a comma-separated list of keywords as command line

arguments.

• linkid_matrix.pl

Generates import files for either Pajek or UCINET network analysis programs out of

cagipch_identifierlinks_idmatrix. Accepts a comma-separated list of keywords as

command line arguments to limit the resulting matrix to a specific subject area.

• populate_keywords_normalized.pl

Populates cagipch_keywords_normalized.stem with the German stem of the respective

keyword using Lingua::Stem::Snowball134 and cagipch_keywords_normalized.soundex

with the Soundex value of the keyword using Text::Soundex135. Both stem and Soundex

can be used to identify duplicates or spelling errors. populate_keywords_normalized.pl

should be run after cagipch_items has been completely filled and after

132 http://search.cpan.org/~ykar/WWW-Google-PageRank-0.10/, January, 20th 2006. 133 http://search.cpan.org/~xern/Algorithm-PageRank-0.08/, January, 20th 2006. 134 http://search.cpan.org/~fabpot/Lingua-Stem-Snowball-0.93/, January, 20th 2006. 135 http://search.cpan.org/~markm/Text-Soundex-3.02/, January, 20th 2006.

Page 60: A Field Study of Subject Gateways on 'Zeitgeschichte

60

cagipch_keywords_normalized has been populated by the SQL insert statement at line

number 114 of cagipch.sql.

• populate_keywords_levenshtein.pl

Populates cagipch_keywords_levenshtein with the Levensthein edit distances between

the keywords in cagipch_keywords_normalized using Text::LevenshteinXS136. The

Levensthein edit distances can be used to identify duplicates or spelling errors.

• keywords_from_common_uris.pl

Dumps keyword information and statistics from a list of URLs in cagipch_items. Used

for analyzing the classification of the 24 shared URLs of the three subject gateways.

136 http://search.cpan.org/~jgoldberg/Text-LevenshteinXS-0.03/, January, 20th 2006.