how to identify credible sources on the
TRANSCRIPT
-
8/14/2019 How to Identify Credible Sources on The
1/123
HOW TO IDENTIFY CREDIBLE SOURCES ON THE WEB
by
Dax R. NormanNational Security Agency
PGIP Class 0001
Unclassified thesis submitted to the Faculty
of the Joint Military Intelligence Collegein partial fulfillment of the requirements for the degree ofMaster of Science of Strategic Intelligence.
19 December 2001
The views expressed in this paper are those of the author anddo not reflect the official policy or position of theDepartment of Defense or the U.S. Government.
-
8/14/2019 How to Identify Credible Sources on The
2/123
ACKNOWLEDGEMENTS
Foremost, I am thankful for the endless patience of my wife and
daughter, who for two years worked and played one man short of a full team,
and often carried the ball when I should have.
I am grateful to Professor Jerry P. Miller, Director of the Competitive
Intelligence Center at Simmons College in Boston, for his patient and
persistent help in constructing the thesis survey.
I would also like to thank LTC (ret) Karl Prinslow, at the time, a
contractor employed by the U.S. Army Foreign Military Studies Office, for his
practical assistance, and encouragement.
Thank you must also go to my Thesis Chairman, Dr. Alex Cummins and
Thesis Reader Robyn Winder for their conscientious support of the Joint
Military Intelligence College Masters program by volunteering to serve as
Thesis Chairman and Reader.
ii
-
8/14/2019 How to Identify Credible Sources on The
3/123
-
8/14/2019 How to Identify Credible Sources on The
4/123
Appendices
A. Web Site Evaluation Worksheets.66
B. Survey to Industry and Academia...77
C. Survey to Intelligence Community..88
D. Criteria Analysts Currently Use to Judge Credibility..101
Bibliography......106
Annex 1. Survey Results (not included in original thesis.) 109
iv
-
8/14/2019 How to Identify Credible Sources on The
5/123
LIST OF GRAPHICS
Tables
Page
1. Question 8a to 8r, Recommended Criteria and Relative Values (Mean).48
2. Questions 9a-f. Required Level of Source Credibilityfor Intelligence Products. .........53
3. Question 5. Part 1, Official Criteria for Unclassified Sources
....54
4. Question 5. Part 2, Official Criteria for Classified Sources
55
5. Questions 7a, b, c, j, k, l, m, Credibility of Well-Known Titles...
57
6. Questions 7d, e, f, g, h, i, Credibility of Obscure Titles, andForeign Web Sites
57
7. Questions 7n to 7s, Credibility of All Classified Sources
.59
8. Credibility of Open Sources Compared to Classified Sources...
.60
9. Question 7q, Credibility of IMINT Without Annotations..
..61
v
-
8/14/2019 How to Identify Credible Sources on The
6/123
10. Benchmark Web Site Evaluation Work Sheet, Spot
66
11. Benchmark Web Site Evaluation Work Sheet, ITU
69
12. Benchmark Web Site Evaluation Work Sheet, NY Times.
71
13.Benchmark Web Site Evaluation Work Sheet, Korea..
73
14.Blank Web Site Evaluation Work Sheet..
76
15.Survey Question 6: Credibility Criteria Analysts Currently Use...
101
Graph
1. Question 7q, Credibility of IMINT Without Annotations..
..61
vi
-
8/14/2019 How to Identify Credible Sources on The
7/123
ABSTRACT
TITLE OF THESIS: How to Identify Credible Sources on the Web.
STUDENT: Dax R. Norman
CLASS NO. PGIP 0001 DATE: 19 December 2001
THESIS COMMITTEE CHAIR: Dr. Alex Cummins
SECOND COMMITTEE MEMBER: Robyn Winder
There is little argument today that open sources and the World-Wide-
Web have a role to play in intelligence, but little has been written about
evaluating the credibility of Web sites and communicating that evaluation to
analysts. Such a capability is needed because of the increased opportunity to
collect open source intelligence from the Web; the ever increasing cost of
classified collection; and the ever-present demand on analysts to analyze and
report at the edge of their knowledge. With so many intelligence sources
available, including the Web, analysts must be able to identify credible
sources. The alternative is to evaluate every piece of information collected
from every Web site of intelligence interest. Due to the enormous size of the
Web, evaluating data validity is not practical.
That is why the Intelligence Community (IC) needs a generally agreed
upon set of criteria for evaluating Web sites of potential intelligence value.
Credible Web sites can be identified. However, without these criteria, and a
method to share the results, hundreds of analysts will repeatedly find the
same Web sites of dubious credibility as other analysts; they will attempt to
-
8/14/2019 How to Identify Credible Sources on The
8/123
evaluate the sites usefulness and credibility by many widely different
standards, and share their results with only a few close coworkers. The
quality of these Web site evaluations will vary widely based on the subject of
the Web site and the subject expertise of the evaluator.
This thesis collected criteria recommended by professional Web
searchers and surveyed industry, academia, and the Intelligence Community
for their opinions of those criteria. From this survey the author developed a
weighted list of credibility criteria and a methodology that both the subject-
matter expert and the subject-matter novice will find useful. With these
criteria and the relative credibility scale, subject-matter experts throughout
the IC can evaluate Web sites within their area of expertise and share that
source evaluation with the entire IC.
This thesis identifies valid criteria for evaluating the credibility of open
source Web sites; presents a relative credibility scale based on benchmarked
Web sites; identifies the target level of credibility for all intelligence sources;
offers a Web site evaluation worksheet; and compares the credibility of open
sources to classified sources. Credible information can be located on the Web,
and although subject-matter experts are the best evaluators, any analyst can
evaluate a Web site when he does not have a subject-matter expert to assist
him.
-
8/14/2019 How to Identify Credible Sources on The
9/123
CHAPTER 1
INTRODUCTION TO OPEN SOURCE EVALUATION
Along with the information technology revolution has come an equally
important increase in information access and information sources via the
World-Wide-Web. However, such abundance is a double-edged sword because
the Web contains every type of print, audio, and visual data from every type
of source, including children, students, professors, conspiracy theorists,
researchers, advertisers, government data, and government misinformation.
Information analysts must sort the useful information from the junk.
However, what is useless for one person may be just right for someone else.
This thesis will establish Intelligence Community criteria for identifying
credible Web sites from untrustworthy, or non-credible Web sites. This thesis
used a survey structured to answer several key issues and the research
question: how to identify credible sources on the Web. The hypothesis was
that credible Web sites can be confidently identified by evaluating the Web
sites based on criteria recommended by professional Web searchers and
agreed to by intelligence analysts. Most analysts today apparently evaluate
the data rather than the source.
1
-
8/14/2019 How to Identify Credible Sources on The
10/123
VALIDITY MATTERS
This thesis will also show that most analysts do not attempt to identify
credible sources, but evaluate the validity of the data in the sources. There
is a common misunderstanding about validity and credibility. Validity is an
attribute of information. Validity also describes information as
simultaneously relevant and meaningful. Validity can also refer to the proper
use of logic to reach a conclusion.1 In psychometrics, validity can have
several meanings, including the proper use, or function of a measurement
tool.2 This thesis uses validity as an attribute of data that is verifiably correct.
Validity is what the analyst means when he asks, is this data correct?
Although validity is important to intelligence, it always describes the
information rather than the source, and alone does not measure believability,
which this thesis calls credibility. Because discrete elements of information
can be examined and compared, the validity of information is of most
concern to analysts because analysts know how to check validity. They
examine the data for consistency, verify it with other sources, or verify that it
functions as expected. Although consistently valid data can lead to credible
sources, the goal should be to identify sources as credible so that every
document from the source does not have to be validated. Establishing
source credibility should be of greater interest to analysts because they
cannot become expert in every subject on which they may be expected to
1 G. & C. Merriam Co., Websters New Collegiate Dictionary(Springfield, MA:G. & G. Merriam Co., 1975), under Valid. Cited hereafter as Websters.
2 Jum C. Nunnally, Psychometric Theory(New York: McGraw-Hill BookCompany, 1967), 75.
2
-
8/14/2019 How to Identify Credible Sources on The
11/123
report, because organization focus changes, analysts change jobs, and there
just is not enough time to learn it all and still report.
This thesis will provide a tool for the general analysts to evaluate Web
sites as potential intelligence sources. Although Web site evaluations are
best done by subject-matter experts, analysts are often expected to report on
unfamiliar topics, and must discern for themselves if a source is credible.
Experts will also be able to use the recommended criteria and credibility
scale to evaluate Web sites in a consistent manner that other people will
understand, and can repeat.3
CREDIBILITY COUNTS MORE
To judge validity, an analyst must understand the issue, or technology,
or strategy, or politics very well for every data element included in his
reporting. Because every analyst cannot possibly be an expert on every
subject, they rely on sources that they trust to provide valid data. This trust
in a person or group is a measure of credibility. A credible source offers
reasonable grounds for being believed.4 This is the meaning intended in
this thesis for credibility.
These credible sources are an essential element of intelligence
analyses because analysts are often expected to report on topics, in which
they are not expert, or that are too complex for any one person to
3 See Appendix A, Web Site Evaluation Worksheet, for the relative credibilityscale, benchmark Web site evaluation worksheets, and a blank evaluationworksheet.
4 Websters, under Credible.
3
-
8/14/2019 How to Identify Credible Sources on The
12/123
understand. Because it is impractical for analysts to validate every data
element from every source, the focus should be on identifying credible
sources. In the area of Open Source Intelligence (OSINT), this is even more
important because of the widespread use of OSINT by the other intelligence
disciplines, and the multitude of unclassified open sources.5 The source must
be judged credible before the data can be judged valid. Of course this can
become a circular argument, but in the end it is more useful to have a
credible source than a valid data element. For example, it would be better to
know where to find a foreign leaders official travel schedule, than to know
where the leader will travel next. This is true because this credible source
can tell one where the next trip will be, any changes to his next trip, and the
details of subsequent trips. If a source provides valid data consistently, it
will soon be judged a credible source. However, once judged credible, it is
less important that every data element the source provides is validated.
Note that open source information (OSINF) is public or proprietary
information available to anyone for a fee or for free. OSINF becomes open
source intelligence (OSINT) when it is used by the Intelligence Community to
answer a intelligence question.
THE CHALLENGE OF CREDIBLE SOURCES
Regardless of the credibility of a source, or the validity of the data,
analysts are more likely to use the sources most accessible to them. The
5 Joint Chiefs of Staff, Joint Pub 1-02, Department of Defense Dictionary ofMilitary and Associated Terms, URL:, accessed 13 February2000. Cited hereafter as Joint Pub 1-02. This thesis uses intelligence disciplines, suchas OSINT, as defined in Joint Pub 1-02.
4
-
8/14/2019 How to Identify Credible Sources on The
13/123
Web has the potential to put a worldwide library on the desk of every analyst.
With todays search engines and Web-directories an analyst can conduct a
single search of the Web in seconds that would take a librarian a career to
complete. This is because the librarians know which sources are credible
based on their own use of the sources or recommendations from other
librarians and subject-matter experts. Therefore, it stands to reason that
intelligence analysts, who do not have access to a subject-matter expert on
every reportable issue, should have access to credible information sources on
the Web. How to identify credible sources on the Web is the challenge of this
thesis.
In an ideal world, subject-matter experts in every field would identify
credible sources, and index them for everyone to use. However, even in such
a world there would be disagreement on what is credible. Therefore, the
research question that this thesis will answer is how to identify credible
sources on the Web. The focus is on Web sites because library science and
publishers have already established acceptable standards in the print media
for credibility. Such standards include peer-review in scientific journals,
editorial review in newspapers, independent verification of facts, and the
proper labeling of commentary and advertisements in magazines. In the
absence of such standard practices on the Web, it is up to the reader to
judge. With the help of expert Web searchers from industry, defense, and
intelligence, this thesis establishes a set of common credibility evaluation
criteria, which can be used by subject-matter experts as well as analysts
reporting on an unfamiliar issue. Some subjectivity remains, but the criteria
5
-
8/14/2019 How to Identify Credible Sources on The
14/123
are established which provide analysts with the tools and vocabulary to
measure credibility of sources and describe a sources relative
trustworthiness, known as credibility.
ASSUMPTIONS
This thesis does make some assumptions. The first two are that open
source intelligence is less costly than classified intelligence, and therefore is
the preferred source if it can be trusted. The third assumption is that
credibility is relative to its intended use and user. For example, a CNN
broadcast might be sufficiently credible for indications and warning (I&W),
but not sufficiently credible for basic intelligence for which the analyst has
some time to conduct research, or when the product will become the
background for future reporting. Likewise, a second-hand report of the
humanitarian conditions in a country may be credible enough for a person
planning an overseas visit; however, only a first-hand report from an
authoritative, unbiased source may be considered for the subject of an
intelligence report. Therefore, a relative credibility scale is necessary rather
than an absolute determination of credible or non-credible.
A UNIQUE STUDY
Although other studies establish criteria for evaluating Web sites, such
as Alison CookesAuthoritative Guide to Evaluating Information on the
Internet, I have not found a study that focuses on establishing the credibility
6
-
8/14/2019 How to Identify Credible Sources on The
15/123
of Web sites.6 Cookes work is an excellent guide to evaluating the overall
quality of many types of Web sites. The closest Joint Military Intelligence
College study found is MAJ Robert M. Simmonss unclassified thesis, Open
Source Intelligence: An Examination of Its Exploitation, 1995.7 Simmons
focuses on the accessibility and use of open source, not the credibility of
sources. Although Reva Baschs Secrets of the Super Net Searchers includes
the question of credibility, it is less formal than this study and asks the
credibility question differently of each expert interviewed.8Secrets of the
Super Net Searchers does not focus on any one issue, but asks many
questions of the industry experts. However, many criteria from Baschs book
were included in the thesis survey used for this study. This thesis surveyed
analysts from defense, intelligence, and academia, as well as industry, to
establish common criteria for evaluating the credibility of Web sites.9 The
broad survey population, which included industry, academia, and
intelligence, and the focus on credibility, make this study unique.
REVIEW OF THESIS
6 Alison Cooke, Authoritative Guide to Evaluating Information on the Internet(New York: Neal-Schuman Publishers, Inc., 1999).
7 Major Robert M. Simmons, USA, Open Source Intelligence: An Examinationof Its Exploitation in the Defense Intelligence Community, MSSI Thesis (Washington,DC: Joint Military Intelligence College, August 1995.)
8 Reva Basch, Secrets of the Super Net Searchers (Wilton, CT : PembertonPress, 1996).
9 E-mail Survey, Joint Military Intelligence College Thesis Survey: CredibilityCriteria for Web Sites, conducted by the author, July-August 2001. Hereafter citedas Survey.
7
-
8/14/2019 How to Identify Credible Sources on The
16/123
The research for this thesis began with a literature review, found in
Chapter two. From the literature several authors were selected who either
represent a significant point of view or are in a position to influence other
analysts. The objective of the literature review was to identify what is
already known, or thought about identifying credible sources on the Web.
However, the literature also revealed tangent issues that influence how or
when unclassified open sources are used in intelligence products. Most
significantly, the literature review identified the criteria recommended by
expert Web searchers for judging the credibility of Web sites. Those criteria
were included in the thesis survey, which was the primary research tool used
by the author.
Chapter three describes the research methodology employed. That
methodology included gathering expert criteria from the literature review;
developing and administering the survey to both industry, academic, and
intelligence analysts, coding the survey results and entering the data into the
SPSS statistical program; and performing the calculations which answered
the research questions and the key issues. The recommended credibility
criteria were determined by identifying the criteria that analysts most often
rated as contributing 50 percent or more to the credibility of a Web site; then
determining the relative weights for each criterion and a relative credibility
scale. Finally, four Web sites of known credibility were evaluated as
benchmark sites. Chapter three describes this process in detail as well as
how the target source-credibility level was determined for most intelligence
products.
8
-
8/14/2019 How to Identify Credible Sources on The
17/123
The results of the survey calculations are shown in the findings
Chapter four. The findings chapter, like the methodology chapter, is
organized to answer the research question and each key issue, which in short
include the following key issues: open source relevance to intelligence,
knowledge of existing official criteria, analysts objectivity, credibility of
foreign Web sites in English, credibility of classified versus unclassified
sources; and the research questions of evaluation criteria, and needed level
of credibility,
The conclusions are in Chapter five, and include analysis of the survey
results. The thesis concludes that credible Web sites can be identified,
evaluated, and shared with other analysts. Known weaknesses in the survey
are mentioned in the findings and conclusions chapters. Chapter six also
includes a recommendation for implementing this evaluation procedure in
the Intelligence Community. The appendices include a copy of the surveys
used; the competed evaluation worksheets for the benchmarked Web sites;
and a blank evaluation worksheet.
9
-
8/14/2019 How to Identify Credible Sources on The
18/123
CHAPTER 2
LITERATURE REVIEW
RANGE OF THOUGHT
Open source information (OSINF) has been widely accepted as a
necessary element of all-source intelligence reporting, as demonstrated by
Director of Central Intelligence Directive 2/12, which established the
Community Open Source Program Office.10 Most experts agree that OSINF
should support classified intelligence collection. However, I think there has
not been significant attention paid to the issue of identifying credible Web
sites, a significant source of unclassified information. The Web makes foreign
newspapers and gray literature (documents with limited distribution such
as company brochures, or equipment manuals), more accessible, as well as
expert opinions, and research projects from universities, just to name some
valuable sources.11 The issue of identifying credible Web sites affects
everyone who uses the Internet, including defense, intelligence, academia,
and industry. Therefore, the literature reviewed for this study included
documents from all of these communities of interest. The authors presented
in this study include: Robert David Steele of Open Source Solutions Inc.; Dr.
Wyn Bowen of Kings College, London, writing forJanes Intelligence Review;
A. Denis Clift, President of the Joint Military Intelligence College (JMIC),
Washington, D.C.; Reva Basch, author ofSecrets of the Super Net Searchers;
10 Director of Central Intelligence, Director of Central Intelligence Directive2/12 (Washington, D.C.: n.p., 1 March 1994). Hereafter cited as DCID 2/12.
11 Basch, 110.
10
-
8/14/2019 How to Identify Credible Sources on The
19/123
and Allison Cooke, author ofAuthoritative Guide to Evaluating Information on
the Internet. These authors are all in a position to influence information
analysts, either inside or outside of government, and represent a range of
opinions on the proper use of open source information.
All these points of view agree that there is more data available now
than an analyst can manage unaided. Their approach is what differs. Steele
and Bowen would expand the Intelligence Community, which is not going to
happen without a long, and gradual culture change. Clift sees a need for
better automated tools for data retrieval, including an on-line index of open
sources .12 Cooke and Basch offer solutions for today: evaluate sources based
on criteria similar to those used for traditional print media. This thesis will
demonstrate that the ideas of each of these authors combined with the
recommend evaluation criteria in this thesis, represent a practical solution to
the information fog of the Web.
Robert David Steele, Open Source Solutions, Inc.
Steele is the most vocal advocate for expanded use of OSINF to
support the other intelligence disciplines, and recommends expanding the
Intelligence Community to include business people and academics, who have
unique knowledge and access. Steele would have analysts consult open
sources first, including subject experts in industry and academia, and then
classified sources. He is President of Open Source Solutions Inc. His
company is in the private open source intelligence (OSINT) business, and he
12 A. Denis Clift, Clift Notes: Intelligence and the Nations Security(Washington, D.C.: Joint Military Intelligence College, 1999), 51-57.
11
-
8/14/2019 How to Identify Credible Sources on The
20/123
has proposed his own plan for intelligence in the 21st Century, called
Intelligence and Counterintelligence: Proposed Program for the 21st
Century.13 Steele sees a great need to expand the access that analysts have
to OSINF.14 His view of the future Intelligence Community (IC) includes
several new groups, including scholars and business people, which constitute
the Virtual IC.15 It is these sources that Steele sees as the gold mine of
information. However, he does acknowledge that the Internet will greatly
expand access to OSINF, primarily secondary sources, which are derived from
an original source. He also suggests that OSINF may be used as a source of
tip-offs to serious issues that warrant classified collection.16 However, his
stand that classified intelligence is only useful in the context of what is
already known from open sources borders on accepted practice.
Dr. Wyn Bowen, Open-source Intelligence.
Bowen is an academic concerned about information overload, and
would add non-government subject-matter experts to the intelligence
collection process, as Steele suggests. Bowen thinks that subject-matter
experts should be the people to evaluate Web sites, which is unique in this
literature review. However, he sees open sources as an adjunct to classified
sources, not the source of first resort as Steele suggests. Bowen, who is a
13Robert D. Steele, Intelligence and Counterintelligence: Proposed Program
for the 21st Century, URL: , accessed 5 January 2000.Cited hereafter as Steele, Intelligence.
14 Steele, Intelligence, under Introduction.
15 Steele, Intelligence, under Part III Figure 18.
16 Steele, Intelligence, underPart III.
12
-
8/14/2019 How to Identify Credible Sources on The
21/123
professor at Kings College, London, and writes forJanes Intelligence Review,
demonstrates the invaluable resources available through open sources in his
article Open-sourceIntelligence: A Valuable National Security Resource.17 He
uses weapons proliferation as a demonstration case. This case is very
effective because it reduces the issue to tangible products of intelligence
value found in the public domain. Bowen thinks that the role of OSINF is to
provide the context of classified information.18 He also dwells on the issue of
information overload, which concerns Clift. However, he would add non-
government subject-matter experts to the collection process, as Steele also
suggests. Bowen thinks the experts role should be to identify the useful
sources to keep and collect, (not specific data) and the worthless sources to
ignore. In his view, experts would also serve to evaluate sources for
inaccuracy, bias, irrelevance and disinformation, which non-experts would
find difficult to do.19
17Dr. Wyn Bowen, Intelligence: A Valuable National Security Resource,Janes Intelligence Review, 1 November 1999, Dow Jones Interactive, PublicationsLibrary, All Publications, Search Terms Open Source Intelligence, URL: , accessed on 4 March 2000.
18 Bowen, under Technical Sources.
19 Bowen, under Conclusion.
13
-
8/14/2019 How to Identify Credible Sources on The
22/123
A. Denis Clift, President of the Joint Military Intelligence
College
Clift is also concerned about information overload, and sees a need for
better automated selection tools to solve the analysts selection problems.
Clift is President of the Joint Military Intelligence College (JMIC) in Washington,
D.C. His views are his own and do not represent that of the U.S.
Government; however, as President of the JMIC, Clift is in a position to
influence the opinions of analysts graduating and going on to work in
intelligence. He also served as Editor for the United States Naval Institute
Proceedings, early in his career, from 1963 to 1966.
In Chapter five ofClift Notes: Intelligence and the Nations Security,
Clift gives a short explanation of the open source programs available today to
support the intelligence analyst.20 He defends the Intelligence Communitys
record on making open source information (OSINF) available to intelligence
analysts. He gives an overview of the OSINF programs available to the
analysts, but does not indicate how accessible the information is. I observed
lines of analysts waiting to use Internet terminals in the JMIC library in 1999
and 2000. This is an example of why it should be clear to the Intelligence
Community (IC) that OSINF will only be used to its highest potential when it is
on the analysts desk. The work lost walking to a terminal down the hall or in
the next building is not worth the effort to analysts unfamiliar with the
sources, or inundated with other sources at their finger tips. Clift writes that
OSINF plays an important role in intelligence, and states that the IC already
has a good collection of OSINF in Central Information Reference and Control
20 Clift, 51-57.
14
-
8/14/2019 How to Identify Credible Sources on The
23/123
(CIRC) of the National Air Intelligence Center and the Defense Scientific and
Technical Intelligence Centers.21 He notes the serious difficulties analysts
have within formation overload and the need for better-automated selection
tools.22 However, the technology Clift wants is not yet intelligent enough to
discern credible sources from non-credible sources. As will be demonstrated
in the findings chapter, determination of credibility requires research, and
corroboration, and has a measure of subjectivity.
Reva Baschs Secrets of the Super Net Searchers
Basch does not address the Intelligence Community, but does address
the issue of how to select trustworthy Web sites. Basch, as well as Cooke,
takes the most practical approach to finding credible information in the flood
of electronic data. Both recommend using evaluation criteria similar to that
used for print media, with some variations.
Basch published Secrets of the Super Net Searchers in 1996, after
interviewing 35 of the best Internet searchers. In 1996, she was the news
editor for ONLINE, DATABASE, and ONLINE USER magazines and had been an
online researcher for about 21 years. Since then, she has published a series
of Super Searchers books. For Secrets of the Super Net Searchers she
conducted informal interviews with expert researchers, each of which
represents a chapter in Super Searchers. Her questions covered many issues
21 Clift, 54.
22 Clift, 56.
15
-
8/14/2019 How to Identify Credible Sources on The
24/123
affecting online researchers and included the following, which relate to Web
site credibility: 23
What is the quality and reliability of information on the Web?
Are some types of sites more reliable than others?
How are biased sources treated?
How are the quality and reliability of unfamiliar Web sites judged?
Is there a relationship between credibility and longevity?
Many of the experts Basch interviewed had something useful to say
about source credibility, which were consolidated into several survey
questions for this thesis.
There is disagreement whether information from personal Web sites is
credible. Susan Feldman stated in Super Net Searchers that a Web site
written by Joe Schmo might be way ahead of McGraw-Hill. So youre left to
your own devices to analyze and evaluate.24 However, Mary Ellen Bates, also
interviewed by Basch for Super Net Searchers, stated at a WebSearch
conference in Virginia on 10 May 2001 that she does not rely on personal
Web sites unless they are well known.25
Alison Cooke, Authoritative Guide to Evaluating Information onthe Internet
Cooke also does not address the Intelligence Community, but does
address the issue of how to select trustworthy Web sites. Cooke also
23 Basch, 3.
24 Basch, 31.
25 Mary Ellen Bates, Presentation to WebSearch University Conference inReston, VA, 10 September 2001.
16
-
8/14/2019 How to Identify Credible Sources on The
25/123
recommends using evaluation criteria similar to that used for print media,
with some variations.
Alison Cooke, who is a professional Internet searcher, wrote in 1999
theAuthoritative Guide to Evaluating Information on the Internet. The
authors implicit thesis is that although there is much useless, outdated, and
difficult to authenticate information on the Internet, high quality information
can be found and the quality can be assessed.26 Like Clift and Bowen, Cooke
sees information overload as a serious challenge facing researchers, but
believes accuracy is of most concern to researchers. Her solution is to
carefully evaluate Web sites using criteria similar to criteria used to evaluate
print media.
EVERY MANS PRINTING PRESS
There are widely accepted criteria for evaluating traditional print
media. These criteria include the reputation of the publisher and author,
peer-review of scientific articles, and editorial review of periodicals.27 Such
criteria work well when the number of publishers in a particular field are
quantifiable and their past work can be located and reviewed. However,
desktop publishing programs, personal computers, and the Web have
enabled hundreds of thousands of people to produce professional-looking
articles and distribute them to millions of potential readers without the
26 Alison Cooke, Authoritative.27 Jan Alexander and Marsha Tate, The Web as a Research Tool: Evaluation
Techniques, Wolfgram Memorial Library, Widener University, Chester, PA, URL: accessed 13 March 2001.
17
-
8/14/2019 How to Identify Credible Sources on The
26/123
benefit of peer or editorial review, or regard for brand name reputation.
Among the millions of Web pages available to the public today are many of
potential intelligence value produced by proud inventors, boisterous
government agencies, self-promoting corporations, community-minded
colleges, nave public servants, happy vacationers, and zealous
revolutionaries. The issue at hand today is how to identify credible
information among the millions of personal, organizational, industry,
academic, and government sources. There are as many opinions on this
topic as there are open source researchers and intelligence analysts.
INFORMATION GAPS
Even after a Web site is evaluated based on the criteria presented in
Basch, Cooke or Alexander, the issue of credibility still remains. How does a
subject-matter novice know which sources he can believe? The other issue is
that of relativity. Is a Web site that is credible enough for a high school term
paper also credible enough for a basic intelligence report, or for an
intelligence warning report. This study answers both of these questions.
18
-
8/14/2019 How to Identify Credible Sources on The
27/123
CHAPTER 3
METHODOLOGY
This chapter on methodology and the following chapter on findings are
organized by key issues and research questions. The key issues are
obstacles that must be overcome before the research question can be
answered. The key issues include: how is open source information relevant
to intelligence; do analysts know of existing official credibility criteria; are
analysts biased toward popular source titles; are foreign sites in English less
credible; and how does the credibility of classified sources compare to
unclassified sources? To answer the research question of, how to identify
credible sources on the Web, it was necessary to separate the question into
two parts. The first part of the research question was what criteria can be
use to identify credible Web sites. The second part of the research question
was how credible should any intelligence source be. The methodology relies
on logic, and statistics, and is somewhat complex due to the many steps
necessary to arrive at useful criteria, which is accurately weighted. The
methodology begins with the development of the thesis survey.
KEY ISSUE: OSINF RELEVANCE TO INTELLIGENCE
Even before the survey could be developed, the basic question needed
to be answered: why is open source information relevant to intelligence? The
19
-
8/14/2019 How to Identify Credible Sources on The
28/123
literature review provided several views on the role of open sources in
intelligence. The opinions of Steele and Clift offered convincing reasons that
intelligence must include open source information. The reasons for using
OSINF in intelligence products are included in the findings chapter.
SURVEY DEVELOPMENT
Although the primary research question was, how to identify credible
sources on the Web, this thesis needed to answer several key issues
regarding source credibility on the way to answering the primary research
question. Two research methods were used to answer the key issues and
research question. First, published literature was reviewed from Intelink,
online DIA course material, Lexis-Nexus, Dow Jones Interactive, the NSA
Library, and academic Web pages. This literature review uncovered some
answers to the key issues and provided the majority of the concepts tested
by the thesis survey.
Once the thesis survey was developed, it was given to a test
population of 15 intelligence analysts for a validity check. The 15 analysts
completed the survey, and suggested adding questions, clarifying ambiguous
wording, and questioned the relevance of some questions. Those changes
were made and the second draft was given to Professor Jerry P. Miller,
Director of the Competitive Intelligence Center at Simmons College in Boston.
Miller offered numerous suggestions that improved the reliability of the
survey. He identified government lingo that would not likely be
understood in industry and academia, and recommended changes to the
20
-
8/14/2019 How to Identify Credible Sources on The
29/123
survey questions to maintain Likert-type scales for the responses. Likert
scales are a recognized method in social sciences to format survey response
options that are understood by most populations and can be used to measure
evenly a populations opinions.
The second draft was also sent to LTC (ret) Karl Prinslow, project
manager and operations officer of a virtual organization that employs over
150 military reservists who work via telecommuting to collect and acquire
open source information in support of the Intelligence Community's
requirements. Prinslow suggested several format changes that insured all
recipients were able to display the survey on their computers, and would be
comfortable replying with anonymity. Prinslow and Miller suggested adding
the personal information disclosure statement. Prinslow also recommended
E-mailing the survey as an ASCII text message rather than a MS-Word
document, and simplified some questions. The text message enabled
anyone who was able to receive the E-mailed survey to respond to it without
special software.
After making the changes suggested by Miller and Prinslow, two
separate surveys were distributed by E-mail. In the coding and analysis, the
two surveys were treated as one survey, with some questions not applicable
to the whole population. The Intelligence Community (IC) Survey included
several questions at the end, which would not apply to industry or academia,
and it was distributed by internal communications. The Industry Survey
included the same questions as the IC Survey without the IC-unique
questions. The IC Survey was E-mailed to a group of about 100 IC analysts
who have an interest in open source intelligence (OSINT). The exact number
21
-
8/14/2019 How to Identify Credible Sources on The
30/123
of IC analysts cannot be determined because it was sent to a mail-list, which
often changes. This method had the effect of randomizing the population
selection. One of these 100 analysts E-mailed the survey to 18 other IC
analysts. Four of these 18 E-mailed the survey to 238 others, for a total of
356 IC analysts. This chain of events was evident from the E-mail headings
and some respondents informed the author who forwarded the survey to
them. About 50 participants from a Society for Competitive Intelligence
Professional (SCIP) conference were then contacted by telephone and agreed
to participate in the E-mail Industry Survey. The Industry Survey was then E-
mailed to those 50 and 9 Defense Department analysts. One of the 9
Defense analysts E-mailed the survey to about 120 other defense analysts. A
total of about 179 analysts are known to have received the Industry Survey.
Together, the two surveys reached about 535 analysts who have an interest
in Internet research. With 66 responses, this equates to a 12.3 percent
response rate from a randomly selected population.28
RESEARCH QUESTION AND SURVEY STRUCTURE
The survey was structured to answer several key issues and the
research question: how to identify credible sources on the Web. The
hypothesis was that credible Web sites can be confidently identified by
evaluating the Web sites based on criteria recommended by professional Web
searchers and agreed to by intelligence analysts. The thesis survey asked
this question directly in survey question 6, and indirectly in survey questions
28 Appendices B and C include a copy of the E-mailed surveys.
22
-
8/14/2019 How to Identify Credible Sources on The
31/123
8a through 8r. Question 8 listed the criteria most often mentioned by
published experts. Here is how the survey asked these questions.29
6. List up to five criteria that you use to determine the credibility of
any information source.a.b.c.d.e.
8. How much credibility does each of the following factors add to thetotal credibility of a Web site? Use the following scale:
___6) 100 percent Credibility___5) 75 percent Credibility___4) 50 percent Credibility___3) 25 percent Credibility___2) 10 percent Credibility___1) 0 percent Credibility
a. Recommended by a subject-matter expert.b. Recommended by a generalist.c. Listed by an Internet subject guide that evaluates Web sites.d. Listed in a search engine such as Alta Vista.e. Listed in a Web-directory organized by people, such as yahoo.f. Content is perceived current.
g. Content is perceived accurate.h. A peer or editor reviewed the content.i. Content's bias is obvious.
j. Author is reputable.k. Author is associated with a reputable organization.l. Publisher, or Web-host is reputable.m. Content can be corroborated with other sources.n. Other Web sites link to or give credit to the evaluated site.o. The server or Internet domain is a recognized copyrighted or
trademark name such as IBM.com ,p. There is a statement of attribution.q. Professional appearance of the Web site.
r. Professional writing style of the Web site.
To avoid influencing the responses to survey question 6, analysts were
first asked to list the criteria they currently use; they were later asked to
29 Survey, questions 6 and 8.
23
-
8/14/2019 How to Identify Credible Sources on The
32/123
evaluate the list of criteria in questions 8a through 8r. If the survey
population had been asked about specific criteria (question 8) before being
asked what criteria they actually use (question 6), they may have been
influenced to include the listed criteria from question 8 as criteria that they
use. This arrangement was necessary because earlier discussions with
analysts revealed that there were criteria that analysts would use only after
they were told of them. Discussions with analysts prior to the survey
development had also revealed that many analyst do not know how they
determine what is a credible source, and that many analysts may only
evaluate the data, and not the source.
As is shown in the findings chapter, many analysts were confused
about the difference between data validity and source credibility. The
categorized results of question 6 were then compared to the specific criteria
analysts approved of in question 8.
RESEARCH QUESTION: CREDIBILITY CRITERIA
The results of questions 6, and 8a through 8r were used to develop the
recommended credibility criteria and credibility scale in the findings chapter.
The recommended criteria were determined by computing the mode (score
most-often chosen) for each criterion in survey questions 8a through 8r, and
to avoid influencing the responses to survey question 6. An unusual amount
of variance would indicate little agreement among the analysts. Only criteria
from question 8 that scored a mode of 50-percent credibility or greater were
included in the recommended criteria list. This means analysts most often
24
-
8/14/2019 How to Identify Credible Sources on The
33/123
believe (mode) that the satisfaction of any one of these recommended
criteria made the source at least 50-percent credible.
Then the arithmetic mean (average) credibility was calculated for each
recommended criterion from question 8 and became that criterions relative
value. The relative value is how much more important, on average, analysts
think one criterion is than another criterion. The assumption here is that
such attributes are cumulative, and the more recommended criteria a site
satisfies, the more credible is the site.
The results of question 6 were categorized into a list of criteria that
analysts think they use to evaluate source credibility. The frequencies of
these criteria were calculated, and those criteria that were suggested by 50
percent of the analysts were added to the recommended criteria list.
Because the recommended criteria from question 6 were not evaluated on a
scale in the survey, they were arbitrarily assigned the average relative value
of those recommended criteria from question 8. This allowed the inclusion of
any criteria not included in question 8, but also did not significantly affect the
relative values of those criteria.
The following is a summary of the selection process for the
recommended criteria, and relative value calculation:
Step 1. Calculated the mode (most-often chosen) credibility (0-100
percent) of each criterion from survey question 8.
Step 2. Listed as recommended the criteria from question 8 that had amode credibility of 50 percent or greater.
Step 3. Calculated the mean credibility (average analyst chosen score)for each recommended criteria from question 8.
25
-
8/14/2019 How to Identify Credible Sources on The
34/123
-
8/14/2019 How to Identify Credible Sources on The
35/123
Scale:___7) No Opinion___6) 100 percent Credible___5) 75 percent Credible___4) 50 percent Credible___3) 25 percent Credible
___2) 10 percent Credible___1) 0 percent Credible
Analyst were the asked to choose the required level of credibility for:
9a. Research, or topic summaries.9b. Current, day-to-day developments.9c. Estimative, identifies trends or forecasts opportunities or threats.9d. Operational, tailored, focused to support an activity.9e. Scientific, and technical, in-depth, focused assessments.9f. Warning, an alert to take action.
The mode response for each of these types of intelligence products
was calculated and is the product-credibility levels, which are shown in Table
2 in the findings chapter. The product-credibility levels percentages were
converted into a score so that analysts can simple add the results of an
evaluation and compare the sum to the table of product-credibility levels.
The product-credibility level is also the credibility level that is needed
for sources that analysts use for a particular intelligence product. When a
potential Web site is evaluated, the analyst calculates the credibility score of
the evaluated site, and then compares it to the table of product-credibility
levels in Table 2. The sum of the evaluated Web site should be at least equal
to the product-credibility level of that type of intelligence product shown in
the table. The source-credibility level of each intelligence product type was
determined by calculating the percentage of a benchmarked very credible
Web sites score which would equal the product-credibility level that was
recommended by the surveyed analysts.. For example, here is a theoretical
27
-
8/14/2019 How to Identify Credible Sources on The
36/123
Web site evaluation, which also demonstrates how the product-credibility
level was determined.
Example:
Benchmark site credibility score = 46.75 points (100 percent Credible)Product-credibility level of intelligence product: 35.06 (75 percent of
46.75).Theoretical results of a Web site evaluation:
Meets Criteria 1 = 5 pointsMeets Criteria 3 = 6 pointsMeets Criteria 4 = 3 pointsMeets Criteria 5 = 3.5 pointsMeets Criteria 6 = 5 pointsMeets Criteria 7= 4.5 pointsMeets Criteria 10 = 2 pointsMeets Criteria 11 = 3 pointsMeets Criteria 12 = 3.5Meets Criteria 13 = 1.5Meets Criteria 14 = 4.5
Sum of Evaluated Site = 38 pointsResult: Exceeds the product-credibility level of 35.06
This summarizes the process recommended in this thesis to evaluate
the credibility of a Web site. This process is based on the theory that the
criteria recommended by expert Web searchers and approved by most
analysts are the best criteria for evaluating Web sites. The weight or relative
value of each criterion is based on the average score given the criterion by
analysts. The final evaluation is based on a comparison of the total values of
the evaluated site to the total values of the benchmark sites.
.
28
-
8/14/2019 How to Identify Credible Sources on The
37/123
-
8/14/2019 How to Identify Credible Sources on The
38/123
Discussions with analysts and the literature review indicated that well-
known publication titles are perceived as more credible than obscure titles,
even though the analysts may have never seen the well-known titles.
Therefore, to determine how objective analysts are, question 7a through 7m
asked analysts to evaluate the credibility of 13 sources based only on their
titles. This key issue was answered by comparing the well-known titles in
survey questions 7a, b, c, j, k, l, and m, with obscure titles in survey
questions 7d, e, f, g, h, and i. Question 7 asked:34
7. How credible are the following information sources given only theirtitles? Choose one from the following scale:
___7) = Certainly True___6) = Strongly Credible___5) = Credible___4) = Undecided___3) = Non-credible___2) = Strongly Non-credible___1) = Certainly False
Well-known Titles:
a. NY Timesb. Washington Postc. Harvard.edu Web site
j. NationalGeographic.com Web sitek. JanesDefenseWeekly.com Web sitel. InformationWeek.com Web sitem. DowJonesInteractive.com Web site
Obscure Titles:d. RussianArmy.ru, Web site in Russiane. RussianArmy.ru Web site in Englishf. IsraelIndependentNews.is Web site in Hebrew
g. IsraelIndependentNews.is Web site in Englishh. FrenchIndependentNews.fr Web site in Frenchi. FrenchIndependentNews.fr Web site in English
34 Survey, questions 7a 7l..
30
-
8/14/2019 How to Identify Credible Sources on The
39/123
However, there was a problem with how this question was structured
and the findings may not be valid. Judging from the comments in the
surveys, it was evident that analysts were not able to make credibility
judgments for many sources based on titles alone either because they had
personal experience with the sources, which influenced their judgments, or
because they were unwilling to make an uninformed judgment based on titles
alone.35
KEY ISSUE: FOREIGN LANGUAGE SOURCES
An issue related to source titles was, do analysts perceive foreign
sources published in their native language to be more credible than the
English language version of the same publications? This question was
answered by comparing survey questions 7d to 7e, and comparing 7f to 7g,
and comparing 7h to 7i. The validity of these questions was preserved by not
including any real publications or Web site titles, which the analysts may be
familiar with.36
7. How credible are the following information sources given only theirtitles? Choose one from the following scale:
___7) = Certainly True___6) = Strongly Credible___5) = Credible___4) = Undecided
___3) = Non-credible___2) = Strongly Non-credible___1) = Certainly False
d. RussianArmy.ru, Web site in Russian
35 Survey, questions 7a 7l.
36 Survey, questions 7d 7i.
31
-
8/14/2019 How to Identify Credible Sources on The
40/123
e. RussianArmy.ru Web site in English
f. IsraelIndependentNews.is Web site in Hebrewg. IsraelIndependentNews.is Web site in English
h. FrenchIndependentNews.fr Web site in French
i. FrenchIndependentNews.fr Web site in English
KEY ISSUE:CLASSIFIED VS. UNCLASSIFIED SOURCES
Discussions with IC managers and consultants often included
statements such as, how do classified sources compare in credibility to
unclassified sources and less often, how do classified sources compare to one
another. This is a comparison that is likely to change over time. One JMIC
professor explained that different intelligence sources seem to go in and out
of favor as access success improves for one source or another. These issues
were only included in the IC Survey and most analysts answered as though
they had an opinion. Therefore, questions 7n, o, p, q, r, and, s. asked:37
7. How credible are the following information sources given only theirtitles? Choose one from the following scale:
___7) = Certainly True___6) = Strongly Credible___5) = Credible___4) = Undecided___3) = Non-credible___2) = Strongly Non-credible___1) = Certainly False
The intelligence sources in question included:
7n. HUMINT sources with no reporting record7o. HUMINT sources with a proven reporting record7p. IMINT, with National analysts annotations or comments7q. IMINT, without National analysts annotations or comments
37 Survey, questions 7n 7s.
32
-
8/14/2019 How to Identify Credible Sources on The
41/123
7r. SIGINT reporting7s. MASINT
Analysis of these questions included a calculation of the mode and
range for all sources included in question 7, and compared them to each
other. This provides an interesting comparison of classified and unclassified
sources. 38
ETHICS
The thesis survey relied on the truthful response from analysts
currently working in areas included in this survey. Such responses could be
critical of an analysts employer or profession; therefore, the thesis included
the following statement intended to protect the respondents anonymity.
PRIVACY:You do not need to include your name; however, if you choose toinclude your name, it will only be used by me to contact you if I needmore information regarding your comments. I will not quote you
directly unless you indicate in Questions 3 and 4 that I may do so.Otherwise, only me and my Thesis Chairman, Professor Alex Cummins will have access to respondent names. Any record of the names inassociation with the responses will be destroyed after the research iscompleted, except those names included in the thesis withpermission.39
38 See Table 7, and Table 8 in the findings chapter.
39 Survey, Privacy.
33
-
8/14/2019 How to Identify Credible Sources on The
42/123
CHAPTER 4
FINDINGS
This chapter first describes what was discovered in the literature
review that could answer the research question and the key issues. Then the
results of the survey are described , followed by how these results answered
the research question and the key issues. The survey determined what
criteria analyst use today to judge the credibility of an intelligence source,
which can be found in Appendix D. Even after consolidation, 148 separate
criteria were suggested by analysts, indicating little consistency in criteria, or
little understanding of the differences between data validity and source
credibility. Many of the suggested criteria appear to be measures of valid
data, or lists of known credible sources.40
The most significant result of the survey is the list of recommended
credibility criteria determined by surveying analysts opinions of criteria
suggested by experts in the literature review. Only two expert
recommendations were rejected by the surveyed analysts. The survey also
showed that analysts see only a small difference in the credibility of open
sources and classified sources.4142
40 Survey, question 6.
41 Survey, questions 7a through 7s.
42 See Table 8 in findings chapter for comparison of classified andunclassified source credibility.
34
-
8/14/2019 How to Identify Credible Sources on The
43/123
Just as useful as the credibility criteria is the credibility scale
developed by benchmarking known credible and known non-credible Web
sites. The benchmarked sites determined the expected score of a credible
Web site. The survey results also determined a target level of credibility for
intelligence sources, which was converted to a percent of the credible
benchmark score on the credibility scale. The benchmarking of known
credible and non-credible Web sites validated the criteria and demonstrated
that credible sources can be identified on the Web.43
KEY ISSUE: OSINF RELEVANCE TO INTELLIGENCE
Although all experts agree that open source information (OSINF)
contributes to intelligence, how OSINF should contribute is still an open
debate. Steele suggests that analysts should reference OSINF first, and then
classified sources, and presumably only then request further classified
collection to fill the intelligence gaps.44 This approach would acquire data
from the least expensive sources first. Steele calls for 5 percent of the
intelligence budget to be moved to support OSINF acquisition.45 He claims
this would increase timely intelligence by a magnitude. His comments
suggest an answer to the key issue how relevant is OSINF to intelligence.
Open sources include what is already publicly known about a subject, and
therefore should represent the background and context of any intelligence
43 See Appendix A, Benchmarked Web Site Evaluation Worksheet.
44 Steele, under Part III.
45 Steele, under Part III.
35
-
8/14/2019 How to Identify Credible Sources on The
44/123
report, and should be considered before any classified collection is
attempted. Not to do so would potentially waste funds and possibly put
people at risk for information that may have been found in a foreign Web
site, foreign newspaper, or company brochure. These open sources can also
be used to corroborate classified intelligence, thus contributing to the
credibility of a classified source. Because classified resources are so much
more expensive than open sources, open sources should always be the first
choice, followed by classified sources if not available through open sources,
or if the open sources credibility cannot be determined or is determined to be
too low. Therefore, OSINF affects the cost of intelligence, the timely access
to information, the context of intelligence, the credibility of intelligence, as
well as the content.
Bowens recommendations to include subject-matter experts in the
intelligence collection cycle may be a practical way to implement the
evaluation process proposed by this thesis.46 Implemented community wide,
Bowens cadre of OS subject experts could produce a significant savings in
time and money spent by countless analysts attempting to sort the useful
credible information from the useless and non-credible information. I have
observed that every analyst who makes use of Web sites for open source
intelligence must rediscover which sites are useful and credible, even though
an expert at another agency or just down the hall may have already
evaluated the site. Also, when a Web site is recommended by one analyst to
another analyst, there is no consistent way to evaluate the Web site and
express that evaluation to other analysts. This research produced a
46 Bowen, under Collection Strategy.
36
-
8/14/2019 How to Identify Credible Sources on The
45/123
-
8/14/2019 How to Identify Credible Sources on The
46/123
Bias. The researcher must understand the sources bias.51
Objectivity. Are the authors statements supported with reasoning
or facts? 52 Even a bias author can compensate for his bias by including
competing reasoning and facts.
Accuracy. Online sources are generally quicker than print media at
correcting errors.53 Even print sources include inaccurate information or
disinformation.54 I believe that this is significant because accuracy affects
credibility; therefore, Web sources should be more accurate and timely
than print media because the technology enables quicker revisions.
Expert opinion.
Rely on second party expert evaluation whenever possible, e.g.,
recommendations from professional associations, academic organizations,
subject experts.55
Informal networks of colleagues with different areas of expertise
inform one another of credible sources.56
Use second opinions to evaluate the accuracy of an author, which
can be done by posting related questions to appropriate news groups. 57
51 Basch, 9, 15.
52 Basch, 31.
53 Basch, 48.
54 Basch, 9.
55 Basch, 31.
56 Basch, 31.
57 Basch, 31.
38
-
8/14/2019 How to Identify Credible Sources on The
47/123
Subject area Web pages created by subject librarians are a good
source of links to evaluated Web sites.58 I recommend evaluation sites
that explain their evaluation process.
Gray literature (documents with limited distribution such as
company brochures, or equipment manuals), best located on the Web, is
often published by very credible sources, including governments, and
corporations, which can be good sources for factual data. Interpretation
of the data may require an expert.59 I suggest asking a subject-matter
expert to distinguish facts from advertising in corporate literature.
Origin.
How close is the source to the origin of the data? 60
Discover the original source to avoid circular and false
corroboration.61
Corroboration. Can the information be corroborated?62
Corroboration is only effective if it is from diverse sources. This is another
reason it is important to know the origin of the data.
Current. Is the information current? 63
58 Basch, 139.
59 Basch, 40, 110.
60 Basch, 9.
61 Basch, 16.
62 Basch, 9, 96.
63 Basch, 132.
39
-
8/14/2019 How to Identify Credible Sources on The
48/123
-
8/14/2019 How to Identify Credible Sources on The
49/123
Know which publishers, universities, or companies are well
respected in your topic area.69 These are likely to be credible sources, or
able to identify credible sources.
Reputable publishers, well-known authors, and (peer) reviewed
publications are more credible than other sources.70
Attribution.
Does the source clearly identify its self and its purpose? 71
Indications of the source include the text of the Web site, the name
of the Web server in the URL, and the directory name in the URL, which
may include the authors name.72
Attribution should include the institution and a person,
withinformation on how to contact the author.73
I would also recommend viewing the Web sites HTML source code
for revision dates, and statements of attribution not shown in the Web
sites body.
Motivation.
Information has value; therefore, know why a source provides
information for free.74
69 Basch, 110, 137.
70 Basch, 32.
71 Basch, 16.
72 Basch, 140.
73 Basch, 140.
74 Basch, 77.
41
-
8/14/2019 How to Identify Credible Sources on The
50/123
The presence of a counter on a Web site indicates the author cares
that people know that other people like his site enough to visit it.75
However, I am aware that counters have also been used to falsely
indicate that a site is popular when it is not. Therefore, counters are
probable not a reliable indicator of anything. A more relevant indicator of
popularity is how many and which other Web sites include links to the
evaluated site. I suggest using Alta-Vistas Link: command in the
Advanced Search area to determine this. A search of relative news groups
will also indicate what other people think of a Web site.
Relativity. What is a good source for one purpose may be
insufficient for another purpose.76 This is another reason that I think
that Web sites are best evaluated by subject-matter experts. A
novice or generalist who evaluates a Web site for someone else
should indicate his own level of knowledge in the topic area. This
also relates to thesis survey question 9, which asked analysts to
evaluate how credible a source must be to use it for different
intelligence products.
All of the statements listed above from respected Internet searchers
contributed to the thesis survey question 8, which asked how much does
specific criteria contributed to the credibility of Web sites.
Alison CookesAuthoritative Guide to Evaluating Information on the
Internetincluded three areas: what is high quality information, how to find it,
and how to evaluate it. Each of these areas contributed to the development
75 Basch, 132.
76 Basch, 133.
42
-
8/14/2019 How to Identify Credible Sources on The
51/123
of relevant questions in the thesis survey. On the topic of high-quality
information, Cooke explains that some of the most common problems with
the Internet include:77
information overload
too much useless information
potentially inaccurate material
outdated material
Publishing has become so easy that researchers must comb through
thousands of supposedly related Web pages returned by search tools, which
do not even include, databases, news services, and FTP sites. The citation
search engines are of no help in determining quality, or relevance. Most
search engines are only an index of Web pages found.
Cooke explains that without the filtering provided by commercial and
academic publishers, people publish because they can, not because they
have something useful to share.78 I have observed that this is a serious
problem because it camouflages the useful information and requires a great
amount of time to sort through. A useless site can have all the gloss, format,
and authoritative lingo of a useful site, yet have no useful content.
Cooke contends that accuracy is perhaps of most concern to
researchers and professionals. As an example of the accuracy issue, Cooke
explains that of forty WWW medical sites evaluated, only four included the
advice close to the authoritative published recommendations.79 I believe that
this level of inaccuracy is possible because Web authors are their own editor
77 Cooke, 89.
78 Cooke, 12.
79 Cooke, 62.
43
-
8/14/2019 How to Identify Credible Sources on The
52/123
and publisher, allowing no opportunity for critical review which most scholars
and professionals welcome.
Methods for finding data on the Web are unique to the Web and online
sources. Cooke explains in great detail the advantages and disadvantages
of:
search engines
review and rating services
subject catalogs and directories
subject-based gateway services and virtual libraries
Cooke explains that search engines such as Excite and Lycos (or
AltaVista, which is still solvent) are comprehensive, unfocused, have poor
relevance ranking, and are not useful for finding nor evaluating sources for
quality. They are also generally limited to Web sites and index every page on
every site, further multiplying the number of results per query.80 I have
observed that some search engines such as Google have resolved this
multiple indexing of a single site by displaying only the first indexed page,
unless one requests more.
Cooke also writes that subject catalogs and directories such as Yahoo
and Galaxy are more useful because site authors write the site descriptions;
catalog experts choose the hierarchy category to place the site; and only
sites are indexed, not every page. However, these sites are still very large,
and because the indexing is done by people rather than machines, as is the
80 Cooke, Chapter 2.
44
-
8/14/2019 How to Identify Credible Sources on The
53/123
case with search engines, Web site directories are not revisited as often and
may become outdate.81
Cooke also wrote that rating and reviewing services use different,
usually unpublished criteria for rating the best sites. These include
Encyclopaedia Britannicas Internet Guide and Lycos Top 5 percent.82 These
are even better yet for finding high-quality sources because a person other
than the author has reviewed the site based on some criteria. However,
these criteria are targeted to a general audience, not the academic or
professional. Higher weight may be given to organization and graphics, than
for content or accuracy, and the evaluators are not subject-matter experts.83
Cooke believes that the best place to find high-quality sources is from
subject-based gateway services and virtual libraries. These facilities are
designed by librarians or subject-matter experts, and use common indexing
methods used in libraries. They are often subject-matter specific and site
descriptions are evaluated and described by subject-matter experts.84
The last section of Cookes book gives checklists of evaluation criteria
for several internet source types. The criteria can be used for overall
evaluation of Web sites, not specifically for credibility as this thesis does.
Cookes criteria are based on surveys of hundreds of internet users, and were
81 Cook, Chapter 2.
82 Cook, Chapter 2.
83 Cooke, Chapter 2.
84 Cooke, 92.
45
-
8/14/2019 How to Identify Credible Sources on The
54/123
validated by professional librarians. The unique evaluation criteria for each
type of Web site are fully described.
The source types described in this book, with general evaluation
criteria, included:
organizational WWW sites
personal home pages
subject-based WWW sites
electronic journals and magazines
image-based and multimedia sources
USENET newsgroups and discussion groups
databases
FTP archives
current awareness services
FAQs
Criteria for assessing an organizational Web site should include the
authority and reputation of the institution within its field, as well as the date
the page was last updated.85 Criteria for a subject-based Web site include
the purpose of the site, comprehensiveness, and whether the page includes
pointers to other sources for more information.86 Evaluation criteria for
electronic journals and magazines include the sites authority and reputation
as well as whether the site has been referenced by a known reputable journal
85 Cooke, 90.
86 Cooke, 97.
46
-
8/14/2019 How to Identify Credible Sources on The
55/123
that filters its own articles for accuracy.87 These criteria were included in the
survey questions for this thesis.
SURVEY FINDINGS, CREDIBILITY CRITERIA
The primary purpose of the thesis survey was to identify criteria for
assessing the credibility of a Web site. The recommended credibility criteria
were determined by a multi-step processes. First, all credibility criteria
recommended by experts in the literature review were listed, and then
consolidated. Then the consolidated list of expert criteria were included in
the thesis survey to industry and intelligence analysts as questions 8a
through 8r. Those criteria, which analysts most often gave a credibility value
of 50 percent or higher, were then listed as recommendations. Note that
only three criteria were rejected as credible by 50 percent or more
respondents. The first two were not recommended by experts, but were
added to assess the basic knowledge of respondents and as control
questions, which were not expected to be accepted by respondents.
Rejected criteria included:
8d. Listed in a search engine such as AltaVista.8e. Listed in a Web directory organized by people, such as Yahoo.8r. Professional writing style of Web page
Then the mean credibility (average analyst chosen score) was
calculated for each recommended criteria from question 8. The mean then
became the relative value or weight for each criterion.
87 Cooke, 98.
47
-
8/14/2019 How to Identify Credible Sources on The
56/123
The criteria recommended in survey question 6 were then listed, and
consolidated. The methodology planned to add to the list of recommended
criteria from question 8, those criteria from question 6 that were not already
on the recommended list, and that had a mode occurrence of 50 percent or
greater (at least half the analysts listed the criterion). Surprisingly, there
were no criteria recommended by half or more of the respondents in the
open survey question number 6. The criteria that were mentioned most
often were: corroboration (28 occurrences), bias (14 occurrences), reputation
of the source (10 occurrences), sources authority or credentials (8
occurrences), and presentation (7 occurrences).88 However, each of these
most-often suggested criterion, except source authority, were also suggested
by published experts discussed in the literature review, and were recommend
by 50 percent or more of respondents when ask about those specific criterion
in survey questions 8a-8r. Therefore, no additional criteria were added from
question 6.
Therefore, Table 1 below includes the results of the criteria surveyed,
the relative values of each criterion, and which criteria were chosen for
recommendation.89
Table 1. Question 8a to 8r, Recommended Criteria and Relative Values (Mean).(a)
Number ofCases
Criteria Valid Missing
Mean Mode
Recommended
88 See Table 15. Survey Question 6: Personal Criteria Analysts Currently Useto Determine Credibility.
89 Survey, questions 8a 8r.
48
-
8/14/2019 How to Identify Credible Sources on The
57/123
8a. Recommended bysubject-matter expert in thetopic of the Web page.
66 0 4.94 5 Yes
8b. Recommended by ageneralist.
65 1 3.65 4 Yes
8c. Listed by an Internet
subject guide that evaluatesWeb sites.
63 3 3.56 4 Yes
8d. Listed in a search enginesuch as AltaVista
64 2 2.39 1 No
8e. Listed in a Web directoryorganized by people, such asYahoo.
62 4 2.65 2 No
8f. Content is perceivedcurrent.
64 2 3.78 5 Yes
8g. Content is perceived
accurate. 63 3 4.56 5 Yes
8h. A peer or editor reviewedthe content.
65 1 4.52 5 Yes
8i. Content's bias is obvious. 65 1 3.06 4 Yes8j. Author is reputable. 64 2 4.64 5 Yes8k. Author is associated witha reputable organization.
65 1 4.42 5 Yes
8l. Publisher or Web host isreputable.
65 1 4.02 5 Yes
8m. Content can becorroborated with othersources
65 1 5.17 5 Yes
8n. Other Web sites link to, orgive credit to the evaluatedsite
65 1 3.68 5(b) Yes
8o. Server or domain iscopyrighted or trademarkname, like IMB.com.
65 1 3.45 4 Yes
8p. Statement of attribution. 64 2 3.78 5 Yes
8q. Professional appearanceof Web site.
65 1 2.86 4 Yes
8r. Professional writing styleof Web page.
64 2 3.16 3 No
(a) (a) Table Explanatory Notes. Mode Values: 1=0 percent, 2=10 percent, 3=25percent, 4=50 percent, 5=75 percent, 6=100 percent credible. Mode is themost-often chosen score respondents gave each criterion. Only modes of 50percent credible and higher are recommended. The Mean is the average scorerespondents gave each criterion. The Mean is assigned to each recommended
criteria as their relative values which are latter summed when evaluating a Website.
(b) (b) Multiple modes exist. The smallest value is shown
The last step of the processes to identify commonly agreed-upon
credibility criteria and to assign relative weights, involved applying the
49
-
8/14/2019 How to Identify Credible Sources on The
58/123
recommended criteria to known credible, and known non-credible Web sites,
to establish benchmarks and a relative credibility scale. Three credible sites
known to the author or recommended by a subject expert were evaluated to
establish the high-end of the relative credibility scale. The relative values of
each criterion that the site satisfied were then summed for the sites relative
credibility score. Then the average of the three credible Web sites was
calculated as the benchmark credible score. See Appendix A for the
evaluation worksheets, and detailed evaluation for these Web sites.
It was surprisingly easier to find known credible Web sites to evaluate
than it was to find known non-credible Web sites to evaluate. This was
because it did not seem useful to benchmark a Web site so obviously non-
credible that no analysts would consider using it, negating the need for an
evaluation at all. Due to this difficulty, only one non-credible Web site was
evaluated. Due to concerns about potential libel claims, this non-credible
Web site will be referenced here by the pseudonym KoreanNewsSite. The
KoreanNewsSite was selected because the author had evaluated this site for
a previous research paper and had found it non-credible, and yet a challenge
to evaluate. The challenge to evaluating it came from its mix of very credible
links, unknown contributing authors, and non-credible articles by the
publisher. The key points that made the publishers articles non-credible
included a general lack of authoritative citations to source documents, lack of
dates on the articles, a distinct bias camouflaged by corroborative facts, and
inaccuracies. Relative newsgroup discussions indicated that the publishing
author had a poor reputation for these same reasons.
50
-
8/14/2019 How to Identify Credible Sources on The
59/123
The figures below represent the relative credibility scale and how these
benchmarks were determined. Based on these evaluations, a very credible
Web site should rate a relative credibility score of about 46.75, and a non-
credible site should rate a relative credibility score of about 7.46.
51
-
8/14/2019 How to Identify Credible Sources on The
60/123
Benchmark Credible Web sites Evaluated Score
Spot Image Corporation, www.spot.com 43.19International Telecommunications Union, www.itu.int 48.24NY Times On the Web, nytimes.com 48.82Average Score 46.75
Benchmark Non-credible Web site Evaluated ScoreKoreanNewsSite 7.46
Relative Credibility Scale:46.75 = Very-Credible
7.46 = Non-credible
SURVEY FINDINGS,CREDIBLE ENOUGH FOR INTELLIGENCE USE
As discussed in the methodology chapter, having a relative scale is
useful from an academic perspective; however, to be of practical use, the
analysts must also know what the target or required level of credibility is for
a source he would like to use in an intelligence product. The required level of
credibility for intelligence sources was determined by survey questions 9a
9f, which asked:90
How credible must an intelligence source be to use its data in thefollowing intelligence products?
7) No Opinion6) 100 percent Credible5) 75 percent Credible4) 50 percent Credible3) 25 percent Credible2) 10 percent Credible1) 0 percent Credible
9a. Research, or topic summaries9b. Current, day-to-day developments9c. Estimative, identifies trends or forecasts opportunities or threats9d. Operational, tailored, focused to support an activity9e. Scientific, or technical, in-depth, focused assessments
90 Survey, questions 9a 9f.
52
-
8/14/2019 How to Identify Credible Sources on The
61/123
9f. Warning, an alert to take action
The following calculations were used to determine the product-
credibility level for six types of intelligence products. The mode was
calculated for survey questions 9a 9f. The mode is the most-often chosen
required level of source credibility. The statistics indicate that most analysts
believe that all types of intelligence products require that sources be 75
percent credible.91 This was a surprise because the author expected to see a
greater variance in the required levels of source credibility, with warning
intelligence requiring the least credibility and in-depth focused assessments
requiring the greatest level of credibility. This presumption was based on the
belief that analysts require less information about an imminent threat than
they do about a future scientific or political condition, because the potential
impact of ignoring the least threat is so much greater than ignoring the most
significant emerging scientific or political condition. Apparently, most
analysts do not understand the relationship of intelligence products to
outcomes, or the survey question was flawed.
However, using the survey results, the sources of all intelligence
products should be 75 percent credible. If the most credible Web sites have a
relative-credibility score of 46.75 as demonstrated above, then intelligence
products should be 75 percent of that, which is 35.06. Therefore, the target-
credibility level of any intelligence source is 35.06, as evaluated by the
recommended credibility criteria. The following table shows the most-often
chosen (mode) required credibility level for intelligence products.
91 See Table 2.
53
-
8/14/2019 How to Identify Credible Sources on The
62/123
Table 2. Questions 9a-f. Required Level of Source Credibility forIntelligence Products.92
Number ofCases
Required Credibility
Valid Missing(b)
Modepercent
Rangepercent
9a. Research, special topicsummaries
35 31 50percent(a)
0-100percent
9b. Current, day-to-daydevelopments
35 31 75percent
0-100percent
8c. Estimative, identifies trends orforecasts opportunities or threats
35 31 75percent
0-100percent
9d. Operational, tailored, focused, tosupport a military, intelligence, ordiplomatic activity
35 31 75percent
0-100percent
9e. Scientific or technical, in-depth,
focused assessments of trends orcapabilities
35 31 75
percent
0-100
percent
9f. Warning, an alert to take action 35 31 75percent
0-100percent
Required-credibility level for allIntelligence Product Sources
75percent
(a) Multiple modes exist. The smallest value is shown. Just as manyrespondents chose 75 percent.(b) Missing responses are primarily because non Intelligence Communitypersonnel were not asked these questions in the survey. Mode is based onvalid responses.
SURVEY FINDINGS, OFFICIAL CREDIBILITY CRITERIA
Question 5 asked, Does your organization have official criteria that
you are told to use for determining the credibility of any source? "Any source"
means published, proprietary, and classified sources.93 The purpose of this
question was to determine if analysts are aware of credibility criteria that
they can use to ensure a consistent quality of reporting. The assumption
92 Survey, questions 9a 9f.
93 Survey, question 5.
54
-
8/14/2019 How to Identify Credible Sources on The
63/123
here is that only criteria formally sanctioned by the organization are likely to
be consistently followed. As the table below indicates, 86.2 percent of
analysts are eith