production and distribution of electronic news
TRANSCRIPT
Production and Distribution of Electronic News: Technology and PreservationNews: Technology and Preservation
ChallengesMegan Waters [email protected]
June 27, 2013
Global Resources Roundtable Beyond the Fold Access to News in the Digital EraGlobal Resources Roundtable: Beyond the Fold: Access to News in the Digital EraNewberry Library, Chicago, IL
BackgroundBackground
2005‐2009: Director of Information Services at The Miami Herald Media Company
– News research
– Intranet, editorial, & archiving systems
– Print & digital archives
– Content licensing & syndicationContent licensing & syndication
Preserving News in the Digital Environment: Mapping the pp gNewspaper Industry in Transition. A report from the Center for Research Libraries, April 27, 2011.
Print Trends*Print Trends70000
50000
60000
Total Daily Print Circulation
30000
40000 Total Daily Newspapers
Sunday Circulation
10000
20000
y
Classified Ad Revenue
0
2004 2005 2006 2007 2008 2009
Source: Newspaper Association of America (NAA) Trends & Numbershttp://www.naa.org/en/Trends‐and‐Numbers.aspx
Composition of Newspaper Media Total Revenue S *Stream*
2011 20122011New Revenue, 7%
2012
New Revenue, 8%
Circulation, 26%
Circulation, 27%
Digital Ads, 10%
Digital Ads, 11%
Print Newspaper, 49%Nich Mktg., 8%
Print Newspaper, 46%NicheNiche Mktg., 8%
Source: NAA American Newspaper Media Industry Revenue Profile 2012
Recent NAA Data
“[Library access and] preservation activities are built around the print newspaper, with its tangible outputs and regular publishing cycles. Electronic distribution requires other, more complex preservation models.”
CRL Mapping the Newspaper Industry April 27 2011 p 42CRL Mapping the Newspaper Industry, April 27, 2011, p. 42.
CRL Mapping the News Industry ReportP & M h d lPurpose & Methodology
• Outcome of 2009 National Digital Information Infrastructure & P ti P (NDIIPP) k h t L CPreservation Program (NDIIPP) workshop at LoC.
• Explore strategies for collecting & preserving digital news on a national basis, including:
digital newspaper web sites– digital newspaper web sites– tv and radio broadcasts distributed via the Internet– blogs– pod castspod casts– digital photos and videos
• Examine, analyze, and document workflows at 4 newspapers representing a broad segment of the U.S. newspaper industry (e.g. p g g p p y glocal, regional, national).
• Also includes analysis of The New York Times, Investor’s Business Daily, and the Associated Press.
Key Questions from NDIIPPKey Questions from NDIIPP
• What is the nature of the electronic facsimile?What is the nature of the electronic facsimile?– Not http://www.miamiherald.com/ but http://www heraldsubscriptions com/mhdigital/http://www.heraldsubscriptions.com/mhdigital/
• How profound is the relationship between Web and print news “coverage”?Web and print news coverage ?
• What is the technical formatting and delivery f l t i t t ?of electronic news outputs?
FindingsFindings• There are many actors & systems used to produce news content, licensed
content is brought into news pages or sites and locally‐produced content is licensed “out” to a plethora of syndication services and vendors
• Standards are employed & rich metadata of interest to libraries exists for some types or levels of content in certain enterprise news systems
• Print editions function as fixed, static records of interest to local constituencies whereas news web sites function as regional portals to a much larger and more dynamic realm of changeable content
• The proliferation of platforms, devices, and distribution networks used to read or access the news further complicates our ability to fully archive the user experience
• Common methods for Web archiving like those employed by the Internet Archive are insufficient due to the rate of change and site architecture of common news web sites, among other factors
Actors & ContentActors & Content
EditorsWire
servicesStaff writers
Ads
Text
Art/Graphics
Photos
Video
PublisherParent Company
FreelancersPhoto agency
Video
Audio
Tabular data
Databases
Decision‐making File outputContent sources
Typical News OrganizationTypical News Organization
Publisher
Production Business Editorial Advertising Circulation
Desks Newsroom technology Library & Archives
Sports, Local, International, etc. Photo Graphics & Illustrations Copy Page Layout
Actors within the Org ChartActors within the Org ChartIndependent/freelance content
Publisher
Wire services & photo agencies
Ad & data services
content
Newspaper chain Publisher
Media company
Newspaper chain Publisher
PublisherInteractive PublisherdivisionSyndicates & other content
idAdvertising
i &providers agencies & businesses
Systems OverviewSystems Overview
Editorial System
Pagination SystemInput
Output to printers, efacsimile
Web Output to Web hosts servers
Archive
Production System
Output to Web hosts, servers
ThirdP t Output to Web hosts, serversPartySystem
p ,
Print/Web Content OverlapPrint/Web Content Overlap
• Online content including entire stories isOnline content, including entire stories, is often absent from the print edition
• Print content is often absent online• Print content is often absent online
• The timing of publication varies between print d W band Web
• Multiple versions of articles appear online, but rarely in print
Brief News Standards TimelineBrief News Standards Timeline
1965 •IPTC Founded1965•IPTC 7901/ANPA‐1312 is a 7‐bit news agency text markup specification1970 80s •Still in use!1970‐80s•Binary formats•Information Interchange Model (IIM), first multimedia news exchange format aka “IPTC Fields”1990s g ( ), g•News Industry Text Format (NITF) XML specification1990s•NewsML•IPTC Core Extension for use with Adobe XMP for photo metadataN C d ll d b l i / i2000s •News Codes controlled vocabularies/taxonomies2000s•NewsML‐G2•Newscodes.org for linking•rNews emerging standard for embedding structured metadata into HTML documents2010s rNews emerging standard for embedding structured metadata into HTML documents
Submission & Ingest of ContentSubmission & Ingest of Content
Photo & Multimedia FlowPhoto & Multimedia Flow
User Web SubmissionUser Web Submission
Pagination to eFacsimilePagination to eFacsimile
Text‐Specific Flow Out to AggregatorsText Specific Flow Out to Aggregators
News Web Site Architecture*News Web Site Architecture• Complex content management
systems drive site architecture &Site Content
systems drive site architecture & content
• Site architecture is dense with 105 “gateway” pages
• Much subnavigation out of the gate:• Much subnavigation out of the gate: 5% of pages are linked to/from +1 gateway page
• 366,290 links monitored; average 735 per day added
Ads, 83%
Tribune, 11per day added• Most were advertising‐driven• Highly dynamic, unpredictable over
time: No master list of articles tied to original gateway hierarchy
%
Other, 6%
original gateway hierarchy• Roughly 39% are linked for 1 day, 5%
for 30 minutes or less
Leetaru, Kalev. Chicago Tribune Content Velocity Analysis, CRL Mapping the Newspaper Industry, April 27, 2011, pages 66‐74.
Characteristics by System/ProcessCharacteristics by System/Process
Editorial ArchiveEditorial• Tied to newsroom
processes & practices ( i f ld )
Archive• Can be tied to newsroom
processes or considered “ id ”/ i(assignments, story folders)
• Highly standards‐based depending on
“outside”/unimportant to producing the news
• Highly standards‐basedsystem, staffing
• Often includes high resolution or raw original
• May include “enhancing” of data
• Important as the record ofresolution or raw original files
• Includes unpublished material
• Important as the record of everything published, rights management, licensing, but not “how we got the story”material not how we got the story
Characteristics by System/ProcessCharacteristics by System/Process
Pagination WebPagination• Tied to production print and
ereader processes
Web• Content comes from many
sources and systemsd d d• Highly standardized but
oriented towards print publication needs
• Less standardized• Uses a diversity of file types
which may or not be merged
• Component parts are fused together, treated as a whole (e.g. the page)
• Platform & content licensing may occur at corporate level
• Richer file types and metadata (e.g. the page)• Richer file types and
metadata are stripped out
are stripped out• Metadata added is SEO‐
oriented
Reading ExperienceReading Experience
• Users may access content under an umbrella brand like The yChicago Tribune via an Internet Browser, third party provider/site such as LexisNexis or Twitter, and e‐reader devicesdevices
• The content they see may be an efacsimile of the print edition or not, it may also be certain, licensed material like staff writers only or only certain content types like text or video
• Web‐based content is lower resolution and has less metadata attachedattached
• With print advertising dropping and readership trends changing, more resources will flow to online operations and revenue streams
Where is the “high impact point of ” f l b h flentry” for libraries in this flow?
Editorial System
Pagination SystemInput
Output to printers, efacsimile
Web Output to Web hosts servers
Archive
Production System
Output to Web hosts, servers
ThirdP t Output to Web hosts, serversPartySystem
p ,
Thank You!Thank You!
Megan Waters [email protected]://www linkedin com/in/meaux/http://www.linkedin.com/in/meaux/