![Page 1: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/1.jpg)
![Page 2: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/2.jpg)
The New Library of Alexandria Overview
Bibliotheca Alexandrina (BA)
![Page 3: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/3.jpg)
Ø Center of excellence in the production and dissemination of knowledge
Ø Place of dialogue, learning and understanding between cultures and peoples
![Page 4: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/4.jpg)
Ø The World’s Window on Egypt
Ø Egypt’s Window on the World Ø Instrument for Rising to the Challenges of
the Digital Age
Ø Center for Dialogue Between Peoples and Civilizations
![Page 5: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/5.jpg)
Not just a Library of Books but rather a vast cultural and scientific complex
![Page 6: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/6.jpg)
A library that can accommodate millions of books
![Page 7: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/7.jpg)
7
http://archive.bibalex.org
![Page 8: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/8.jpg)
8
![Page 9: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/9.jpg)
![Page 10: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/10.jpg)
![Page 11: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/11.jpg)
![Page 12: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/12.jpg)
![Page 13: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/13.jpg)
![Page 14: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/14.jpg)
14
![Page 15: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/15.jpg)
15
http://descegy.bibalex.org
![Page 16: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/16.jpg)
16
http://lartarab.bibalex.org
![Page 17: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/17.jpg)
17
More than 230,000 Arabic books are freely available online for Arabic
readers worldwide
![Page 18: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/18.jpg)
18
http://suezcanal.bibalex.org
![Page 19: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/19.jpg)
19
![Page 20: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/20.jpg)
20
http://naguib.bibalex.org/
![Page 21: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/21.jpg)
21
http://nasser.bibalex.org
![Page 22: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/22.jpg)
22
http://sadat.bibalex.org
![Page 23: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/23.jpg)
![Page 24: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/24.jpg)
Ø Project Overview Ø Collection Overview Ø Data Representation Ø System Workflow
� DAF (Digital Assets Factory) � Cataloguing � Website
§ Solr search Engine § Article Viewer
24
![Page 25: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/25.jpg)
25
![Page 26: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/26.jpg)
Ø Centre for Economic, Judicial, and Social Study and Documentation (CEDEJ) collaborated with Bibliotheca Alexandrina (BA) for the digitization of its archive of massive press articles collection
Ø The project consists of multiple modules to: � Index the Press Archive Collection � Control data entry workflow � Digitize and process data � Catalogue and review Articles � Archive Web Publishing
26
![Page 27: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/27.jpg)
27
![Page 28: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/28.jpg)
Ø Package of press archive � 800,000+ press clips varying between
§ Press § Reports
� 500+ publishers � 60,000+ writers and reporters � 200 Different subjects
§ Economic, politics, social life, etc… � Archive Languages:
§ Arabic, English and French � Date range from 1966 to 2009
28
![Page 29: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/29.jpg)
Ø Finished so far � 115,000 press clips varying between
§ Press § Reports
� 200 publishers � 14,000 writers and reporters � 100 Different subjects
§ Economic, politics, social life, etc… � Archive Languages:
§ Arabic, English and French � Date range from 1966 to 2009
29
![Page 30: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/30.jpg)
30
![Page 31: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/31.jpg)
Ø A list of packaged press archive is submitted to
Bibliotheca Alexandrina to be scanned and catalogued
Ø Source of data is a collection of boxes Ø The box is organized on the following
hierarchy � Folder � File � Sub-File � Document
Ø Document represents a single page of press
31
![Page 32: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/32.jpg)
32
![Page 33: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/33.jpg)
33
![Page 34: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/34.jpg)
34
![Page 35: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/35.jpg)
35
![Page 36: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/36.jpg)
36
![Page 37: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/37.jpg)
37
![Page 38: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/38.jpg)
38
![Page 39: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/39.jpg)
Article Creation
39
![Page 40: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/40.jpg)
Article Metadata
40
![Page 41: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/41.jpg)
Lookups Management
41
![Page 42: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/42.jpg)
Reports
42
![Page 43: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/43.jpg)
43
![Page 44: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/44.jpg)
44
![Page 45: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/45.jpg)
45
![Page 46: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/46.jpg)
Ø Based on Apache Lucene project v4.1
Ø SolrNet API is used to connect to Solr server
Ø Features � Simple/Advanced search � Results Highlighting � Fields AutoComplete � Text search (Article Viewer)
46
![Page 47: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/47.jpg)
47
![Page 48: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/48.jpg)
48
![Page 49: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/49.jpg)
49
![Page 50: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/50.jpg)
50
![Page 51: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/51.jpg)
51
![Page 52: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/52.jpg)
52
![Page 53: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/53.jpg)
53
![Page 54: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/54.jpg)
Ø Article viewer is used for previewing articles � It is one of multiple viewers developed at BA
Ø Architecture � Server Side: RESTful services � Client Side: JavaScript using JSONP
Ø Features � Image preview � Metadata preview � Text selection � Searching/highlighting � Zooming options: fit width/height
54
![Page 55: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/55.jpg)
Ø Viewer Web Services � Metadata Web Service:
§ Retrieve article catalogue metadata § Return technical information (width, height, page
count..) � Content Web Service:
§ Retrieve the image of each single page in the article applying scaling to custom width and height responsively
§ Return the selected text based on the user highlighted area
� Search Web Service: § Perform the search using Solr engine APIs in the
content of the articles § Highlight the matching phrases in the article image
55
![Page 56: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/56.jpg)
56
![Page 57: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/57.jpg)
57
![Page 58: Managing the Digitization of Large Press Archives](https://reader038.vdocuments.us/reader038/viewer/2022103018/558e1b741a28abcf5b8b463a/html5/thumbnails/58.jpg)
58