ICT Project on Text Transcription of Technical Video Lectures and
Creation of Video Searchable Index,
Metadata and Online Quizzes
Status Report – up to September 30, 2010
Project duration: April 2009 to April 2011
Proposal
Under the National Mission on Education through ICT
Submitted to: The Additional Secretary (TEL) Department of Higher Education Ministry of Human Resource Development Shastri Bhavan, New Delhi
1
1. Original project proposal which was approved and financial sanction
accorded:
Objectives of the project:
1. The project proposes to produce transcript files for all video lectures produced in NPTEL Phase I in a
phased manner—about 1000 files every six months. So far about 2200 lectures have been transcribed
through a manual transcription process and having data on speech (in English) with sufficient variation in
pronunciation by Indian English speakers for training machines for automatic speech-to-text transcription
with an improved accuracy—about 60 percent or so. The project proposes to generate enough data
when completed, so that a large number of NPTEL Phase II video lectures can be text transcribed using
automatic means.
2. Every transcribed lecture is being time coded and indexed using technologies available on the net and
standard text books. This will enable the user to point to a particular video segment through a single or a
rapid keyword based search on all the 4600 or so hours of recorded material. When completed, this will
also be the first time in technical education all over the world that such facilities and standard
video based metadata are available on the internet for the entire science and engineering
curriculum.
3. The availability of complete text material for technology courses will enable the lectures to be subtitled
in English, Hindi and other languages where sufficient technical vocabulary exists. This will help non-
native English speakers and students in rural colleges throughout India to learn the concept through a
partial or complete translation of the spoken content in their first language.
4. Every course with video lectures will be accompanied by quizzes online to facilitate the reader to
understand the material in a focused manner. Also such quizzes can provide the basis for future
University examinations adopting NPTEL contents.
2
Budget Proposed:
SUMMARY:
BUDGET
ESTIMATES:
SUMMARY
ITEM BUDGET (in lakhs of Rupees)
1st Year 2nd Year Total
A. Recurring
1.Salaries/wages Rs.89,40,000 Rs.89,40,000 Rs.1,78,80,000
2. Consumables Rs.60,000 Rs.60,000 Rs.1,20,000
3. Travel Rs. 2,00,000 Rs. 2,00,000 Rs. 4,00,000
4. Other costs Rs. 2,50,000 Rs. 2,50,000 Rs. 5,00,000
B. Equipment Rs.1,26,00,000
Grand total (A+B)
Rs.3,15,00,000
BUDGET FOR SALARIES/WAGES:
ITEM DESCRIPTION (in Rupees)
1st Year (m.m.*) 2nd Year (m.m.) Total (m.m.)
Designation & number of
persons
Monthly
Emoluments
A programmer/System
Analyst with three to five
years of experience for
two years
Rs. 25,000 Rs. 3,00,000 Rs. 3,00,000 Rs. 6,00,000
Six Project Associates
for each branch of
Engineering/Core
Science video program
two years
Rs. 15, 000 Rs. 10,80,000 Rs. 10,80,000 Rs. 21,60,000
3
One attendant Rs. 5,000 Rs.60,000 Rs.60,000 Rs. 1,20,000
Honoraria payments to
faculty for authenticating
course contents (120
Course)
Rs. 25,000 per
courseRs. 15,00,000 Rs. 15,00,000 Rs. 30,00,000
Outsourcing transcription
for 5000 video hours to
produce text and time-
line indexed videos
Rs. 2,400 per
lectureRs.60,00,000 Rs.60,00,000 Rs.1,20,00,000
Total Rs.1,78,80,000
BUDGET FOR CONSUMABLE MATERIALS:
ITEM DESCRIPTION BUDGET (in Rupees)
1st Year 2nd Year Total
Other consumables Rs. 60,000 Rs. 60,000 Rs. 1,20,000
Total Rs. 1,20,000
BUDGET FOR TRAVEL:
ITEM DESCRIPTION BUDGET (in Rupees)
1st Year 2nd Year Total
Coordinator
travel Travel (Only inland travel)
Rs. 2,00,000 Rs. 2,00,000
Rs. 4,00,000
Travel abroad (specify
details)
4
BUDGET FOR OTHER COSTS / CONTINGENCIES:
ITEM DESCRIPTION BUDGET (in Rupees)
1st Year 2nd Year Total
1
One Workshop with about fifty
participants to understand the
technology and provide feedback
on the text transcription
procedures and content costs
Rs. 2,50,000 Rs. 2,50,000
Rs. 5,00,000
BUDGET FOR EQUIPMENT (Computers, peripherals contingency, consumables):
Sl.
No.
Generic name of the Equipment along
with make & model
Imported
/Indigenous
Estimated Costs (in
Foreign Currency
also)*
Spare time for
other users (in
%)
1 Two high speed scanners Rs. 50,000
2 A high quality laser printer Rs. 50,000
3
Contingency expenses for stationeries,
other computer peripherals, Eight Desktop
servers for the Associates
Rs. 5,00,000
4
High end server with 32-64 processors,
adequate memory, storage and warranty
or three years for providing access to all
the contents and video lectures through
the servers for national access
A formal quotation from one of the
vendors is attached along with the
proposal as a basis for arriving at this
limit.
Rs 1,20,00,000
(Rs. 120 lakhs)
Total Rs.1,26,00,000
5
2. Funds received so far and utilized: Financial Statement for the period ending: 30/09/2011 Name of the Co-ordinator : Mangala Sunder K
Title of Project : ICT project on Text transcription of technical video
lectures and creation of searchable video index,
metadata and online quizzes.
Project No. : CCE0910005MHRDKMAN
START DATE : 27/04/2009 CLOSE DATE : 31/03/2012
Budget Head Budget
Allocation Expenditure
Balance Commitments
Expenditure Inclusive of
Commitments
Staff 17880000.00 9463521.00 0.00 9463521.00
Equipment 12600000.00 10751477.00 0.00 10751477.00
Consumables 120000.00 54386.00 0.00 54386.00
Contingencies 0.00 43069.00 0.00 43069.00
Travel 400000.00 254878.00 0.00 254878.00
Components 0.00 0.00 0.00 0.00
Inst. Overhead 0.00 0.00 0.00 0.00
Others 500000.00 275752.00 0.00 275752.00
Total 31500000.00 20843083.00 0.00 20843083.00
A. Total grant received upto the end of this month : 36100545.00
B. Expenditure incurred upto the end of this month : 20843083.00
C. Balance commitment at the end of this month : 0.00
D. Total expenditure + Commitment : 20843083.00
E. Balance of Funds Available [A -(B+C)] : 15257462.00
6
3. Milestones achieved and activities completed:
Summary of activities proposed and current status:
Activities proposed:
Accurately transcribe and certify text files with video images of all lectures from 4,000 hours of
video lectures. Approximately 92,000 print pages (A4) will be made available for online access.
The text files will be certified by the faculty who developed the video courses. This will enable
viewers to browse through authenticated text contents of 4,000 hours of engineering lessons in
video and search for specific topics with the help of powerful search engines.
Text transcription of video will be done semi-automatically by engaging private companies in and
around Chennai and using the expertise available in speech recognition technology at IIT Madras.
To enable this, the video lectures will be transcribed first using private agencies.
Current status:
Lectures Transcribed – 2214 hours (495 hours Edited)
List of total no. of transcribed lectures uploaded on the website to date:
S.No. Department Lectures
( Hours)
1. Basic Courses 147
2. Civil Engineering 387
3. Computer Science & Engineering 689
4. Electrical Engineering 298
5. Electronics & Communication Engineering 348
6. Mechanical Engineering 289
7. Biotechnology 56
Total 2214
7
Summary
Sl. No Activity Timeline
1. Text Transcription of the first 100 video lectures on a trial
basis
September 2009 (Status:
Completed)
2. Editing of 100 transcribed lectures to provide a master copy of
the transcribed lecture for faculty review
November 2009 (status:
completed)
3 Training of machine based transcription process with data
generated from the first 100 lectures by several different
speakers
June 2010 (status: ongoing)
4. Purchase of the video streaming server (64 nodes (eight, dual
quad core processors and 200 TB memory for storing all 5000
hours for various video encoding experiments and for hosting
Sakshat contents as a mirror)
October 2009 (status: completed)
5. Faculty feedback on the first 100 lectures and indexing for
searching across the videos
March 2010 (status: ongoing)
6. Creation of edited transcript for 1000 video lectures and
online quizzes for video lecture courses
June 2010 (Status:
More than 2200 hours of lectures
have been transcribed already
and about 2200 of them have been
uploaded on the trial website and
the editing process is ongoing),
quizzes are being added
7. Process six to be completed for all the 4000 or so hours from
NPTEL Phase I in increments of every 1500 lectures in six
months
March 2011 (status: ongoing)
8. Editing and indexing of all video lectures with subtitles in
English
To be completed by June 2011
(status: ongoing)
8
4. Expenditure details (More than 1lakh) and Website URL
Details of expenditure for the period 01/04/2009 TO 31/12/2010 EXPENDITURE HEAD EQPT
C2356 24/06/2009 NEOTERIC INFOMATIQUE LTD.
APPLE MACBOOK PRO-15
CCE20091004SPLX
CCE005008 36201
172952.00
C4617 25/08/2009 THE COMMISSIONER OF CUSTO
CLEARANCE CHARGES
CCE20091018SPIX
CCE005029 39361
1039577.00
B0514 15/09/2009 M/S REDINGTON DISTRIBUTIO
M5000 SERVER WITH+--
CCE20091018SPIX
CCE005029 46/09
6948433.00
B0972 11/02/2010 M/S STRATUS TECHNOLOGIES
FT SERVER 2600+BANK
CCE20091040SPIX
878002.00
C0691 10/03/2010 SRI BALAJI ENTERPRISES.
COMPAQ 3170L CIRE 2
CCE20091060SPLX
510000.00
C1404 29/03/2010 SBA INFO SOLUTIONS PVT
ACER systems CCE20091059SPLX
151042.00
C0742 07/05/2010 HCL INFOSYSTEMS LIMITED
HCL systems CCE20091058SPLX
117585.00
Website URL
The expenditure details are also available on the website at:
URL : http://textofvideo.nptel.iitm.ac.in/text/expenditure.php
User ID : mech
Password : mmm
In the project proposal, it was suggested that a large part of the funds be allocated for outsourcing text
transcription process. However, till now only one company has been found to be capable of doing
transcription of technical video lectures in Chennai and that too their rate of processing is quite slow due
to non-availability of suitable manpower. Therefore, the coordinator, in consultation with the project office
in IIT Madras, has set up a cell inside IITM and have hired, on an ad hoc basis, a number of engineering
and science graduates to do the transcription in-house, in the interests of completing the project on time.
The coordinator/s may kindly be permitted by the Mission to continue this practice if necessary for the
duration of the project within the hiring norms of IIT Madras and MHRD.
9
5. Details of software created or contents generated:
A website was designed for this text transcription project in which some static pages like Home, About us,
Mission, Technical Details, objectives, FAQ, Contact Us and a module for list of courses were developed.
The website URL is http://textofvideo.nptel.iitm.ac.in/
This site is currently password protected:
user id : mech
password : mmm
Home page :
10
Course page:
11
2262 hours of lectures in mp3 and 2214 hours in pdf formats have been uploaded.
12
Creation of online quiz and self-evaluation by students in these courses through the Internet.
Creation of a thesaurus of Indian pronunciation of technical terms as a suitable database for
future research in speech-to-text translation using AI and other search algorithms.
Extensive indexing of videos for enabling search tools to search through the video.
Sample model developed for indexing of videos:
The creation of text files to act as catalysts in the design and development of digital and online
text books in engineering by the faculty.
13
14
15
6. Infrastructure facilities :
Photos of some of the equipments purchased:
M 5000 Servers
Apple Mac Book PRO15
16
M 5000 Servers – Front view & NAS box
Suggested plan of action for utilization of outcome expected from the project
Indian Institute of Technology Madras has an excellent web and video studio where some part of the
activities related to metadata creation, thesaurus, RSS feeds, Wiki Development, online quizzes can be
carried out with the help of suitably trained engineering and science graduates. It is proposed to employ
about five to ten associates for this purpose for the entire duration of the project.
Text transcription of videos will be done semi-automatically by engaging a private company as well as an
in-house transcription team trained by the PI and using the expertise available in speech recognition
technology at IIT Madras. To enable this, the video lectures are being first transcribed in a raw format
17
18
capturing all spoken text and pauses (for accurate machine reading later). The transcribed data is then
passed on to the research team to train an Indian English Speech recognition system. The data from
future recordings will be transcribed using the bootstrapped speech recognition system. Transcription is
one of the most expensive activities and will utilize about 40 percent of the project costs for approximately
4600 video hours. To establish a working model at the end of this project, and to ensure that faculty who
delivered the video lectures are available for authenticating the transcript, a small token honorarium is
proposed to all the faculty who will authenticate the text created by the project team. An extensive index
of keywords and technical terms will also be created with their help. A thesaurus of these words will be
generated for use with a speech-to-text transcription system such as Dragon Naturally Speaking (or any
other appropriate software). This will enable the development of a fully automatic transcription program
for future when video recordings will be made as part of the second and third phases of the NPTEL
programme. A future application could also be the development of audio indexing tools and standalone
audio tracks of the video lectures. Together with the transcribed and edited text slides and quizzes, the
audio can provide a low-cost substitute to a band-width intensive video transmission.
A national video server is currently being set up (see the pictures of the server and the storage in the
earlier pages) and located at IIT Madras to permit concurrent access of videos, text and search indices
by several thousands at any time and the uninterrupted service will be provided. Scalable processors for
such activities are needed with a cluster containing at least 32 nodes, and preferably 64 nodes, as past
experiences show. At IIT Madras, we have set up two M 5000 servers with 64 nodes each.