marathi and computer - · pdf filemarathi, computer and internet marathi is a language with...

Download Marathi and Computer - · PDF fileMarathi, computer and Internet Marathi is a language with rich heritage ... Transliteration of the entire English ... make Indian Language pro available

If you can't read please download the document

Upload: vuongkhanh

Post on 05-Feb-2018

231 views

Category:

Documents


3 download

TRANSCRIPT

  • 12gJUH$ Am{U _amR>r

    UNDP defines human development asexpanding the choices for all people in society.This ensures the creation of an enablingenvironment in which all can enjoy long,healthy and creative lives. IT can be used topromote education, learning, communicationand networking, neither technology norproducts can be taken off-the-shelf from othercountries, because the conditions, needs andlanguages are radically different.

    Given the present financial, technical andlinguistic constraints there is an urgent needto improve the conditions for equitable andaffordable access to computers so that thebenefits of IT are distributed across all sectionsof the society.

    It is said that No man is an island. In thesame way, according to me, the days of stand-alone computers are over. Computers all overthe world are networked to form Internet. Thisnew, dynamic, flexible, powerful medium called`Internet is dominated by one language.According to a survey, 85 % of the contentson Internet are in English while less than 5 %of people speak English as their first language.

    Collaborative content development forMarathi The need

    We Indians are proud of EasternCivilization but are in great danger of losingthe advantage our rich civilization has if wedont preserve our cultural heritage.

    Language and culture go hand in hand.Currently, Indian software professionals arebusy making unbelievable progress on allfronts of computerisation all over the world butas far as our own cultural repository isconcerned our contribution is naganya(insignificant).

    Marathi and Computer

    Alka IraniChief Investigator IndiXII, Senior Research Scientist, C-DAC, Mumbai, [email protected]

    with implementation support from Swapnil Hajare et al ofjanabhaaratii (funded by TDIL, MC&T, Govt of India)

    C-DAC, Mumbai (formerly NCST) [email protected]

    Marathi, computer and InternetMarathi is a language with rich heritage

    and abundant literature. For most of thepeople in rural Maharashtra (educated oruneducated) this is the only language ofcommunication. Therefore when one thinksof using computers for public health, ruraldevelopment, education, e-governance andmedia along with enabling computers to read,write, print Marathi text, the interactions withcomputer, the help material etc. and mostimportant the contents to share amongcommunity members need to be in Marathi andin the script used by Marathi i.e. Devanagari.

    We can see quite a few websites inMarathi. Most of the Marathi newspapers Loksatta, Maharashtra Times, have their e-copy on Internet.

    A question to ponder over is how manyMarathi speaking people are using Computersfor interaction (chat) in Marathi or for sendingmails (e-mail) in Marathi.

    Education and ComputerThe conventional education system is

    unable to cope with the rapidly changinginformation needs of the education domain.The vast Knowledge Base available in foreignlanguages can reach masses only if it is madeavailable in local languages. Currently, theeducators are unequipped to handle the rapidpace at which the Information world is moving.If the current model of teaching is to besustained continuous input to the educatorsknowledge and a continuous collaborativelearning mechanism must be available tothem. Getting familiar with this new mediumwill make their job easy and enjoyable. Oncethey have their lessons neatly organised andkept, it will not be necessary to go through the

  • 13gJUH$ Am{U _amR>r

    process of preparing presentation all overagain. They can make incremental changes.Many of the mundane day-to-day jobs likekeeping track of students, their activities,progress can be minimal. Email is a cheap andaffordable medium for information exchangewhile web-pages can be used for publishingas well as for bulletin boards.

    Computer, local languages andEmployment opportunities:

    Ample employment opportunities areavailable, the need is to make local languageslike Marathi successfully to computers. ManyGovernment offices are confronted with thetask of working in Indian languages. Thenationwide project PURA on initiative of ourPresident and a great visionary Abdul Kalamfor Providing Urban amenities for Rural Areasis already working in that direction. Ifcompetent people (teachers/students) aregathered, and resources are provided,educational contents for rural settings can bedeveloped using local skills. Contents will beunderstood much better as they are writtenby the people having similar culturalbackground (peer learning). Localnewspapers, magazines/TV channels need lotof contents to be input, edited and producedin local languages.

    Computer and contentsComputer is an ideal medium for content

    management. With drastic reduction in costsof disk drives and invention of manyaffordable, flexible, compact storing deviceslike pen drives, the contents can be storedefficiently in compact format in a way they canbe accessed anytime, anywhere.

    Some desirable characteristics of contentsare:

    Contents should be in standardizedformats. For a collaborative frameworkto succeed, the methods andconventions used must be stable,universal and scalable.

    Multiple ways of organising, viewing,browsing data must be possible

    Multiple layers (that can be merged orseparated as per the need) buildingvarious hierarchies, abstractions must bepossible.

    Contents should be attractive to view(education should be fun). With use ofmultimedia it is possible.

    Contents management systems shouldhave provision for entering/retrievingcontents for disable people. This meansvisual or speech interfaces should bepossible.

    Content units should be distributed at thesame time fairly independent to allowcontent developers work independent ofComputer Network when it is notaccessible

    The ScenarioMany stand-alone packages for bi-lingual

    text input exist. The two important landmarkachievements in 80s from CDAC are:

    - Rupantar (Software solution)The rupantar phonetic coding schemewas designed during early 1980s at atime when there was no Devanagari text-processor available for inputting the text;the PC revolution had not caught up andmost of the work was done on mainmachines of Digital at NCST.

    It was designed during development oftransliteration softwares Swaroop andthen Rupantar which were basicallycomputer-based systems for convertingnames written in English script intoDevanagari equivalents.

    Transliteration of the entire EnglishTelephone Dictionary in Marathi, sortingand made it ready for printing; also usedthe scheme for printing cheques, printingcertificates, etc.

    - GIST (Hardware Solution)The Graphic and Intelligence basedScript Technology (GIST) group of C-DAC, Pune has introduced a GIST card

  • 14gJUH$ Am{U _amR>r

    for inputting Indian characters on DOSplatform.

    The GIST card has been the cornerstoneof some of the most crucialcomputerization programs of India suchas the Land Records Program, theElection Commission Identity cards, andcitizen surveys.

    - StandardisationFundamentally, computers just deal withnumbers. They store letters and othercharacters by assigning a number foreach one called encoding of thecharacter. Currently there are twostandards for devanagari encoding oncomputers.

    ISCII ISCII is an Indian nationalstandard for character encoding whichwas revised many times. The ISCII-91code retains the standard ASCII codewhile utilizing the upper ASCII codes forIndian scripts. This makes it feasible touse Indian scripts along with Englishcomputers and software in an 8-bitenvironment.

    The ISCII code table is a superset of allthe characters required in the 10 Brahmi-based Indian scripts includingDevanagari. These scripts share a largenumber of structural features betweenthem as a consequence of their commonBrahmi origin.

    - Unicode: character encodingBefore Unicode was invented, there werehundreds of different encoding systemsfor assigning these numbers. No singleencoding could contain enoughcharacters. These encoding systems alsoconflicted with one another.

    Unicode provides a unique number forevery character, no matter what theplatform, no matter what the program, nomatter what the language. The UnicodeStandard has been adopted by Apple, HP,IBM, Microsoft, Oracle, SAP, Sun,

    Sybase, Unisys etc. Unicode is requiredby modern standards such as XML andJava. It is supported in many operatingsystems, all modern browsers, and manyother products. The emergence of theUnicode Standard, and the availability oftools supporting it, are among the mostsignificant recent global softwaretechnology trends.

    Incorporating Unicode into client-serveror multi-tiered applications and websitesoffers significant cost savings over theuse of legacy character sets. Unicodeenables a single software product or asingle website to be targeted acrossmultiple platforms, languages andcountries without re-engineering. It allowsdata to be transported through manydifferent systems without corruption.

    (Unicode and ISCII standards differ. ISCIIis a 8-bit character encoding, Unicode 32-bit character encoding. Unicode wasbased on ISCII-88.) The latest ISCIIStandard is ISCII-91.

    However, even today many standalonepackages text processors for inputtingtext and printing with their ownencodings.

    What was done to enhance quality,inclusiveness, diversity at NCST (now C-DAC Mumbai)

    The two recent projects at NCST (nowC-DAC, Mumbai) is a step in right direction tomake Indian Language pro available tomasses on computers and Internet.

    1. IndiX enabling the medium forreading/writing Marathi

    The basic IndiX agenda is that the textprocessing should be as easy as it is forEnglish user. The technological challengesaddressed by IndiX is to identify the minimal,logical and required changes in Indic textprocess