03-60-214 computer languages, grammars, and...

38
03-60-214 Computer Languages, Grammars, and Translators Jianguo Lu School of Computer Science University of Windsor

Upload: vuongtram

Post on 13-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

03-60-214ComputerLanguages,Grammars,andTranslators

JianguoLuSchoolofComputerScience

UniversityofWindsor

17-01-10 2

Instructors

–  ProfessorJianguoLu•  Office:LambtonTower5111•  Phone:519-253-3000ext3786•  Email:jluatuwindsor•  Web:hUp://cs.uwindsor.ca/~jlu/214

•  GAs

17-01-10 3

CoursedescripZon•  Computerlanguages,grammars,andtranslators•  Prerequisite:60-100,03-60-212

–  AssignmentswillbeimplementedinJava.•  ObjecZve

–  Knowledgeofcomputerlanguagesandgrammars–  AbletoanalyzeprogramswriUeninvariouslanguages–  Abletotranslatelanguages

•  Contents–  Regularexpressions,finiteautomataandlanguagerecognizers;–  Contextfreegrammar;–  Languagesparsers.

•  Soawaretoolsused–  Programminglanguage:Java(includingtokenizer,regularexpressionpackage)–  Lexicalanalyzer:JLex,–  Parsergenerator:JavaCup

17-01-10 4

WhatisLanguage

•  Language:“anysystemofformalizedsymbols,signs,etc.,usedorconceivedasameansofcommunicaZon.”–  Communicate:totransmitorexchangethoughtorknowledge.

•  Programminglanguage:communicatebetweenapersonandamachine–  Programminglanguageisanintermediary

thought Languages machine

03 60 214: Computer Languages, Grammars, and Translators

17-01-10 5

Hierarchyof(programming)languages

•  Machinelanguage;•  Assemblylanguage:mnemonicversionofmachinecode;•  Highlevellanguage:Java,C#,Pascal;•  Problemoriented;•  Naturallanguage.

thought

Languages machine

Natural Language

High Level Language

Assembly Language

Machine Language

Problem Oriented Language

Closer to humans

Higher level

03 60 214: Computer Languages, Grammars, and Translators

17-01-10 6

Grammar

•  Grammar:thesetofstructuralrulesthatgovernthecomposiZonofsentences,phrases,andwordsinanygivennaturallanguage.--wikipedia

•  Formalgrammar:rulesforformingstringsinaformallanguages

•  Computerlanguagegrammar:rulesforformingtokens,statements,andprograms.

•  Differentlayersofgrammar:–  Regulargrammar(forwords,tokens)–  Contextfreegrammar(forsentences,programs)–  …

03 60 214: Computer Languages, Grammars, and Translators

17-01-10 7

LanguageTranslators

•  Translator:Translateonelanguageintoanotherlanguage(e.g.,fromC++toJava)–  Agenericterm.

•  Forhighlevelprogramminglanguages(suchasjava,C):–  Compiler:translatehighlevelprogramminglanguagecodeintohost

machine’sassemblycodeandexecutethetranslatedprogramatrun-Zme.

–  Interpreter:processthesourceprogramanddataatthesameZme.Noequivalentassemblycodeisgenerated.

•  Assembler:translateanassemblylanguagetomachinecode.

03 60 214: Computer Languages, Grammars, and Translators

17-01-10 8

CompilerandInterpreter

•  Compiler

•  Interpreter

Source Code

Compile Execute Results Object Code

data

Interpret

data

Results Source Code

Compile time Execute time

Compile and run time

03 60 214: Computer Languages, Grammars, and Translators

17-01-10 9

Howdoesacompilerwork

•  Acompilerperformsitstaskinthesamewayhowahumanapproachesthesameproblem

•  Considerthefollowingsentence:“Writeatranslator”

•  Weallunderstandwhatitmeans.Buthowdowearriveattheconclusion?

03 60 214: Computer Languages, Grammars, and Translators

17-01-10 10

Theprocessofunderstandingasentence

1.  Recognizecharacters(alphabet,mathemaZcalsymbols,punctuaZons).–  16explicit(alphabets),2implicit(blanks)

2.  GroupcharactersintologicalenZZes(words).–  3words.–  Lexicalanalysis

3.  Checkthewordsformastructurallycorrectsentence–  “translatorawrite”isnotacorrectsentence–  SyntacZcanalysis

4.  CheckthatthecombinaZonofwordsmakesense–  “digatranslator”isasyntacZcallycorrectsentence–  SemanZcanalysis

5.  Planwhatyouhavetodotoaccomplishthetask–  CodegeneraZon

6.  Executeit.

“Writeatranslator”

03 60 214: Computer Languages, Grammars, and Translators

17-01-10 11

Thestructure(phases)ofacompiler

syntax analyzer

Source code

error handler

Lexical analyzer

improve code

symbol table

generate code

object code

Synthesis Synthesis Analysis

•  Frontend(analysis):dependonsourcelanguage,independentonmachine–  Thisiswhatwewillfocus(mainlytheblueparts).

•  Backend(synthesis):dependentonmachineandintermediatecode,independentofsourcecode.

03 60 214: Computer Languages, Grammars, and Translators

semantic analyzer

17-01-10 12

03 60 214: Computer Languages, Grammars, and Translators

17-01-10 13

Assignmentsoverview

•  Ourfocusisthefrontend–  AutomatedgeneraZonoflexicalanalyzer–  AutomatedgeneraZonofsyntaxanalyzer

syntax Analyzer

Assignment 3

Source code

Lexical Analyzer

Assignment 2

translation Assignment 4

17-01-10 14

Assignments(28%)

•  Assignment1(warmup):RegularexpressioninJava(5%)–  UseStringTokenizerinJDKtotokenizethestrings.–  Useregularexpressionstomatchstrings–  Youwillseethedifficultytoanalyseprogramswithoutadvancedtools

suchasJlexandJavaCup.•  Assignment2(6%)

–  UseJLextobuildalexicalanalyzerforZnyprogram•  Assignment3(6%)

–  Manuallywritearecursivedescendentparsing–  UseJavaCuptogenerateaparser

•  Assignment4(6%)–  TranslatetheZnyprogramtoJavaandactuallyrunit.

•  Assignment5(5%)–  Manuallywritearecursivedescendentparsing

17-01-10 15

Whythiscourse

•  Everyuniversityoffersthistypeofcourses.•  Skillslearnt

–  writeaparser–  processprograms–  re-engineerandmigrateprograms

•  MigratefromC++toC#•  …

–  processdata•  Xml,weblogs,socialnetworks,…

17-01-10 16

Whythiscourse(cont.)

•  TheoreZcalaspectsofprogramming•  Thescienceofdevelopingalargeprogram

–  Nothandcraatheprogram

•  Howto–  definewhetheraprogramisvalid–  Determinewhetheraprogramisvalid–  Generatetheprogram

17-01-10 17

Coursematerials•  Referencebooks(notrequired)

–  Compilers:Principles,Techniques,andTools(2ndEdiZon)byAlfredV.Aho,MonicaS.Lam,RaviSethi,andJeffreyD.Ullman(Aug31,2006)

–  OrA.V.Aho,R.Sethi,andJ.D.Ullman,Compilers:Principles,Techniques,andTools,Addison-Wesley,1988.(Chapter1-5)

–  JohnR.Levine,TonyMason,andDougBrown,Lex&Yacc,O'Reilly&Associates,1992.

•  Onlinemanual–  JavaCup,www.cs.princeton.edu/~appel/modern/

java/CUP/–  JLex,www.cs.princeton.edu/~appel/modern/java/

JLex/

17-01-10 18

Markingscheme

Exams 72% Midterm1 12%

Midterm2 20%

Final 40%

Assignments 28% assignment1 5%assignment2 6%assignment3 6%assignment4 6%

Assignment5

5%

Total 100% 100%

17-01-10 19

Assignments(28%)•  Assignmentsubmission•  Allassignmentsmustbecompletedindividually.

–  AlltheassignmentswillbecheckedbyacopyingdetecZonsystem.

•  Academicdishonesty–  Discussionwithotherstudentsmustbelimitedtogeneraldiscussionof

theproblem,andmustneverinvolveexamininganotherstudent'ssourcecodeorrevealingyoursourcecodetoanotherstudent.

17-01-10 20

Exams(72%)•  Twomidtermexams•  Finalexam•  Closebookexams•  Examscovertopicsinlectures

–  ClassaUendanceisimportant.

•  Examswillcovertopicsinassignments–  Finishingassignmentsisalsoimportant.

•  Whatifyoumissedexam(s)–  Amissedexamwillresultinamarkofzero.Theonlyvalidexcusefor

missinganexamisadocumentedmedicalemergency.

17-01-10 21

StudentMedicalCerAficate[1]

FacultyofSCIENCE

A. TOBECOMPLETEDBYTHESTUDENT:I,____________________________,herebyauthorizeDr.______________________________toprovidethefollowinginformaZontotheUniversityofWindsor

and,ifrequired,tosupplyaddiZonalinformaZontosupportmyrequestforspecialacademicconsideraZonformedicalreasons.MypersonalinformaZonisbeingcollectedundertheauthorityoftheUniversityofWindsorAct1962andwillbeusedforadministraZveandacademicrecord-keeping,academicintegritypurposes,andtheprovisionofservicestostudents.ForquesZonsinconnecZonwiththecollecZonofthisinformaZon,theAssociateDeanofmyFacultymaybecontactedat519-253-3000.

______________________________________________________________________Signature StudentNo. Date

B. TOBECOMPLETEDBYTHEPHYSICIAN:1. IherebycerZfythatIprovidedhealthcareservicestotheabove-namedstudenton

_________________________________________.(insertdate(s)studentseeninyouroffice/clinic)2. ThestudentcouldnotreasonablybeexpectedtocompleteacademicresponsibiliZesforthefollowingreason(inbroadterms):____________________________________________________________________________3. Thisisanacute/chronicproblemforthisstudent.4. Date(s)duringwhichstudentclaimstohavebeenaffectedbythisproblem:

___________________________________________________________________________________

5. UnabletocompleteacademicresponsibiliZesfor: 24hours 2days 3days 4days 5days Other(pleaseindicate)_________________________

6. IfthestudentispermiUedtoconZnuehis/hercourseofstudy,isthemedicalproblemlikelytorecurand

affecthis/herstudiesagain?Yes NoReason:___________________________________________________________________________PHYSICIANVERIFICATIONName:(pleaseprint)_____________________________ RegistraZonNo.________________________Signature:______________________________________ TelephoneNo._________________________Address:_________________________________________________________________________________(stamp,businesscard,orleUerhead

acceptable)PLEASERETAINCOPYFORTHEPATIENT’SCHART.Note:CostofcerAficatetobepaidbystudent.

[1]Thisformhasbeenadapted,withpermission,fromtheUniversityofWindsorFacultyofLawStudentMedicalCerZficateandtheUniversityofWesternOntarioStudentMedicalCerZficate.

17-01-10 22

Labs

•  YoucanaUendallthelabsecZons•  Labsaremainlyusedtohelpyouwiththeassignments

17-01-10 23

InteracZonwithprofessor

•  Duringlectures•  Duringlabs•  Duringofficehours:Wednesday1:00-3:00•  Emails:jluatuwindsor

–  Subjectlinemuststartwith“214”–  Example:Subject:214--Aboutassignment1–  Mailswithoutpropersubjectmaynotberead(andhencenot

answered)–  AUachdetailederrormessages–  Writeyournameintheemail

17-01-10 24

Webcontents

•  Courseplan;•  Slidesforlectures;•  AssignmentdescripZons;•  Linkstotools,manuals,tutorials;•  Listofmarks;•  Announcements;

17-01-10 25

Importantnote

•  PleasenotethatnostudentisallowedtotakeacoursemorethantwoZmeswithoutpermissionfromtheDean.

FrontdeskjobatLeddyLibrary

•  [email protected]•  StarZngnextweek

17-01-10 26

IntroducZontogrammar

JianguoLuSchoolofComputerScience

UniversityofWindsor

17-01-10 28

FormaldefiniZonoflanguage

•  Alanguageisasetofstrings–  Englishlanguage{“thebrowndoglikesagoodcar”,……}

{sentence|sentencewriUeninEnglish}–  Javalanguage{program|programwriUeninJava}–  HTMLlanguage{document|documentwriUeninHTML}

•  Howdoyoudefinealanguage?•  Itisunlikelythatyoucanenumerateallthesentences,

programs,ordocuments

17-01-10 29

Howtodefinealanguage•  HowtodefineEnglish

–  Asetofwords,suchasbrown,dog,like–  Asetofrules

•  Asentenceconsistsofasubject,averb,andanobject;•  ThesubjectconsistsofanopZonalarZcle,followedbyanopZonaladjecZve,and

followedbyanoun;•  ……

–  Moreformally:•  Words={a,the,brown,friendly,good,book,refrigerator,dog,car,sings,eats,likes}•  Rules:

1)  SENTENCEàSUBJECTVERBOBJECT2)  SUBJECTàARTICLEADJECTIVENOUN3)  OBEJCTàARTICLEADJECTIVENOUN4)  ARTICLEàa|the|EMPTY5)  ADJECTIVEàbrown|friendly|good|EMPTY6)  NOUNàbook|refrigerator|dog|car7)  VERBàsings|eats|likes

17-01-10 30

DerivaZonofasentence

•  Rules:1)  SENTENCEàSUBJECTVERBOBJECT2)  SUBJECTàARTICLEADJECTIVENOUN3)  OBEJCTàARTICLEADJECTIVENOUN4)  ARTICLEàa|the|EMPTY5)  ADJECTIVEàbrown|friendly|good|EMPTY6)  NOUNàbook|refrigerator|dog|car7)  VERBàsings|eats|likes

•  DerivaZonofasentence“thebrowndoglikesagoodcar”SENTENCEàSUBJECTVERBOBJECTàARTICLEADJECTIVENOUNVERBOBJECTàthebrowndogVERBOBJECTàthebrowndoglikesARTICLEADJECTIVENOUNàthebrowndoglikesagoodcar

17-01-10 31

Theparsetreeofthesentence

The

VERB SUBJECT OBJECT

SENTENCE

ARTICLE ADJ NOUN ARTICLE ADJ NOUN

brown dog likes a good car

Parse the sentence: “the brown dog likes a good car” The top-down approach

17-01-10 32

TopdownandboUomupparsing

The

VERB SUBJECT OBJECT

SENTENCE

ARTICLE ADJ NOUN ARTICLE ADJ NOUN

brown dog likes a good car

17-01-10 33

Typesofparsers

•  Topdown–  Repeatedlyrewritethestartsymbol–  Findthelea-mostderivaZonoftheinputstring–  Easytoimplement

•  BoUomup–  Startwiththetokensandcombinethemtoforminteriornodesofthe

parsetree–  Findaright-mostderivaZonoftheinputstring–  Acceptwhenthestartsymbolisreached

•  BoUomupismoreprevalent

17-01-10 34

FormaldefiniZonofgrammar

•  Agrammarisa4-tupleG=(Σ,N,P,S)–  Σisafinitesetofterminalsymbols;–  Nisafinitesetofnonterminalsymbols;–  PisasetofproducZons;–  S(fromN)isthestartsymbol.

•  TheEnglishsentenceexample–  Σ={a,the,brown,friendly,good,book,refrigerator,dog,car,sings,

eats,likes}–  N={SENTENCE,SUBJECT,VERB,NOUN,OBJECT,ADJECTIVE,ARTICLE}–  S={SENTENCE}–  P={rule1)torule7)}

17-01-10 35

RecursivedefiniZon

•  Numberofsentencecanbegenerated:ARTICLE ADJ NOUN VERB ARTICLE ADJ NOUN sentences

3* 4* 4* 3* 3* 4* 4* =6912

•  Howcanwedefineaninfinitelanguagewithafinitesetofwordsandfinitesetofrules?

•  Usingrecursiverules:–  SUBJECT/OBJECTcanhavemorethanoneadjecZves:

1)  SUBJECTàARTICLEADJECTIVESNOUN2)  OBEJCTàARTICLEADJECTIVESNOUN3)  ADJECTIVESàADJECTIVE|ADJECTIVESADJETIVE

–  Examplesentence:“thegoodbrowndoglikesagoodfriendlybook”

17-01-10 36

Chomskyhierarchy•  NoamChomskyhierarchyisbasedontheformofproducZonrules•  Generalform

α1 α2 α3 …αn à β1 β2 β3 … βm Whereαandβarefromterminalsandnonterminals,orempty.

•  Level3:Regulargrammar–  Oftheformα à β or α à β1 β2 –  n=1,andαisanonterminal.–  β iseitheraterminaloraterminalfollowedbyanonterminal–  RHScontainsatmostonenon-terminalattherightend.

•  Level2:Contextfreegrammar–  Oftheformα àβ1β2β3…βm

–  α isnonterminal. •  Level1:ContextsensiZvegrammar

–  n<m.Thenumberofsymbolsonthelhsmustnotexceedthenumberofsymbolsontherhs

•  Level0:unrestrictedgrammar

17-01-10 37

ContextsensiZvegrammar

•  CalledcontextsensiZvebecauseyoucanconstructthegrammaroftheform –  AαB à A β B –  AαC à A γ B

•  ThesubsZtuZonofα dependingonthesurroundingcontextAandBorAandC.

17-01-10 38

Review

•  Languages•  Languagetranslators

–  Compiler,interpreter–  Lexicalanalysis–  Parser

•  TopdownandboUomup

•  Grammars–  FormaldefiniZon,Chomskyhierarchy–  Regulargrammar–  Contextfreegrammar