Download - Compiler Khata
-
8/10/2019 Compiler Khata
1/35
Compiler DesignProblem
No
Name of the problem
01. Write a C program for developing a lexical analyzer (LA) that will eliminate white
spaces form a source program in c and collect numbers.
02. Write a C program for developing a lexical analyzer (LA) that will eliminate whitespaces form a source program in c and collect numbers as token and then alsodisplay the token value as attribute.
03. Write a C program for developing a lexical analyzer (LA) that will recognize all
basic data type of C.
04. Write a C program for developing a lexical analyzer (LA) that will recognize allKeywords of C.
05. Write a C program for developing a lexical analyzer (LA) that will eliminate white
spaces and comments form a C program.
06. Write a C program for developing a lexical analyzer (LA) that will recognize
Variables of C a source program.
07. Write a C program for developing a lexical analyzer (LA) that will generate tokenfor a given statement of C source program.
08. Design a compiler front-end based on syntax-directed translation technique that willfunction as an infix translator for a language consists of sequence of expressions
terminated by semicolon.
-
8/10/2019 Compiler Khata
2/35
Problem No.01
Problem Name:
Write a C program for developing a lexical analyzer(LA) that will eliminate
white spaces from a source program in C and collect numbers.
Problem analysis:
Linear analysis is called lexical analysis or scanning. For example ,in lexical analysis the
character in the assignment statement
Position :=initial + rate *60
Would be grouped into the following tokens:
1. The identifier position
2. The assignment symbol:=
3. The identifier initial
4. The plus sign.
5. The identifier the rate
6. The multiplication sign
7. The number 60.
The blanks separating the characters of these tokens would normally be eliminated duringlexical analysis.
The lexical analyzer is the first phase of compiler .Its main task is to read the input character
and produce as output a sequence of tokens that the parser uses for syntax analysis. This
interaction, summarized schematically in fig(a), is commonly implemented by making the
lexical analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next
token command from the parser, the lexical analyzer reads input characters until it can
identify the next token.
source programLexical
anlyzer
parser
Symbol table
-
8/10/2019 Compiler Khata
3/35
Fig.(a): Interaction of lexical analyzer with parser.
Since the lexical analyzer is the part of the compiler that reads the source text, it may also
perform certain secondary tasks at the user interface. One such task is stripping out from the
source program comments and white space in the form of blank ,tab, and newline characters. The
lexical analyzer may keep track of the number of newline characters seen ,so that line numbercan be associated with an error message.
The purpose of the lexical analyzer is to allow white space and numbers to appear within
expressions.
uses getchar() returns token to caller
to read character
pushes back c using
ungetc(c,stdin)
Fig(b):implementing the interaction of source program
Figure (b):suggest how the lexical analyzer ,written as the function lexan in C. The routine
getchar and ungetc from standards include-file take care of input buffering ; lexan
reads and pushes back input characters by calling the routines getchar and ungetc
respectively. With c declared to be a character, the pair of statements
c-getchar(); ungetc(c,stdin);
leaves the input stream undisturbed. The call of getchar assigns the next input character to c
; the call of ungetc pushes back the value c onto the standard stdin.
If the implementation language does not allow data structure to be returned from functions
,then tokens and their attributes have to be passed separately. The function lexan an integer
encoding of a tokens. A token , such as num , can then be encoded by an integer larger than
any integer encoding a character, say 256. We define the statement :
#define NUM 256
The function lexan returns NUM when a sequence of digits is seen in the input. A global
variable tokenval is set to the value of the sequence of digits. Thus , if a 7 is followed
immediately by a 6 in the input , tokenval is assigned the integer value 76.
Lexan()Lexical
analyzer
Tokenval
-
8/10/2019 Compiler Khata
4/35
Code
#include
#include
#include
void main()
{
char t,f;
int n;
FILE *f1, *f2;
f1=fopen("c:\\compile\\input.txt","r");
f2=fopen("c:\\compile\\output.txt","w");
while( (t=getc(f1)) !=EOF)
{
if(t==' ') ;
else if(isdigit(t) && f!='_')
if(65
-
8/10/2019 Compiler Khata
5/35
{
n=0;
while(isdigit(t))
{
putc(t,f2);
n=n*10+(t-48);
t=getc(f1);
}
printf("%d\n",n);
}
else putc(t,f2);
}
fclose(f1);
fclose(f2);
return(0);
}
INPUT:
void main(){
FILE *f1,*f2;long int a;
char c[100];f1=fopen ("testinput.cpp","r");f2=fopen("testoutput.cpp","w");
while(fscanf(f1,"%s",c)!=EOF) /* reading value from file */{int line=1;
if(c[0]=='\n'){fprintf(f2,"\n",);line++;}
else if(!isdigit(c[0]))
-
8/10/2019 Compiler Khata
6/35
fprintf(f2,"%s",c);else if(isdigit(c[0])
{a=c[0]-'10';int i=1;
j=120;
while(isdigit(c[i])){a=a*10+c[i]-'0';i++;}
printf("Number %ld in line no. %d\n",a,line);
}}}
OUTPUT:
voidmain()
{
FILE*f1,*f2;longinta;charc[100];
f1=fopen("testinput.cpp","r");f2=fopen("testoutput.cpp","w");while(fscanf(f1,"%s",c)!=EOF)/*readingvaluefromfile*/{
intline=Num(1);if(c[0]=='\n'){fprintf(f2,"\n",);line++;}
elseif(!isdigit(c[0]))fprintf(f2,"%s",c);
elseif(isdigit(c[0]){a=c[0]-'Num(10)';
inti=Num(1);j=Num(120);
while(isdigit(c[i])){a=a*Num(10)+c[i]-'Num(0)';
i++;}
printf("Number%ldinlineno.%d\n",a,line);}}}
Result and Discussion:
This program has been written in C/C++ language and successfully eliminate white space from
a source program and collect number as Num.
-
8/10/2019 Compiler Khata
7/35
Problem No.02
Problem Name:
Write a C program for developing a lexical analyzer(LA) that will eliminate white
spaces from a source program in C and collect numbers as token and then also
display the token and token value attribute.
Problem analysis:
Linear analysis is called lexical analysis or scanning. For example ,in lexical analysis the
character in the assignment statement
Position :=initial + rate *60
Would be grouped into the following tokens:
1. The identifier position
2. The assignment symbol:=
3. The identifier initial
4. The plus sign.
5. The identifier the rate
6. The multiplication sign
7. The number 60.
The blanks separating the characters of these tokens would normally be eliminated during
lexical analysis.
The lexical analyzer is the first phase of compiler .Its main task is to read the input character
and produce as output a sequence of tokens that the parser uses for syntax analysis. This
interaction, summarized schematically in fig(a), is commonly implemented by making the
lexical analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next
token command from the parser, the lexical analyzer reads input characters until it can
identify the next token.
source program
Fig.(a): Interaction of lexical analyzer with parser.
Lexical
anlyzer
parser
Symbol table
-
8/10/2019 Compiler Khata
8/35
Since the lexical analyzer is the part of the compiler that reads the source text, it may also
perform certain secondary tasks at the user interface. One such task is stripping out from the
source program comments and white space in the form of blank ,tab, and newline characters. The
lexical analyzer may keep track of the number of newline characters seen ,so that line number
can be associated with an error message.
Tokens: The smallest individual unit in a source program are known as token.
CODE
#include
#include
#include
void main()
{
clrscr();
char t,f;
int n;
FILE *f1, *f2;
f1=fopen("c:\\compile\\input.txt","r");
f2=fopen("c:\\compile\\output.txt","w");
printf(" Token Token value as attributes\n-----------------------------------------");
while( (t=getc(f1)) !=EOF)
{
if(t==' ') ;
else if(isdigit(t) && f!='_')
if(65
-
8/10/2019 Compiler Khata
9/35
putc(t,f2);
}
else
{
n=0;
while(isdigit(t))
{
putc(t,f2);
n=n*10+(t-48);
t=getc(f1);
}
printf("\n num %d",n);
if(t!=' ') putc(t,f2);
}
else putc(t,f2);
f=t;
}
fclose(f1);
fclose(f2);
getch();
}
-
8/10/2019 Compiler Khata
10/35
INPUT:
void main(){
FILE *f1,*f2;long int a;char c[100];f1=fopen ("testinput.cpp","r");
f2=fopen("testoutput.cpp","w");while(fscanf(f1,"%s",c)!=EOF)
{int line=1;if(c[0]=='\n')
{fprintf(f2,"\n",);line++;}else if(!isdigit(c[0]))fprintf(f2,"%s",c);
else if(isdigit(c[0]){a=c[0]-'10';int i=1;
j=120;
}}}
OUTPUT:
voidmain(){
FILE*f1,*f2;
longinta;charc[100];f1=fopen("testinput.cpp","r");f2=fopen("testoutput.cpp","w");while(fscanf(f1,"%s",c)!=EOF)
{intline=1;if(c[0]=='\n'){fprintf(f2,"\n",);line++;}elseif(!isdigit(c[0]))
fprintf(f2,"%s",c);elseif(isdigit(c[0])
{a=c[0]-'10';
inti=1;j=120;}
}}
-
8/10/2019 Compiler Khata
11/35
NUM 1NUM 10
NUM 1NUM 120NUM 10NUM 0
Result and Discussion:
This program has been written in C/C++ language and successfully eliminate white space from
a source program and collect numbers as token and then also display the token and token value
as attributes.
-
8/10/2019 Compiler Khata
12/35
Problem Name.03
Write a C program for developing a lexical analyzer(LA) that will recognize all
basic data types of C.
Problem analysis:
The lexical analyzer is the first phase of compiler .Its main task is to read the input character
and produce as output a sequence of tokens that the parser uses for syntax analysis. This
interaction, summarized schematically in fig(a), is commonly implemented by making the
lexical analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next
token command from the parser, the lexical analyzer reads input characters until it can
identify the next token.
source program
Fig.(a): Interaction of lexical analyzer with parser.
Since the lexical analyzer is the part of the compiler that reads the source text, it may also
perform certain secondary tasks at the user interface. One such task is stripping out from the
source program comments and white space in the form of blank ,tab, and newline characters. The
lexical analyzer may keep track of the number of newline characters seen ,so that line number
can be associated with an error message. The basic data types in a c program are int, float, char
,double, longint .
CODE
#include
#include
#include
void main()
{
clrscr();
Lexical
anlyzer
parser
Symbol table
-
8/10/2019 Compiler Khata
13/35
char *ch;
FILE *f1;
f1=fopen("c:\\compile\\input.txt","r");
while((fscanf(f1,"%s",ch)) !=EOF)
{
if(strcmp("int",ch)==0 || strcmp("char",ch)==0 || strcmp("float",ch)==0 ||
strcmp("double",ch)==0)
printf("%s\n",ch);
}
fclose(f1);
getch();
}
INPUT:
int main(){
int a,b,c;float s;chart s;
}
OUTPUT:
int
float
char
Result and Discussion:
This program has been written in C/C++ language and that will successfully recognize all basic
data types of C.
-
8/10/2019 Compiler Khata
14/35
Problem No.04
Problem Name:
Write a C program for developing a lexical analyzer(LA) that will recognize all
Keywords of C.
Problem analysis:
The lexical analyzer is the first phase of compiler .Its main task is to read the input character
and produce as output a sequence of tokens that the parser uses for syntax analysis. This
interaction, summarized schematically in fig(a), is commonly implemented by making the
lexical analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next
token command from the parser, the lexical analyzer reads input characters until it canidentify the next token.
source program
Fig.(a): Interaction of lexical analyzer with parser.
Since the lexical analyzer is the part of the compiler that reads the source text, it may also
perform certain secondary tasks at the user interface. One such task is stripping out from the
source program comments and white space in the form of blank ,tab, and newline characters. The
lexical analyzer may keep track of the number of newline characters seen ,so that line number
can be associated with an error message.The keyword of C language are
For ,auto ,if,else,break,case,char ,const,continue,default,do,double,enum,float,
goto,int,long,register,return,short,signed,sizeof,static,stuct,switch,typedef,union,unsigned,void,
volatile,while.
Lexical
anlyzer
parser
Symbol table
-
8/10/2019 Compiler Khata
15/35
CODE
#include
#include
#include
void main()
{
clrscr();
char *t;
char *k[]={"auto","break","case","void","char","int","const","continue","default",
"do","double","else","enum","extren","float","if","while","for"};
int n,i;
FILE *f1, *f2;
f1=fopen("c:\\compile\\input.txt","r");
while( (fscanf(f1,"%s",t)) !=EOF)
{
for(i=0;i
-
8/10/2019 Compiler Khata
16/35
INPUT:
#include #include #include
#include
int main(void){
int i,j;
for(j=0;j
-
8/10/2019 Compiler Khata
17/35
Problem No.05
Problem Name:
Write a C program for developing a lexical analyzer(LA) that will eliminate white
spaces and comments from a source program in C .
Problem analysis:
Linear analysis is called lexical analysis or scanning. For example ,in lexical analysis thecharacter in the assignment statement
Position :=initial + rate *60
Would be grouped into the following tokens:
The identifier positionThe assignment symbol:=
The identifier initial
The plus sign.The identifier the rate
The multiplication sign
The number 60.The blanks separating the characters of these tokens would normally be eliminated during lexical
analysis.
The lexical analyzer is the first phase of compiler .Its main task is to read the input character
and produce as output a sequence of tokens that the parser uses for syntax analysis. Thisinteraction, summarized schematically in fig(a), is commonly implemented by making the lexical
analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next token
command from the parser, the lexical analyzer reads input characters until it can identify the next
token.
source program
Fig.(a): Interaction of lexical analyzer with parser.
Since the lexical analyzer is the part of the compiler that reads the source text, it may also
perform certain secondary tasks at the user interface. One such task is stripping out from the
Lexical
anlyzer
parser
Symbol table
-
8/10/2019 Compiler Khata
18/35
source program comments and white space in the form of blank ,tab, and newline characters. The
lexical analyzer may keep track of the number of newline characters seen ,so that line number
can be associated with an error message.
CODE
#include
#include
#include
void main()
{
clrscr();
char t,t1;
int n,s;
FILE *f1, *f2;
f1=fopen("c:\\compile\\input.txt","r");
f2=fopen("c:\\compile\\output.txt","w");
while( (t=getc(f1)) !=EOF)
{
if(t==' ')
;
else if(t=='/')
{
t=getc(f1);
if(t=='*')
{ s=5;
-
8/10/2019 Compiler Khata
19/35
while(s)
{
t=getc(f1);
if(t=='*')
{t=getc(f1); if(t=='/') s=0;}
}
}
}
else putc(t,f2);
}
fclose(f1);
fclose(f2);
getch();
}
INPUT:
#include
#includevoid main(){
clrscr();int p,q,m,n;
printf("How many line ");scanf("%d",&n);/* n is the number of input*/
printf("\n\n");
for(p=1;p
-
8/10/2019 Compiler Khata
20/35
printf("%2d",(m--%10));printf("\n");
}getch();}
OUTPUT:
#include
#includevoidmain(){
clrscr();intp,q,m,n;
printf("Howmanyline");scanf("%d",&n);
printf("\n\n");
for(p=1;p
-
8/10/2019 Compiler Khata
21/35
Problem No.06
Problem Name:
Write a C program for developing a lexical analyzer(LA) that will generate token
for a given statement of C source program.
Problem analysis:
The lexical analyzer is the first phase of compiler .Its main task is to read the input character
and produce as output a sequence of tokens that the parser uses for syntax analysis. This
interaction, summarized schematically in fig(a), is commonly implemented by making the
lexical analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next
token command from the parser, the lexical analyzer reads input characters until it can
identify the next token.
source program
Fig.(a): Interaction of lexical analyzer with parser.
Since the lexical analyzer is the part of the compiler that reads the source text, it may also
perform certain secondary tasks at the user interface. One such task is stripping out from the
source program comments and white space in the form of blank ,tab, and newline characters. The
lexical analyzer may keep track of the number of newline characters seen ,so that line number
can be associated with an error message. The smallest individual unit in a source program are
known as token.
CODING:
#include
#include
#includeint keyword(char buf[]);char
*key[]={"auto","break","case","char","const","continue","default","do","double","else","enum",
"extern","float","for","goto","if","int","long","register","return","short","signed","sizeof","static","struct","switch","typedef","union","unsigned","void","volatile","while","\0"};
void main()
{
Lexical
anlyzer
parser
Symbol table
-
8/10/2019 Compiler Khata
22/35
char c,buf[100];
FILE *f;
f=fopen("c6input.cpp","r");c=getc(f);
printf("Token Attribute value:\n");
while(c!=EOF){int i=0;
if(isalpha(c))
{buf[i]=c;i++;
c=getc(f);
while(isalpha(c)||isdigit(c)||c=='_')
{buf[i]=c;
c=getc(f);
i++;}
buf[i]='\0';
if(keyword(buf)==0)
printf("ID %s\n",buf);else
printf("%s %s\n",buf,buf);
}else if(isdigit(c))
{
int a=c-'0';
c=getc(f);while(isdigit(c))
{
a=a* 10 +c-'0';c=getc(f);
}
if(c=='.'){
c=getc(f);
char b[10];int i=0;
while(isdigit(c)){
b[i]=c;i++;
c=getc(f);
}b[i]='\0';
printf("Num %d.%s\n",a,b);
}else
-
8/10/2019 Compiler Khata
23/35
printf("Num %d\n",a);
}
else if(c==''||c=='='){
char k=c;
c=getc(f);if(c=='='){
printf("RE %c%c\n",k,c);
c=getc(f);}
else
printf("RE %c\n",k);
}else
{
if(c!='\n'&&c!=' ')printf("Punchuation %c\n",c);
c=getc(f);
}
//c=getc(f);}
fclose(f);
}
int keyword(char buf[])
{
int i=0;while(*(key+i)!='\0')
{
if(strcmp(*(key+i),buf)==0)return 1;
i++;
}return 0;
}
INPUT:
(i
-
8/10/2019 Compiler Khata
24/35
OUTPUT:
Token Attribute value:Punchuation (ID i
RE