1 introduction programming language concepts. 2 objective of the course 1. understand concepts of...
TRANSCRIPT
1
Introduction
Programming Language Concepts
2
Objective of the Course
1. Understand concepts of programming languages.
2. Become familiar with different language paradigms and the kind of problems they are for.
3. Learn to use useful tools.
4. Learn basics of some widely used languages C/C++ Scheme (functional language like Lisp) Prolog Python other language, depending on your interest
3
Evaluation Criteria
Class participation and attendance Homework Programming assignments Quizzes Exams
4
Your Responsibility
Perform the role of a student at a level suitable for a first-tier university.
Behavior and activities:
1. Read the assigned text before each lecture
2. Do homework assignments early
3. Participate in class
4. Self-directed learning
5. Analyze your own understanding
5
Instructor's Responsibility
Provide clear guidance on the course of study
Make the requirements clear and unambiguous
Facilitate discussion, answer your questions
Provide helpful evaluation of your understanding
Listen to and respond to feedback
6
Why Tools?
"If the only tool you have is a hammer, then everything looks like a nail."
7
What is a Programming Language?
A vocabulary, set of grammar rules, and associated meanings for communication between people and computers, or communication between computers.
Another definition: A programming language is a notational system for describing computation in machine-readable and human-readable form [K. Louden].
8
Instructions to the CPU
A CPU executes instructions, and uses data, in binary format. An executable computer program may look like this:
001011001010001000010111100000000010110000011000110110000101100100110010100010110000011101001111110110000110000100101101000100010001011100001110…etc…
First instruction to CPUSecond instruction an address for 2nd instruction3rd instruction4th instruction an argument (data) for 4th instruction another argument for 4th instruction5th instruction 6th instruction… more instructions …
9
Running A Program
1. The program and data is loaded into main memory.
2. The address of the program's first instruction is placed in a special register in the CPU: the program counter.
3. The CPU is told to fetch and execute that instruction.
PC Register
Memory
10
What instructions does the CPU understand? CPU instructions cause the CPU to implement logic which is
designed (hardwired) into the CPU. Most CPU instruction are very simple, such as:
LOAD memory_address TO some register
SAVE some register TO memory_address
MOVE some register TO another register
ADD some register TO another register
MULT some register TO another register
COMPARE register TO another register
TEST some register’s value, jump to new instruction if true
JUMP TO new instruction (unconditional branch)
11
Generations of Computer Languages
Machine language: program as series of binary instructions (0 and 1's)
Assembly Language
"High level" or procedural Languages
this is the category we will study
4GL - application specific languages
12
Machine Language
These simple, binary instructions are called machine language instructions.
Writing machine language is slow and difficult, so people invented a symbolic notation for machine language, called assembly language.
Machine Language00101100101000100001111110000000001011000001100011011000010110010011001000011000
Assembly LanguageMOV AX,1F80H
PUSH AX INT 21H POP AX
13
Assembly Language
Simple assembly language instructions have a 1-to-1 translation to machine language.
A program called an assembler converts assembly language into machine language:
unix: as -o output_program input_program.s
Machine Language00101100101000100001111110000000001011000001100011011000010110010011001000011000
Assembly LanguageMOV AX,1F80H
PUSH AX INT 21H POP AX
assembler
14
Viewing Assembly Code on a PC
1. Find a small .exe or .com file, e.g. diskcopy.com
2. Open a “Command Prompt” (DOS) window.
3. Enter “debug filename”.
4. Enter “u” to unassemble and view the program.
5. Enter “q” to quit debug.
C:\WINDOWS\system32> debug diskcopy.com- u0B30:0000 0E PUSH CS0B30:0001 1F POP DS0B30:0002 BA0E00 MOV DX,000E0B30:0005 B409 MOV AH,09- q
machine code(in hex)
assembly language
15
Macro Assembly Language
Simple assembly language instructions are still very tedious and time consuming to input (“code”).
Macro assembly language adds the use of symbolic variable names and compound instructions. A compound instruction is an instruction that is converted into several machine level instructions.
16
Macro Assembly Example
Here is a simple “Hello, World” program in macro assembly language:
.stack
.data message db "Hello world, I’m a program.”, "$" .code main proc
mov ax,seg message mov ds,ax mov ah,09 lea dx,message int 21h mov ax,4c00hint 21h
main endp end main
17
Problems with Assembly Language
Programming is still difficult in macro assembly language: language does not match the way people think programs are long and difficult to understand program logic errors are common and difficult to correct difficult to divide a large problem into small problems
using assembly language assembly language is machine dependent -- won’t run
on a different type of computer! you have to rewrite all your software whenever you
change to different hardware platform
18
Low-level and high-level languages
Higher, more "abstract", languages aid programming and thinking about algorithms.
Separate program logic from hardware implementat'n
.globl sum
.type sum,@functionsum: pushl %ebp movl %esp, %ebp subl $4, %esp movl 12(%ebp), %eax addl 8(%ebp), %eax movl %eax, -4(%ebp) movl -4(%ebp), %eax movl %eax, %eax leave ret
In Assembly Language:int sum( int x, int y ) { int z = x + y; return z;}
Add 2 numbers in C:
Which one is easier to understand?
19
High Level Languages
Higher level languages (also called third generation languages) provide these features:
English-like syntax descriptive names to represent data elements concise representation of complex logic use of standard math symbols:
total = quantity * price * (1 + taxrate); conditional execution and looping expressions:
if ( total > 0 ) then writeln(“The total is “, total)else writeln(“No sale.”);
functional units, such as "functions" or "classes":radius = sqrt( x*x + y*x ); // call sqrt function
20
High Level Language Benefits
programming is faster and easier
abstraction helps programmer think at a higher level
standardized languages enable code to be machine independent, and reduces training effort
complex problems can be divided into smaller problems, each coded and tested separately.
code re-use
provides standard programming interface for hardware interactions, such as input and output.
21
Some higher level languages
There are several broad categories:
Imperative: FORTRAN, COBOL, BASIC, C, Pascal
Functional: Scheme, Lisp
Logic Languages: Prolog
Object-oriented:
Pure object-oriented: Java, Python
Object-oriented & imperative: C++, Perl, Visual Basic
22
Fourth Generation Languages
Fourth Generation Languages (4GL) are application specific. They are tailored for a particular application.
SQL (structured query language) for databases
Postscript page description language for printers
PDF (portable document format) for online documents
HTML and PHP for World Wide Web content
Mathematica for symbolic math on a computer
Mathematica is a proprietary language; the others are industry standards.
23
4GL Example: SQL
SQL is one of the most widely used 4GL. Here are some examples:
Insert data into a database table:
INSERT INTO employees (id, lastName, firstName, Job)VALUES (1445, ‘John’,’Smith',’manager’);
Choose data from a table based on criteria:
SELECT id, lastName, salary FROM employees WHERE ( Job = ‘manager’ );
SQL is declarative: statements say what you want done but not how to do it.
Learn more about SQL at the Open Directory Project:http://dmoz.org/Computers/Programming/Languages/SQL
24
Factors Influencing Language
Many factors influence the nature of a programming language. Some major factors are:
Environment: the capabilities of the machine that we are communicating with.
early computers had very few instructions, no operating system, no memory management.
Application Domain: language is influenced by the kind of information we want to communicate.
most languages are used to specify algorithms. language may contain only statements for the kind of
problem we are interested in solving, e.g., Fortran for scientific/numerical computing.
25
Factors Influencing Language (2)
Methodology: an evolving discipline. new approaches to constructing software are developed
based on experience and need. O-O emerged as response to limitations of imperative
languages in handling complexity. Preference, Economics, and Patronage:
languages have been influenced by the backing of large companies (Fortran by IBM, Java by Sun)
government patronage: Ada by U.S. DoD. preferred by experts:
"structured programming" and modularity made popular by C.S. professors
26
Language Paradigms
Imperative (procedural): traditional sequential programming: program statements operate on variables.
variable represents data in memory locations.
characterized by variables, assignment, and loops.
basic unit of imperative programs in the procedure or function
Examples: Algol, C, Pascal, Ada, FORTRAN
27
Language Paradigms
Functional: functions are first-class entities (can be used same as other forms of data); all execution is by function evaluation
characterized by recursion and functions as data
execution on values, not memory locations -- variables are not necessary!
a function can dynamically define and return a new function: self-evolving code.
Examples: Lisp, Scheme, ML, Haskell
28
Language Paradigms
Logic: program is declarative, it specifies what must be true but not how to compute it.
logic inference the basic control
no sequential operation
non-deterministic: may have many solutions or none
Example: Prolog
29
Language Paradigms
Object-oriented: an object contains its own state (data) and the functions that operate on that state.
program logic is instantiation of objects, messages between objects, encapsulation, and protection.
computing model mostly an extension of imperative.
Examples: C++, C#, Java, Smalltalk, Python
30
More Language Paradigms (1) Declarative: state what needs computing, not how to
compute it (algorithm).
Many 4GL, like SQL and Mathematica share this property.
Prolog is also declarative
31
More Language Paradigms (1)
Concurrent or Parallel: Programming to utilize multiple CPU or multiple threads of execution.
Requires attention to task management, synchronization, and data conflict
sequence of execution may not be predictable. parallel features are often added to existing
programming languages. Examples: threads in Java, C#, and other languages.
MPI (Message Passing Interface) library for cluster and grid computing.
32
Source: http://www.cs.berkeley.edu/~flab/languages.html
Programming Languages used in Open Source projects at SourceForge.net
33
Example: Euclid’s gcd algorithm
Compute the greatest common divisor of two integers. For example, the gcd of 90 and 24 is 6.
34
C
/* “functional” implementation of gcd uses recursion */#include <stdio.h>int gcd(int u, int v) { if (v == 0) return u;else return gcd (v, u % v); // “tail” recursion
}int main() /* test of gcd */{ int x, y; printf("Input two integers: "); // note: use references to read input into x, y scanf("%d%d",&x,&y);
printf("The gcd of %d and %d is %d\n", x, y,gcd(x,y) );
return 0;}
35
Java: imperative style GCDpublic class AnyClassAtAll { /** compute greatest common divisor. * @return the g.c.d. of m and n. * 1 if m and n are zero. */
private static long gcd(long u, long v) {long remainder;if (v < 0) v = -v;while ( v != 0 ) {
remainder = u % v;u = v;v = remainder;
}if ( u == 0 ) return 1; // gcd(0,x) = 1else return (u>0)? u : -u; // absolute value
} ... remainder of the class is irrelevant
36
Java: Object-oriented GCD/** This class finds the GCD of one value (the state of
* the object) with any other value given as parameter.
*/public class GCD { // attribute: state of the object (immutable) private final int value;/** constructor */public GCD( int value ) { this.value = value; }
/** compute GCD of private state and param v */ public int gcd ( int v ) { int u = value; // don't modify object's state while ( v != 0 ) { int t = u % v; u = v; v = t; } return u; }}
37
Scheme
; functional implementation of gcd
; uses recursion
(define (gcd u v)
(if (= v 0) u
(gcd v (modulo u v) )
)
)
Scheme syntax for defining a function:
( define ( function-name param1 param2 ... )
body of function definition
)
38
Scheme application
; using the gcd: perform I/O, invoke gcd(define (euclid)
(display "enter two integers:"); use of variables not really necessary; and not "functional" style(let ( (u (read)) (v (read)) )
(display "the gcd of ")(display u)(display " and ")(display v)(display " is ")(display (gcd u v))(newline)
) )
39
Prolog
/* conditions for GCD */
gcd(U, V, U) :- V = 0.
gcd(U, V, X) :- not (V = 0),
Y is U mod V,
gcd(V, Y, X).
In Prolog, a clause is an assertion that can succeed (be true) or fail of the form:
consequence :- a, b, c.
means: consequence is true if a, b, and c are true.
/* Goal: compute the GCD of 288 and 60. */
gcd(288, 60, X).
40
FORTRAN 77
C Greatest common denominator for real programmers
INTEGER FUNCTION IGCD(U,V) INTEGER U, V, TMP DO WHILE ( V .NE. 0 ) TMP = V V = MOD(U,V) U = TMP END DO IGCD = V RETURN END PROGRAM MAIN WRITE(6,*) "Input two integers:" READ(5,*) I, J WRITE(6,100) I, J, IGCD(I,J) 100 FORMAT("GCD of ",I4," and ",I4," is ",I4) STOP END
Assign returned value
I, J implicitly integer
41
Paradigm use is rarely “pure”
The C gcd() example defines gcd in a functional style, even though C is mainly imperative.
Java can be used to write purely imperative style programs (all static, no objects)
also, in Java primitive data types aren't objects
Scheme uses I/O operations, which depend on sequence and external effects (imperative style)
in a "pure" functional languages, the result of a function depends only the the parameters
this isn't true of I/O operations
42
Language Design
Some conflicting objectives, criteria, and goals
43
Goals for language design
Power
Flexibility
Expressiveness
Writability
Efficient implementation
Support for abstraction
Simplicity
Clarity
Consistency (orthogonality)
Readability
Applicability to problem domain
Portability
44
Readability or writability?
Should programming languages promote the writing of programs or the reading of programs?
Many people (including the writer!) may need to read a program after it is written.
45
Readability or writability?
Q: What does this Perl script do?
#!/usr/bin/perl
foreach $FILE ( @ARGV ) {
open(FILE) || die "Couldn't open $FILE";
while($_ = <FILE>) { print $_; }
close(FILE);
}
46
Language definition
Syntax: defines the grammar of a language. what are valid statements, what is a valid program. given in formal notation such as BNF or ENBF.
Semantics: the meaning of the elements of a language. usually defined in human language formal notations exist, but not widely used can have a static component: type checking, definition
checking, other consistency checks prior to execution. dynamic: run-time checking of array indices, run-time
type determination.
47
Syntax
Defines symbols and grammar of a language. Usually given in Backus-Naur Form or its extensions.
if-statement ::= if ( expression ) statement-block
[ else statement-block ]
statement-block ::= statement ';'
| '{' statement ';' [...] '}'
statement ::= if-statement | assignment-statement |
while-statement | ...etc...
48
Language implementation strategies
Compiler: multi-step process that translates source code into target code; then the user executes the target code.
Interpreter: one-step process in which the source code is executed directly.
Hybrids: "just in time" compilers - Perl
"virtual machine language" - Java, Microsoft .NET languages.
49
Compiler versus Interpreter
Source Program
Source Program
Compiler
Input OutputInterpreter
Input OutputTarget
Program
Execute on machine
Execute on machine
50
Language processing: Interpreted
Interpreted: BASIC, Postscript, Scheme, Matlab The interpreter reads the source program and executes
each command as it reads. The interpreter “knows” how to perform each instruction
in the language.
Source Program Interpreter
Execution
51
Language processing: Compiled
Compiled: C/C++, Pascal, Fortran The compiler converts source code into machine
language to create an object code file. A linker combines object code files and pre-compiled
libraries to produce an executable program (machine language).
52
Compiling a Program
Source Code Compiler
Object Code
LinkerExecutable
Program
Libraries (of object codes)
file.cmain() {printf("hello");exit(0);}
file.obj.sym printfFE048C7138029845AAAF...
printf.obj<obj. code forprintf function>
file.exe<hardware instructions>
53
Typical Phases of a Compiler
Source Program
Lexical Analyzer
Syntax Analyzer
Semantic Analyzer
Intermediate Code Generator
Code Optimizer
Code Generator
Target Program
54
Interpreted versus Compiled
Interpreted Flexible More interactive More dynamic behavior Rapid development Can run program
immediately after writing or changing it
Portable to any machine that has the interpreter
Compiled More efficient execution Extensive data checking More structured Usually more scalable (can
develop large applications) Must (re-)compile program
each time a change is made
Must recompile for new hardware or OS
55
Java: A Hybrid Strategy
Java Compiler: compiles program to create a machine independent byte code.
Java Virtual Machine (interpreter): executes the byte code.
Hello.java javac
compiler
Hello.class
byte codeJava source program
java
Java VM:- byte checker- class loader- interpreter
Libraries
Hello, World!
Program execution
Java Runtime Environment (JRE)
56
Error classification
Lexical: token-level error, such as illegal character (hard to distinguish from syntax errors).
Syntax: error in grammar (missing semicolon or keyword).
Static semantic: non-syntax error detectable prior to execution (e.g., undefined variables, type errors).
Dynamic semantic: non-syntax error maybe detected during execution (e.g., division by 0, array bounds).
Logic: error in algorithm or logical error in its implementation, program not at fault.
57
Notes on error reporting
A compiler will report lexical, syntax, and static semantic errors. It cannot report dynamic semantic errors.
A compiler must recover after finding an error so it can continue to check for more errors. Not easy!
An interpreter will often only report lexical and syntax errors when loading the program. Static semantic errors may not be reported until just prior to execution. Indeed, most interpreted languages (e.g. Lisp, Smalltalk) do not define any static semantic errors.
No translator will report a logic error.
58
Sample Errors (Java):
public int gcd ( int v# ) // lexical error
{ int z = value // syntax error: missing ;
y = v; // static semantic: y undefined
while ( y >= 0 ) // dynamic semantic:
// division by zero
{ int t = y;
y = z % y;
z = t;
}
return y; // logic: should return z
}
59
Identify the errors (Java)
// Compute ratio of a / b
public ratio ( long a; long b )
{ int result;
if ( b => 0 )
Result == a / b;
else
result = 0.0;
return Result;
}
60
Semantic error detected by the linker
/* program contains 2 semantic errors */
int main( ) {
int now;
now = getcurrenttime( );
}
To compile a program without linking it, on Linux use:
gcc -c filename.c
GNU cc (an excellent compiler) doesn't report any errors!
To detect one error, compile and link using:
gcc filename.c
61
The Archetypical semantic/logic error
int n = 0;
if ( n = 1 ) printf("n equals 1");
1. In C an assignment statement resolves to a value equal to the value that was assigned:
x = 2;
results in a value of "2". This makes "x = y = z = 2;" possible.
2. In C, a numeric value can be used as an "if" condition.
0 means value
anything else is true Always prints "n equals 1"
62
The Archetypical semantic/logic error
/* sum input data until a zero is read */
int main( ) {
int x, sum;
sum = 0;
while ( 1 ) {
scanf("%d", &x); // read an integer
if ( x = 0 ) break; // stop if 0 found
sum += x; // else add x to sum
}
}
This error is not detected at all!
Its perfectly legal use of the C language.
63
Wake up!
Abstraction is a key to good software.
64
Abstraction
Abstraction: using one thing to represent another; usually to omit (hide) unimportant details or group similar cases together.
Why Abstraction? Control complexity. In Daily Life:
words and language are abstractions for concepts.
money is an abstraction for value, enabling exchange.
walk, a process of using legs to travel.
lock an abstraction of concept, technology, & process!
65
Abstraction in Programming Languages
In Programming Languages:
everything is an abstraction
x = 10 store 10 in a memory location
y = 2*x load the value of x (memory location) into a register,
load 2 into another register
multiply the values together
save the result in a memory location (called "y')
66
Data Abstraction
Basic Abstraction: Data types. integer, float, double (hides detail of how or where the data is stored)
Structured Abstraction:
Structures:struct node {
int id;char name[80];struct node *next_node; /* point to next */}
Unit Abstraction:Program divided into files for separate compilation, Tables in a database, classes in JavaURL (uniform resource locator) - file://c/temp/junk.txt, ftp://somewhere.com/downloads/junk.txt
67
Control or Process Abstraction
Basic Abstraction: assignment (y = a*x + b),abstracts notion of storing values in memory,"goto" and "break" statements
Structured Abstraction:if - else if - elseloops (for, while), switch-case statementstatement blocks (scope of variables or process)functions and subroutines (procedures)
Unit Abstraction:threads - semi-independent execution unitsprocesses - C fork() to start "child" processes
68
Abstractions
Basic Structured Unit
Data int, char String
class, struct file, package, class (for data hiding)
Control or Process
goto, =
if - then - else while { } procedure
package, API, threads, Ada tasks
69
Abstraction is Key to Programming
Object-Oriented Programming - success is due to a useful abstraction (classes, objects as entities)
World Wide Web - information on the Internet as an interconnected web (plus a good interface :-) and extensible. Processes look like data, too.
Spreadsheet - the original "killer app" for PCs.Useful abstraction of data and its organization.
Q: what useful abstractions contribute the simplicity (and success) of Microsoft Windows and Mac OS?
70
Abstraction and Progress
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems . . . Civilization advances by extending the number of important operations which we can perform without thinking about them.
[A. N. Whitehead, 1911]
71
Abstraction and Proficiency
An abstraction is only useful if it reduces mental effort.
You must understand the abstraction and be proficient in using it to benefit from it.
Intellectual progress depends on this.
O-O Programming is a good example.
Therefore, please study everything (not just this course) with the goal of understanding and proficiency -- not getting a grade.
72
Can we Compare Languages?
With so many languages, how do you choose one?
Is one language able to solve problems that can't be solved in another language?
Theory of Computing seeks to answer these questions.
73
Questions (1)
What is the syntax of a language?
What is meant by the semantics of a language?
There are two strategies for how to process a computer program (source code) so it can be run on a computer.Describe the 2 strategies.
74
Questions (2)
Name the 4 major categories of computer languages.
75
Questions (3)
Name one language from each of these categories:
Imperative
Functional
Logic
Object-oriented