oops notes unit wise
DESCRIPTION
This is notes for OOPS as per syllabus of UTUTRANSCRIPT
Concept of programming & OOPS
Bipin Tripathy Kumaun Institute Of Technology
CONCEPT OF PROGRAMMING
&
OOPS
Narayan Changder
i
Copyright c©2011 by Narayan Changder
All rights reserved.
ISBN . . .
. . . Publications
To
My dear students
iii
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
Contents
1 UNIX Fundamentals 1
1.1 The Connection Between Unix and C: . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Why Use Unix?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Unix Shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Basic Unix primer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Programming Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.1 Integrated development environment . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.2 History of IDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Developers fundamental such as editor . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4.1 Some fundamental editors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4.2 Full screen editors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4.3 Text editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4.3.1 features of text editors . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 various text editors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5.1 Window editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5.2 Multi window editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5.2.1 IDLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5.3 VI editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5.4 DOS editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 editor and word processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.7 full screen editor and multi window editor . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.8 text files and word processor files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.8.1 Case study of MS-DOS (Microsoft Disk Operating System) Editor. . . . . . . . 10
2 UNIT II 13
2.1 Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.1 Coding standards and guidelines: . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.2 Code review: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.2.1 Code walk through: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.2.2 Code Inspection: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Software and its charactiristics: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Software processes: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Software process models: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.1 Waterfall model: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.1.1 About the Phases: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4.1.2 Brief Description of the Phases of Waterfall Model: . . . . . . . . . . 19
2.4.1.3 History of the Waterfall Model: . . . . . . . . . . . . . . . . . . . . . 20
v
CONTENTS
2.4.1.4 Advantages of the Waterfall Model: . . . . . . . . . . . . . . . . . . . 21
2.4.1.5 Disadvantages of the Waterfall Model: . . . . . . . . . . . . . . . . . . 21
2.4.2 Prototype model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.2.1 Need for a prototype in software development: . . . . . . . . . . . . . 22
2.4.3 Spiral model: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.3.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.3.2 Circumstances to use spiral model: . . . . . . . . . . . . . . . . . . . . 24
2.4.4 Comparison of different life-cycle models: . . . . . . . . . . . . . . . . . . . . . 24
2.5 Software cost estimation: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.5.1 Estimation of LINES OF CODE (LOC) . . . . . . . . . . . . . . . . . . . . . . 25
2.5.2 LOC-based cost estimation: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.5.3 CONSTRUCTIVE COST MODEL (COCOMO) . . . . . . . . . . . . . . . . . 25
2.5.4 Basic COCOMO Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.6 Testing: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.6.1 Aim of testing: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.7 verification and validation: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.8 Unit testing: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.8.0.1 Drivers and Stubs: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.9 Black box testing: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.10 Software development models: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.10.1 Software life cycle activities: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.11 User Interface Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.11.1 Characteristics of a user interface . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.12 Types of User Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.13 Menu-based interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3 OBJECTED ORIENTED CONCEPT 37
3.1 ObjectOriented Programming Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.1.1 OBJECTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.1.2 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1.3 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1.4 Data Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1.5 Data Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1.6 Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1.7 Overloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1.8 Re-usability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2.1 Need for a model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2.2 Unified Modeling Language (UML) . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2.3 UML diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Use Case Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3.1 Purpose of use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3.2 Representation of use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3.2.1 Utility of use case diagrams . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.2.2 Factoring of use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.4 Class diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.4.1 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
Contents
3.4.2 Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.4.3 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5 Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4 UNIT IV 49
4.1 Overview of Data Base Management Systems . . . . . . . . . . . . . . . . . . . . . . . 49
4.1.1 Need to store data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.1.2 Limitations of manual methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.1.3 Why computerized data processing? . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 What the DBMS can do . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3 Data Base Management Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3.1 Extended Database Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3.2 Transaction Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4 Database Administrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.5 Types of Databases and Database Applications . . . . . . . . . . . . . . . . . . . . . . 52
4.6 Advantages of Database Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.7 Database Users: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.7.1 3-Level Database System Architecture: . . . . . . . . . . . . . . . . . . . . . . . 55
4.7.1.1 The Internal Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.8 Levels of Abstraction & database schema . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.8.1 Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.8.2 Data Independence: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.9 History of Data Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.10 Entity Relation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.11 Database design and ER Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.11.1 Entity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.11.2 Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.11.2.1 Types of Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.12 Relationships and Relationship sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.12.1 Degree of a Relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.12.2 Constraints on relationship Types . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.13 Types of Entity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.13.1 Weak Entity Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.14 E-R Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.15 Enhanced-ER (EER) Model Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.15.1 Subclasses and Super classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.15.2 Specialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.15.3 Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.15.4 Constraints on Specialization and generalization . . . . . . . . . . . . . . . . . 69
4.16 Relational Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.16.1 Mathematical Definition of Relation . . . . . . . . . . . . . . . . . . . . . . . . 71
4.16.2 Properties of Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.16.3 Relational Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.16.4 Relational Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.17 Query Languages: Relational Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.17.1 Relational Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.18 SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
vii
CONTENTS
4.18.1 Objectives of SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.18.2 History of SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.19 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.19.1 Data Redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.19.1.1 Update Anomalies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.19.2 Lossless-join and Dependency Preservation Properties . . . . . . . . . . . . . . 77
4.19.3 Functional Dependency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.19.3.1 Armstrongs axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.19.4 The Process of Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
List of Figures
1.1 Anatomy of UNIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 The software development process represented in the waterfall model. . . . . . . . . . 20
2.2 Winston W. Royce (1929-1995) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 The software development process represented in the spiral model. . . . . . . . . . . . 23
2.4 Barry W. Boehm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5 Font size selection using scrolling menu . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1 A remote control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2 Different types of diagrams and views supported in UML . . . . . . . . . . . 41
3.3 Use case model for tic-tac-toe game . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4 Use case model for Supermarket Prize Scheme . . . . . . . . . . . . . . . . . . 44
3.5 Representation of use case generalization . . . . . . . . . . . . . . . . . . . . . . 45
3.6 Association between two classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.1 A simplified database system environment. . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 Database system & its application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3 Three tire architecture of database system. . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4 Process of database access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.5 Entities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.6 Two different entities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.7 Entities with relationship. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.8 Ternary relationship. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.9 quaternary relationship. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.10 E-R diagram symbols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.11 E-R diagram for a Company Schema. . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.12 E-R diagram for a Bank database Schema. . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.13 Specialization of an Employee based on Job Type. . . . . . . . . . . . . . . . . . . . . 67
4.14 Specialization of an Employee based on Job Type. . . . . . . . . . . . . . . . . . . . . 68
4.15 Generalization of a vehicle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.16 constraints on Specialization & Generalization. . . . . . . . . . . . . . . . . . . . . . . 70
4.17 Different key and attributes shown as table format. . . . . . . . . . . . . . . . . . . . . 71
4.18 Five basic operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.19 Relationship Between Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
ix
LIST OF FIGURES
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
Chapter 1
UNIX Fundamentals
The Unix operating system found its beginnings in MULTICS, which stands for Multiplexed Operating
and Computing System. The MULTICS project began in the mid 1960s as a joint effort by General
Electric, Massachusetts Institute for Technology and Bell Laboratories. In 1969 Bell Laboratories
pulled out of the project.
One of Bell Laboratories people involved in the project was Ken Thompson. He liked the potential
MULTICS had, but felt it was too complex and that the same thing could be done in simpler way.
In 1969 he wrote the first version of Unix, called UNICS. UNICS stood for Uniplexed Operating and
Computing System. Although the operating system has changed, the name stuck and was eventually
shortened to Unix.
Ken Thompson teamed up with Dennis Ritchie, who wrote the first C compiler. In 1973 they
rewrote the Unix kernel in C. The following year a version of Unix known as the Fifth Edition was
first licensed to universities. The Seventh Edition, released in 1978, served as a dividing point for two
divergent lines of Unix development. These two branches are known as SVR4 (System V) and BSD.
Ken Thompson spent a year’s sabbatical with the University of California at Berkeley. While there
he and two graduate students, Bill Joy and Chuck Haley, wrote the first Berkely version of Unix, which
was distributed to students. This resulted in the source code being worked on and developed by many
different people. The Berkeley version of Unix is known as BSD, Berkeley Software Distribution.
From BSD came the vi editor, C shell, virtual memory, Sendmail, and support for TCP/IP.
For several years SVR4 was the more conservative, commercial, and well supported. Today SVR4
and BSD look very much alike. Probably the biggest cosmetic difference between them is the way the
ps command functions.
The Linux operating system was developed as a Unix look alike and has a user command interface
that resembles SVR4.
1.1 The Connection Between Unix and C:
At the time the first Unix was written, most operating systems developers believed that an operating
system must be written in an assembly language so that it could function effectively and gain access
to the hardware. Not only was Unix innovative as an operating system, it was ground-breaking in
that it was written in a language (C) that was not an assembly language.
The C language itself operates at a level that is just high enough to be portable to variety of
computer hardware. A great deal of publicly available Unix software is distributed as C programs
that must be complied before use.
1
CHAPTER 1. UNIX FUNDAMENTALS
Many Unix programs follow C’s syntax. Unix system calls are regarded as C functions.What
this means for Unix system administrators is that an understanding of C can make Unix easier to
understand.
1.1.1 Why Use Unix?.
One of the biggest reasons for using Unix is networking capability. With other operating systems,
additional software must be purchased for networking. With Unix, networking capability is simply
part of the operating system. Unix is ideal for such things as world wide e-mail and connecting to the
Internet.
Unix was founded on what could be called a ”small is good” philosophy. The idea is that each
program is designed to do one job well. Because Unix was developed different people with different
needs it has grown to an operating system that is both flexible and easy to adapt for specific needs.
Unix was written in a machine independent language. So Unix and unix-like operating systems
can run on a variety of hardware. These systems are available from many different sources, some of
them at no cost. Because of this diversity and the ability to utilize the same ”user-interface” on many
different systems, Unix is said to be an open system.
Unix Anatomy
As is illustrated here, Unix is a multi layered system. In a strictly physical sense the kernel, shell, and
utilities rest on the hardware. Logically the system is arranged as the diagram shows.
Although the words ”operating system” are frequently used to refer to the kernel, shells, and the
utilities or commands. Technically utilities are not part of the operating system. Utilities that come
with the operating system are basic tools that have evolved into standard Unix commands. They
make the operating system more immediately useful to the user, but only the kernel and the shell are
truly the operating system.
Figure 1.1: Anatomy of UNIX
Each layer in the system can be thought of as a
country with its own indigenous language. Hardware
includes the physical parts of the system: monitor,
mouse, printer, cables, chips, etc. The hardware layer
and the user layer are monolingual. They can only
talk to one other layer of the system. The hardware
can converse with the kernel. The kernel is bilingual.
It can talk to the hardware or to the shell. The shell
is multi lingual and can talk to any part of the system
with the exception of the hardware. The shell can be
thought of as the master interpreter. It retrieves the
file that holds instructions for every command that’s
typed in and it allows users to utilize the services of
the kernel.
Unix utilities can converse with the shell and in-
directly with the kernel. Utilities use functions called
system calls to request services from the kernel. They
cannot interact with the kernel directly.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
1.2. Unix Shell
1.2 Unix Shell
What is a shell? A shell is a command interpreter. While this is certainly true it likely doesn’t
enlighten the reader any further. A shell is an entity that takes input from the user and deals with
the computer rather than have the user deal directly with the computer. If the user had to deal
directly with the computer he would not get much done as the computer only understands strings of
1’s and 0’s. While this is a bit of a misrepresentation of what the shell actually does (the idea of an
operating system is neglected) it provides a rough idea that should cause the reader to be grateful
that there is such a thing as a shell. A good way to view a shell is as follows. When a person drives a
car, that person doesn’t have to actually adjust every detail that goes along with making the engine
run, or the electronic system controlling all of the engine timing and so on. All the user (or driver in
this example) needs to know is that D means drive and that pressing accelerator pedal will make the
car go faster or slower. The dashboard would also be considered part of the the shell since pertinent
information relating to the user’s involvement in operating the car is displayed there. In fact any part
of the car that the user has control of during operation of the car would be considered part of the
shell. I think the idea of what a shell is coming clear now. It is a program that allows the user to
use the computer without him having to deal directly with it. It is in a sense a protective shell that
prevents the user and computer from coming into contact with one another.
1.2.1 Basic Unix primer
Unix comes in a variety of constantly changing flavors (SUNOS, HPUX, BSD and Solaris, just to
name a few). Each of these Unix types will have small variations from all of the others. This may
seem a bit discouraging at first, but in reality each version of Unix has more in common with all of
the others than differences. The ls, for example, will give a listing of the current directory in any Unix
environment. The changes or semantics local to any particular brand of Unix should be explained in
the man pages that come with that particular system. The purpose of this book is not to explore the
differences between differnt Unix flavors but rather to assume that they are all equivalent and look at
how the different shells behave. Hence, the rest of the book assumes a kind of generic Unix operating
system (except where explicitly stated otherwise).
1.3 Programming Environments
ANYONE WHO IS LEARNING to program has to choose a programming environment that makes
it possible to create and to run programs. Programming environments can be divided into two very
different types: integrated development environments and command-line environments. An integrated
development environment, or IDE, is a graphical user interface program that integrates all the aspects
of programming and probably others (such as a debugger, a visual interface builder, and project
management). A command-line environment is just a collection of commands that can be typed in to
edit files, compile source code, and run programs.
1.3.1 Integrated development environment
An integrated development environment (IDE) is a software application that provides comprehensive
facilities to computer programmers for software development. An IDE normally consists of a source
code editor, build automation tools and a debugger. Some IDEs contain compiler, interpreter, or both,
such as Microsoft Visual Studio and Eclipse; others do not, such as SharpDevelop and Lazarus. The
boundary between an integrated development environment and other parts of the broader software
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
3
CHAPTER 1. UNIX FUNDAMENTALS
development environment is not well-defined. Sometimes a version control system and various tools
are integrated to simplify the construction of a GUI. Many modern IDEs also have a class browser, an
object inspector, and a class hierarchy diagram, for use with object-oriented software development.
1.3.2 History of IDE
IDEs initially became possible when developing via a console or terminal. Early systems could not
support one, since programs were prepared using flowcharts, entering programs with punched cards
(or paper tape, etc.) before submitting them to a compiler. Dartmouth BASIC was the first language
to be created with an IDE (and was also the first to be designed for use while sitting in front of a
console or terminal). Its IDE (part of the Dartmouth Time Sharing System) was command-based,
and therefore did not look much like the menu-driven, graphical IDEs prevalent today. However it
integrated editing, file management, compilation, debugging and execution in a manner consistent
with a modern IDE.
Maestro I is a product from Softlab Munich and was the world’s first integrated development
environment Maestro I was installed for 22,000 programmers worldwide. Until 1989, 6,000 installations
existed in the Federal Republic of Germany. Maestro I was arguably the world leader in this field
during the 1970s and 1980s. Today one of the last Maestro I can be found in the Museum of Information
Technology at Arlington.
One of the first IDEs with a plug-in concept was Softbench. In 1995 Computer woche commented
that the use of an IDE was not well received by developers since it would fence in their creativity.
1.4 Developers fundamental such as editor
Before text editors existed, computer text was punched into Hollerith cards with keypunch machines.
The text was carried as a -physical box of these thin cardboard cards, and read into a card-reader.
The first text editors were line editors oriented on typewriter style. Terminals and they did not
provide a window or screen-oriented display they usually had very short commands (to minimize
typing) that reproduced the current line. Among them was a command: to print a selected section (s)
of the file on the typewriter (or printer) in case of necessity. An edit cursor, an imaginary insertion
point, could be moved by special commands that operated with line numbers of specific text strings
(context). Later, the context strings were extended to regular expressions. To see the changes, the
file needed to be printed on the printer. These line-based text editors were considered revolutionary
improvements over keypunch machines. In case typewriter-based terminals were not available, they
were adapted to keypunch equipment. En this case th user needed to punch the commands into the
separate deck of cards and feed them into the computer in order to edit the file.
1.4.1 Some fundamental editors
Some editors include special features and extra functions are as follows.
• Source code editors are text editors with additional functionality to facilitate the production of
source code. These Often feature user-programmable syntax highlighting, and coding tools or
keyboard macros similar to an HTML editor.
• Folding editors. This subclass, includes so-called orthodox editors that are derivatives of Xedit.
The specialized version of folding is usually called outlining.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
1.4. Developers fundamental such as editor
• Outliners, Also called tree-based editors, because they combine a hierarchical outline tree with
an text editor. Folding can generally be considered a generalized form of outlining.
• Mathematicians, physicists, and computer scientists often produce articles and books using TeX
or LaTeX in plain text files. Such documents are often produced by a standard text editor, but
some people use specialized TeX editors.
• World Wide Web programmers are offered a variety of text editors dedicated to the task of web
development: These create the plain text files that deliver web pages. HTML editors include:
Dream weaver, E (text editor), Frontpage, HotDog, Homesite, Nyu, Tidy, GoLive, and BBedit.
Many offer the option of viewing a work in progress on a built-in web browser.
• IDEs (integrated development environments) are designed to manage and streamline larger pro-
gramming projects. They are usually only used for programming as they contain many features
unnecessary for simple text editing.
1.4.2 Full screen editors
Qedit has two different full-screen modes, depending on the type of terminals or emulators we use.
For HP terminals it is called Visual mode, for VT terminals it is called Screen mode.
For HP terminals, Qedit is a screen editor with a line-command window as well as a line mode
interface. This makes it easy, to switch between making source changes and testing programs. In
Visual mode you move around the screen using the cursor keys, edit text using the Delete key, Insert
Char, Delete Line, Insert Line, and so on. You update your page by pressing the Enter key and move
around your file using the function keys (for example, F4=next string, F6next page). When you are
ready to compile, you type the compile command in the home line and press F7.
1.4.3 Text editor
A text editor is a utility program that facilitates the user to create, edit, save and format text files. It
has special commands for manipulating the text files efficiently. The commonly used text-editors in
practice are as under:1. MS-Word: It is a Windows based word processor. It processes textual matter
and creates organized and flawless documents. It is fast to work in MS Word. It has the features like
graphics, OLE (Object Linking and Embedding), Mail Merge, Spell Check etc.
1. MS-word (Used with MS windows OS).
2. Vi editor (Unix and Linux based OS).
3. Gedit (Red-hat Linux OS).
4. Kwrite & Kedit (Red-hat Linux OS).
5. Nano (Red-hat Linux OS)
These can be elaborated as follows:
1. MS-Word: It is a Windows based word processor. It processes textual matter and creates
organized and flawless documents. It is fast to work in MS Word. It has the features like
graphics, OLE (Object Linking and Embedding), Mail Merge, Spell Check etc.
2. Vi-edito: vi is a full screen editor available with Unix and Linux system and is acknowledged
as one of the most powerful editors in any environment. It works in two modes, i.e., insert mode
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
5
CHAPTER 1. UNIX FUNDAMENTALS
and command mode. These are a number of commands that can display the file and allow the
user to add, insert, delete or modify parts of the text.
Other Unix based editors are Lyrics, ed, sed etc.
3. Gedit : This light weight text editor comes with red hat Linux and used with GNOME. It has
simple edit functions (like cut, copy, paste, select) and settings (for indentations word wrap &
spell check). It can be started by typing Gedit from a terminal window.
4. Kwrite : It is an advanced editor that comes with KDE desktop. It has a menu to create, open
and save file. It also includes edit functions like select all, find replace.
5. Kedit: It is another KDE text editor. It allows tO open files from file system or URL. It
includes a convenient tool-bar and a spell checker.
6. Nano: It is an advanced editor being available in Red Hat Enterprise Linux. For example
nano¡ile name¿. This command opens the file named file name in a nano editor
1.4.3.1 features of text editors
Typical features of text editors.
• Search and replace: The process of searching for a word or string in a text file and optionally
replacing the search string with a replacement string. Different methods are employed, Global
Search And Replace, Conditional Search and Replace, Unconditional Search and Replace.
• Cut, copy, and paste: Most text editors provide methods to duplicate and move text within
the file, or between files.
• Text formatting: Text editors often provide basic formatting features like line wrap, auto
indentation, bullet list formatting, comment formatting, and so on. Undo. and redo : As with
word processors, text editors will provide a way to undo and redo and last edit. Often-especially
with older text editorsthere is only one level of edit history remembered and successively issuing
the undo command will only toggle the last change.. Modern or more complex editors usually
provide a multiple level history such that issuing the undo command repeatedly will revert the
document to successively older edits. A separate redo command will cycle the edits forward
toward the most recent changes. The number of changes remembered depends upon the editor
and is often configurable by the user.
• Importing: Reading or merging the contents of another text file into the file currently being
edited. Some text editors provide a way to insert the output of a command issued to the
operating system’s shell.
• Filtering: Some advanced text editors allow you to send all or sections of the file being edited to
another utility and read the result back into the file in place of the lines being filtered. This, for
example, is useful for sorting a series of lines alphabetically or numerically; doing mathematical
computations, and so on.
1.5 various text editors
A text editor is a type of program used for editing plain text files. Text editors are often provided
with operating systems or software development packages, and can be used to change configuration
files and programming language source code.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
1.5. various text editors
Qedit Full-Screen Editor Qedit is Robelles fuIl-scree editor for programmers on MPE and HP.-UX.
Programmers spend a lot of time editing text. Contrary to some management theories,, most people
like to work and accomplish things, and programmers are no exception. Any tool that makes work
easier will attract their loyalty.
Qedit is one of the text editors available to programmers using HP systems. The first version was
released in 1977. The goals in developing Qedit were to provide the maximum power for programming,
with the least complex user interface possible, and with tiny system load. Since 1977, Robelle has
produced 34 Qedit updates, each one adding new features suggested by our users, features such as
COBOL change tags, full-screen editing, user commands, text justify, LaserJet support, porting to
HP-UX, Redo stack, unlimited Undo, and many more. Qedit is now more like a shell than like a
simple editor. It accepts system commands, including command files or shell scripts, as easily as
its own editing commands. The basic goals, however, remain the same. Qedit aims to increase
productivity and reduce system load, by providing the precise functions a programmer needs and
eliminating irritating, redundant, and time-consuming steps (such as saving your file to disc and
exiting the editor in order to compile it, then having to re-enter the editor and reload the file in order
to fix the compile errors).
1.5.1 Window editor
The windows editor is similar to text editor that is oriented around lines. The Microsoft introduced
an editor named windows editor. They precedes screen based text editor and originated in era when
a computer operator typically interacted with teletype, with video display and ability to navigate a
cursor interactively in a document. Window editor is application software that facilitates the users
to develop documents through the use of special features for editing, saving and formatting etc.
In other words multi window editor are the editor which has one application window and multiple
documentation window. The windows editor is a multipurpose text editor for window system, which
combines a standard and easy to use GUI (graphical user interface) along with the function requirement
for the text. This provides an intensive support in the form of text processor and other tools. The
MS-Office is an example of windows editor. In the meanwhile, multiple documentation windows can
be opened within a single application window with an added advantage of editing, formatting and
developing.Hence multi window editor provides multi tasking feature to any software system.
1.5.2 Multi window editor
Multi window editor is similar to text editor that oriented around lines. They precede screen based
text editor and originated in era a where a computer operator typically interacted with teletype, with
video display and ability to navigate a cursor interactively in a document.
Multi windows allow one to open multiple windows at same time. In other word multi window
editor are the editor with one application window and multiple documentation windows.
For example, Microsoft word s a multi window editor which has one application window to provide
interface between the user application and the operating system.
In the mean while, multiple documentation windows can be opened within a single application win-
dow with the advantages of editing, formatting and developing. Hence Multi window editor provides
multi tasking feature to any system software.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
7
CHAPTER 1. UNIX FUNDAMENTALS
1.5.2.1 IDLE
IDLE is an integrated Development Environment for Python, which is bundled in each release of the
programming tool. It is completely written in Python and the Tkinter GUI toolkit (wrapper functions
for Tcl/Tk). Its main features are
• Multi-window text editor with syntax highlighting, auto completion, smart indent and other.
• Python shell with syntax highlighting.
• Integrated debugger with stepping, persistent breakpoints, and call stack visibility. Python is
named after the British comedy group Monty Python. Therefore, the name IDLE can be seen
as an allusion to Eric Idle, one of the group’s founding members.
1.5.3 VI editor
A VI editor is a text editor computer program that is oriented around lines. They precode with only
non screen based editor. A vi editor are limited to primitive text oriented input and output methods.
Most edits are a line-at a time. Typing, editing and document display do not occur simultaneously.
A vi editor is a powerful utility for inserting, deleting and updating the files. It is an editor which is a
default full screen editor any available as UNIX and LINUX operating systems. It is a powerful editor
that helps in editing and creating files very fast. The example of another similar editor to editor is
Nona editor. A vi editor display contents of files, in addition to that it provides facilities to user to
add, delete and to change other parts of text.
VI editor is a full screen editor available with Unix and Linux system and is acknowledged as one
of the most powerful editor in any environment. The features of VI editor are as follows:
• It is an editor available in UNIX and LINUX operating systems.
• It mainly works in two modes.
1. insert mode.
2. Command mode
• There exist a number of commands that can display the file.
• It also allows user to add, remove and update files.
• It provides the feature to insert, delete and update the content of files.
1.5.4 DOS editor
DOS editor is an editor which was firstly introduced by Microsoft Corporation. Therefore it consists
of two words (MS + DOS) which means Micro Soft Disk Operating System The feature of DOS Editor
are:
• It is an application software that facilitates user to develop documents through the use of special
features for editing, saving, formulating etc.
• It is multipurpose text editor for windows system, which combines a standard and easy to use
GUI along with the functional required for time.
• it provides incentive support in the form of text.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
1.6. editor and word processor
1.6 editor and word processor
A text editor is a utility program that facilitates user to create, edit and save or format text files. It
has special commands for manipulating the text files efficiently. The commonly used text editors are
as follows:
1. Notepad.
2. Vi editor.
3. Gedit.
4. Kwrite and Kedit.
5. Nano.
Word Processor is a windows based word processor or Linux based word processor. It processes textual
matter and creates organized and flawless document. It is fast to work in MS Word . It has the features
like graphics, OLE (object linking and embedding), mail merge and spell check etc.Examples of word
processor:
• Microsoft office (a Microsoft Windows product).
• Openorg office (an Redhat Linux product)
1.7 full screen editor and multi window editor
Editor is application software that facilitates user to develop documents through the use of special
features for editing, saving and formulating etc.
Figure 1.2: Difference .
1.8 text files and word processor files
There are important differences between plain text files created by a text editor, and document files
created by word processors such as Microsoft Word, WordPerfect, or OpenOffice.org. Briefly.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
9
CHAPTER 1. UNIX FUNDAMENTALS
1. A plain text file is represented and edited by showing all the characters as they are present in
the file. The only characters usable for mark-up are the control characters of the used character
set; in practice this is newline, tab and form feed. The most commonly used character set is
ASCII, especially recently, as plain text files are more used for programming and configuration
and less frequently used for documentation than in the past.
2. Documents created by a word processor generally contain file format-specific control characters
beyond what is defined in the character set. These enable functions like bold, italic, fonts,
columns, tables, etc. These and other common page formatting symbols were once associated
only with desktop publishing but are now commonplace in the simplest word processor.
3. Word processors can usually edit a plain text file and save in the plain text file format.However
one must take care to tell the program that this is what is wanted.
This is especially important in cases such as source code, HTML, and configuration and control
files. Otherwise the file will contain those special characters unique to the word processor’s file
format and will not be handled correctly by the utility the files were intended for.
1.8.1 Case study of MS-DOS (Microsoft Disk Operating System) Editor.
MS-DOS Editor is a text editor that comes with MS-DOS (since version 5) and 32-bit versions of
Microsoft Windows. Originally (up to MS-DOS 6.22) it was actually QBasic running in editor mode.
With DOS 7 (Windows 95), QBasic was removed and MS-DOS Editor became a standalone program.
Editor is sometimes used as a substitute for Notepad on Windows 9x, where Notepad is limited
to small files only. Editor can edit files that are up to 65,279 lines and up to approximately 5MB in
size. MS-DOS versions are limited to approximately 300KB, depending on how much conventional
memory is free. Editor can be launched by typing it into the Run command dialog on Windows, and
by typing edit into the command line interface (usually cmd.exe.).
1. Features: MS-DOS Editor uses a text user interface and its color scheme can be adjusted. It
has a multiple document interface in which Windows 9x versions can open up to nine files at a
time while L)OS versions are limited to a only one file The screen can be split vertically into
two panes which can be used to view two files simultaneously or different parts of the same file.
It can also open files in binary mode, where a fixed number of characters are displayed per line,
and newlines are treated as any other character. Editor converts Unix newlines to DOS newlines
and has mouse support. Some of these features were added only in 1995 (version 2.0), with the
release of Windows 95.
2. List of DOS commands
1 ACALC.
2 APPEND.
3 ASSIGN.
4 ATTRIB.
5 BACKUP.
6 BASIC.
7 BREAK.
8 CALL.
9 CHCP.
10 CHDIR or CD.
11 CHKDSK.
12 CHOICE.
13 CLS.
14 COMMAND.
15 COMP.
16 COPY.
17 CTTY.
18 DATE.
19 DEBUG.
20 DEFRAG.
21 DEL or ERASE.
22 DELTREE.
23 DIR.
24 DISKCOMP
25 DISKCOPY.
26 DOSKEY.
27 DRVLOCK.
28 DYNALOAD.
29 E ECHO.
30 EDIT.
31 EDLIN.
32 EJECT.
33 EMM386.
34 EXE2BIN.
35 EXIT.
36 FASTOPEN.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
1.8. text files and word processor files
37 FC.
38 FDISK.
39 FIND.
40 FOR.
41 FORMAT.
42 GOTO.
43 GRAFTABL.
44 GRAPHICS.
45 HELP.
46 IF
47 INTERLNK
48 INTERSVR.
49 JOIN
50 KEYB
51 LABEL
52 LOADFIX
53 LOADHIGH or
LH
54 MEM
55 MIRROR
56 MKDIR or MD
57 MODE
58 MORE
59 MOVE
60 MSCDEX
61 MSD
62 NLSFUNC
63 PATH
64 PAUSE
65 POWER
66 PRINT
67 PROMPT
68 QBASIC
69 QCONFIG
70 RECOVER
71 REM
72 RENAME or REN
73 REPLACE
74 RESTORE
75 REXX
76 REXXDUMP
77 RMDIR or RD
78 SCANDISK
79 SET
80 SETVER
81 SHARE
82 SHIFT
83 SMARTDRV
84 SORT
85 SUBST
86 SYS
87 TIME
88 TREE
89 TRUENAME
90 TYPE
91 UNDELETE
92 UNFORMAT
93 VER
94 VERIFY
95 VOL
96 XCOPY
3. Internal Commands of DOS: Internal commands are memory resident commands. They are
resident in the memory when the COMMAND.COM is loaded in the boot up process. Below
shows the list of internal commands.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
11
CHAPTER 1. UNIX FUNDAMENTALS
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
Chapter 2
UNIT II
In this chapter we will talk about Software engineering.There are many reasons why it is interesting
and useful to study Software Engineering. Here are some thoughts on this topic.
1. Software requirements: The elicitation, analysis, specification, and validation of requirements
for software.
2. Software design: The process of defining the architecture, components, interfaces, and other
characteristics of a system or component. It is also defined as the result of that process.
3. Software construction: The detailed creation of working, meaningful software through a
combination of coding, verification, unit testing, integration testing, and debugging.
4. Software testing: The dynamic verification of the behavior of a program on a finite set of
test cases, suitably selected from the usually infinite executions domain, against the expected
behavior.
5. Software maintenance: The totality of activities required to provide cost-effective support to
software.
6. Software configuration management: The identification of the configuration of a system at
distinct points in time for the purpose of systematically controlling changes to the configuration,
and maintaining the integrity and traceability of the configuration throughout the system life
cycle.
7. Software engineering management: The application of management activitiesplanning, co-
ordinating, measuring, monitoring, controlling, and reportingto ensure that the development
and maintenance of software is systematic, disciplined, and quantified.
8. Software engineering process: The definition, implementation, assessment, measurement,
management, change, and improvement of the software life cycle process itself.
9. Software engineering tools and methods: The computer-based tools that are intended
to assist the software life cycle processes, see Computer Aided Software Engineering, and the
methods which impose structure on the software engineering activity with the goal of making
the activity systematic and ultimately more likely to be successful.
10. Software quality: The degree to which a set of inherent characteristics fulfills requirements.
13
CHAPTER 2. UNIT II
2.1 Coding
Good software development organizations normally require their programmers to adhere to some well-
defined and standard style of coding called coding standards. Most software development organizations
formulate their own coding standards that suit them most, and require their engineers to follow these
standards rigorously. A coding standard is desirable for a couple of reasons:
i) Most software companies enforce some kind of coding standard, so this is good experience prepar-
ing for a real-world environment.
ii) Using a coding standard makes code more readable, making it easier for the graders to assist
students with coding problems and allowing them to evaluate the code and return grades more
quickly.
iii) All code will be graded against the same standard, allowing for more uniform grading.
iv) A coding standard gives a uniform appearance to the codes written by different engineers.
v) Easy to understand.
vi) Following a standard keeps the code consistent even if multiple number of coders have worked
over it, over time.
2.1.1 Coding standards and guidelines:
Good software development organizations usually develop their own coding standards and guidelines
depending on what best suits their organization and the type of products they develop.The following
are some representative coding standards.
i) Rules for limiting the use of global: These rules list what types of data can be declared
global and what cannot.
ii) Naming conventions for global variables, local variables, and constant identifiers: A
possible naming convention can be that global variable names always start with a capital letter,
local variable names are made of small letters, and constant names are always capital letters.
iii) Do not use a coding style that is too clever or too difficult to understand:Code
should be easy to understand. Many inexperienced engineers actually take pride in writing
cryptic and incomprehensible code. Clever coding can obscure meaning of the code and hamper
understanding. It also makes maintenance difficult.
iv) The code should be well-documented: As a rule of thumb, there must be at least one
comment line on the average for every three-source line.
v) Avoid goto statements: Use of goto statements makes a program unstructured and makes it
very difficult to.
vi) The length of any function should not exceed 10 source lines: A function that is very
lengthy is usually very difficult to understand as it probably carries out many different functions.
For the same reason, lengthy functions are likely to have disproportionately larger number of
bugs. understand.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
2.1. Coding
2.1.2 Code review:
Code review for a model is carried out after the module is successfully compiled and the all the syntax
errors have been eliminated. Code reviews are extremely cost-effective strategies for reduction in
coding errors and to produce high quality code. Normally, two types of reviews are carried out on the
code of a module. These two types code review techniques are code inspection and code walk through.
2.1.2.1 Code walk through:
The purpose of the code walkthrough is to ensure that the requirements are met, coding is sound, and
all associated documents completed. Code walk throug is an informal code analysis technique. In this
technique, after a module has been coded, successfully compiled and all syntax errors eliminated. A
few members of the development team are given the code few days before the walk through meeting to
read and understand code. Each member selects some test cases and simulates execution of the code
by hand (i.e. trace execution through each statement and function execution). The main objectives
of the walk through are to discover the algorithmic and logical errors in the code. The members
note down their findings to discuss these in a walk through meeting where the coder of the module is
present.
Even though a code walk through is an informal analysis technique, several guidelines have evolved
over the years for making this nave but useful analysis technique more effective. Of course, these guide-
lines are based on personal experience, common sense, and several subjective factors. Therefore, these
guidelines should be considered as examples rather than accepted as rules to be applied dogmatically.
Some of these guidelines are the following.
i) The team performing code walk through should not be either too big or too small. Ideally, it
should consist of between three to seven members.
ii) In order to avoid the feeling among engineers that they are being evaluated in the code walk
through meeting, managers should not attend the walk through meetings.At least two people are
required for the code walkthrough:
• Developer - who wrote the code
• Reviewer - who reviews the code
iii) Discussion should focus on discovery of errors and not on how to fix the discovered errors.
Design is presented at the code walkthrough in order for the Reviewer to understand the context of
the software change. Depending upon the scope of the software change, the design does not need to
be written, but may be presented orally.
Outcome of the Code Walkthrough:
There can be three outcomes to the code walkthrough:
i) Successful: all required checks and quality are present. Software may be released into the
approved build.
ii) Corrective Action Needed, No Further Review Needed: a list of items to be corrected
are presented. Once these are performed, the code walkthrough is complete.
iii) Corrective Action Needed, Review Needed: a list of items to be corrected are presented.
One these are performed, another code walkthrough is warrented.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
15
CHAPTER 2. UNIT II
If the code walkthrough was successful, then the software change may be released into the approved
build.
2.1.2.2 Code Inspection:
In contrast to code walk through, the aim of code inspection is to discover some common types of
errors caused due to oversight and improper programming. In other words, during code inspection
the code is examined for the presence of certain kinds of errors, in contrast to the hand simulation
of code execution done in code walk throughs. For instance, consider the classical error of writing
a procedure that modifies a formal parameter while the calling routine calls that procedure with a
constant actual parameter. It is more likely that such an error will be discovered by looking for these
kinds of mistakes in the code, rather than by simply hand simulating execution of the procedure. In
addition to the commonly made errors, adherence to coding standards is also checked during code
inspection. Good software development companies collect statistics regarding different types of errors
commonly committed by their engineers and identify the type of errors most frequently committed.
Such a list of commonly committed errors can be used during code inspection to look out for possible
errors.
Following is a list of some classical programming errors which can be checked during code inspec-
tion:
a) Use of uninitialized variables.
b) Jumps into loops.
c) Nonterminating loops.
d) Incompatible assignments.
e) Array indices out of bounds.
f) Improper storage allocation and deallocation.
g) Mismatches between actual and formal parameter in procedure calls.
h) Use of incorrect logical operators or incorrect precedence among operators.
i) Improper modification of loop variables.
j) Comparison of equally of floating point variables, etc.
2.2 Software and its charactiristics:
Software Engineering is concerned with
a) Technical processes of software development.
b) Software project management.
c) Development of tools, methods and theories to support software production.
d) Getting results of the required quality within the schedule and budget.
e) Often involves making compromises.
f) Often adopt a systematic and organized approach.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
2.2. Software and its charactiristics:
g) Less formal development is particularly appropriate for the development of web-based systems.
Software Engineering is important because:
a) Individuals and society rely on advanced software systems.
b) Produce reliable and trustworthy systems economically and quickly.
c) Cheaper in the long run to use software engineering methods and techniques for software systems.
Fundamental activities being common to all software processes:
a) Software specification: customers and engineers define software that is to be produced and the
constraints on its operation.
b) Software development: software is designed and programmed.
c) Software validation: software is checked to ensure that it is what the customer requires.
d) Software evolution: software is modified to reflect changing customer and market requirements.
Software Engineering is related to computer science and systems engineering:
a) Computer science: Concerned with theories and methods.
b) Software Engineering: Practical problems of producing software.
c) Systems engineering: Aspects of development and evolution of complex systems and Specifying
the system, defining its overall architecture, integrating the different parts to create the finished
system.
Essential attributes of good software
a) Maintainability:
• Evolve to meet the changing needs of customers.
• Software change is inevitable (see changing business environment).
b) Dependability and security:
• Includes reliability, security and safety.
• Should not cause physical or economic damage in case of system failure.
• Take special care for malicious users.
c) Efficiency:
• Includes responsiveness, processing time, memory utilization.
• Care about memory and processor cycles
d) Acceptability:
• Acceptable to the type of users for which it is designed.
• Includes understandable, usable and compatible with other systems.
Application types:
a) Stand-alone applications.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
17
CHAPTER 2. UNIT II
b) Interactive transaction-based applications.
c) Embedded control systems.
d) Batch processing systems.
e) Entertainment systems.
f) Systems for modeling and simulation.
g) Data collection systems.
h) Systems of systems.
Software engineering ethics:
a) Confidentiality: Respect confidentiality or employers and clients (whether or not a formal con-
fidentiality agreement has been signed).
b) Competence: do not misrepresent your level of competence (never accept work being outside of
your competence).
c) Intellectual property rights: be aware of local laws governing the use of intellectual property
such as patents and copyright.
d) Computer misuse: do not use technical skills to misuse other peoples computers.
2.3 Software processes:
Main software processes are
• Specification.
• Design and implementation.
• Validation.
• Evolution.
2.4 Software process models:
A software development process, also known as a software development life-cycle (SDLC), is a structure
imposed on the development of a software product.There are several models for such processes, each
describing approaches to a variety of tasks or activities that take place during the process. Some
people consider a life-cycle model a more general term and a software development process a more
specific term. Here we will discuss different types of model.
2.4.1 Waterfall model:
The waterfall model is a sequential design process, often used in software development processes,
in which progress is seen as flowing steadily downwards (like a waterfall) through the phases of
Conception, Initiation, Analysis, Design, Construction, Testing, Production/Implementation, and
Maintenance.
a) Requirements analysis and definition
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
2.4. Software process models:
• Consult system users to establish systems services, constrains and goals.
• They are then defined in detail and serve as a system specification.
b) System and software design:
• Establish an overall system architecture to allocate the requirements to either hardware or
software systems.
• Involves identifying and describing the fundamental software system abstractions and their
relationships.
c) Implementation and unit testing:
• Integrate and test individual program units or programs into complete systems.
• Ensure the software requirements has been met
• Software system is delivered to the customer (after testing).
d) Operation and maintenance:
• Normally the longest life cycle phase.
• System is installed and put into practical use.
• Involves correcting errors, improving the implementation of system units and enhancing the
systems services as new requirements are discovered
2.4.1.1 About the Phases:
the waterfall model has been structured on multiple phases especially to help out the software con-
struction companies to develop an organized system of construction. By following this method, the
project will be divided into many stages thus easing out the whole process. For example you start
with Phase I and according to this model, one only progresses to the next Phase once the previous
one has been completed. This way one moves progressively to the final stage and once that point is
reached, you cannot turn back; similar to the water in a waterfall.
2.4.1.2 Brief Description of the Phases of Waterfall Model:
i) Definition Study / Analysis: During this phase research is being conducted which includes
brainstorming about the software, what it is going to be and what purpose is it going to fulfill.
ii) Basic Design: If the first phase gets successfully completed and a well thought out plan for the
software development has been laid then the next step involves formulating the basic design of
the software on paper.
iii) Technical Design / Detail Design: After the basic design gets approved, then a more
elaborated technical design can be planned. Here the functions of each of the part are decided
and the engineering units are placed for example modules, programs etc.
iv) Construction / Implementation: In this phase the source code of the programs is written.
v) Testing: At this phase, the whole design and its construction is put under a test to check its
functionality. If there are any errors then they will surface at this point of the process.
vi) Integration: in the phase of Integration, the company puts it in use after the system has been
successfully tested.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
19
CHAPTER 2. UNIT II
Analysis
Requirementspecification
Designphase
Implementation
Testing andintegration
Operation andmaintanence
Figure 2.1: The software development process represented in the waterfall model.
vii) Management and Maintenance: Maintenance and management is needed to ensure that the
system will continue to perform as desired.
Through the above mentioned steps it is clearly shown that the Waterfall model was meant to function
in a systematic way that takes the production of the software from the basic step going downwards
towards detailing just like a Waterfall which begins at the top of the cliff and goes downwards but
not backwards.
2.4.1.3 History of the Waterfall Model:
The history of the Waterfall model is somewhat disrupted.It is often said or believed that the
Figure 2.2:Winston W. Royce
(1929-1995)
model was first put forth by Winston Royce in 1970 in one of his articles, whereas
he did not even used the word waterfall. In fact Royce later presented this model
to depict a failure or a flaw in a non-working model. So later on, this term was
mostly used in writing about something that is often wrongly done in the process
of software development- like a common malpractice.
Royce was more of the opinion that a successful model should have the al-
lowance of repetition or to go back and forth between phases which the waterfall
model does not do. He examined the first draft of this model and documented
that a recurrent method should be developed in this model. He felt the need
of progressing only after a feedback from the previous stage has been received.
This is known as the Iterative model As opposed to the Waterfall model, the It-
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
2.4. Software process models:
erative model is more practical and has room for maneuver. Followers of the Iterative method perceive
the Waterfall model as inappropriate.
2.4.1.4 Advantages of the Waterfall Model:
Lets look at some of the advantages of this model.
i) The project requires the fulfillment of one phase, before proceeding to the next. Therefore if
there is a fault in this software it will be detected during one of the initial phases and will be
sealed off for correction.
ii) A lot of emphasis is laid on paperwork in this method as compared to the newer methods. When
new workers enter the project, it is easier for them to carry on the work from where it had been
left. The newer methods dont document their developmental process which makes it difficult for
a newer member of the team to understand what step is going to follow next. The Waterfall
Model is a straight forward method and lets one know easily what stage is in progress.
iii) The Waterfall method is also well known amongst the software developers therefore it is easy to
use. It is easier to develop various software through this method in short span of time.
2.4.1.5 Disadvantages of the Waterfall Model:
There are many disadvantages to the model as well. Lets have a look at those.
i) Many software projects are dependent upon external factors; out of which the client for which
the software is being designed is the biggest factor. It happens a lot of times, that the client
changes the requirement of the project, thereby influencing an alteration in the normal plan of
construction and hence the functionality as well. The Waterfall Model doesnt work well in a
situation like this as it assumes no alteration to occur once the process has started according to
plan.
ii) If, for instance, this happens in a Waterfall Model, then a number of steps would go to waste,
and there would arise a need to start everything all over again. Of course this also brings about
the aspect of time and money which will all go to waste. Therefore this method will not at all
prove to be cost effective. It is not even easy to take out the cost estimate of each step, as each
of the phases is quite big.
iii) There are many other software developmental models which include many of the same aspects
of the Waterfall model. But unlike the Waterfall model, these methods are not largely affected
by the outside sources. In the waterfall model, there are many different people working in the
different phases of the project like the designers and builders and each carries his own opinion
regarding his area of expertise. The design, therefore, is bound to be influenced; however in the
Waterfall model, there is no room for that.
iv) The other negative aspect of this model is that a huge amount of time is also wasted. For example
if we study any software development process, we know that Phase II cannot be executed until
Phase I has been successfully completed; so while the designers are still designing the software,
time of the builders is completely wasted.
v) Another disadvantage of this method is that the testing period comes quite late in the develop-
mental process; whereas in various other developmental programs the designs would be tested a
lot sooner to find the flaw at a time when a lot of time and money has not been wasted.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
21
CHAPTER 2. UNIT II
vi) Elaborate documentation during the Waterfall method has its advantages, but it is not without
the disadvantages as well. It takes a lot of effort and time, which is why it is not suitable for
smaller projects.
2.4.2 Prototype model
Software prototyping, refers to the activity of creating prototypes of software applications, i.e., incom-
plete versions of the software program being developed. It is an activity that can occur in software
development and is comparable to prototyping as known from other fields, such as mechanical engi-
neering or manufacturing.
A prototype is a toy implementation of the system. A prototype usually exhibits limited functional
capabilities, low reliability, and inefficient performance compared to the actual software. A prototype
is usually built using several shortcuts. The shortcuts might involve using inefficient, inaccurate, or
dummy functions. The shortcut implementation of a function, for example, may produce the desired
results by using a table look-up instead of performing the actual computations. A prototype usually
turns out to be a very crude version of the actual system.
2.4.2.1 Need for a prototype in software development:
There are many advantages to using prototyping in software development- some tangible, some ab-
stract.
i) Reduced time and costs: Prototyping can improve the quality of requirements and specifica-
tions provided to developers. Because changes cost exponentially more to implement as they are
detected later in development, the early determination of what the user really wants can result
in faster and less expensive software.
ii) Improved and increased user involvement: Prototyping requires user involvement and
allows them to see and interact with a prototype allowing them to provide better and more
complete feedback and specifications. The presence of the prototype being examined by the user
prevents many misunderstandings and miscommunications that occur when each side believe the
other understands what they said. Since users know the problem domain better than anyone
on the development team does, increased interaction can result in final product that has greater
tangible and intangible quality. The final product is more likely to satisfy the users desire for
look, feel and performance.
Examples for prototype model:
A prototype of the actual product is preferred in situations such as:
i) user requirements are not complete.
ii) technical issues are not clear.
Lets see an example for each of the above category.
Example 1. User requirements are not complete
In any application software like billing in a retail shop, accounting in a firm, etc the users of the
software are not clear about the different functionalities required. Once they are provided with the
prototype implementation, they can try to use it and find out the missing functionalities.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
2.4. Software process models:
Example 2. Technical issues are not clear
Suppose a project involves writing a compiler and the development team has never written a
compiler.
In such a case, the team can consider a simple language, try to build a compiler in order to check
the issues that arise in the process and resolve them. After successfully building a small compiler
(prototype), they would extend it to one that supports a complete language.
2.4.3 Spiral model:
The Spiral model of software development is shown in figure 2.3 The diagrammatic representation of
this model appears like a spiral with many loops. The exact number of loops in the spiral is not fixed.
Each loop of the spiral represents a phase of the software process. For example, the innermost loop
might be concerned with feasibility study. The next loop with requirements specification, the next
one with design, and so on. Each phase in this model is split into four sectors (or quadrants) as shown
in figure 2.3. The following activities are carried out during each phase of a spiral model.
Figure 2.3: The software development process represented in the spiral model.
First quadrant (Objective Setting)
During the first quadrant, it is needed to identify the objectives of the phase. Examine the risks
associated with these objectives.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
23
CHAPTER 2. UNIT II
Second Quadrant (Risk Assessment and Reduction)
A detailed analysis is carried out for each identified project risk. Steps are taken to reduce the risks.
For example, if there is a risk that the requirements are inappropriate, a prototype system may be
developed.
Third Quadrant (Development and Validation)
Develop and validate the next level of the product after resolving the identified risks.
Fourth Quadrant (Review and Planning)
Review the results achieved so far with the customer and plan the next iteration around the spiral.
Progressively more complete version of the software gets built with each iteration around the spiral.
2.4.3.1 History
The spiral model was defined by Barry Boehm in his 1988 article A Spiral Model of Software Devel-
opment and Enhancement. This model was not the first model to discuss iterative development, but
it was the first model to explain why the iteration matters. As originally envisioned, the iterations
were typically 6 months to 2 years long. Each phase starts with a design goal and ends with the client
Figure 2.4: BarryW. Boehm
(who may be internal) reviewing the progress thus far. Analysis and engineering
efforts are applied at each phase of the project, with an eye toward the end goal of
the project.Barry W. Boehm is an American software engineer, TRW Emeritus
Professor of Software Engineering at the Computer Science Department of the
University of Southern California, and known for his many contributions to
software engineering
2.4.3.2 Circumstances to use spiral model:
The spiral model is called a meta model since it encompasses all other life cycle
models. Risk handling is inherently built into this model. The spiral model is suitable for development
of technically challenging software products that are prone to several kinds of risks. However, this
model is much more complex than the other models this is probably a factor deterring its use in
ordinary projects.
2.4.4 Comparison of different life-cycle models:
The classical waterfall model can be considered as the basic model and all other life cycle models as
embellishments of this model. However, the classical waterfall model can not be used in practical
development projects, since this model supports no mechanism to handle the errors committed during
any of the phases.
This problem is overcome in the iterative waterfall model. The iterative waterfall model is probably
the most widely used software development model evolved so far. This model is simple to understand
and use. However, this model is suitable only for well-understood problems; it is not suitable for very
large projects and for projects that are subject to many risks.
The prototyping model is suitable for projects for which either the user requirements or the un-
derlying technical aspects are not well understood. This model is especially popular for development
of the user-interface part of the projects.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
2.5. Software cost estimation:
The evolutionary approach is suitable for large problems which can be decomposed into a set of
modules for incremental development and delivery. This model is also widely used for object-oriented
development projects. Of course, this model can only be used if the incremental delivery of the system
is acceptable to the customer.
The spiral model is called a meta model since it encompasses all other life cycle models. Risk
handling is inherently built into this model. The spiral model is suitable for development of technically
challenging software products that are prone to several kinds of risks. However, this model is much
more complex than the other models this is probably a factor deterring its use in ordinary projects.
The different software life cycle models can be compared from the viewpoint of the customer.
Initially, customer confidence in the development team is usually high irrespective of the development
model followed. During the lengthy development process, customer confidence normally drops off,
as no working product is immediately visible. Developers answer customer queries using technical
slang, and delays are announced. This gives rise to customer resentment. On the other hand, an
evolutionary approach lets the customer experiment with a working product much earlier than the
monolithic approaches. Another important advantage of the incremental model is that it reduces
the customers trauma of getting used to an entirely new system. The gradual introduction of the
product via incremental phases provides time to the customer to adjust to the new product. Also,
from the customers financial viewpoint, incremental development does not require a large upfront
capital outlay. The customer can order the incremental versions as and when he can afford them.
2.5 Software cost estimation:
The task of software cost estimation is to determine how many resources are needed to complete the
project. Usually this estimate is in programmer-months (PM).There are two very different approaches
to cost estimation. The older approach is called LOC estimation, since it is based on initially estimat-
ing the number of lines of code that will need to be developed for the project. The newer approach is
based on counting function points in the project description.
2.5.1 Estimation of LINES OF CODE (LOC)
The first step in LOC-based estimation is to estimate the number of lines of code in the finished project.
This can be done based on experience, size of previous projects, size of a competitors solution, or by
breaking down the project into smaller pieces and then estimating the size of each of the smaller
pieces.A standard approach is, for each piece Pi, to estimate the maximum possible size, MAXi, the
minimum possible size, MINi, and the best guess size, BESTi. The estimate for the whole project is
1/6 of the sum of the maximums, the minimums, and 4 times the best guess:
2.5.2 LOC-based cost estimation:
2.5.3 CONSTRUCTIVE COST MODEL (COCOMO)
COCOMO is the classic LOC cost-estimation formula. It was created by Barry Boehm in the 1970s.
He used thousand delivered source instructions (KDSI) as his unit of size. KLOC is equivalent. His
unit of effort is the programmer-month (PM).
Boehm divided the historical project data into three types of projects:
1. Organic: A development project can be considered of organic type, if the project deals with
developing a well understood application program, the size of the development team is reasonably
small, and the team members are experienced in developing similar types of projects.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
25
CHAPTER 2. UNIT II
2. Semidetached: A development project can be considered of semidetached type, if the devel-
opment consists of a mixture of experienced and inexperienced staff. Team members may have
limited experience on related systems but may be unfamiliar with some aspects of the system
being developed.
3. Embedded: A development project is considered to be of embedded type, if the software
being developed is strongly coupled to complex hardware, or if the stringent regulations on the
operational procedures exist.
Above three product classes correspond to application, utility and system programs, respectively. Nor-
mally, data processing programs are considered to be application programs. Compilers, linkers, etc.,
are utility programs. Operating systems and real-time system programs, etc. are system programs.
System programs interact directly with the hardware and typically involve meeting timing constraints
and concurrent processing.
2.5.4 Basic COCOMO Model
The basic COCOMO model gives an approximate estimate of the project parameters. The basic
COCOMO estimation model is given by the following expressions:
Effort = a1 × (KLOC)a2 PM
Tdev = b1 × (Effort)b2 Months
Where
• KLOC is the estimated size of the software product expressed in Kilo Lines of Code.
• a1,a2,b1,b2 are constants for each category of software products.
• Tdev is the estimated time to develop the software, expressed in months.
• Effort is the total effort required to develop the software product, expressed in person months
(PMs).
2.6 Testing:
Testing is the process of exercising a program with the specific intent of finding errors prior to delivery
to the end user that means to examine the program behaves as expected.If the program fails to
behave as expected, then the conditions under which failure occurs are noted for later debugging and
correction.
Different types of testing are their
1. Acceptance testing: Testing to verify a product meets customer specified requirements. A
customer usually does this type of testing on a product that is developed externally.
2. Black box testing Testing without knowledge of the internal workings of the item being tested.
Tests are usually functional.
3. Compatibility testing:Testing to ensure compatibility of an application or Web site with dif-
ferent browsers, OSs, and hardware platforms. Compatibility testing can be performed manually
or can be driven by an automated functional or regression test suite.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
2.6. Testing:
4. Conformance testing: Verifying implementation conformance to industry standards. Pro-
ducing tests for the behavior of an implementation to be sure it provides the portability, inter-
operability, and/or compatibility a standard defines.
5. Functional testing: Validating an application or Web site conforms to its specifications and
correctly performs all its required functions. This entails a series of tests which perform a
feature by feature validation of behavior, using a wide range of normal and erroneous input
data. This can involve testing of the product’s user interface, APIs, database management,
security, installation, networking, etcF testing can be performed on an automated or manual
basis using black box or white box methodologies.
6. Integration testing: Testing in which modules are combined and tested as a group. Modules
are typically code modules, individual applications, client and server applications on a network,
etc. Integration Testing follows unit testing and precedes system testing.
7. Load testing: Load testing is a generic term covering Performance Testing and Stress Testing.
8. Performance Testing: Performance testing can be applied to understand your application or
web site’s scalability, or to benchmark the performance in an environment of third party products
such as servers and middleware for potential purchase. This sort of testing is particularly useful
to identify performance bottlenecks in high use applications. Performance testing generally
involves an automated test suite as this allows easy simulation of a variety of normal, peak, and
exceptional load conditions.
9. Stress Testing: Testing conducted to evaluate a system or component at or beyond the limits
of its specified requirements to determine the load under which it fails and how. A graceful
degradation under load leading to non-catastrophic failure is the desired result. Often Stress
Testing is performed using the same process as Performance Testing but employing a very high
level of simulated load.
10. regression Testing: Similar in scope to a functional test, a regression test allows a consistent,
repeatable validation of each new release of a product or Web site. Such testing ensures reported
product defects have been corrected for each new release and that no new quality problems were
introduced in the maintenance process. Though regression testing can be performed manually
an automated test suite is often used to reduce the time and resources needed to perform the
required testing.
11. Smoke Testing: A quick-and-dirty test that the major functions of a piece of software work
without bothering with finer details. Originated in the hardware testing practice of turning on
a new piece of hardware for the first time and considering it a success if it does not catch on
fire.
12. System Testing: Testing conducted on a complete, integrated system to evaluate the system’s
compliance with its specified requirements. System testing falls within the scope of black box
testing, and as such, should require no knowledge of the inner design of the code or logic.
13. Unit Testing: Functional and reliability testing in an Engineering environment. Producing
tests for the behavior of components of a product to ensure their correct behavior prior to
system integration.
14. White box Testing: Testing based on an analysis of internal workings and structure of a
piece of software. Includes techniques such as Branch Testing and Path Testing. Also known as
Structural Testing and Glass Box Testing.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
27
CHAPTER 2. UNIT II
2.6.1 Aim of testing:
The overall aim of software testing is to try and improve the quality of the software. Testing aims to
finds defects in the program code (these defects are known as bugs), so that the bugs can be fixed
(this fixing is known as debugging).
Bugs are very common indeed in software. If a program is sufficiently long to do something
interesting, the chances are good that it has a bug in it somewhere! It is not just learner programmers
that make mistakes in their coding; even very experienced programmers make mistakes too. If this
seems surprising, think of it this way: whilst experienced programmers may make fewer elementary
coding mistakes, they can still make subtle hard-to-spot mistakes! Also, experienced programmers
typically write code forming part of large programs, and large programs have plenty of places for bugs
to hide, especially bugs of the subtle sneaky kind.So a good rule-of-thumb is that ”All software
contains bugs, and software testing does its best to find them!”
Be careful, though: just because you test a program doesn’t mean that it is 100% bug-free. Software
testing does help you to find bugs, but it does not give you any guarantees about the quality of a
program. Instead, you should view testing in this way: testing code is definitely better than not
testing it at all, and testing can give you a limited measure of confidence in the code, depending on
how good the testing was.
Testing can only prove the presence of bugs, not their absence.
—————- Edsger W. Dijkstra
Who does the testing?
Testing is a big part of software development. In the world of professional programming, software is
typically developed by teams of programmers, and testing is done on a large-scale. The testers may
not be the same people as the developers; indeed the testers may have been hired specially to do
software testing.
Whilst you are learning to program as an undergraduate at a university, however, testing is rather
different. As a beginning programmer, you are usually encouraged to practise testing on small pro-
grams that you yourself have coded. This is not ideal in some respects, because it is more difficult to
find bugs in your own code rather than someone else’s. However, testing is an important part of your
programming skills, to help you get better at finding and fixing problems in program code.
When do we test programs?
Programs are tested at all stages of development. Typically, as each programmer works on the code,
he or she will test in an incremental way, doing small tests for each small piece of code written, to
check that it seems to be working ok. In addition, there are also large-scale tests carried out when
substantial parts of the code have been assembled.
In general, it is better to find bugs as early in the development as possible. The longer bugs remain
hidden in the code, the greater the risk of writing more wrong code as a result.
2.7 verification and validation:
Verification is the process of determining whether the output of one phase of software development
conforms to that of its previous phase, whereas validation is the process of determining whether a
fully developed system conforms to its requirements specification. Thus while verification is concerned
with phase containment of errors, the aim of validation is that the final product be error free.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
2.8. Unit testing:
2.8 Unit testing:
Unit testing deals with testing a unit as a whole. In Testing of individual software components or
modules. Typically done by the programmer and not by testers, as it requires detailed knowledge of
the internal program design and code. may require developing test driver modules or test harnesses.
In order to test a single module, a complete environment is needed to provide all that is necessary
for execution of the module. That is, besides the module under test itself, the following steps are
needed in order to be able to test the module:
• The procedures belonging to other modules that the module under test calls.
• Nonlocal data structures that the module accesses.
• A procedure to call the functions of the module under test with appropriate parameters.
2.8.0.1 Drivers and Stubs:
It is always a good idea to develop and test software in ”pieces”. But, it may seem impossible because
it is hard to imagine how you can test one ”piece” if the other ”pieces” that it uses have not yet been
developed (and vice versa). To solve this kindof diffcult problems we use stubs and drivers.
2.9 Black box testing:
2.10 Software development models:
The software life cycle is the sequence of different activities that take place during software develop-
ment. Milestones are events that can be used for telling the status of the project. For example, the
event of completing the user manual could be a milestone. For management purposes, milestones are
essential because completion of milestones allow, the manager to assess the progress of the software
development.Two required characteristics of a milestone are
i) It must be related to progress in the software development.
ii) It must be obvious when it has been accomplished.
2.10.1 Software life cycle activities:
a) Feasibility Study
• Feasibility: Determining if the proposed development is worthwhile.
• Market analysis: Determining if there is a potential market for this product.
b) Requirement analysis:
• Requirements: Determining what functionality the software should contain.
• Requirement elicitation: Obtaining the requirements from the user.
• Domain analysis: Determining what tasks and structures are common to this problem.
c) Planning:
• Project planning: Determining how to develop the software.
• Cost analysis: Determining cost estimates.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
29
CHAPTER 2. UNIT II
• Scheduling: Building a schedule for the development.
• Software quality assurance: Determining activities that will help ensure quality of the
product.
• Work-breakdown structure: Determining the subtasks necessary to develop the product.
d) Design phase
• Design: Determining how the software should provide the functionality.
• Architectural design: Designing the structure of the system.
• Interface design: Specifying the interfaces between the parts of the system.
• Detailed design: Designing the algorithms for the individual parts.
e) Implementation: Building the software.
f) Testing phase:
• Testing: Executing the software with data to help ensure that the software works correctly.
• Requirement elicitation: Obtaining the requirements from the user.
• Domain analysis: Determining what tasks and structures are common to this problem.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
2.11. User Interface Design
2.11 User Interface Design
User interface design or user interface engineering is the design of computers, appliances, machines,
mobile communication devices, software applications, and websites with the focus on the user’s expe-
rience and interaction. The goal of user interface design is to make the user’s interaction as simple
and efficient as possible, in terms of accomplishing user goals.
Objectives:
At the end of this section, you will be able to:
• Identify five desirable characteristics of a user interface.
• Differentiate between user guidance and online help system..
• Differentiate between a mode-based interface and the modeless interface.
• Compare various characteristics of a GUI with those of a text-based user interface.
2.11.1 Characteristics of a user interface
It is very important to identify the characteristics desired of a good user interface. Because unless
we are aware of these, it is very much difficult to design a good user interface. A few important
characteristics of a good user interface are the following:
• Speed of learning: A good user interface should be easy to learn. Speed of learning is
hampered by complex syntax and semantics of the command issue procedures. A good user
interface should not require its users to memorize commands. Neither should the user be asked
to remember information from one screen to another while performing various tasks using the
interface. Besides, the following three issues are crucial to enhance the speed of learning:
i) Use of Metaphors and intuitive command names: Speed of learning an interface is
greatly facilitated if these are based on some day-to-day real-life examples or some physical
objects with which the users are familiar. The abstractions of real-life objects or concepts
used in user interface design are called metaphors. If the user interface of a text editor
uses concepts similar to the tools used by a writer for text editing such as cutting lines
and paragraphs and pasting it at other places, users can immediately relate to it. Another
popular metaphor is a shopping cart. Everyone knows how a shopping cart is used to make
choices while purchasing items in a supermarket. If a user interface uses the shopping cart
metaphor for designing the interaction style for a situation where similar types of choices
have to be made, then the users can easily understand and learn to use the interface. Yet
another example of a metaphor is the trashcan. To delete a file, the user may drag it to the
trashcan. Also, learning is facilitated by intuitive command names and symbolic command
issue procedures.
ii) Consistency: Once a user learns about a command, he should be able to use the similar
commands in different circumstances for carrying out similar actions. This makes it easier to
learn the interface since the user can extend his knowledge about one part of the interface
to the other parts. For example, in a word processor, Control-b is the short-cut key to
embolden the selected text. The same short-cut should be used on the other parts of the
interface, for example, to embolden text in graphic objects also - circle, rectangle, polygon,
etc. Thus, the different commands supported by an interface should be consistent.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
31
CHAPTER 2. UNIT II
iii) Component-based interface: Users can learn an interface faster if the interaction style
of the interface is very similar to the interface of other applications with which the user
is already familiar. This can be achieved if the interfaces of different applications are
developed using some standard user interface components. This, in fact, is the theme of
the component-based user interface. Examples of standard user interface components are:
radio button, check box, text field, slider, progress bar, etc.
The speed of learning characteristic of a user interface can be determined by measuring the
training time and practice that users require before they can effectively use the software.
• Speed of use: Speed of use of a user interface is determined by the time and user effort
necessary to initiate and execute different commands. This characteristic of the interface is
some times referred to as productivity support of the interface. It indicates how fast the users
can perform their intended tasks. The time and user effort necessary to initiate and execute
different commands should be minimal. This can be achieved through careful design of the
interface. For example, an interface that requires users to type in lengthy commands or involves
mouse movements to different areas of the screen that are wide apart for issuing commands
can slow down the operating speed of users. The most frequently used commands should have
the smallest length or be available at the top of the menu to minimize the mouse movements
necessary to issue commands.
• Speed of recall: Once users learn how to use an interface, the speed with which they can
recall the command issue procedure should be maximized. This characteristic is very important
for intermittent users. Speed of recall is improved if the interface is based on some metaphors,
symbolic command issue procedures, and intuitive command names.
• Error prevention: A good user interface should minimize the scope of committing errors
while initiating different commands. The error rate of an interface can be easily determined by
monitoring the errors committed by average users while using the interface. This monitoring can
be automated by instrumenting the user interface code with monitoring code which can record
the frequency and types of user error and later display the statistics of various kinds of errors
committed by different users.
Moreover, errors can be prevented by asking the users to confirm any potentially destructive
actions specified by them, for example, deleting a group of files.
Consistency of names, issue procedures, and behavior of similar commands and the simplicity
of the command issue procedures minimize error possibilities. Also, the interface should prevent
the user from entering wrong values.
• Attractiveness: A good user interface should be attractive to use. An attractive user interface
catches user attention and fancy. In this respect, graphics-based user interfaces have a definite
advantage over text-based interfaces.
• Consistency: The commands supported by a user interface should be consistent. The basic
purpose of consistency is to allow users to generalize the knowledge about aspects of the interface
from one part to another. Thus, consistency facilitates speed of learning, speed of recall, and
also helps in reduction of error rate.
• Feedback: A good user interface must provide feedback to various user actions. Especially, if
any user request takes more than few seconds to process, the user should be informed about the
state of the processing of his request. In the absence of any response from the computer for a
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
2.12. Types of User Interfaces
long time, a novice user might even start recovery/shutdown procedures in panic. If required,
the user should be periodically informed about the progress made in processing his command.
For example, if the user specifies a file copy/file download operation, a progress bar can be
displayed to display the status. This will help the user to monitor the status of the action
initiated.
• Support for multiple skill levels: A good user interface should support multiple levels of
sophistication of command issue procedure for different categories of users. This is necessary
because users with different levels of experience in using an application prefer different types of
user interfaces. Experienced users are more concerned about the efficiency of the command issue
procedure, whereas novice users pay importance to usability aspects. Very cryptic and complex
commands discourage a novice, whereas elaborate command sequences make the command issue
procedure very slow and therefore put off experienced users. When someone uses an application
for the first time, his primary concern is speed of learning. After using an application for
extended periods of time,he becomes familiar with the operation of the software. As a user
becomes more and more familiar with an interface, his focus shifts from usability aspects to
speed of command issue aspects. Experienced users look for options such as hot-keys, macros,
etc. Thus, the skill level of users improves as they keep using a software product and they look
for commands to suit their skill levels.
• Error recovery (undo facility).: While issuing commands, even the expert users can commit
errors. Therefore, a good user interface should allow a user to undo a mistake committed by
him while using the interface. Users are put to inconvenience, if they cannot recover from the
errors they commit while using the software.
• User guidance and on-line help: Users seek guidance and on-line help when they either forget
a command or are unaware of some features of the software. Whenever users need guidance or
seek help from the system, they should be provided with the appropriate guidance and help.
2.12 Types of User Interfaces
User interfaces can be classified into the following three categories:
• Command language based interfaces.
• Menu-based interfaces.
• Direct manipulation interfaces.
1. Command Language-based Interface A command language-based interface as the name
itself suggests, is based on designing a command language which the user can use to issue the
commands. The user is expected to frame the appropriate commands in the language and
type them in appropriately whenever required. A simple command language-based interface
might simply assign unique names to the different commands. However, a more sophisticated
command language-based interface may allow users to compose complex commands by using
a set of primitive commands. Such a facility to compose commands dramatically reduces the
number of command names one would have to remember. Thus, a command language-based
interface can be made concise requiring minimal typing by the user. Command language-based
interfaces allow fast interaction with the computer and simplify the input of complex commands.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
33
CHAPTER 2. UNIT II
2. Menu-based Interface An important advantage of a menu-based interface over a command
language-based interface is that a menu-based interface does not require the users to remember
the exact syntax of the commands. A menu-based interface is based on recognition of the
command names, rather than recollection. Further, in a menu-based interface the typing effort
is minimal as most interactions are carried out through menu selections using a pointing device.
This factor is an important consideration for the occasional user who cannot type fast.
However, experienced users find a menu-based user interface to be slower than a command
language-based interface because an experienced user can type fast and can get speed advan-
tage by composing different primitive commands to express complex commands. Composing
commands in a menu-based interface is not possible. This is because of the fact that actions
involving logical connectives (and, or, etc.) are awkward to specify in a menu-based system.
Also, if the number of choices is large, it is difficult to select from the menu. In fact, a major
challenge in the design of a menu-based interface is to structure large number of menu choices
into manageable forms.
3. Direct Manipulation Interfaces Direct manipulation interfaces present the interface to the
user in the form of visual models (i.e. icons or objects). For this reason, direct manipulation
interfaces are sometimes called as iconic interface. In this type of interface, the user issues
commands by performing actions on the visual representations of the objects, e.g. pull an
icon representing a file into an icon representing a trash box, for deleting the file. Important
advantages of iconic interfaces include the fact that the icons can be recognized by the users
very easily, and that icons are language-independent. However, direct manipulation interfaces
can be considered slow for experienced users. Also, it is difficult to give complex commands
using a direct manipulation interface. For example, if one has to drag an icon representing the
file to a trash box icon for deleting a file, then in order to delete all the files in the directory
one has to perform this operation individually for all files which could be very easily done by
issuing a command like delete
2.13 Menu-based interfaces
When the menu choices are large, they can be structured as the following way:
• Scrolling menu: When a full choice list can not be displayed within the menu area, scrolling
of the menu items is required. This would enable the user to view and select the menu items
that cannot be accommodated on the screen. However, in a scrolling menu all the commands
should be highly correlated, so that the user can easily locate a command that he needs. This
is important since the user cannot see all the commands at any one
time. An example situation where a scrolling menu is frequently used is font size selection in a
document processor (as shown in fig. ). Here, the user knows that the command list contains
only the font sizes that are arranged in some order and he can scroll up and down to find the
size he is looking for. However, if the commands do not have any definite ordering relation,
then the user would have to in the worst case, scroll through all the commands to find the exact
command he is looking for, making this organization inefficient.
• Walking menu Walking menu is very commonly used to structure a large collection of menu
items. In this technique, when a menu item is selected, it causes further menu items to be
displayed adjacent to it in a sub-menu. A walking menu can successfully be used to structure
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
2.13. Menu-based interfaces
Figure 2.5: Font size selection using scrolling menu
commands only if there are tens rather than hundreds of choices since each adjacently displayed
menu does take up screen space and the total screen area is after limited.
• Hierarchical menu: In this technique, the menu items are organized in a hierarchy or tree
structure. Selecting a menu item causes the current menu display to be replaced by an appropri-
ate sub-menu. Thus in this case, one can consider the menu and its various sub-menus to form
a hierarchical tree-like structure. Walking menu can be considered to be a form of hierarchical
menu which is practicable when the tree is shallow. Hierarchical menu can be used to manage
large number of choices, but the users are likely to face navigational problems because they
might lose track of where they are in the menu tree. This probably is the main reason why this
type of interface is very rarely used.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
35
CHAPTER 2. UNIT II
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
Chapter 3
OBJECTED ORIENTED
CONCEPT
This chapter begins our study of object-oriented programming (OOP). What is OOP? One popular
definition describes OOP as a methodology that organizes a program into a collection of interacting
objects. A more technical definition asserts that OOP is a programming paradigm that incorporates
the principles of encapsulation , inheritance , and polymorphism . If these characterizations mean little
or nothing to you, don’t be dismayed. OOP is not a concept that can be easily described or explained
in a single sentence or even a single paragraph. In fact, the foundations of OOP are Encapsulation,
inheritance, and polymorphism With a bit of time and practice, OOP will become quite natural to
you. For the present, however, lets just say that OOP is all about objects .
In this chapter you will learn about objects-what they are and how to use them. Once you under-
stand objects, then encapsulation, inheritance, polymorphism, and all the nuances and advantages of
OOP easily follow.
3.1 ObjectOriented Programming Concepts
The important concept of OOPs are
• Objects.
• Classes.
• Inheritance.
• Data Abstraction.
• Data Encapsulation.
• Polymorphism.
• Overloading.
• Re-usability.
3.1.1 OBJECTS
Object is the basic unit of object-oriented programming. Objects are identified by its unique name.
An object represents a particular instance of a class. There can be more than one instance of an
object. Each instance of an object can hold its own relevant data.An Object is a collection of data
members and associated member functions also known as methods.
In the context of a computer program, an object is a representation or an abstraction of some
entity such as a car, a soda machine, an ATM machine, a slot machine, a dog, an elephant, a person,
a house, a string of twine, a string of characters, a bank account, a pair of dice, a deck of cards, a
37
CHAPTER 3. OBJECTED ORIENTED CONCEPT
point in the plane, a TV, a DVD player, an iPod, a rocket, an elevator, a square, a rectangle, a circle,
a camera, a movie star, a shooting star, a computer mouse, a live mouse, a phone, an airplane, a song,
a city, a state, a country, a planet, a glass window, or a computer window. Just about anything is
an object. An object may be physical, like a radio, or intangible, like a song. Just as a noun is a
person, place, or thing, so is an object. And, just as people, places, and things are defined through
their attributes and behaviors, so are objects.
What is object
An object has characteristics or attributes; an object has actions or behaviors. Specifically, an
object is an entity that consists of:
1. data (the attributes), and
2. Methods that use or manipulate the data (the behaviors).
The remote control unit of Figure 3.1 provides a good example. With this rather bare bones
remote, an armchair viewer can turn a TV on or off, raise or lower the volume, or change the channel.
Figure 3.1: A remotecontrol unit
Accordingly, a remote control object has three attributes :
1. The current channel, an integer,
2. The volume level, an integer, and
3. The current state of the TV, on or off, true or false
along with five behaviors or methods:
1. Raise the volume by one unit,
2. Lower the volume by one unit,
3. Increase the channel number by one,
4. Decrease the channel number by one, and
5. Switch the TV on or off.
The remote control unit exemplifies encapsulation , one of the three major
tenets of OOP.(The other two are inheritance and polymorphism.)The remote control unit exemplifies
encapsulation , one of the three major tenets of OOP. (The other two are inheritance and polymor-
phism.)
Encapsulation
Encapsulation is defined as the language feature that packages
attributes and behaviors into a single unit. That is, data and
methods comprise a single entity.
Accordingly, each remote control object encapsulates data and methods, attributes and behaviors.
An individual remote unit, an object, stores its own attributeschannel number, volume level, power
stateand has the functionality to change those attributes. It’s all in a single package.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
3.1. ObjectOriented Programming Concepts
A rectangle is also an object. The attributes of a rectangle might be length and width, two floating-
point numbers; the methods compute and return area and perimeter. Figure shows three different
rectangle objects. Again notice that data and methods come bundled together; data and methods are
encapsulated in one object. Each rectangle has its own set of attributes; all share the same behaviors.
3.1.2 Classes
Classes are data types based on which objects are created. Objects with similar properties and
methods are grouped together to form a Class. Thus a Class represent a set of individual objects.
Characteristics of an object are represented in a class as Properties. The actions that can be performed
by objects becomes functions of the class and is referred to as Methods.
3.1.3 Inheritance
Inheritance is the process of forming a new class from an existing class or base class. The base class is
also known as parent class or super class, The new class that is formed is called derived class. Derived
class is also known as a child class or sub class. Inheritance helps in reducing the overall code size of
the program, which is an important concept in object-oriented programming.
3.1.4 Data Abstraction
Data Abstraction increases the power of programming language by creating user defined data types.
Data Abstraction also represents the needed information in the program without presenting the details.
3.1.5 Data Encapsulation
Data Encapsulation, data is not accessed directly; it is only accessible through the functions present
inside the class. Data Encapsulation enables the important concept of data hiding possible.
3.1.6 Polymorphism
Polymorphism allows routines to use variables of different types at different times. An operator or
function can be given different meanings or functions. Polymorphism refers to a single function or
multi-functioning operator performing in different ways.
3.1.7 Overloading
Overloading is one type of Polymorphism. It allows an object to have different meanings, depending
on its context. When an exiting operator or function begins to operate on new data type, or class, it
is understood to be overloaded.
3.1.8 Re-usability
This term refers to the ability for multiple programmers to use the same written and debugged existing
class of data. This is a time saving device and adds code efficiency to the language. Additionally,
the programmer can incorporate new features to the existing class, further developing the application
and allowing users to achieve increased performance. This time saving feature optimizes code, helps
in gaining secured applications and facilitates easier maintenance on the application.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
39
CHAPTER 3. OBJECTED ORIENTED CONCEPT
3.2 Model
A model captures aspects important for some application while omitting (or abstracting) the rest. A
model in the context of software development can be graphical, textual, mathematical, or program
code-based. Models are very useful in documenting the design and analysis results. Models also
facilitate the analysis and design procedures themselves. Graphical models are very popular because
they are easy to understand and construct. UML is primarily a graphical modeling tool. However, it
often requires text explanations to accompany the graphical models.
3.2.1 Need for a model
An important reason behind constructing a model is that it helps manage complexity. Once models
of a system have been constructed, these can be used for a variety of purposes during software
development, including the following:
• Analysis.
• Specification.
• Code generation.
• Design.
• Visualize and understand the problem and the working of a system.
• Testing, etc.
In all these applications, the UML models can not only be used to document the results but also to
arrive at the results themselves. Since a model can be used for a variety of purposes, it is reasonable
to expect that the model would vary depending on the purpose for which it is being constructed. For
example, a model developed for initial analysis and specification should be very different from the
one used for design. A model that is being used for analysis and specification would not show any
of the design decisions that would be made later on during the design stage. On the other hand, a
model used for design purposes should capture all the design decisions. Therefore, it is a good idea
to explicitly mention the purpose for which a model has been developed, along with the model.
3.2.2 Unified Modeling Language (UML)
UML, as the name implies, is a modeling language. It may be used to visualize, specify, construct,
and document the artifacts of a software system. It provides a set of notations (e.g. rectangles, lines,
ellipses, etc.) to create a visual model of the system. Like any other language, UML has its own
syntax (symbols and sentence formation rules) and semantics (meanings of symbols and sentences).
Also, we should clearly understand that UML is not a system design or development methodology,
but can be used to document object-oriented and analysis results obtained using some methodology.
3.2.3 UML diagrams
UML can be used to construct nine different types of diagrams to capture five different views of a
system. Just as a building can be modeled from several views (or perspectives) such as ventilation
perspective, electrical perspective, lighting perspective, heating perspective, etc.; the different UML
diagrams provide different perspectives of the software system to be developed and facilitate a com-
prehensive understanding of the system. Such models can be refined to get the actual implementation
of the system.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
3.2. Model
The UML diagrams can capture the following five views of a system:
• Users view.
• Structural view.
• Behavioral view.
• Implementation view.
• Environmental view.
Users view
Use case diagram
Structural view
Class diagram
Object diagram
Behavioral view
Sequence diagram
Collaboration diagram
State-chart diagram
Activity diagram
Implementation view
Component diagram
Environmental view
Deployment diagram
Figure 3.2: Different types of diagrams and views supported in UML
1. Users view: This view defines the functionalities (facilities) made available by the system
to its users. The users view captures the external users view of the system in terms of the
functionalities offered by the system. The users view is a black-box view of the system where
the internal structure, the dynamic behavior of different system components, the implementation
etc. are not visible. The users view is very different from all other views in the sense that it
is a functional model compared to the object model of all other views. The users view can be
considered as the central view and all other views are expected to conform to this view. This
thinking is in fact the crux of any user centric development style.
2. Structural view: The structural view defines the kinds of objects (classes) important to the
understanding of the working of a system and to its implementation. It also captures the
relationships among the classes (objects). The structural model is also called the static model,
since the structure of a system does not change with time.
3. Behavioral view: The behavioral view captures how objects interact with each other to realize
the system behavior. The system behavior captures the time-dependent (dynamic) behavior of
the system.
4. Implementation view: This view captures the important components of the system and their
dependencies.
5. Environmental view: This view models how the different components are implemented on
different pieces of hardware.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
41
CHAPTER 3. OBJECTED ORIENTED CONCEPT
3.3 Use Case Model
The use case model for any system consists of a set of use cases. Intuitively, use cases represent the
different ways in which a system can be used by the users. A simple way to find all the use cases
of a system is to ask the question: What the users can do using the system? Thus for the Library
Information System (LIS), the use cases could be:
• issue-book.
• query-book.
• return-book.
• create-member.
• add-book, etc
Use cases correspond to the high-level functional requirements. The use cases partition the system
behavior into transactions, such that each transaction performs some useful action from the users
point of view. To complete each transaction may involve either a single message or multiple message
exchanges between the user and the system to complete.
3.3.1 Purpose of use cases
The purpose of a use case is to define a piece of coherent behavior without revealing the internal
structure of the system. The use cases do not mention any specific algorithm to be used or the
internal data representation, internal structure of the software, etc. A use case typically represents a
sequence of interactions between the user and the system. These interactions consist of one mainline
sequence. The mainline sequence represents the normal interaction.between a user and the system.
The mainline sequence is the most occurring sequence of interaction. For example, the mainline
sequence of the withdraw cash use case supported by a bank ATM drawn, complete the transaction,
and get the amount. Several variations to the main line sequence may also exist. Typically, a variation
from the mainline sequence occurs when some specific conditions hold. For the bank ATM example,
variations or alternate scenarios may occur, if the password is invalid or the amount to be withdrawn
exceeds the amount balance. The variations are also called alternative paths. A use case can be
viewed as a set of related scenarios tied together by a common goal. The mainline sequence and each
of the variations are called scenarios or instances of the use case. Each scenario is a single path of
user events and system activity through the use case.
3.3.2 Representation of use cases
Use cases can be represented by drawing a use case diagram and writing an accompanying text
elaborating the drawing. In the use case diagram, each use case is represented by an ellipse with the
name of the use case written inside the ellipse. All the ellipses (i.e. use cases) of a system are enclosed
within a rectangle which represents the system boundary. The name of the system being modeled
(such as Library Information System) appears inside the rectangle.
The different users of the system are represented by using the stick person icon. Each stick person
icon is normally referred to as an actor. An actor is a role played by a user with respect to the system
use. It is possible that the same user may play the role of multiple actors. Each actor can participate
in one or more use cases. The line connecting the actor and the use case is called the communication
relationship. It indicates that the actor makes use of the functionality provided by the use case. Both
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
3.3. Use Case Model
the human users and the external systems can be represented by stick person icons. When a stick
person icon represents an external system, it is annotated by the stereotype �external system.�.
Example 3. The use case model for the Tic-tac-toe problem is shown in Figure. This software has
only one use case play move. Note that the use case get-user-move is not used here. The name get-
user-move would be inappropriate because the use cases should be named from the users perspective.
Tic-Tac-Toe game
Play move
Player
Figure 3.3: Use case model for tic-tac-toe game
1. Text Description: Each ellipse on the use case diagram should be accompanied by a text
description. The text description should define the details of the interaction between the user
and the computer and other aspects of the use case. It should include all the behavior associated
with the use case in terms of the mainline sequence, different variations to the normal behavior,
the system responses associated with the use case, the exceptional conditions that may occur in
the behavior, etc. The behavior description is often written in a conversational style describing
the interactions between the actor and the system. The text description may be informal, but
some structuring is recommended. The following are some of the information which may be
included in a use case text description in addition to the mainline sequence, and the alternative
scenarios.
2. Contact persons: This section lists the personnel of the client organization with whom the
use case was discussed, date and time of the meeting, etc.
3. Actors: In addition to identifying the actors, some information about actors using this use case
which may help the implementation of the use case may be recorded.
4. Pre-condition: The preconditions would describe the state of the system before the use case
execution starts.
5. Post-condition: This captures the state of the system after the use case has successfully
completed.
6. Non-functional requirements: This could contain the important constraints for the design
and implementation, such as platform and environment conditions, qualitative statements, re-
sponse time requirements, etc.
7. Exceptions, error situations: This contains only the domain-related errors such as lack of
users access rights, invalid entry in the input fields, etc. Obviously, errors that are not domain
related, such as software errors, need not be discussed here.
8. Sample dialogs: These serve as examples illustrating the use case.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
43
CHAPTER 3. OBJECTED ORIENTED CONCEPT
9. Specific user interface requirements: These contain specific requirements for the user in-
terface of the use case. For example, it may contain forms to be used, screen shots, interaction
style, etc.
10. Document references: This part contains references to specific domain-related documents
which may be useful to understand the system operation.
Example 4. The use case model for the Supermarket Prize Scheme is shown in figure. As discussed
earlier, the use cases correspond to the high-level functional requirements. From the problem descrip-
tion and the context diagram in figure we can identify three use cases: register-customer, register-sales,
and select-winners. As a sample, the text description for the use case register-customer is shown.
Figure 3.4: Use case model for Supermarket Prize Scheme
Text description
U1: register-customer: Using this use case, the customer can register himself by providing the neces-
sary details.
Scenario 1: Mainline sequence
1. Customer: select register customer option.
2. System: display prompt to enter name, address, and telephone number.
3. Customer: enter the necessary values.
4. System: display the generated id and the message that the customer has been successfully
registered.
Scenario 2: At step 4 of mainline sequence
1. System: displays the message that the customer has already registered
The description for other use cases is written in a similar fashion.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
3.3. Use Case Model
3.3.2.1 Utility of use case diagrams
From use case diagram, it is obvious that the utility of the use cases are represented by ellipses.
They along with the accompanying text description serve as a type of requirements specification of
the system and form the core model to which all other models must conform. But, what about the
actors (stick person icons)? One possible use of identifying the different types of users (actors) is
in identifying and implementing a security mechanism through a login system, so that each actor
can involve only those functionalities to which he is entitled to. Another possible use is in preparing
the documentation (e.g. users manual) targeted at each category of user. Further, actors help in
identifying the use cases and understanding the exact functioning of the system.
3.3.2.2 Factoring of use cases
It is often desirable to factor use cases into component use cases. Actually, factoring of use cases are
required under two situations. First, complex use cases need to be factored into simpler use cases.
This would not only make the behavior associated with the use case much more comprehensible,
but also make the corresponding interaction diagrams more tractable. Without decomposition, the
interaction diagrams for complex use cases may become too large to be accommodated on a single
sized (A4) paper. Secondly, use cases need to be factored whenever there is common behavior across
different use cases. Factoring would make it possible to define such behavior only once and reuse it
whenever required. It is desirable to factor out common usage such as error handling from a set of use
cases. This makes analysis of the class design much simpler and elegant. However, a word of caution
here. Factoring of use cases should not be done except for achieving the above two objectives. From
the design point of view, it is not advantageous to break up a use case into many smaller parts just
for the shake of it.
UML offers three mechanisms for factoring of use cases as follows:
1. Generalization:Use case generalization can be used when one use case that is similar to another,
but does something slightly differently or something more. Generalization works the same way
with use cases as it does with classes. The child use case inherits the behavior and meaning
of the parent use case. The notation is the same too (as shown in figure). It is important
to remember that the base and the derived use cases are separate use cases and should have
separate text descriptions.
Pay membership fee
Pay through credit cardPay through
library pay card
Figure 3.5: Representation of use case generalization
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
45
CHAPTER 3. OBJECTED ORIENTED CONCEPT
3.4 Class diagrams
A class diagram describes the static structure of a system. It shows how a system is structured rather
than how it behaves. The static structure of a system comprises of a number of class diagrams and
their dependencies. The main constituents of a class diagram are classes and their relationships:
generalization, aggregation, association, and various kinds of dependencies.
3.4.1 Classes
The classes represent entities with common features, i.e. attributes and operations. Classes are repre-
sented as solid outline rectangles with compartments. Classes have a mandatory name compartment
where the name is written centered in boldface. The class name is usually written using mixed case
convention and begins with an uppercase. The class names are usually chosen to be singular nouns.
Classes have optional attributes and operations compartments. A class may appear on several
diagrams. Its attributes and operations are suppressed on all but one diagram.
3.4.2 Attributes
An attribute is a named property of a class. It represents the kind of data that an object might
contain. Attributes are listed with their names, and may optionally contain specification of their
type, an initial value, and constraints. The type of the attribute is written by appending a colon and
the type name after the attribute name. Typically, the first letter of a class name is a small letter.
An example for an attribute is given.
bookName : String
3.4.3 Operation
Operation is the implementation of a service that can be requested from any object of the class to
affect behaviour. An objects data or state can be changed by invoking an operation of the object.
A class may have any number of operations or no operation at all. Typically, the first letter of an
operation name is a small letter. Abstract operations are written in italics. The parameters of an
operation (if any), may have a kind specified, which may be in, out or inout. An operation may have
a return type consisting of a single return type expression. An example for an operation is given.
issueBook(in bookName):Boolean
3.5 Association
Associations are needed to enable objects to communicate with each other. An association describes a
connection between classes. The association relation between two objects is called object connection
or link. Links are instances of associations. A link is a physical or conceptual connection between
object instances. For example, suppose Amit has borrowed the book Graph Theory. Here, borrowed
is the connection between the objects Amit and Graph Theory book. Mathematically, a link can be
considered to be a tuple, i.e. an ordered list of object instances. An association describes a group
of links with a common structure and common semantics. For example, consider the statement that
Library Member borrows Books. Here, borrows is the association between the class LibraryMember
and the class Book. Usually, an association is a binary relation (between two classes). However, three
or more different classes can be involved in an association. A class can have an association relationship
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
3.5. Association
with itself (called recursive association). In this case, it is usually assumed that two different objects
of the class are linked by the association relationship.
Association between two classes is represented by drawing a straight line between the concerned
classes. Figure illustrates the graphical representation of the association relation. The name of the
association is written along side the association line. An arrowhead may be placed on the association
line to indicate the reading direction of the association. The arrowhead should not be misunderstood to
be indicating the direction of a pointer implementing an association. On each side of the association
relation, the multiplicity is noted as an individual number or as a value range. The multiplicity
indicates how many instances of one class are associated with each other. Value ranges of multiplicity
are noted by specifying the minimum and maximum value, separated by two dots, e.g. 1.5. An
asterisk is a wild card and means many (zero or more). The association of figure should be read as
Many books may be borrowed by a Library Member. Observe that associations (and links) appear as
verbs in the problem statement.
Figure 3.6: Association between two classes
Associations are usually realized by assigning appropriate reference attributes to the classes in-
volved. Thus, associations can be implemented using pointers from one object class to another. Links
and associations can also be implemented by using a separate class that stores which objects of a class
are linked to which objects of another class. Some CASE tools use the role names of the association
relation for the corresponding automatically generated attribute.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
47
CHAPTER 3. OBJECTED ORIENTED CONCEPT
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
Chapter 4
UNIT IV
Databases today are essential to every business. They are used to maintain internal records, to
present data to customers and clients on the world-Wide-Web, and to support many other commercial
processes. Databases are likewise found at the core of many scientific investigations. They represent
the data gathered by astronomers, by investigators of the human genome, and by biochemists exploring
the medicinal properties of proteins, along with many other scientists.
The power of databases comes from a body of knowledge and technology that has developed
over several decades and is embodied in specialized software called a database management system,
or DBMS, or more colloquially a .’database system.” A DBMS is a powerful tool for creating and
managing large amounts of data efficiently and allowing it to persist over long periods of time, safely.
These systems are among the most complex types of software available. The capabilities that a DBMS
provides the user are:
1. Persistent storage. Like a file system, a DBMS supports the storage of very large amounts of
data that exists independently of any processes that are using the data. However, the DBMS
goes far beyond the file system in providing flexibility. such as data structures that support
efficient access to very large amounts of data.
2. Programming interface.A DBMS allow the user or an application program to access and modify
data through a powerful query language. Again, the advantage of a DBMS over a file system
is the flexibility to manipulate stored data in much more complex ways than the reading and
writing of files.
3. Transaction management. A DBMS supports concurrent access to data, i.e.: simultaneous ac-
cess by many distinct processes (called ”transactions”) at once. To avoid some of the undesirable
consequences of simultaneous access, the DBMS supports isolation, the appearance that trans-
actions execute one-at-a-time, and atomicity, the requirement that transactions execute either
completely or not at all. A DBMS also supports durability, the ability to recover from failures
or errors of many types.
4.1 Overview of Data Base Management Systems
Data bases and database systems have become an essential component of everyday life in modern
society. Examples for database Applications include.
• Purchases from the supermarket.
• Purchases using your credit card.
49
CHAPTER 4. UNIT IV
• Booking a holiday at the travel agents.
• Using the local library.
• Taking out insurance.
• Using the Internet.
• Studying at university.
4.1.1 Need to store data
Data originates at one time and used later for example,Store registrations for grading later, Store for
future information needs, Governmental regulations requires access to past data, Data used later for
auditing, evaluation purpose, Used more than once : save for future use.
4.1.2 Limitations of manual methods
Problems of speed, Problems of accuracy, Problems of consistency and reliability, Problems of poor
response time, Problems of work-load handling capability, Problems of meeting ad hoc information
needs, Problems of cost, Problems due to human frailties: (misplaced) loyalty, inconsistency, irregu-
larity, difficulties in handling big tasks.
4.1.3 Why computerized data processing?
We use computer to process our data because of several reasons and the include
• Advantage of speed.
• Advantage of accuracy.
• Advantage of reliability.
• Advantage of consistency.
• Advantage of storage and retrieval efficiency
• Advantage of on-line-access to meet ad-hoc
needs.
• Advantage of cost.
4.2 What the DBMS can do
the term database refers to a collection of data that is managed by a DBMS. The DBMS is expected
to:
1. Allow users to create new databases and specify their schema (logical structure of the data),
using a specialized language called a data-definition language.
2. Give users the ability to query the data (a ”query” is database lingo for a question about the
data) and modify the data, using an appropriate language, often called a query language or
data-manipulation language.
3. Support the storage of very large amounts of data - many gigabytes or more - over a long period
of time, keeping it secure from accident or unauthorized use and allowing efficient access to the
data for queries and database modifications.
4. Control access to data from many users at once, without allowing the actions of one user to
affect other users and without allowing simultaneous accesses to corrupt the data accidentally.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.3. Data Base Management Systems
4.3 Data Base Management Systems
Data Base is a Collection of related data, by data, we mean known facts that can be recorded and
that have implicit meaning.
Definition 4.3.1: Definition of DBMS
A data base management system(DBMS) is a collection of programs that enables users to
create and maintain a database. The DBMS is hence a general purpose software system that
facilitate the process of defining, constructing , manipulating and sharing databases among the
various users and applications.
Historical development of database Technologies:
1. Early Database Applications: The Hierarchical and Network Models were introduced in mid
1960s and dominated during the seventies. A bulk of the worldwide database processing still
occurs using these models.
2. Relational Model based Systems: The model that was originally introduced in 1970 was
heavily researched and experimented with in IBM and the universities.
3. Object-oriented applications: OODBMSs were introduced in late 1980s and early 1990s to
cater to the need of complex data processing in CAD and other applications. Its use has not
taken off much.Relational DBMS Products emerged in the 1980s.
4. Data on the Web and E-commerce Applications: Web contains data in HTML (Hypertext
markup language) with links among pages. This has given rise to a new set of applications and
E-commerce is using new standards like XML (extended Markup Language).
4.3.1 Extended Database Capabilities
New functionality is being added to DBMS s in the following areas
• Scientific Applications.
• Image Storage and Management.
• Audio and Video data management.
• Data Mining.
• Spatial data management.
• Time Series and Historical Data Manage-
ment.
4.3.2 Transaction Management
A transaction is a collection of operations that performs a single logical function in a database ap-
plication. Transaction-management component ensures that the database remains in a consistent
(correct) state despite system failures (e.g., power failures and operating system crashes) and transac-
tion failures.Concurrency-control manager controls the interaction among the concurrent transactions,
to ensure the consistency of the database.
A database transaction is a unit of interaction with a database management system or similar
system that is treated in a coherent and reliable way independent of other transactions that must be
either entirely completed or aborted.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
51
CHAPTER 4. UNIT IV
The ACID Properties of Transactions
Properly implemented transactions are commonly said to meet the ”.ACID test,” where:
• ”A” stands for ”atomicity,” the all-or-nothing execution of transactions.
• ”I” stands for ”isolation,” the fact that each transaction must appear to be executed as
if no other transaction is executing at the same time.
• ”D” stands for ”durability,” the condition that the effect on the database of a transaction
must never be lost, once the transaction has completed.
• ”C,” stands for ”consistency.” That is, all databases ’ have consistency constraints, or
expectations about relationships among data elements (e.g., account balances may not
be negative). Transactions are expected to preserve the consistency of the database.
4.4 Database Administrator
A database administrator (DBA) is a person who is responsible for the environmental aspects of a
database. In general, these include.
1. Recoverability: Creating and testing Backups.
2. Integrity: Verifying or helping to verify data integrity
3. Security: Defining and/or implementing access controls to the data.
4. Availability: Ensuring maximum uptime.
5. Performance: Ensuring maximum performance given budgetary constraints.
6. Development and testing support Helping programmers and engineers to efficiently utilize
the database.
The role of a database administrator has changed according to the technology of database management
systems (DBMSs) as well as the needs of the owners of the databases.
4.5 Types of Databases and Database Applications
• Numeric and Textual Databases (Traditional Database).
• Multimedia Databases (Video clips, pictures, sound message).
• Geographic Information Systems (GIS)(Weather data, map analysis, satellite images).
• Data Warehouses (Decision making).
• Real-time and Active Databases (Internet based (World wide web))
Database Systems can be categorized according to the data structures and operators they present to
the user. The oldest systems fall into inverted list, hierarchic and network systems. These are the
pre-relational models. A simplified database system environment is shown here.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.6. Advantages of Database Systems
Application programs/Queries
Software to process queries
Software to access stored data
DA
TA
BA
SE
DA
TA
BA
SE
Database
system
DBMS
software
Users/Programmers
Figure 4.1: A simplified database system environment.
4.6 Advantages of Database Systems
As shown in the figure, the DBMS is a central system which provides a common interface between
the data and the various front-end programs in the application. It also provides a central location for
the whole data in the application to reside.
Due to its centralized nature, the database system can overcome the disadvantages of the file-based
system as discussed below.
• Minimal Data Redundancy: Since the whole data resides in one central database, the various
programs in the application can access data in different data files. Hence data present in one file
need not be duplicated in another. This reduces data redundancy. However, this does not mean
all redundancy can be eliminated. There could be business or technical reasons for having some
amount of redundancy. Any such redundancy should be carefully controlled and the DBMS
should be aware of it.
1. Duplication is wasteful. It costs time and money to enter the data more than once.
2. It takes up additional storage space, again with associated costs.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
53
CHAPTER 4. UNIT IV
Database application
Checking
account
Current
account
application
Loan
Application
Mortgage
loan
application
Databse
management
system
DA
TA
BA
SE
Other
application
Figure 4.2: Database system & its application.
3. Perhaps more importantly, duplication can lead to loss of data integrity.
• Data Consistency:Reduced data redundancy leads to better data consistency.
• Data Integration:Since related data is stored in one single database, enforcing data integrity
is much easier. Moreover, the functions in the DBMS can be used to enforce the integrity rules
with minimum programming in the application programs.
• Data Sharing:Related data can be shared across programs since the data is stored in a cen-
tralized manner. Even new applications can be developed to operate against the same data.
• Enforcement of Standards:Enforcing standards in the organization and structure of data
files is required and also easy in a Database System, since it is one single set of programs which
is always interacting with the data files.
• Application Development Ease:The application programmer need not build the functions
for handling issues like concurrent access, security, data integrity, etc. The programmer only
needs to implement the application business rules. This brings in application development ease.
Adding additional functional modules is also easier than in file-based systems.
• Better Controls:Better controls can be achieved due to the centralized nature of the system.
• Data Independence :The architecture of the DBMS can be viewed as a 3-level system com-
prising the following:
1. Internal or the physical level where the data resides.
2. The conceptual level which is the level of the DBMS functions.
3. The external level which is the level of the application programs or the end user.
Data Independence is isolating an upper level from the changes in the organization or structure
of a lower level. For example, if changes in the file organization of a data file do not demand
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.7. Database Users:
for changes in the functions in the DBMS or in the application programs, data independence
is achieved. Thus Data Independence can be defined as immunity of applications to change in
physical representation and access technique. The provision of data independence is a major
objective for database systems.
• Reduced Maintenance:Maintenance is less and easy, again, due to the centralized nature of
the system.
• Restricting unauthorized access to data:When multiple users shares a large database, it
is likely that most users will not be authorized to access all information in the database.
• Representing complex relationships among data.
• Providing multiple interfaces to different classes of users.
4.7 Database Users:
Typically there are three types of users for a DBMS. They are :
1. Database administrators: responsible for authorizing access to the database, for co-ordinating
and monitoring its use, acquiring software, and hardware resources, controlling its use and mon-
itoring efficiency of operations.
2. Database Designers: Responsible to define the content, the structure, the constraints, and
functions or transactions against the database. They must communicate with the end-users and
understand their needs.
3. End users: End users are the people whose jobs require access to the database for querying,
updating, and generating reports; the database primarily exists for their use. There are several
categories of end users:
(a) Casual End User: Access database occasionally when needed. But they may need dif-
ferent information each time.
(b) Nave or Parametric End user: They make up a large section of the end-user population.
They use previously well-defined functions in the form of canned transactions against the
database. Examples are bank-tellers or reservation clerks who do this activity for an entire
shift of operations.
(c) Sophisticated End User: These include business analysts, scientists, engineers, others
thoroughly familiar with the system capabilities. Many use tools in the form of software
packages that work closely with the stored database.
(d) Stand-alone End User: Mostly maintain personal databases using ready-to-use pack-
aged applications. An example is a tax program user that creates his or her own internal
database.
4.7.1 3-Level Database System Architecture:
• The External Level represents the collection of views available to different end-users.
• The Conceptual level is the representation of the entre information content of the database.
• The Internal level is the physical level which shows how the data data is stored, what are
the representation of the fields etc.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
55
CHAPTER 4. UNIT IV
Three tire architecture
External
view
External
view
Conceptual
schema
Internal
schema
Stored Database
External
level
Conceptual mappingConceptual mapping
Conceptual level
Internal mapping
Internal level
End users
Figure 4.3: Three tire architecture of database system.
4.7.1.1 The Internal Level
This level represents how the data is physically stored on the disk and some of the access
mechanisms commonly used for retrieving this data.
The Internal Level is the level which deals with the physical storage of data. While designing
this layer, the main objective is to optimize performance by minimizing the number of disk
accesses during the various database operations.
The figure shows the process of database access in general. The DBMS views the database as a
collection of records. The File Manager of the underlying Operating System views it as a set of
pages and the Disk Manager views it as a collection of physical locations on the disk
When the DBMS makes a request for a specific record to the File Manager, the latter maps the
record to a page containing it and requests the Disk Manager for the specific page. The Disk
Manager determines the physical location on the disk and retrieves the required page.
4.8 Levels of Abstraction & database schema
The description of a database, Includes descriptions of the database structure and the constraints that
should hold on the database.A diagrammatic display of (some aspects of) a database schema is called
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.8. Levels of Abstraction & database schema
Database access
DBMS
File manager
Disk manager
DA
TA
BA
SE
Request Stored record Stored record returned
Request Stored page Stored page returned
Disk I/O operation Data read from disk
Figure 4.4: Process of database access
schema diagram.
The data in a DBMS is described at three levels of abstraction, as illustrated in the figure 1.2
defines DBMS schemas at three levels:
• Internal schema at the internal level: To describe physical storage structures and access
paths. Typically uses a physical data model.( how a record (e.g., customer) is stored-Physical
level)
• Conceptual schema: At the conceptual level to describe the structure and constraints for the
whole database for a community of users. Uses a conceptual or an implementation data model.(
describes data stored in database, and the relationships among the data Logical level).
• External schema: At the external level to describe the various user views. Usually uses the
same data model as the conceptual level.(describes data as seen by a user/application View
Level)
4.8.1 Schema
Schema describes contents of the database e.g., what information about a set of customers and
accounts and the relationship between them)
• Physical schema: how data is stored at physical level (how).
• Logical schema:data contained at the logical level (what)
Database Instance: The actual data stored in a database at a particular moment in time. Also
called database state (or occurrence).
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
57
CHAPTER 4. UNIT IV
4.8.2 Data Independence:
When a schema at a lower level is changed, only the mappings between this schema and higher-level
schema’s need to be changed in a DBMS that fully supports data independence. The higher-level
schema’s themselves are unchanged. Hence, the application programs need not be changed since they
refer to the external schema’s.
• Physical Data Independence The ability to modify the physical schema without changing
the logical schema.
1. Applications depend on the logical schema.
2. In general, the interfaces between the various levels and components should be well defined
so that changes in some parts do not seriously influence others.
• Logical Data Independence: The ability to modify conceptual schema without changing
the external Schema or application programs.
4.9 History of Data Models
• Relational Model: proposed in 1970 by E.F. Codd (IBM), first commercial system in 1981-82.
Now in several commercial products (DB2, ORACLE, SQL Server, SYBASE, INFORMIX).
• Network Model: the first one to be implemented by Honeywell in 1964-65 (IDS System).
Adopted heavily due to the support by CODASYL (CODASYL - DBTG report of 1971). Later
implemented in a large variety of systems - IDMS (Cullinet - now CA), DMS 1100 (Unisys),
IMAGE (H.P.), VAX -DBMS (Digital Equipment Corp.)
• Hierarchical Data Model: implemented in a joint effort by IBM and North American Rock-
well around 1965. Resulted in the IMS family of systems. The most popular model. Other
system based on this model: System 2k (SAS inc.)
• Object-oriented Data Model(s): several models have been proposed for implementing in a
database system. One set comprises models of persistent O-O Programming Languages such
as C++ (e.g., in OBJECTSTORE or VERSANT), and Smalltalk (e.g., in GEMSTONE). Ad-
ditionally, systems like O2, ORION (at MCC - then ITASCA), IRIS (at H.P.- used in Open
OODB).
• Object-Relational Models: Most Recent Trend. Started with Informix Universal Server.
Exemplified in the latest versions of Oracle-10i, DB2, and SQL Server etc. systems.
4.10 Entity Relation Model
The Entity Relationship (ER) data model allows us to describe the data involved in a real world
enterprise in terms of objects and their relationships and is widely used to develop an initial data base
design. Within the larger context of the overall design process, the ER model is used in a phase called
Conceptual database design.
4.11 Database design and ER Diagrams
The database design process can be divided into six steps. The ER model is most relevant to the first
three steps.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.11. Database design and ER Diagrams
1. Requirement Analysis The very first step in designing a database application is to understand
what data is to be stored in the database, what application must be built in top of it, and what
operations are most frequent and subject to performance requirements. In other words, we must
find out what the users want from the database.
2. Conceptual database Design The information gathered in the requirements analysis step is
used to develop a high-level description of the data to be stored in the database, along with the
constraints known to hold over this data. The ER model is one of several high level or semantic,
data models used in database design.
3. Logical Database Design We must choose a database to convert the conceptual database
design into a database schema in the data model of the chosen DBMS. Normally we will consider
the Relational DBMS and therefore, the task in the logical design step is to convert an ER schema
into a relational database schema.
4. Schema Refinement This step is to analyze the collection of relations in our relational database
schema to identify potential problems, and refine it.
5. Physical Database Design This step may simply involve building indexes on some table
and clustering some tables or it may involve substantial redesign of parts of database schema
obtained from the earlier steps.
6. Application and security Design Any software project that involves a DBMS must consider
aspects of the application that go beyond the database itself. We must describe the role of each
entity (users, user group, departments)in every process that is reflected in some application task,
as part of a complete work flow for the task. A DBMS Provides several mechanisms to assist in
this step.
4.11.1 Entity
Entities are specific objects or things in the mini-world that are represented in the database. For
example, Employee or staff, Department or Branch , Project are called Entity.
Figure 4.5: Entities.
4.11.2 Attribute
Attributes are properties used to describe an entity. For example an EMPLOYEE entity may have a
Name, SSN, Address, Sex, Birth Date and Department may have a D name, D no, D Location.
A specific entity will have a value for each of its attributes.For example a specific employee en-
tity may have Name=John Smith, SSN=123456789, Address =731, Fondren, Houston, TX, Sex=M,
Birth Date=09-JAN-55.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
59
CHAPTER 4. UNIT IV
Each attribute has a value set (or data type) associated with it e.g. integer, string, subrange,
enumerated type etc.
Figure 4.6: Two different entities.
4.11.2.1 Types of Attributes
1. Simple Vs. Composite Attribute
• Simple attributes Attribute that are not divisible are called simple or atomic attribute.
For example, First name and last name. i.e. the division of composite attribute name.
• Composite Attributes The attribute may be composed of several components. For
example, Name(First name,Middle name,Surname)
2. Single Valued Vs. Multi Valued
• Single Valued An attribute having only one value. For example, Age, Date of birth, Sex.
• Multi-valued An entity may have multiple values for that attribute.For example Phone
number of a student.
3. Stored Vs. Derived
In some cases two or more attributes values are related for example the age and date of birth
of a person. For a particular person entity, the value of age can be determined from the current
(todays) date and the value of that persons Birth date.
The Age attribute is hence called a derived attribute and is said to be derivable from the Birth
date attribute , which is called stored attribute
4. Null Values
In some cases a particular entity may not have an applicable value for an attribute. For example,
a college degree attribute applies only to persons with college degrees. For such situation, a
special value called null is created for the person who have not degree.
5. Key attributes of an Entity An important constraint on the entities of an entity type is
the Key or uniqueness constraint on attributes. An entity type usually has an attribute whose
values are distinct for each individual entity in the entity set. Such an attribute is called a Key
attribute, and its values can be used to identify each entity uniquely.
For example name attribute is a key of the company entity type, because no two companies
are allowed to have the same name. For the person entity type, a typical key attribute is
socialSecurityNumber (SSN). In ER diagrammatic notation, each key attribute has its name
underlined inside the oval.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.12. Relationships and Relationship sets
4.12 Relationships and Relationship sets
A relationship is an association among two or more entities. For example we may have the relationship
that Rajesh works in the Marketing department. A relationship type R among the n entity types
E1 , E2 , . . . , En defines a set of associations or a relationship set- among entities from these entity
types.
Informally each relationship instance ri in R is an association of entities, where the association
includes exactly one entity from each participating entity type. Each such relationship instance ri
represents the facts that the entities participating in ri are related in some way in the corresponding
mini world situation. For example consider a relationship type Works For between the two entity
types Employee and Department, which associates each employee with the department for which the
employee works. Each relationship instance in the relationship set Works For associates one employee
entity and one department entity. Figure illustrates this example.
e1
e2
e3
e4
e5
e6
e7
d1
d2
d3
r1
r2
r3
r4
r5
r6
r7
EMPLOYEE WORKS FOR DEPARTMENT
Figure 4.7: Entities with relationship.
4.12.1 Degree of a Relationship
The degree of a relationship is the number of participating entity types. Hence the above work for
relationship is of degree two. A relationship type of degree two is called Binary, and one of degree
three is called Ternary. An example of Ternary relationship is given below.
Example for quaternary Relationship
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
61
CHAPTER 4. UNIT IV
r1
r2
r3
r4
r5
r6
r7
S1
S2
P1
P2
P3
d1
d2
d3
SUPPLIER SUPPLY PROJECT
PARTS
Figure 4.8: Ternary relationship.
4.12.2 Constraints on relationship Types
Relationship types usually have certain constraints that limit the possible combinations of entities
that may participate in the corresponding relationship set. These constraints are determined from the
mini world situation that the relationship represent. For example in the works for relationship, if the
company has a rule that each employee must work for exactly one department, that we would like to
describe this constraint in the schema. There are two type.
1. Cardinality Ratio Describes maximum number of possible relationship occurrences for an
entity participating in a given relationship type. Cardinality ratio (of a binary relationship):
1:1, 1:N, N:1, or M:N.
2. Participation Determines whether all or only some entity occurrences participate in a relation-
ship.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.13. Types of Entity
Buyer
Solicitor
FinancialInstitution
Arranges
Bid
A solicitor arranges a bid on behalf of
a buyer supported by a financial institution
Figure 4.9: quaternary relationship.
4.13 Types of Entity
1. Strong Entity Entity which has a key attribute in its attribute list.
2. Weak Entity Entity which does not have the Key attribute.
4.13.1 Weak Entity Sets
An entity set that does not possess sufficient attributes to form a primary key is called a weak entity
set. One that does have a primary key is called a strong entity set.For example,
• The entity set transaction has attributes transactionnumber, date and amount.
• Different transactions on different accounts could share the same number.
• These are not sufficient to form a primary key (uniquely identify a transaction)
• Thus transaction is a weak entity set.
For a weak entity set to be meaningful, it must be part of a one-to-many relationship set. This
relationship set should have no descriptive attributes.
• Member of a strong entity set is a dominant entity.
• Member of a weak entity set is a subordinate entity.
A weak entity set does not have a primary key, but we need a means of distinguishing among the
entities.
The discriminator of a weak entity set is a set of attributes that allows this distinction to be made.
The primary key of a weak entity set is formed by taking the primary key of the strong entity set
on which its existence depends plus its discriminator.To illustrate:
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
63
CHAPTER 4. UNIT IV
• Transaction is a weak entity. It is existence-dependent on account.
• The primary key of account is account-number.
• Transaction-number distinguishes transaction entities within the same account (and is thus the
discriminator).
• So the primary key for transaction would be (accountnumber,transaction-number).
4.14 E-R Diagram
Notations for ER Diagram
Figure 4.10: E-R diagram symbols.
Sample ER diagram for a Company Schema with structural Constraints is shown below.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.14. E-R Diagram
Figure 4.11: E-R diagram for a Company Schema.
An E-R Diagram for a Bank database Schema
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
65
CHAPTER 4. UNIT IV
Figure 4.12: E-R diagram for a Bank database Schema.
4.15 Enhanced-ER (EER) Model Concepts
• It Includes all modeling concepts of basic ER.
• Additional concepts: subclasses/super classes,specialization/generalization.
• The resulting model is called the enhanced-ER or Extended ER (E2R or EER) model.
• It is used to model applications more completely and accurately if needed.
• It includes some object-oriented concepts, such as inheritance
4.15.1 Subclasses and Super classes
• An entity type may have additional meaningful sub groupings of its entities.
• Example: EMPLOYEE may be further grouped into SECRETARY, ENGINEER, MANAGER,
TECHNICIAN, SALARIED EMPLOYEE, HOURLY EMPLOYEE,...........
1. Each of these groupings is a subset of EMPLOYEE entities.
2. Each is called a subclass of EMPLOYEE.
3. EMPLOYEE is the super class for each of these subclasses.
• These are called super class/subclass relationships.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.15. Enhanced-ER (EER) Model Concepts
4.15.2 Specialization
It Is the process of defining a set of subclasses of a super class. The set of subclasses is based upon
some distinguishing characteristics of the entities in the super class.For Example
{SECRETARY,ENGINEER, TECHNICIAN}
is a specialization of EMPLOYEE based upon job type.Another specialization of EMPLOYEE based
on the method of pay is.
{SALARIED EMPLOY EE,HOURLY EMPLOY EE}
Super class/subclass relationships and specialization can be diagrammatically represented in EER
diagrams. Attributes of a subclass are called specific attributes. For example, Typing Speed of
SECRETARY.
The subclass can participate in specific relationship types. For example, BELONGS TO of
HOURLY EMPLOYEE.Figure shows the Specialization of an Employee based on Job Type.
Figure 4.13: Specialization of an Employee based on Job Type.
We may have several specializations of the same super class.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
67
CHAPTER 4. UNIT IV
Figure 4.14: Specialization of an Employee based on Job Type.
4.15.3 Generalization
• The reverse of the specialization process.
• Several classes with common features are generalized into a super class; original classes become
its subclasses.
• Example: CAR, TRUCK generalized into VEHICLE; both CAR, TRUCK become subclasses
of the super class VEHICLE.
1. We can view CAR, TRUCK as a specialization of VEHICLE.
2. Alternatively, we can view VEHICLE as a generalization of CAR and TRUCK
In this example Two entity types, CAR and TRUCK. Generalizing CAR and TRUCK into the super-
class VEHICLE.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.15. Enhanced-ER (EER) Model Concepts
Figure 4.15: Generalization of a vehicle.
4.15.4 Constraints on Specialization and generalization
1. Predicate Defined If we can determine exactly those entities that will become members of
each subclass by a condition, the subclasses are called predicate-defined (or condition defined)
subclasses.
• Condition is a constraint that determines subclass members.
• Display a predicate-defined subclass by writing the predicate condition next to the line
attaching the subclass to its super class
2. Attribute Defined If all subclasses in a specialization have membership condition on same
attribute of the super class, specialization is called an attribute defined specialization.Attribute
is called the defining attribute of the specialization.Example: Job Type is the defining attribute
of the specialization of EMPLOYEE
{SECRETARY, TECHNICIAN,ENGINEER}
3. User Defined subclass is called user-defined.Membership in a subclass is determined by the
database users by applying an operation to add an entity to the subclass.Membership in the
subclass is specified individually for each entity in the superclass by the user.The figure shows
the constraints on Specialization & Generalization
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
69
CHAPTER 4. UNIT IV
Figure 4.16: constraints on Specialization & Generalization.
4. Disjointness Constraint
• Specifies that the subclasses of the specialization must be disjointed (an entity can be a
member of at most one of the subclasses of the specialization).
• Specified by ’d’ in EER diagram.
• If not disjointed, overlap; that is the same entity may be a member of more than one
subclass of the specialization.
• Specified by ’O’ in EER diagram
5. Completeness Constraint
• Total specifies that every entity in the superclass must be a member of some subclass in
the specialization/ generalization.
• Shown in EER diagrams by a double line.
• Partial allows an entity not to belong to any of the subclasses.
• Shown in EER diagrams by a single line
4.16 Relational Model
Relational Model Terminology
1. A relation is a table with columns and rows.Only applies to logical structure of the database,not
the physical structure.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.16. Relational Model
2. Attribute is a named column of a relation.
3. Domain is the set of allowable values for one or more attributes.
4. Tuple is a row of a relation.
5. Degree is the number of attributes in a relation.
6. Cardinality is the number of tuples in a relation.
7. Relational Database is a collection of normalized relations with distinct relation names.
Figure 4.17: Different key and attributes shown as table format.
4.16.1 Mathematical Definition of Relation
Consider two sets, D1 & D2, where D1 = {2, 4} and D2 = {1, 3, 5}.Cartesian product of D1 and D2,
is set of all ordered pairs,where first element is member of D1 and second element is member of D2.
D1 ×D2 = {(2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5)}
Alternative way is to find all combinations of elements with first from D1 and second from D2.Any
subset of Cartesian product is a relation; e.g.
R = {(2, 1), (4, 1)}
4.16.2 Properties of Relations
• Relation name is distinct from all other relation names in relational schema.
• Each cell of relation contains exactly one atomic (single) value.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
71
CHAPTER 4. UNIT IV
• Each attribute has a distinct name.
• Values of an attribute are all from the same domain.
• Each tuple is distinct; there are no duplicate tuples.
• Order of attributes has no significance.
• Order of tuples has no significance, theoretically.
4.16.3 Relational Keys
1. Superkey: An attribute, or a set of attributes, that uniquely identifies a tuple within a relation.
2. Candidate Key Superkey (K) such that no proper subset is a superkey within the relation.
• In each tuple of R, values of K uniquely identify that tuple (uniqueness).
• No proper subset of K has the uniqueness property (irreducibility).
3. Primary Key Candidate key selected to identify tuples uniquely within relation.
4. Alternate Keys: Candidate keys that are not selected to be primary key.
5. Foreign Key: . Attribute, or set of attributes, within one relation that matches candidate key
of some (possibly same) relation.
4.16.4 Relational Integrity
1. Null Represents value for an attribute that is currently unknown or not applicable for tuple.
Deals with incomplete or exceptional data. Represents the absence of a value and is not the
same as zero or spaces, which are values.
2. Entity Integrity: In a base relation, no attribute of a primary key can be null.
3. Referential Integrity: If foreign key exists in a relation, either foreign key value must match
a candidate key value of some tuple in its home relation or foreign key value must be wholly
null.
4. Enterprise Constraints: Additional rules specified by users or database administrators.
4.17 Query Languages: Relational Algebra
Relational algebra and relational calculus are formal languages associated with the relational model.
Informally, relational algebra is a (high-level) procedural language and relational calculus a nonpro-
cedural language. However, formally both are equivalent to one another. A language that produces a
relation that can be derived using relational calculus is relationally complete.
4.17.1 Relational Algebra
• Relational algebra operations work on one or more relations to define another relation without
changing the original relations.
• Both operands and results are relations, so output from one operation can become input to
another operation.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.17. Query Languages: Relational Algebra
• Allows expressions to be nested, just as in arithmetic. This property is called closure.
• Five basic operations in relational algebra: Selection, Projection, Cartesian product, Union, and
Set Difference.
• These perform most of the data retrieval operations needed.
• Also have Join, Intersection, and Division operations, which can be expressed in terms of 5 basic
operations.
Figure 4.18: Five basic operation.
1. Selection (or Restriction) Works on a single relation R and defines a relation that
contains only those tuples (rows) of R that satisfy the specified condition (predicate).It is
denoted by σpredicate(R).
Example 5. List all staff with a salary greater than RS 10,000
σsalary > 10000 (Staff)
2. Projection Works on a single relation R and defines a relation that contains a vertical
subset of R, extracting the values of specified attributes and eliminating duplicates.It is
denoted by Πcol1,col2...,coln(R)
Example 6. Produce a list of salaries for all staff, showing only staffNo, fName, lName, and
salary details.
ΠstaffNo,fName,lName,salary(Staff)
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
73
CHAPTER 4. UNIT IV
3. Union(R∪ S) Union of two relations R and S defines a relation that contains all the tuples
of R, or S, or both R and S, duplicate tuples being eliminated. R and S must be union-
compatible.If R and S have I and J tuples, respectively, union is obtained by concatenating
them into one relation with a maximum of (I + J) tuples
Example 7. List all cities where there is either a branch office or a property for rent.
Πcity(Branch) ∪ Πcity(PropertyForRent)
4. Set Difference(R-S) Defines a relation consisting of the tuples that are in relation R, but
not in S.R and S must be union-compatible
Example 8. List all cities where there is a branch office but no properties for rent.
Πcity(Branch)−Πcity(PropertyForRent)
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.18. SQL
5. Intersection (R ∩ S)Defines a relation consisting of the set of all tuples that are in both
R and S.R and S must be union-compatible.Expressed using basic operations.
R ∩ S = R− (R− S)
Example 9. List all cities where there is both a branch office and at least one property for
rent.
Πcity(Branch) ∩Πcity(PropertyForRent)
6. Cartesian product (R×S) Defines a relation that is the concatenation of every tuple of
relation R with every tuple of relation S.
Example 10. List the names and comments of all clients who have viewed a property for
rent.
4.18 SQL
4.18.1 Objectives of SQL
1. Ideally, database language should allow user to
• create the database and relation structures.
• perform insertion, modification, deletion of data from relations.
• perform simple and complex queries.
2. Must perform these tasks with minimal user effort and command structure/syntax must be easy
to learn.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
75
CHAPTER 4. UNIT IV
3. It must be portable.
4. SQL is a transform-oriented language with 3 major components:
• A DDL for defining database structure.It Consists of SQL statements for defining the
schema (Creating, Modifying and Dropping tables, indexes, views etc.)
• A DML for retrieving and updating data.It Consists of SQL statements for operating on
the data (Inserting, Modifying, Deleting and Retrieving Data) in tables which already exist.
• Data Control Language , Consists of SQL statements for providing and revoking access
permissions to users
5. SQL is relatively easy to learn.It is non-procedural ,you specify what information you require,
rather than how to get it.it is essentially free-format.
6. Can be used by range of users including DBAs, management, application developers, and other
types of end users.
7. An ISO standard now exists for SQL, making it both the formal and de facto standard language
for relational databases.
4.18.2 History of SQL
• In 1974, D. Chamberlin (IBM San Jose Laboratory) defined language called Structured English
Query Language (SEQUEL).
• A revised version, SEQUEL/2, was defined in 1976 but name was subsequently changed to SQL
for legal reasons.
• Still pronounced as see-quel., though official pronunciation is S-Q-L.
• IBM subsequently produced a prototype DBMS called System R, based on SEQUEL/2.
• In late 70s, ORACLE appeared and was probably first commercial RDBMS based on SQL.
• In 1987, ANSI and ISO published an initial standard for SQL.
• In 1989, ISO published an addendum that defined an Integrity Enhancement Feature.
• In 1992, first major revision to ISO standard occurred, referred to as SQL2 or SQL/92.
• In 1999, SQL3 was released with support for object-oriented data management.
4.19 Normalization
Main objective in developing a logical data model for relational database systems is to create an
accurate representation of the data, its relationships, and constraints.To achieve this objective, must
identify a suitable set of relations.Four most commonly used normal forms are first (1NF), second
(2NF) and third (3NF) normal forms, and BoyceCodd normal form (BCNF)
Based on functional dependencies among the attributes of a relation.A relation can be normalized
to a specific form to prevent possible occurrence of update anomalies.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.19. Normalization
4.19.1 Data Redundancy
Major aim of relational database design is to group attributes into relations to minimize data re-
dundancy and reduce file storage space required by base relations.Problems associated with data re-
dundancy are illustrated by comparing the following Staff and Branch relations with the StaffBranch
relation.
StaffBranch relation has redundant data: details of a branch are repeated for every member of
staff.In contrast, branch information appears only once for each branch in Branch relation and only
branchNo is repeated in Staff relation, to represent where each member of staff works.
4.19.1.1 Update Anomalies
Relations that contain redundant information may potentially suffer from update anomalies.Types of
update anomalies include
• Insertion.
• Deletion.
• Modification.
4.19.2 Lossless-join and Dependency Preservation Properties
Two important properties of decomposition of any relation are:
1. Lossless-join property enables us to find any instance of original relation from corresponding
instances in the smaller relations.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
77
CHAPTER 4. UNIT IV
2. Dependency preservation property enables us to enforce a constraint on original relation by
enforcing some constraint on each of the smaller relations.
4.19.3 Functional Dependency
It is the main concept associated with normalization.Functional Dependency Describes relationship
between attributes in a relation. If A and B are attributes of relation R, B is functionally dependent
on A (denoted A → B), if each value of A in R is associated with exactly one value of B in R.
Determinant of a functional dependency refers to attribute or group of attributes on left-hand side of
the arrow.
Main characteristics of functional dependencies used in normalization:
• have a 1:1 relationship between attribute(s) on left and right-hand side of a dependency.
• hold for all time.
• are nontrivial.
Complete set of functional dependencies for any given relation can be very large.We need to find an
approach that can reduce set to a manageable size means need to identify set of functional dependencies
(X) for a relation that is smaller than complete set of functional dependencies (Y) for that relation and
has property that every functional dependency in Y is implied by functional dependencies in X.Set
of all functional dependencies implied by a given set of functional dependencies X called closure of X
(written X+).
Set of inference rules, called Armstrongs axioms, specifies how new functional dependencies can
be inferred from given ones.
4.19.3.1 Armstrongs axioms
Let A, B, and C be subsets of the attributes of relation R. Armstrongs axioms are as follows:
1. Reflexivity: If B ⊆ A, then A→ B.
2. Augmentation: If A→ B, then AC → BC.
3. Transitivity: If A→ B and B → C, then A→ C.
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
4.19. Normalization
4.19.4 The Process of Normalization
The Process of Normalization is a Formal technique for analyzing a relation based on its primary key
and functional dependencies between its attributes.This process often executed as a series of steps.
Each step corresponds to a specific normal form, which has known properties. As normalization
proceeds, relations become progressively more restricted (stronger) in format and also less vulnerable
to update anomalies.
Figure 4.19: Relationship Between Normal Forms
1. Unnormalized Form (UNF) A table that contains one or more repeating groups. To create
an unnormalized table:
• transform data from information source (e.g. form) into table format with columns and
rows.
2. First Normal Form (1NF) A relation in which intersection of each row and column contains
one and only one value.
(a) UNF to 1NF Nominate an attribute or group of attributes to act as the key for the
unnormalized table.Identify repeating group(s) in unnormalized table which repeats for
the key attribute(s).Remove repeating group by entering appropriate data into the empty
columns of rows containing repeating data (flattening the table) Or by placing repeating
data along with copy of the original key attribute(s) into a separate relation.
3. Second Normal Form (2NF) 2nd normal form is based on concept of full functional depen-
dency:
• A and B are attributes of a relation R.
• B is fully dependent on A if B is functionally dependent on A but not on any proper subset
of A.
A relation is in 2NF if it is in 1NF and every non-primary-key attribute is fully functionally
dependent on the primary key.
(a) 1NF to 2NF
• Identify primary key for the 1NF relation.
• Identify functional dependencies in the relation.
• If partial dependencies exist on the primary key remove them by placing them in a
new relation along with copy of their determinant.
4. Third Normal Form (3NF) Based on concept of transitive dependency:
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com
79
CHAPTER 4. UNIT IV
• A, B and C are attributes of a relation such that if A→ B and B → C.
• C is transitively dependent on A through B. (Provided that A is not functionally dependent
on B or C).
A relation is in 3NF if it is in 1NF and 2NF and in which no non-primary-key attribute is
transitively dependent on the primary key.
(a) 2NF to 3NF
• Identify the primary key in the 2NF relation.
• Identify functional dependencies in the relation.
• If transitive dependencies exist on the primary key remove them by placing them in a
new relation along with copy of their determinant.
5. BoyceCodd Normal Form (BCNF)
Notes developed by Mr Narayan Changder(Assistant professor CSE,BTKIT).For any query Contact: narayan.changder@ gmail.com