s · web viewin mathematics and computer programming, true and false are sometimes referred to as...

35
UNIT 6 – The Structure of Hardware and Software SUMMARY MATERIAL www.ouw.co.uk, or contact Open University Worldwide, Michael Young Bu1908 858785; fax +44 (0)1908 858787; e-mail [email protected] The Open University, Walton Hall, Milton Keynes, MK7 6AA Licensed for use by the Arab Open University ________________________________________________________________________ ___ UNIT 6 – The Structure of Hardware & Software 1 Mrs. Haifaa Elayyan

Upload: others

Post on 21-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

S

UNIT 6 – The Structure of Hardware and Software SUMMARY MATERIAL

www.ouw.co.uk, or contact Open University Worldwide, Michael Young Bu1908 858785; fax +44 (0)1908 858787; e-mail [email protected]

The Open University, Walton Hall, Milton Keynes, MK7 6AA

Licensed for use

by the Arab Open University

___________________________________________________________________________

UNIT 6 – The Structure of Hardware

& Software

SUMMARY MATERIAL ______________________________________________

2008–2009

The Structure of Hardware & Software

Software vs. hardware

· Hardware consists of the tangible parts of the computer system – ‘the parts that can be kicked’. Examples of hardware include the electronic circuits inside the casing, the keyboard and hard disk.

· Software, on the other hand, is more abstract – it consists of sets of instructions that tell a computer how to perform a particular task. Examples of software are word-processor applications such as Microsoft Word, browsers such as Explorer or Netscape.

Any instruction performed by software can also be built directly into hardware, and instructions executed by hardware can often be simulated in software. As an illustration let us consider the multiplication of two numbers. One way of thinking about multiplication is to consider it as a number of additions: starting with nothing (zero) you need to add a particular number (one of the numbers to be multiplied) a specified number of times (the second number to be multiplied). For example, to multiply 3 by 4, we need to add 3 to 0, then add 3 to the result, and so on, until we have added 3 four times, as follows:

0 + 3 gives a result of 3 (first addition)

3 + 3 gives a result of 6 (second addition)

6 + 3 gives a result of 9 (third addition)

9 + 3 gives a result of 12 (fourth addition)

Now suppose that a computer has hardware (i.e. an electronic circuit) to carry out a simple arithmetic instruction such as ADD. Since multiplication is nothing more than repeated addition, we could construct a new circuit (or piece of hardware) made up of a number of ADD circuits that, working together, can multiply two numbers.

Alternatively we could instruct the computer, using a computer program, to perform repeated addition using the existing ADD circuit. The hardware and software solutions are logically equivalent – they both use repeated addition – but the hardware solution uses several ‘tangible’ electronic circuits, whereas the software solution uses the more abstract idea of ‘instructing’ the machine to use the same circuit repeatedly. The decision to implement some functions in hardware and others in software is based on factors such as speed, cost and reliability.

Firmware is the term used to refer to a sequence of instructions (software) etched into the read-only memory (ROM) of the computer, usually to perform some system function. Because these instructions are on a chip they form a permanent part of the computer and could be viewed as a combination of hardware and software.

The computer as a system

One simple view of an information system is that it is a machine that can solve problems by carrying out a sequence of instructions, usually on a given set of data. The data provided is usually referred to as input, and the solution (or results) produced is referred to as the output.

Some of the calculating devices that you read about in History snippet 1 conformed to this model of a simple information system. Our view is that a computer differs from a calculating device in that it includes a mechanism that stores not only the data but also the instructions for processing that data, and the results that are to be output. If we have a computer at the heart of our information system we can extend our model as shown in Figure 2.2

The word ‘process’ has a very particular meaning in some areas of computing. In this unit we are using it to refer to all the processing (or manipulation) of the input so as to produce the output

A computer is a device that accepts data and manipulates it by carrying out a sequence of instructions (a program), so as to produce some output. A computer also has the means of storing the input, the output and the program. Input, output and instructions.

In modern computers, all data, including the instructions that make up a computer program, are stored in the computer as binary digits (bits) – sequences of zeros and ones. The branch of mathematics that defines the rules governing the manipulation of entities with only two states, such as bits, is called binary logic (or Boolean algebra – after the Englishman, George Boole, who developed it in the middle of the nineteenth century). We will not explore the application of binary logic to computer circuits any further in this course but it is important to be aware that at a fundamental level computer processing is simply the application of logic to binary numbers.

Of course in order to make the results readable by humans, the binary representations of the output need to be ‘translated’ back into an appropriate format.

The heart of the computer – the central processing unit (CPU)

You have learned that computer processes manipulate binary representations of data, using instructions that are themselves stored in binary format. The component of the computer in which this manipulation of binary data takes place is referred to as an arithmetic/logic unit (ALU). Movement of data between the ALU and other parts of the computer is coordinated by the control unit.

The terms ‘processor’, ‘microprocessor’ and ‘CPU’ are often used interchangeably. The term ‘microprocessor’ usually refers to the chip on which the processor is implemented.

Within both the ALU and the control unit are small numbers of memory locations known as registers which are used to hold single pieces of data or single instructions immediately before and after processing. The ALU and the control unit along with their registers are together referred to as the central processing unit (CPU) or processor of a computer. A much larger area of memory, known as main memory, is where the data and instructions that are not immediately required are held, and where results are eventually stored.

Individual pieces of data and instructions are moved from main memory into the registers as required. The registers are built from more efficient (and expensive) memory than main memory, thus allowing fast access and transfer of data.

You will have come across the concept of main memory, loosely referred to as RAM, in the context of computer specifications. For example, the minimum computer specification for M150 in 2004 is 16MB (megabytes) of RAM, though most computers on sale at the time of writing have at least 256MB.

The processor of a PC is usually on a single microchip,(or microprocessor) which can contain several tens of millions of circuits. Processors come in ‘families’, such as the Pentium 4, Celeron, AMD Athlon XP. Each family of processors has what is termed an instruction set. This is the collection of basic instructions, called machine language instructions, which a particular processor understands. The terms ‘machine language’ and ‘machine code’ are synonymous.

Figure 3.2 shows how data moves between the various components. The data flow is coordinated by the control unit, through the sending and receiving of control signals

More about the CPU and main memory :

The control unit is responsible for interpreting the instructions in a computer program and then sending appropriate control signals to the components that will carry them out. As soon as one instruction has been executed, the control unit moves on to deal with the next. Once processing of a program has started, there is no need for human intervention unless some interactive input is required. The control unit is in two-way communication with all the other devices in the system so that it may, for example, receive a mouse click, and then send a control signal to another device in order to execute an instruction.

Main memory

A computer has a number of different types of memory (or data storage). As a computer user you are probably most familiar with main memory, often referred to as RAM, and file storage memory, such as that provided by a hard disk. Main memory is used to store a program of instructions and the data needed by the program while it is running. Its contents are volatile, i.e. they are lost when the computer is switched off. File storage memory is memory that is used to store large amounts of data for use at some later date. Unlike main memory, it is non-volatile, i.e. the data is persistent – it is not lost when the computer is turned off.

Main memory can be envisaged as a large collection of sequentially ordered pigeon-holes, each of which can hold the same number of bits, equivalent to the word size of the computer.

The word size refers to the number of bits that the CPU can manipulate at one time. It is determined by the size of the registers in the CPU and the number of bits that can be moved round the computer as a single unit. When people refer to computers as being ‘32-bit’ or ‘64-bit’ machines they are referring to word size.

Older personal computers (including 386, 486 and Pentium PCs) had a word size of 32 bits. Newer processors such as the Itanium series have a 64-bit (or 8-byte) word size.

Locations in main memory are sequentially numbered, so that each one has a unique ‘address’ by which it can be directly accessed. This is why this type of memory is sometimes referred to as random-access memory (RAM).

The addressable nature of main memory makes the retrieval of data much faster than if there was a need to search sequentially through each ‘pigeonhole’ in turn until the right one was found. Furthermore, random access means that each memory location takes the same amount of time to access, regardless of its address.

Most other modern forms of memory, such as hard disk drive storage, are also random access. However, the acronym RAM is reserved for main memory.

Registers

Registers are very fast, efficient areas of memory that are located in the CPU and used as a holding area for all the data and other information needed during the processing of a program. In fact there are a number of different types of register, some located in the ALU, some in the control unit, and each designed to hold a particular type of information

Arithmetic/logic unit (ALU)

The arithmetic/logic unit is the part of the CPU where the arithmetic and logical operations are carried out. It contains a number of registers where data is held directly prior to and following processing. Among other things, the ALU can:

·

· add

· n

· subtract

· n

· compare

· n

· multiply and divide (which may be derived from adding and

· subtracting).

Peripheral devices

A peripheral, or peripheral device, is any computer device that is not part of the essential computer (the CPU and main memory). The most common peripherals are input and output (I/O) devices, and storage devices. Some peripherals, such as hard disks, are usually mounted in the same case as the processor, while others, such as printers, are physically outside the computer, and communicate with it via a wired or wireless connection.

The operating system

Hardware components of a computer: the processor, memory and peripheral devices. Managing these resources to effectively coordinate and carry out all the computer’s tasks is the job of the operating system. Common operating systems for personal computers include Linux, Mac OS (for the Apple Macintosh) and the various versions of Windows, e.g. Windows 2000 and Windows XP.

An operating system is a complex piece of software that acts as an interface between the user (or an application program) and the computer hardware.

The functions of the operating system include the following.

· Provision of a user interface :

The user interface is the software that enables us to communicate with our computers. It provides us with a means of inputting data and instructions, and presents output in a way that we can understand.

The user interfaces of early operating systems such as CP/M and DOS were text based, requiring the user to learn a set of commands which needed to be typed in following precise rules. Output to the screen also consisted entirely of characters. Today most PC operating systems provide graphical user interfaces (GUIs), the most common example of which is Microsoft Windows.

GUI-based operating systems make use of icons, menus and other graphical widgets, with which the user interacts via a pointing device, usually a mouse. Most people find graphical interfaces more intuitive, quicker to learn, and easier to use than sequences of commands. A further advantage of GUIs is their availability for use by programs other than the core software provided by the operating system.

· Management of memory :

During the processing of a program, data and instructions are stored in the computer’s main memory. It is the job of the operating system to allocate an appropriately sized area of memory to each process, and to ensure that program instructions and data do not interfere with each other or with the data and instructions of other programs.

· Coordination and control of peripheral devices :

In order to carry out its tasks a computer may need to communicate with one or more peripheral devices. For example, it may wish to receive input data from the keyboard or mouse; read from a file on a storage device; send output to the monitor or printer; connect to a network. The operating system coordinates all these operations, ensuring that data is moved safely and efficiently between the components of the system.

Scheduling access to the processor

· The operating system manages access to the processor:

By prioritising jobs to be run and ensuring that the processor is not idle. For example, if the currently running program finishes or is interrupted in order to wait for data from the hard disk, the operating system will ensure that another process is given access to the processor.

· Providing an interface between application programs and the computer hardware :

Operating system provides a ‘common’ interface between application software and the computer hardware. It is generally not possible to build application software that can run on the many different types of processor made by different companies. The operating system ensures that a ‘standard’ way of interacting with the hardware exists, and application programmers can then rely on this standard without worrying about the minutiae of the processor specifications.

· Providing basic utilities :

Most modern operating systems also provide basic utilities such as disk formatting facilities, file management systems and software installation wizards.

· Read-only memory (ROM) and bootstrapping :

When you switch on a computer, the first thing it needs to do is to load its own operating system (which is usually stored on the hard disk). You learned earlier in this unit that in order to carry out any processing, the computer needs to load instructions from its main memory – but also that main memory is volatile – when the power is turned off, its contents are lost. So how does the computer ‘know’ how to load its operating system from the hard disk? The answer is that there is another type of memory, called read-only memory (ROM) installed in the computer.

The data in ROM is built into the memory chips during manufacture and is permanent. It cannot be overwritten and will not disappear when power is lost to the computer

An important function of ROM is to store a program, called a boot program, which is automatically executed when the computer is first switched on. This small program will typically run a test of main memory and see what peripherals are connected to the system, before loading larger programs, such as the operating system. The process of using a short program to load a larger program is called bootstrapping, which comes from the idea of someone pulling themselves up by their own bootstraps.

Running a program \ The fetch/execute cycle:

that data and instructions are stored in main memory, but moved to registers in the CPU as required, and that operations such as adding and logical comparisons are carried out by the ALU. We can summarize the steps carried out by the CPU in the execution of a computer program as follows:

·

· Get an instruction from the program;

· Find and transfer any data necessary to perform the instruction;

· Carry out the instruction (which may involve saving a result)

The process of locating, transferring and carrying out a single instruction during the execution of a computer program is known as the fetch/ execute cycle or fetch/execute sequence. During each cycle the CPU must:

· locate the next (or if execution has just started, the first) instruction in the program , which is stored in main memory.

· transfer this instruction into the processor by placing it into an appropriate register.

· decode the instruction .

· locate (in main memory) any data the instruction refers to, and fetch it into the processor by placing into the processor by placing it into an appropriate register.

· Do the processing on the data that the instruction requires (e.g. add a number to another number .

· place the result in an appropriate register;

· if necessary move the result back to main memory so that it can be

used later in the processing;

· update the program counter to hold the memory address of the next Instruction.

When processing the simplest set of instructions, the processor needs to keep track of several different pieces of information at the same time, and that data and instructions need to be moved quickly around the system. You know that the control unit manages this task. However there is also the need for some subsystem to connect its various parts, so that the data, instructions and control signals (in the form of bits) can flow from place to place as required. This is achieved through buses and a processor clock

Buses

An electronic channel through which data and instructions travel between different components of the computer is known as a bus. Generally at least three categories of bus are needed to ensure that the fetch/execute cycle operates effectively.

· The address bus is used to carry information about the addresses in main memory to be accessed.

· The data bus transmits the data; the external data bus transmits data between the registers and main memory, and the internal data bus transmits data between the different registers within the processor.

· The control bus carries control signals sent by the control unit, as well as signals that report the status of various devices. For example, when the processor is reading data or instructions from main memory the control bus would carry the read signal that indicates that the control unit is currently coordinating a read from main memory.

Bus technology is also important for transferring data between the CPU and the input and output devices.

Processor clock

During the processing of instructions, the order in which events happen is clearly critical. To enable events to be synchronized by the control unit, all computers have a processor clock, which sends out pulses at regular intervals. The length of time between pulses is called the clock cycle time. Historically each instruction took at least one clock cycle to execute but advanced processors can now execute more than one instruction in a single cycle. The number of pulses per second, or frequency, of the clock is measured in megahertz (MHz), so a processor with a 900MHz clock sends 900,000,000 pulses per second. All other things being equal, a computer with a high clock frequency (sometimes called clock speed) can execute more instructions per second than one with a lower clock frequency.

SAQ 4.3

(a) For the instruction MUL 502, identify the operand and the operator. What do you think this instruction means? (Hint Read the comment in the table for the ADD 502 instruction.)

(b) Does every instruction in the program on page 34 have an operand? Which instruction appears to have an operand that does not reference a memory location where a meaningful piece of data is stored?

Answer to SAQ 4.3

(a)

The operand is 502 and the operator is MUL. You might have guessed

that the instruction means multiply the contents of the accumulator by

the contents of memory location 502.

(b)

Every instruction appears to have an operand. However, in the case of

the HLT instruction, the operand is meaningless as this instruction does

not need to access any data in memory. (Not all assembly languages

require the programmer to specify an operand where it is not meaningful.

Accumulator

Historically the accumulator was the register used to store any intermediate values during a calculation. In the simplest model of a processor all arithmetic operations are carried out using a single accumulator. In most modern computers, the processor contains several general-purpose registers that allow operations to be carried out directly on the contents of two memory locations, but it is convenient to maintain the simple mental model of a single accumulator and you will still find references to this register in computing literature.

Program counter (or next instruction pointer)

The program counter (PC), sometimes called the next instruction pointer (NIP), is located in the control unit and holds the memory address of the next instruction to be executed. The control unit sends a signal to this register when it needs to know from where the next instruction must be fetched.

Instruction register

The instruction register (IR), also located in the control unit, holds the instruction that is currently being executed. It is responsible for decoding the instruction, i.e. separating the instruction into its operator and operand, and interpreting the meaning of the operator.

Memory address register

The memory address register (MAR) holds the address of a main memory location that needs to be accessed.

Memory data register

The memory data register (MDR) is a temporary holding area for either data or instructions that are being transferred between main memory and the other registers.

Status registers

The status registers hold information about the current status of an instruction. One use of a status register is to flag up or signal a problem. For example, the status register could contain a special code to indicate that the programmer has inadvertently tried to do something impossible, such as divide by zero.

Thanks for the memory Running a program

When discussing the fetch/execute cycle we confined our attention largely to what was happening within the CPU. However, during the cycle, the CPU will need to communicate with disk storage memory and possibly also with input/output devices. We now look a little more closely at the way in which data and instructions are stored in main memory and then consider how hard disk storage memory can be used to supplement main memory while a program is executing. In Subsection 4.4 we will take a brief look at the way in which large amounts of data are moved between input/output devices and the CPU.

Storing and locating data and instructions in main memory

Finding a particular location in main memory is an important part of the fetch/execute cycle. You have seen that main memory consists of a number of addressable (i.e. uniquely identifiable) locations, each capable of holding a certain number of bits. The memory addresses are numeric and start with location 0 (zero) so a computer with n cells of main memory will have memory addresses 0 to n – 1.

The processor’s registers must normally be large enough to hold at least one word (though some registers hold half a word but work in pairs) and the ALU must be able to manipulate one word at a time. Generally, a computer with a large word size can process more data in each instruction cycle than a computer with a small word size.

Virtual memory

Computer memory is expensive, and it is therefore important that efficient use is made of the different types available. We have already seen that different kinds of memory have different characteristics, one of which is the speed at which they can be accessed. The fastest access takes place in the registers but register memory is very expensive. Furthermore, since registers are built into the processor microchip, it is not possible to add to them. At the other end of the scale, the slowest access is from hard disk storage memory, roughly 10,000 times slower than for the registers, but hard disk storage memory is relatively cheap. Practically speaking, computers tend to have lots of cheap, but slow hard disk storage memory and very little fast, but expensive register memory.

The amount of main memory (usually referred to as RAM) supplied in computers has been steadily increasing. At the time of writing it is not uncommon to find 512MB or more of RAM in a standard multimedia PC. However, the size of programs and the amount of data they are required to manipulate has also increased, mainly as a result of applications involving sound and video. Main memory can be supplemented by a technique whereby an area of the disk storage memory (hard disk) can be used to simulate additional capacity during processing. This technique is known as virtual memory.

The use of virtual memory avoids the need for main memory to hold all the instructions and data needed to run a program. Instead, a relatively small working set of data and instructions are held in main memory at any one time, with the rest being kept in a special area on the hard disk. Data and instructions are swapped in and out of main memory as required, creating an illusion of far more main memory than actually exists. The transfer of blocks of data (called pages) between the main memory and the hard disk is managed by the operating system. Although the use of virtual memory enables a computer to run bigger programs, with more data than would otherwise be possible, the speed of execution is reduced due to the increased time needed to access the hard disk.

As with data held in main memory, data in virtual memory becomes inaccessible once the computer loses power. This may seem surprising, given that data stored on the hard disk is not normally volatile. However, because information on the location of the data on disk is held in main memory, it is lost once the computer loses power.

Cache memory

As already discussed, the speed of the processor’s registers exceeds that of main memory, potentially leading to a situation where processing is slowed down due to the wait for data to be delivered from main memory. To bridge this gap, there exists what is known as cache memory – faster than main memory, but slower than the registers. You will not be surprised to learn that it is also intermediate in cost. Some cache memory is now routinely included on most processor microchips.

Cache memory is ‘fed’ from main memory before the data is finally moved to the registers. When an instruction calls for data, the processor first checks to see if the required data is in the cache. If it is, it takes the data from the cache instead of fetching it from main memory. A typical PC will have several hundred kilobytes of cache memory.

Hierarchy of speed and cost

We have seen that there is a hierarchy in terms of speed and cost – from faster, more expensive memory to slower, less costly memory. This is summarised in Figure 4.4.

Program-controlled input and output

Fast data transfer in and out of main memory, and a fast clock speed to ensure that instructions can be executed quickly, will not achieve fast processing overall if a great deal of time is needed for the processor to interact with much slower peripherals (in particular the input/output devices). Historically the lower speed of input and output relative to data flow within the processor has slowed down the speed at which a computer can process data. This has become more of an issue given the requirement to manipulate the very large amounts of data associated with sound and video data and has resulted in a lot of development work in this area.

One outcome is a local input/output bus (I/O bus) which is a high-speed bus that connects performance-critical devices, such as video cards, modems and network interface cards, to the main memory and the processor. The most common local I/O bus is the peripheral component interconnect bus (PCI). PCI technology allows the processor to be bypassed altogether when transferring data to and from main memory, resulting in very fast transfer speeds – typically over 30MB per second compared with less than half this rate for buses that have to channel data through the processor.

Types of programming language

Low-level languages

Machine language

Early computer programmers wrote instructions in a form that could be directly understood by their computer’s family of processors, i.e. in some dialect of machine language. As already discussed, a computer can only handle bits, so each instruction in machine language is a long stream of 1s and 0s. For example, in the machine language for one particular processor family, the command to move data from the processor’s accumulator into main memory address 53281 is:

10001101 00100001 11010000

At first sight, this would be incomprehensible to most people, but there is, in fact, some logic to this pattern of bits. The first byte, 10001101, is the instruction that the control unit interprets as ‘store the contents of the accumulator in memory.’ The second and third bytes give the memory address of the location in which the contents should be stored. For reasons we will not explore here, the order of these two bytes needs to be reversed, 46

and so they should be interpreted as 11010000 00100001, giving memory address 53281. Types of programming language

In a sense, machine language programs are the ‘purest’ kind of program because the programmer is literally writing in the language of the processor without any intervening interpreters or translators. A programmer working in machine language can therefore exploit the capabilities of the processor to the full.

However, there are many disadvantages to reading and writing programs in machine language, as listed below.

n

Machine language programs are slow and difficult for humans to read

and write. In particular, it is hard to memorise instruction codes which,

as we have seen, consist of 8-digit binary sequences such as

10001101.

n

Memory references are given as actual location numbers. The

programmer therefore needs to assign data to particular memory

locations and to maintain a ‘memory map’ in order to keep track of

what has been stored where, and which locations are still available.

n

It is difficult to understand machine language programs without a great

deal of annotation. As a consequence they are often very difficult to

modify.

n

Each family of processors understands only its own machine language.

A processor cannot correctly execute a program that has been written

in the machine language of another family of processors.

Assembly language

As with machine language, an assembly language is a language in which each instruction exactly corresponds to one hardware operation of the processor. Any computer programming language with this one-to-one correspondence (including the machine language itself) is referred to as a low-level language. In programming languages such as JavaScript, Java, C++ and Smalltalk, one instruction may be equivalent to many machine language instructions. These programming languages are referred to as high-level languages. Because of the one-to-one correspondence between a low-level language instruction and a machine language instruction, a program written in assembly language can directly take advantage of all the features and instructions available on the processor. This is not the case with high-level languages, where a single instruction may be translated into a sequence of low-level instructions in a number of different ways – something over which the programmer has no control.

Whenever a program is written in a language other than machine language, the instructions in the original program (called the source code) need to be converted into equivalent machine language instructions. The task of converting the source code into machine language is done by special programs, called translators. When the source language is an assembly language the program that does this translation is called an assembler.An assembler takes an assembly language program and generates and saves an equivalent program in machine language. This machine language program can then be executed whenever required.

Assembly language programs have a number of advantages over machine language.

n

In assembly languages, mnemonic names (i.e. names that give a clue

as to the function of the instruction, such as ADD and SUB) are given to

the operations.

n

Some assembly languages allow the use of symbolic names for

memory locations. These names, together with their numeric

equivalents, are held in a table maintained by the assembler, thus

avoiding the need for the programmer to deal directly with memory

management and addressing.

n

An assembly language may allow the programmer to provide data in

forms other than binary – ordinary decimal numbers or characters for

example – and let the assembler worry about translating them.

n

Most assemblers include a facility to print a listing of the source

program (in assembly language) as well as a list of its machine

language equivalent. These listings are very helpful when it comes to

correcting errors or making modifications.

When the source language is a high-level language, the translator is called a or an

Assemblers usually notify the programmer of errors in the usage of the assembly language, such as the inclusion of an illegal operator code.

However, assembly language retains some of the disadvantages of machine language.

n

An assembly language program can only run on one family of

processors. (You will learn in the next unit that programs written in

high-level languages can potentially be run on many machines.)

n

Although slightly easier to read than machine language, assembly

language programs are still difficult to understand by anyone other

than the programmer, and so are difficult to modify or update.

n

Programs written in assembly language tend to be very long. Because

each statement in a high-level programming language is normally

equivalent to several machine language instructions, a program written

in a high-level language is shorter than the equivalent assembly

language program, sometimes by a factor of as much as ten. The

workload implications of this are very significant.

High-level languages

Most programmers write in high-level languages because these languages are easier for humans to use. As we have already noted, a single high-level language instruction is usually equivalent to several machine level instructions. In addition, high-level language instructions tend to be closer to English and other natural languages. Both these characteristics lead to increased programmer productivity. Futhermore, due to their increased readability, programs written in high-level languages are easier to debug and to modify.

We have seen that each family of processors understands only instructions written in its own machine language. This means that when we write programs in a high-level language, we need the same kind of translation process that the assembler provides for assembly programs. There are two different mechanisms by which a program written in a high-level language may be translated into machine language: compilation and interpretation. In compilation the program written in the high-level language, called the source code, is used as the input to a translator program called a compiler (we say the source code is compiled). The compiler translates the source program into a program written in the machine language understood by the processor. The machine language version is then saved, and it is this machine language program that is executed every time the program is run.

When you buy computer software, you are not usually able to see the program code in which that software was written. What you have is an executable program, which has already been compiled from the source code.

For a particular programming language several different compilers will be available to take into account that different families of processors have different versions of machine language, and that different operating systems interact with the processors in different ways. For example, you can buy a C++ compiler that is written for a PC running Windows, or a C++ compiler for a computer running Linux. This means that a program written in a high-level language (such as C++) on a PC running Windows can be moved and recompiled on a computer running Linux. The only source code that would need to be changed would be the parts related to I/O and the user interface.

Interpretation is a different mechanism for translating high-level languages to machine language. In effect, an interpreter translates each instruction in the source code into machine language as the program is running, and then immediately executes it. There is never a complete translation of the whole of the source code into machine language, and so no machine language program is generated or saved. The advantage of an interpreted language is that the potentially lengthy process of compile/execute does not need to be gone through for each small change in the source code. Interpreted languages lend themselves to situations where small incremental changes to a program are being made, and need to be tested as quickly as possible. The disadvantage is that the translation process must take place every time a program is run, resulting in a slower execution process. Languages, such as JavaScript, Perl and Basic are especially designed to be interpreted, whereas languages such as C, C++ and Java are designed to be compiled.

The structure of computer programs

Sequence, selection and repetition

For most programming languages the default execution order is sequential – instructions are executed in the order in which they appear in the program. The assembly language programs that you saw in Subsection 4.2 were simple sequential programs.

‘Begin at the beginning’, the King said, gravely, ‘and go on till you

come to the end; then stop.’

(Lewis Carroll, Alice’s Adventures in Wonderland)

However, there may be occasions when the programmer might wish to specify a possible change in the order of execution, e.g. in response to some user input. All programming languages have mechanisms for changing the order in which their instructions are carried out. An instruction that changes the default sequential order is referred to as a control instruction. Examples of control instructions include instructions to jump immediately to an instruction later in the program, skipping over the instructions in between, or instructions that cause a jump to an earlier instruction.

Control instructions are used to support selection and repetition,which are two very powerful mechanisms when writing computer programs. Selection refers to the process of deciding between two or more courses of action, e.g. whether to execute the next instruction (or block of instructions) or skip over it to do the instruction(s) that follow it. In repetition the decision determines whether or not one or more instructions are repeated. For example, it might be necessary to return to a particular instruction (or block of instructions) to execute it again.

Selection

Selection (sometimes called branching) occurs in a computer program when the instruction or instructions to be executed depend on the situation. The circumstances that determine each possible course of action are formulated as conditions which are statements that can be either true or false, such as: ‘It is raining’ or ‘The traffic is very heavy’.

Selecting between two alternatives

As an analogy, consider the following instructions for a (non-computerised) task: ‘If it is snowing, put the cat in the kitchen; if it is not snowing, put the cat outside.’ The condition is ‘It is snowing’, and we take one of two courses of action depending on whether the condition is true or false. In mathematics and computer programming, true and false are sometimes referred to as Boolean values, and a condition that can be evaluated true or false is referred to as a Boolean expression.

To return to our example about the cat and the weather, we have the following:

a condition: it is snowing

two possible courses of action:

put the cat in the kitchen (if the condition is true)

put the cat outside (if the condition is false)

If the condition is true, we carry out the first instruction (and skip over the second to whatever follows next). If the condition is false, we skip over the first instruction, and carry out the second instruction. Only one of the two instructions is executed. Once we have executed one or the other, we move on to any other instructions which follow. Suppose our original task had been expanded so that it read: ‘If it is snowing, put the cat in the kitchen; if it is not snowing, put the cat outside. After you have dealt with the cat, turn out the lights.’

Informally, we could perhaps write our instructions as follows.

Here we have used parentheses to make it clear which part is the condition (a Boolean expression) and we have used indentation to provide a clue as to which parts of the structure are dependent on whether the condition is true or false. The fact that ‘turn out the lights’ is not indented indicates that it is not part of the selection structure. Although this is just an informal example, many high-level programming languages do use the words if and else to indicate a selection structure. In computing we usually use else instead of otherwise to indicate an alternative course of action.

For each particular language, there are strict rules about how conditions and their dependent code sequences are delimited. You will learn about these rules for JavaScript in Unit 7.

The first alternative ‘put the cat in the kitchen’ is referred to as the if branch (or clause), and the second branch ‘put the cat outside’ is referred to as the else branch (or clause). In a selection structure such as this one, the if branch is executed if the condition is true, and the else branch is executed if the condition is false. Figure 6.1 illustrates this example.

Negative conditions and NOT

Sometimes it is more convenient to couch a condition in the negative, for example:

it is not snowing But note that in this case we would need to reverse the two actions so that the if...else construction would be as follows.

if (it is not snowing)put the cat outsideelseput the cat in the kitchenturn out the lights

Discussion

(a) if (the customer is not an adult)

charge £5.00else

charge £10.00admit the customer

More on conditions involving NOT

We wrote the condition ‘it is not snowing’ in the way we would write it in English. But when we write instructions for a computer, we need to be as precise as possible. In programming languages the ‘not’ appears either before or after the complete Boolean expression to make it quite clear what we are negating. The English word ‘not’ is usually represented in Boolean expressions by the word NOT (in upper-case letters).

For example, we could write:

NOT (it is snowing) or

(it is snowing) NOT In each case the Boolean expression shown in parentheses (round brackets) is evaluated first, to either true or false, and the NOT then has the effect of reversing it to the other Boolean value, i.e. making it false (if it were true) and true (if it were false.)

NOT is called a Boolean operator because it operates on a Boolean expression. It is also an example of a unary operator, which means that it operates on a single value:

NOT (something that evaluates to true) will give false

NOT (something that evaluates to false) will give true In the next section we will come across two further Boolean operators, both of which are binary operators, i.e. they each operate on two Boolean values.

Truth tables

The effects of applying Boolean operators to different Boolean values are often shown in tables, usually known as truth tables.

Here is the truth table for the NOT operator acting on the Boolean expression ‘it is snowing’.

Other examples of selection A single possibility

In the examples so far there have been two alternative actions but selection can also be applied to a situation where there is a single action that we either wish to carry out or not, based on a given condition.

Here’s an example:

if (changes have been made to the document)save itprint the document

The condition is: changes have been made to the document and there is a single if option containing the instruction ‘save it’.

This instruction is executed only if the condition is true. If the condition is false, it is skipped over. In either case the instruction after the selection structure (‘print the document’) is then executed. You can think of this as being similar to a two alternatives structure in which the else clause is empty, so does not need to be included.

More than two alternatives

More complicated selection structures allow a programmer to test two or more conditions, leading to more than two possible branches. Here’s an example:

If a book is fiction put it on the first shelf, else if it is biography put it on the second shelf, else put it on the third shelf. Here we have two conditions:

the book is fictionthe book is biography

They can be written using two if...else structures as follows:

if (the book is fiction)put it on the first shelfelseif (the book is biography)put it on the second shelfelseput it on the third shelf

Repetition

Repetition,or looping, occurs when we have an instruction (or a sequence of instructions) that we might want to execute more than once.

For example, an algorithm for reading a book might contain the following instructions:

while (there are pages left to read in the book)read the current pageturn the current page over

select a new book 63

Notice that, as with the selection structure, there is a condition that controls what happens – in this case whether or not we repeat the sequence (or block) of two instructions. Like the conditions we encountered earlier, it is a Boolean expression – it evaluates to true or false. Unit 6 The structure of hardware and software

The instructions to be repeated are often referred to as the loop body.In our informal structure we have indented them. Here the loop body consists of the two instructions:

read the current page

turn the current page over

This kind of loop structure is commonly called a while loop. When the start of a while loop is encountered in a program the condition is evaluated; if it is true, the instructions in the loop body are executed. Then the condition is evaluated again and, if it is true, the loop body is executed again. This process continues until the condition evaluates to false, at which point the looping stops, and the instruction following the loop body (if there is one) is executed. Figure 6.4 illustrates the looping structure for reading a book.

As with selection structures the formal rules for writing a looping structure vary from programming language to programming language, and in fact many high-level programming languages have more than one looping structure.

You should be able to see that when a loop has its condition at the beginning, it is possible that the condition may evaluate to false the first time round, and that the sequence of statements in the body of the loop will not be executed at all. This is not necessarily a problem, but the programmer needs to think quite carefully about what the program is do to. Another potential pitfall in designing a loop structure is that if care is not taken to ensure that the condition eventually becomes false, the loop body will repeat forever. This is called an infinite loop.

Now let us trace through this loop structure to see what the output might be.

n First evaluation of the condition ‘NOT (count is 0)’

As count is 3, this becomes NOT false, which evaluates to true. Since the condition is true, we execute the instructions in the loop

body:write the value of count (i.e. output 3 to the screen)move to the next linereduce count by 1 (i.e. count now becomes 2).

So at the end of the first execution of the loop body we have the following.

Now let us trace through this loop structure to see what the output might be.

n First evaluation of the condition ‘NOT (count is 0)’

As count is 3, this becomes NOT false, which evaluates to true. Since the condition is true, we execute the instructions in the loop

body:write the value of count (i.e. output 3 to the screen)move to the next linereduce count by 1 (i.e. count now becomes 2).

So at the end of the first execution of the loop body we have the following.

Second evaluation of the condition

‘NOT (count is 0)’ is still true, so we execute the instructions in the loop body again: write out count (which is 2) move to the next line

reduce count by 1 (so it becomes 1). At the end of the second execution of the loop body we have the following.

Third evaluation of the condition

‘NOT (count is 0)’ is still true, so we execute the instructions in the loop body again: write out count, which is 1 move to the next line

reduce count by 1 (so it is 0). At the end of the third execution of the loop body we have the following.

Fourth evaluation of the condition

This time we get:NOT (count is 0)NOT truefalse

So we break out of the loop structure, and the next instruction after the loop structure is executed. The final output looks like the following.

Trace tables

Another way of working out what a program does is by using a trace table,

i.e. a table that shows what’s going on as each instruction is executed. Trace tables are not formal structures, and do not need to have any particular format. A programmer will usually decide on a format suitable for the sequence of instructions to be traced.

In the case of the looping structure just considered, we need to know the following information as the program progresses:

n

the value of ‘count’ at various times – when we test the condition, and

when we write it out;

n

the value of the condition ‘count is 0’;

n

the value of the condition ‘NOT(count is 0)’;

n

which line we are at in the output window;

n

what we are going to write out.

As long as the value in the NOT (count is 0) column is true, we continually repeat lines 3, 4, 5 and 2. Once it becomes false, we jump to line 6.

The last two columns are used to show what will be written to the output window. The last but one column tells us which line of the window it will be written to, and the last column tells us what will be written. As with our previous method, what we get is the following.

Compound conditions

You have seen that both selection and looping are controlled by conditions that evaluate to true or false. In Subsection 6.2 we looked at some simple conditions, but in real-life tasks conditions can be more complicated.

Here are some examples in English: It is raining or snowing Bill is over 18 and under 65 The number is less than 5 or more than 12 The shape is square and red

Conditions involving OR

Consider the condition: it is raining or snowing We can write this as two quite separate conditions, joined with the word ‘or’, as follows: (it is raining) or (it is snowing) Notice that we use parentheses here to emphasise the fact that we have two separate conditions joined with an or. A condition that is made up of two or more conditions is referred to as a compound condition.

If two conditions are joined with an or, the resulting expression evaluates to true if the first or the second (or both) is true, and false if both the first and the second are false.

The English word ‘or’ is represented in Boolean expressions by the Boolean operator OR. Because it operates on two values, it is a called a binary operator.

We can summarise the behaviour of expressions containing OR in the following truth table (where A and B are Boolean values).

Compound conditions involving AND

Consider the condition: Bill is over 18 and under 65 This can be written as two separate conditions joined with the word ‘and’, as follows: (Bill is over 18) and (Bill is under 65)

Compound conditions involving NOT

Compound conditions involving NOT can sometimes be a little tricky because they are not quite so easy to translate from English.

Consider the condition:

The shape is neither square nor filled in

We can translate this as meaning that the shape is not square and it is not filled in and write it in Boolean notation as:

NOT (The shape is square) AND NOT (The shape is filled in)

In this example we have a mixture of binary and unary Boolean operators, and we need a rule about the order in which we evaluate these. If there are any parentheses, we always evaluate their contents first. Then unary operators are applied, then binary operators. That means that once we have evaluated the two single conditions, we apply each NOT before applying the AND.

Consider the shape in Figure 6.8.

Figure 6.8

If we evaluate the expression above we get the following.

Compound conditions involving OR and AND

It is also possible to construct conditions involving both OR and AND. We will not be covering these conditions in this unit. However, a cautionary note is in order. The precedence of OR and AND varies from programming language to programming language so it is wise always to include parentheses in such conditions so as to make your meaning absolutely clear.

1

Mrs. Haifaa Elayyan