meljun cortes computer information processing chapter 9 with_notes

CCT101: Chapter 9

OBJECTIVESOBJECTIVES

• Describe the types of data processing files• Describe the types of file organization• Data validation

FILE, RECORD & FIELD - Field

• Data item• e.g. student name

- Record• A group of related data items or fields• e.g. student record

- File• A collection of related records• e.g. Student file

ENTITY SET, ENTITY & ATTRIBUTES

- Attributes

• Describe the properties of the entity (I.e. field)

- Entity• Which or when we store facts (i.e. records)

- Entity set• A collection of logically related entities (i.e. file)

1. Physical file :

– Refers to how the data is stored i.e. the actual arrangement of data in storage device

2. Logical file :

– What a file contains & how the data should be processed

Logical File & Physical Files

• It is a field within the record which is used for locating & processing the recorde.g. student number

Key Field

FILE LENGTH

• Fixed-length records– Each record has the same length– Advantage: Easier to design– Disadvantage: Wasted storage space

• Variable-length records– Each record does not have the same length– Advantage: Saves storage space– Disadvantage: More difficult to design

1. Writing :

– The act of transferring a record from main memory to secondary storage.

2. Insertion :

– Adding a new record to an existing file.

3. Deleting :

– Removing a record from a file.

INFORMATION RETRIEVAL

4. Updating :– Making changes to the contents of a record to show the new

status of information.

5. Sorting :

– Rearranging the records in a file for the purpose of producing ordered reports.

6. Merging :

– Combination of 2 or more files to produce a single output file.

7. Matching :– Where 2 or more output files are compared record

against record to ensure there is a complete set of records for each key. Mismatched records are highlighted for action.

8. Searching :– Involves looking for a record with a certain key value

9. Appending :- Adding a record at the last available space of an

existing file

• The number of records that are changed as a result of updating when compared to the total number of records in the file.

– HIT RATE

• Volatility :– Measuring the number of additions and deletions in a file.

• File growth– No of records additions – number of records deletions

number or records affectedtotal records on file

ACTIVITY RATIO (HIT RATE)

1. Master file

– Permanent or semi-permanent data

– Used for reference and updating

– Shows the current status of data

– Never empty except at its time of creation

– E.g. stock master file

TYPES OF DP FILES

2. Transaction file

– Contains source or transaction data

– Used for updating master file

– E.g. sales transaction file

3. Work file

– Temporary file

– Used for storing intermediate data for further processing

– E.g. file used by sort utility

TYPES OF DP FILES

4. Transition file

– Temporary file for specific use

– E.g. meter readings, customer’s detail for printout

5. Security & backup file

– Extra copy of file against damage/loss

6. Audit file

– Enables auditor to check correct functioning of computer based procedures

– Keeps a copy of all transactions

TYPES OF DP FILES

FILE ORGANISATIONS

• 4 Types

1. Serial

2. Sequential

3. Indexed-sequential

4. Random

• Simplest, not in any order

• Placed record in next available space

• Suitable for– Unsorted transaction files

– Print files

– Dump files

– Temporary data files

• Access in order of records placed

SERIAL ORGANISATION

• Advantages :

– File design is simple

– Efficient for high activity file

– Effective use of low cost file media suitable for batch processing

• Disadvantage :

– File are to be processed from beginning to the end

• Predefined order

• A designated field within the record is selected as basis in ordering records

• This key is also known as Record key or Simply key

• Suitable for master file

• Not for fast response on line enquiring systems

• E.g. Payroll transaction file

SEQUENTIAL ORGANISATION

SEQUENTIAL ORGANISATION• Advantages :

– File design is simple

– Efficient for high activity file

– Effective use of low cost file media suitable for batched transactions

• Disadvantage :

– Entire file must be processed even if activity is low

– Transactions required sorting

• Physical sequence to primary key

• Builds an index separate from the data or

records

• Accessed randomly and sequentially

• 3 main parts– Prime (Home) area

– Overflow area

– Index area

INDEXED SEQUENTIALORGANISATION

• When insufficient space in home area (prime area), overflow area will be used

• Overflow areas created at cylinder & track level

• Access controlled by means of pointers

• File reorganization to be done

• Overflow records recovered & indexes rebuilt

- Support three types of processing :

1. Sequential processing

2. Selective sequential processing/ Random access

3. Block is searched record by record until record is found/ Direct access/ Dynamic access

INDEXED-SEQUENTIAL FILES

• Predictable relationship between record key & record’s location on disc

• Not in sequence physically, scattered in random

• Direct addressing

• Key as physical address of record

• Device dependent

RANDOM ORGANISATION

INDEXED-SEQUENTIAL ORGANISATION• Advantages :

– Transactions may be sorted or unsorted

– Only the affected master records are processed during updating

– Response time is reasonably fast

– Facilities file enquiry

– Be processed sequentially and randomly

• Disadvantage :

– Each master file access requires index file access

– Requires direct access storage devices (still costly)

– Storage space required for indexes

RANDOM ORGANIZATION

• Predictable relationship between record key and record location on disc

• Records may be scattered in random

• Direct addressing

RANDOM ORGANIZATION

• Key transformation techniques used

1. Division remainder method

Divide key value by an appropriate number

Remainder of division as address of record

Number used to divide is prime number

2. Mid Square Hashing

The key is squared, specified digits extracted from middle of the

result to yield address of the results

RANDOM ORGANIZATION

3. Hashing By Folding

– Key is divided into 2 or more parts which are then added together

– Truncation to bring result into required range of numbers

RANDOM ORGANISATION• Advantages :

– As index are not required, space and searching time are saved

– Insertion and deletion or records can take place

• Disadvantage :

– Variable-length records are difficult to handle

– Gaps in keys can caused wasted space

– Synonym can occur

– Allocation of efficient overflow areas is difficult

• Double punching method

• Sight verification

DATA VERIFICATION

• Presence

• Size

• Range

• Character check

• Format

• Reasonableness

• Check digits

DATA VALIDATION

• Adequate program checkpoint/ restart facilities

• File dumps

• Generations of backup files

ERROR RECOVERY

meljun cortes computer information processing chapter 9 with_notes

student file

file design

student record file

transition file temporary

work file temporary

file volatility

entire file

file growth

Documents

meljun cortes sdlc

meljun cortes computer information processing chapter 9 with...

meljun cortes computer information processing chapter 8 with...

meljun cortes computer information processing chapter6

meljun cortes html

meljun cortes operating_system_file_system_interface

meljun cortes computer information processing chapter3

meljun cortes computer information processing chapter 1 with...

meljun cortes - fundamentals of computer processing

meljun cortes accord presentation meljun

meljun cortes meljun computer organization_lecture_chapter3

meljun cortes computer information processing chapter 6 with...

meljun cortes meljun computer organization_lecture_chapter1

meljun cortes computer information processing chapter 3 with...

meljun cortes computer information processing chapter 5 with...

meljun cortes computer information processing chapter 10...

meljun cortes computer information processing chapter9

meljun cortes

meljun cortes computer information processing chapter5

meljun cortes computer information processing chapter 4 with...