manipulating data data management: validation and verification
TRANSCRIPT
Manipulating data
Data management: validation and verification
Manipulating data
Valid data
• Data that is valid is allowable• Valid data has to obey certain rules• Data can be incorrect yet still valid
Manipulating data
Data can be valid and incorrect
Example:• A person has a date of birth 19/12/87• A user enters it incorrectly as
19/12/78• Both are valid as dates• Yet one is incorrect
Manipulating data
Two techniques for reducing errors
• Verification• Validation
Manipulating data
Verification
• Checks that errors are not introduced during typing by the user
• Checks data entered is the same as on a source document (e.g., order form, application form, etc.)
Manipulating data
Two methods of verification
• Visual checking/proof reading – checking what has been typed in against a source document
• Double entry of data – two people enter the same data – only if both sets of data are the same will it be accepted
Manipulating data
Validation checks include• Data type checks – is data entered the right type for the
field (e.g., letters are not entered into a numeric field)?• Presence checks – has a field been left empty?• Format checks – is data is of the right length and the
right combination of characters for a field (e.g., code FF019J has length 6 characters with first two letters, the next three numbers and the last a letter)?
• Range Checks• Look-up Lists• Format Checks• Check digits
Manipulating data
Parity checks
Ensure that data sent over a network has not become corrupted
Manipulating data
Hash and Batch Totals
Totals created from the data that is meaningless apart from checking that the data is verified.
See - Animation
Manipulating data
Spreadsheet software
Manipulating data
Spreadsheets
Manipulating data
Components of spreadsheets
• Labels - are used for titles, headings, names, and for identifying columns of data
• Data - are the values (text or numbers) that you enter into the spreadsheet
• Formulas - are used to perform calculations on the cell contents
Manipulating data
Functions• A function is a specialized calculation that the
spreadsheet software has memorized• Average• Max, Min• Mode, Median• Sum• Count, Counta, Countif• Vlookup• IF
Manipulating data
The two types of cell referencing
• Relative cell referencing• Absolute cell referencing
Manipulating data
Relative cell referencing
• This reference tells the spreadsheet that the cell to which it refers is 3 cells up and one cell to the left of cell B4
• If cell B4 is copied to another position, say E5, then the reference will still be to the same number of cells up and to the left so it will now be to cell D2.
Manipulating data
Absolute cell referencing
• With absolute cell referencing, if cell B4 contains a reference to cell A1, then if the contents of B4 are copied to a new position, the reference will not be adjusted and it will still refer to cell A1
Manipulating data
The benefits of using spreadsheets
• You can perform ‘what if’ investigations – you can make changes to the spreadsheet values to see what happens
• Automatic recalculation – when an item of data changes, all those cells that are connected to the changed cell by a formula will also change
• Accurate calculation – provided the formulas are all correct, the calculations on the numbers will always be correct
• It is easy to produce graphs and charts – once the data has been entered, it is very easy for the spreadsheet to produce graphs and charts based on it
Manipulating data
Graphs and Modelling
• Pie,Bar,Scattergraphs, line• Modelling
– Inputs– Variables– Constants– Contraints– Rules
Have a go at the activity 1- 4 on page 86
Manipulating data
Data capture
Manipulating data
Data capture 1
• Is the method by which data from the outside world enters the computer for processing
• For example, keyboard entry is a method of data capture
• Other methods include optical mark recognition, the use of sensors, etc.
Manipulating data
Data capture 2
The ideal method of data capture would be:• Comparatively accurate• Cheap• Automatic• Fast
Manipulating data
Chip and pin
• Used to capture credit/ debit card details
• Details are encrypted on a chip
• User has to enter a PIN that only they know
• The PIN is checked and proves the user is authentic
Manipulating data
Optical mark recognition (OMR)
• Automatically reads marks made on a form
• Forms are read/scanned at high speed
• Readers are relatively cheap
• Reject rate owing to people not filling in the form correctly can be high
• If forms are folded they cannot be read
Manipulating data
Bar code reading
• A code is stored as a series of light and dark bars
• Uses a scanner to read the bars
• Used on items in stores, parcels, luggage handling at airports, etc.
• Fast to read • Accurate• Can be read at a
distance
Manipulating data
Voice recognition• Uses microphone as
the input device• Uses special voice
recognition software to turn the sounds into letters that can be understood
• Can input text into word-processing, email software, etc., using speech
• Can also issue commands using speech
Manipulating data
Biometrics• Uses unique features of
the human body to recognize a person
• Examples include retinal scanning, fingerprint recognition, etc.
• Ideal for access control to buildings, rooms and computers
• Has the advantage that, unlike passwords, there is nothing to remember
Manipulating data
RFID tags• RFID means radio
frequency identification• Data is stored on a small
computer chip• Tags can be read at a
distance• Tags can be read
through clothing or a bag
• They can store a lot of data
• They are quite expensive to manufacture
Manipulating data
Databases
Manipulating data
A Database
• Fields– Key fields / Primary key
• Record• File – Table
• Flat file and relational databases– Data redundancy– Keeping important data when deleting records– More efficient when doing searches.
Manipulating data
Data Handling Applications
• Financial Forecasting• Weather Forecasting• Flight Simulators• Expert Systems for decision making
Manipulating data
Data handling software
Manipulating data
Data handling software
Covers any software used to store and manipulate data and output information:• Database software• Spreadsheet software
Manipulating data
Updating, deleting and searching records
• Update – change the data to bring up-to-date
• Delete – remove data no longer needed
• Search – look for specific records that match certain criteria (e.g., list the names of all students in Year 11)
Manipulating data
Search criteria using operators
Operators are used to construct search criteria and include:
= equals> greater than< less than<> not equal to >= greater than or equal to<= less than or equal to
Manipulating data
Examples of operators being used
= Patel (In a surname field finds data for people with surname Patel)
= 20 (In a quantity field finds data for all occurrences where the quantity is 20)
>01/02/10 (In a date field finds the data for all the dates after (but not including) 01/02/10)
<>0 (In a quantity field finds the data where the quantity does not equal zero)
Manipulating data
Joining operators
Operators can be joined by AND or ORSize = XL AND Type = Shirt
The above will find all the extra large shirt details
Pet = Dog OR Pet = CatThe above will find details for Dog or Cat
Manipulating data
What do these searches do?
Pupil ID Surname Age Free meals Form
11901 Duggan 15 N 11G
11809 Hughes 15 Y 11T
11900 Lee 15 N 11T
11907 Liu 15 Y 11G
11877 Khan 16 N 11M
1 Pupil ID = 118092 Age > 153 Free meals = Y AND Age > 164 Form = 11T OR Free meals = Y5 Surname = Lee
Manipulating data
Answers to check
1 Displays one record for pupil with pupil ID 11809 (i.e., pupil with surname Hughes)
2 Displays all the details of pupils whose age is over 15 (i.e., the details of all 16-year-old pupils in this set of data)
3 Displays data of all pupils who have free meals and are over 15 (i.e., all 16 year olds in this case)
4 Displays data for pupils who have free meals or who are in Form 11T
5 Displays the details for pupils with surname Lee