introduction to computing
DESCRIPTION
Introduction to Computing. Unit 3: Information Management. Review of the Lecture. Databases Data Modelling Relational Databases Data Integrity and Security Query Language Data Mining. Databases. What is a database? A structured store of data Logically coherent - PowerPoint PPT PresentationTRANSCRIPT
© NCC Education LimitedV1.0
Introduction to Computing Unit 3: Information Management
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.2
Review of the Lecture• Databases• Data Modelling• Relational Databases• Data Integrity and Security• Query Language• Data Mining
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.3
Databases What is a database?
• A structured store of data• Logically coherent• Designed for a specific purpose• Integrated across an organisation• On a computer system• Based on a model• Allows access to data by all who need it
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.4
beans peas tissues
jam tea coffee butter eggs apples
carrots
Data Files Data can be stored in simple files:
• A Serial File- Data is stored one item after another. - e.g. Sales – in the order they happen
• To find an item:- Start at the beginning- Visit each item in turn- Until found or end of file
- Finding an item can take a long time
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.5
Stock no. 12
Stock no. 15
Stock no. 17
Stock no. 19
Stock no. 20
Stock no. 25
Stock no. 56
Stock no. 67
Stock no. 78
Stock no. 87
Data Files• A Sequential File
- data is stored one item after another in some order- E.g. Sales – in stock number order
- To find an item:• start at the beginning• visit each item in turn• until found or stock number higher than the one
searched for
- Finding an item can be quicker:• can stop the search early if item not present• or…. can use an index and fast forward to the right place
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.6
Data Files Adding new data to simple files:
• Serial- go to end of file – add new item
• Sequential- start at the beginning- write old data to new file until correct position for
new item- write new item to new file- write the rest of old file to new file
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.7
Data Files
Direct Access• This method calculates the correct
position and places the data item there. It finds data in the same way.
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.8
Data Files
Direct Access• A very quick method• It requires a random access storage
device such as a disk• Cannot use tape storage
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.9
Problems with Simple Data Files Simple data files can be awkward for an organisation.
What if……• Two or more people need to update the same
file?• The software is updated?• The data structure needs to be changed?
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.10
Data Redundancy• The same data may get stored more
than once• One copy gets updated• The other does not
ErrorsLack of data integrity
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.11
Data Dependence
What if the organisation’s data storage needs change?
There is a need for:• Updated data structure• Updated software
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.12
The Database Approach• Data is stored in an integrated
database• Data is stored centrally• The organisation can control it• Everyone has access to the same data• Data is not spread around lots of files• Data can have multiple views
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.13
The Database Approach• Reduced data redundancy• Data is consistent• Data is shared• Corporate standards can be enforced• Better security• Controlled access• Leads to data integrity
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.14
Keeping Data Safe
Security• Data is protected against loss or
unauthorised access
Integrity• Data is as it should be – complete,
consistent and correct
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.15
Data Independence Simple Filing Systems:
• The software must “know about” the way that the data is organised
• Changes to the data structure require changes to the software
Databases• Separate the data from the applications• The data is under the control of a database
management system (DBMS)
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.16
DBMS
A set of programs:• for defining a database• for creating a database• maintaining a database• manipulating a database
- (queries, reports, updating)
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.17
application
DBMS
data
DBMS• A Database Management System
manages the physical data.• Applications can be produced that
work on the data through the DBMS.
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.18
DBMS The DBMS:
• Can enforce rules to ensure integrity.• Can separate the application from the
data.• Can allow changes to the applications.• Can perform validation checks.• Provides security features.
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.19
Hierarchical Model
Data Models - Hierarchical•Data must be structured for ease
of processing.
•The structure is known as the data model or schema.
•The description of the model is stored in the data dictionary.
•There are various models, e.g. the hierarchical model. This works in some cases
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.20
Entities
Field name ValueFname BarackSname ObamaOccupation President
•An entity is a real-world object.
•We store data about entities.
•The entity has attributes.
•Each attribute makes up a data field.
•Data about one entity makes up a record.
1 record
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.21
Data Models - Relational• The most common model is relational.
- Based on interconnected tables- Each table is made of rows and columns.- A row corresponds to a record.- Each row of a table must have the same
structure.- A column corresponds to a field.- Tables are linked by keys.
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.22
Entity Relations A relational database separates the data about different entities into different tables.• A customer can place many
orders. • Each order can contain many
items.
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.23
Tables
Links
Relational Model
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.24
Field namestudent_numbersurnameforenamedate_of_birth
Field nameroom_numberdatecustomer_idextras
table: student table: hotel_booking
compound primary key
simple primary key
Key Fields• Each table has a primary key.• This uniquely identifies each record.• It may be one field such as an id number.• It may be a compound key.
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.25
Links
A primary key links to a foreign key in a different table.
Au_ID is primary key here
Au_ID is foreign key here
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.26
Integrity and Security• The data is held centrally.• Centralised back up • Levels of access• Validation enforces rules on data entry.• Record locking prevents inconsistent
updates.• Referential integrity prevents
conflicting updates.
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.27
Validation•A check that data is acceptable.•Carried out by the software at the time of data entry
•Does not ensure that data is accurate
Examples:length check data item is not too longrange check data is within a limited rangepresence check data must be enteredcheck digit an extra digit is calculated by an algorithm
and added to the datatype check data is the correct data type
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.28
Querying a Database• Databases can get big.• We need a way to extract just the data
we need.• Often done using a query language.• DBMS may have QBE (query by
example).• There are many examples of query
languages.
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.29
Querying a Database Some DBMS programs allow Query by example (QBE).
This query selects records where the number of items in stock is more than 400.
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.30
Querying a DatabaseThe same query can be constructed in a query language such as
SQL:
SELECT stock.stock_number, stock.stock_name, stock.number_in_stock, supplier.supplier_name, supplier.supplier_telephoneFROM stock INNER JOIN supplier ON stock.supplier_number = supplier.supplier_numberWHERE (((stock.number_in_stock)>400))ORDER BY stock.number_in_stock;
The result of running the query is:
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.31
Creating a Database The query language can also create databases:
CREATE SCHEMA SCHOOL CREATE TABLE STAFF(FName VARCHAR(15) NOT NULL, LName VARCHAR(15) NOT NULL, SocSecNo INT NOT NULL, StaffNum INT NOT NULL, Title VARCHAR(10), Code CHAR(5), PRIMARY KEY (SocSecNo), UNIQUE (StaffNum), FOREIGN KEY (Code) REFERENCES DEPARTMENT (DeptCode));
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.32
Nowadays there is a huge amount of data in databases. There may be useful information that was not part of the plan when the database was set up.
For example, a supermarket with a loyalty card scheme records every purchase made by each customer.
Data Mining
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.33
Data Mining• The supermarket knows the names and
addresses of its card holders.• It knows what they have bought.• It can look through the data for patterns.• Maybe people in certain areas tend to
buy certain things.• Advertising can now be targeted.
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.34
Data Mining• Similar items are grouped together.• Attempts to find a model
• Data mining:- uncovers patterns- discovers new relationships- can help to make predictions
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.35
Data Mining
Uses:• Market analysis• Combating terrorism• Predicting genetic disorders• Predicting drug reactions• Traffic analysis
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.36
Government
Criminals Insurance companies
Security services
Tax authorities
Friends and family
Data and Confidentiality With so much data being stored and easy to copy, people worry about who has access to it.
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.37
Data and Confidentiality Most countries have data protection laws.
These are designed to protect privacy.
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.38
Data and Confidentiality Most data protection laws are similar:
• Data may only be used for the specific purposes for which it was collected.
• Data must not be disclosed to other parties without consent. • Personal information may be kept for no longer than is
necessary.• Personal information may not be transmitted outside a particular
area. • Those holding personal information are required to have
adequate security measures in place. • Subjects have the right to have factually incorrect information
corrected.
…..but it is so easy to copy data, can these laws be enforced?
© NCC Education LimitedV1.0
Information Management Unit 3 - 3.39
Unit 3
Any questions?