1 data for business. 2 conventional business tools paper-based paper-based letters letters telephone...

33
1 Data for Business Data for Business

Upload: shonda-burns

Post on 17-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

11

Data for BusinessData for Business

Page 2: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

22

Conventional Business ToolsConventional Business Tools

Paper-BasedPaper-Based

LettersLetters

TelephoneTelephone

FaxFax

TeleconferencingTeleconferencing

Etc.Etc.

Page 3: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

33

Evolution of Data for BusinessEvolution of Data for Business

Paper-based(Basic Infrastructure

E-Docs(Standalone)

Network

- LAN- WAN

Page 4: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

44

Stand-alone ComputerStand-alone Computer

Using computers in business without connectivity

Cashier Inventory

Customer Profile Employee Profile

Page 5: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

55

Intranet ComputerIntranet Computer

Local Area Network (LAN)

Page 6: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

66

Internet ComputerInternet Computer

Unsecured Network

Bad Guy

LAN

LANLAN

LAN

Wide Area Network (WAN)

Page 7: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

77

Wireless NetworkWireless Network

WLAN (Wireless LAN)WLAN (Wireless LAN)

Wi-FiWi-Fi• Wireless network in computer systems which Wireless network in computer systems which

enable connection to the internet or other enable connection to the internet or other machines machines

More convenient but more exposed to More convenient but more exposed to publicpublic

Need better protection Need better protection • Use data encryptionUse data encryption

Page 8: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

88

Levels of Data AccessLevels of Data Access

Executive

Manager

Employees

Within OrganizationOutside Organization

Page 9: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

99

Data SharingData Sharing

We need to: We need to: • Guarantee each worker access to the Guarantee each worker access to the

right information, at the right time, from right information, at the right time, from the whatever sourcethe whatever source

We need to:We need to:• Provide each worker with the Provide each worker with the

appropriate interfaces to work with this appropriate interfaces to work with this information and make decisioninformation and make decision

Page 10: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

1010

Scope of Data SharingScope of Data Sharing

Private (internal use)Private (internal use)• LAN (Intranet)LAN (Intranet)

PublicPublic• WAN (Internet)WAN (Internet)

Page 11: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

1111

Why Go Public?Why Go Public?

Increase ProductivityIncrease Productivity• Online transactionOnline transaction

Open business opportunitiesOpen business opportunities• Create partnershipCreate partnership

Page 12: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

1212

Data ManagementData Management

Centralized SystemCentralized System• Easy to manageEasy to manage• Can lead to bottleneck problem at peak Can lead to bottleneck problem at peak

timestimes

Distributed SystemDistributed System• Hard to manageHard to manage• Provide better performance and Provide better performance and

scalabilityscalability

Page 13: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

1313

Centralized SystemCentralized System

ServerdB

Client 1

Client 2 Client 3

Client 4

Client 5

Client 6

Page 14: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

1414

Distributed DBMSDistributed DBMS

Data Partitioning

Page 15: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

1515

Questions of ConcernQuestions of Concern

What can be shared and what cannot be?What can be shared and what cannot be?

Is Data Privacy guaranteed by using IT Is Data Privacy guaranteed by using IT systems?systems?

Is our current system sufficiently useful? Is our current system sufficiently useful?

What do we really need?What do we really need?

Page 16: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

1616

Symmetric CryptographySymmetric Cryptography

http://msdn.microsoft.com/en-us/library/aa480570.aspx

Page 17: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

1717

Asymmetric CryptographyAsymmetric Cryptography

http://msdn.microsoft.com/en-us/library/aa480570.aspx

Page 18: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

1818

Data RestrictionData Restriction

PublicPublic• Information which may or must be open to the general public. It is defined Information which may or must be open to the general public. It is defined

as information with no existing local, national or international legal as information with no existing local, national or international legal restrictions on access. restrictions on access.

• Example: Course CatalogExample: Course Catalog

SensitiveSensitive• Information whose access must be guarded due to proprietary, ethical, or Information whose access must be guarded due to proprietary, ethical, or

privacy considerations. privacy considerations. • Example: Date of Birth, EthnicityExample: Date of Birth, Ethnicity

RestrictedRestricted• Information protected because of protective statutes, policies or Information protected because of protective statutes, policies or

regulations. This level also represents information that isn't by default regulations. This level also represents information that isn't by default protected by legal statue, but for which the Information Owner has protected by legal statue, but for which the Information Owner has exercised their right to restrict access.exercised their right to restrict access.

• Example: Student Academic Record (FERPA)Example: Student Academic Record (FERPA)

Purdue University

Page 19: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

1919

Data ValidationData Validation

Data validation is the process of ensuring that a program Data validation is the process of ensuring that a program operates on clean, correct and useful data. operates on clean, correct and useful data.

It uses routines, often called "validation rules" or "check It uses routines, often called "validation rules" or "check routines", that check for correctness, meaningfulness, routines", that check for correctness, meaningfulness, and security of data that are input to the system. and security of data that are input to the system.

Data validation checks that data are valid, sensible, Data validation checks that data are valid, sensible, reasonable, and secure before they are processed.reasonable, and secure before they are processed.

Page 20: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

2020

Data Validation MethodsData Validation Methods

Format checkFormat check• Checks that the data is in a specified format (template), e.g., dates Checks that the data is in a specified format (template), e.g., dates

have to be in the format DD/MM/YYYY.have to be in the format DD/MM/YYYY. Data type checksData type checks

• Checks if the input data does not match with the chosen data type, Checks if the input data does not match with the chosen data type, e.g., In an input box accepting numeric data, if the letter 'O' was e.g., In an input box accepting numeric data, if the letter 'O' was typed instead of the number zero, an error message would appear.typed instead of the number zero, an error message would appear.

Range checkRange check• Checks that data lie within a specified range of values, e.g., the Checks that data lie within a specified range of values, e.g., the

month of a person's date of birth should lie between 1 and 12.month of a person's date of birth should lie between 1 and 12. Limit checkLimit check

• Unlike range checks, data is checked for one limit only, upper OR Unlike range checks, data is checked for one limit only, upper OR lower, e.g., data should not be greater than 2 (>2).lower, e.g., data should not be greater than 2 (>2).

Page 21: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

2121

Data Validation Methods (cont.)Data Validation Methods (cont.)

Presence checkPresence check• Checks that important data are actually present and have not Checks that important data are actually present and have not

been missed out, e.g., customers may be required to have their been missed out, e.g., customers may be required to have their telephone numbers listed.telephone numbers listed.

Spelling and grammar checkSpelling and grammar check• Looks for spelling and grammatical errors.Looks for spelling and grammatical errors.

Consistency ChecksConsistency Checks• Checks fields to ensure data in these fields corresponds, e.g., If Checks fields to ensure data in these fields corresponds, e.g., If

Title = "Mr.", then Gender = "M".Title = "Mr.", then Gender = "M".

Page 22: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

2222

Dirty DataDirty Data

Dirty data refers to inaccurate information/data primarily Dirty data refers to inaccurate information/data primarily collected by means of data capture formscollected by means of data capture forms

Dirty data is data that is:Dirty data is data that is:• MisleadingMisleading• Incorrect or without generalized formattingIncorrect or without generalized formatting• Containing spelling or punctuation errors (data that is entered in Containing spelling or punctuation errors (data that is entered in

a wrong field or duplicate data)a wrong field or duplicate data)

Page 23: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

2323

Causes of Dirty DataCauses of Dirty Data

Deliberate distortion of informationDeliberate distortion of information• Person could deliberately inserts misleading or fictional data Person could deliberately inserts misleading or fictional data

such as personal information, biographical data which such as personal information, biographical data which seems/appears real, it may not be picked up by an administrator seems/appears real, it may not be picked up by an administrator and/or a validation routine due to its appearanceand/or a validation routine due to its appearance

Typographical errorsTypographical errors Formatting issues Formatting issues

• Personal preferences for formatting of the data (such as phone Personal preferences for formatting of the data (such as phone numbers) could lead to introduction of dirty datanumbers) could lead to introduction of dirty data

Duplication errorsDuplication errors• Duplicate data may be caused by accidental double submission Duplicate data may be caused by accidental double submission

on the forms; incorrect data joining; user error(s)on the forms; incorrect data joining; user error(s)

Page 24: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

2424

Dirty Data PreventionDirty Data Prevention

It is commonly prevented using input masks or validation It is commonly prevented using input masks or validation rules.rules.

Completely removing dirty data from a data source is Completely removing dirty data from a data source is impossible or impractical in some cases.impossible or impractical in some cases.

Page 25: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

2525

Data CleansingData Cleansing

Data cleansing or data scrubbing is the act of detecting Data cleansing or data scrubbing is the act of detecting and correcting (or removing) corrupted or inaccurate and correcting (or removing) corrupted or inaccurate records from a record set, table, or database. records from a record set, table, or database.

It refers to identifying incomplete, incorrect, inaccurate, It refers to identifying incomplete, incorrect, inaccurate, irrelevant etc. parts of the data and then replacing, irrelevant etc. parts of the data and then replacing, modifying or deleting dirty data.modifying or deleting dirty data.

Data cleansing differs from Data cleansing differs from data validationdata validation in that: in that:• validation means data is rejected from the system at entry and is validation means data is rejected from the system at entry and is

performed at entry time, rather than on batches of data.performed at entry time, rather than on batches of data.

Page 26: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

2626

Steps in the Evolution of Data MiningSteps in the Evolution of Data Mining Evolutionary Evolutionary

Step Step Business QuestionBusiness Question Enabling Enabling

TechnologiesTechnologiesCharacteristicCharacteristic

ss

Data CollectionData Collection

((1960s1960s))"What was my total "What was my total revenue in the last revenue in the last five years?"five years?"

Computers, tapes, Computers, tapes, disksdisks

Retrospective, Retrospective, static data static data deliverydelivery

Data AccessData Access

((1980s1980s))"What were unit "What were unit sales in New England sales in New England last March?"last March?"

RDBMS, SQL, ODBCRDBMS, SQL, ODBC Retrospective, Retrospective, dynamic data dynamic data delivery at delivery at record levelrecord level

Data Data Warehousing &Warehousing &

Decision Decision SupportSupport

(1990s)(1990s)

"What were unit "What were unit sales in New England sales in New England last March? Drill last March? Drill down to Boston."down to Boston."

On-line analytic On-line analytic processing (OLAP), processing (OLAP), multidimensional multidimensional databases, data databases, data warehouseswarehouses

Retrospective, Retrospective, dynamic data dynamic data delivery at delivery at multiple levelsmultiple levels

Data MiningData Mining

((Emerging TodayEmerging Today))

"What’s likely to "What’s likely to happen to Boston happen to Boston unit sales next unit sales next month? Why?"month? Why?"

Advanced algorithms, Advanced algorithms, multiprocessor multiprocessor computers, massive computers, massive databasesdatabases

Prospective, Prospective, proactive proactive information information deliverydelivery

http://www.thearling.com/text/dmwhite/dmwhite.htm

Page 27: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

2727

Data Storage PerformanceData Storage Performance

ActiveActive

Less ActiveLess Active

HistoricalHistorical

ArchiveArchive

dB

Fast

Medium

Slow

Per Request

Life Cycle of DataLife Cycle of Data

Page 28: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

2828

Data for BusinessData for Business

RFID TechnologyRFID Technology

Page 29: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

2929

Radio Frequency Identification (RFID)Radio Frequency Identification (RFID)

An automatic method, relying on storing An automatic method, relying on storing and remotely retrieving data using and remotely retrieving data using devices called “RFID tags”.devices called “RFID tags”.

Page 30: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

3030

Types of RFIDTypes of RFID PassivePassive

• Does not have internal power supplyDoes not have internal power supply• Range (4cm up to a few meters)Range (4cm up to a few meters)

ActiveActive• Have its own power supply to broadcast signal to readerHave its own power supply to broadcast signal to reader• Range of hundreds of meters with 10 years battery lifetimeRange of hundreds of meters with 10 years battery lifetime

Semi-passiveSemi-passive• Have its own power for chip but not for broadcast a signalHave its own power for chip but not for broadcast a signal• greater sensitivity than passive, typically 100 times more greater sensitivity than passive, typically 100 times more

RFID backscatter

Page 31: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

3131

Example of RFID TagsExample of RFID Tags

RFID in the form of sticker

An RFID tag used for electronic toll collection

Page 32: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

3232

Implantable RFID ChipImplantable RFID Chip

Page 33: 1 Data for Business. 2 Conventional Business Tools Paper-Based Paper-Based Letters Letters Telephone Telephone Fax Fax Teleconferencing Teleconferencing

3333

Logo of the Anti-RFID Campaign Logo of the Anti-RFID Campaign