information processing architectures
DESCRIPTION
Information processing architecturesTRANSCRIPT
![Page 1: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/1.jpg)
Information Processing Architectures
Raji Gogulapati, Sep 2014
![Page 2: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/2.jpg)
Information Search
Information Acquisition
Information Processing
Information Maintenance
Information Retention
Information System Management
![Page 3: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/3.jpg)
Information Processing
Online transaction processing
(OLTP)
Online Analytical Processing (OLAP)
Complex Event Processing
(CPP)
Massively Parallel Processing (MPP)
Legacy
Random
![Page 4: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/4.jpg)
Infrastructure Essentials for Information Processing
![Page 5: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/5.jpg)
Shared
Nothing
• OLAP • BI, DW, Big Data
Shared
Disk
• Traditional RDMS • OLTP
Shared
Everything
• Traditional RDMS • OLTP
Infrastructure Models of Databases
![Page 6: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/6.jpg)
Process
Disk
Process Process Process
Disk
Process
Shared Everything Shared Disk
Database Architectures
Relational Data management systems for OLTP information
![Page 7: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/7.jpg)
Process
Disk
Process
Disk
Process
Disk
Process
Disk
Master
Shared Nothing, Massively Parallel Architecture Layout
For Data Warehousing, Business Intelligence, Big Data loads of information
![Page 8: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/8.jpg)
Trade offs
Assigning tasks at proper time in the determined order
Batch and online scheduling algorithms
Priority based, First come first served, Round Robin
Load balancing across nodes
Serializing data transfer
Data Transfer, computation delays
Data overflow, underflow
Reference: chapter3, Hu, Wen-Chen, and Naima Kaabouch (eds). Big Data Management, Technologies, and Applications. IGI Global. © 2014.
![Page 9: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/9.jpg)
Map Reduce Approach For Big Data Processing
Chapter 2,Hu, Wen-Chen, and Naima Kaabouch (eds). Big Data Management, Technologies, and Applications. IGI Global. © 2014.
Step 1 - Split Big data among multiple parallel map data Step 2 - Merge and Reduce
data by grouping
Distributed Memory system
Dynamic Job scheduling Scalable
Key Value Pairs Fault Tolerant
![Page 10: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/10.jpg)
Map Reduce Concept - Key Value Pairs
A B C D D CA D B
A B C
D D C
A D B
Input A - 1B - 1 C - 1
D – 1D – 1C - 1
A – 1D – 1B - 1
Map
Shuffle/ Sort
A – 1A – 1
B – 1 B - 1
C – 1 C - 1
D – 1D – 1D - 1
Reduce
D – 3
C – 2
B – 2
A - 2
A – 2B – 2C - 2D – 3
Output
![Page 11: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/11.jpg)
Information Processing – Focus and Changes
Map Reduce Framework and Hadoop Distributed File system
• To perform analytics in parallel• Map & Reduce Functions run in parallel Parallelism
• Share nothing • Compute Nodes
Fault Tolerance
• Scale CPU, memory. Robust data management techniques to optimize data retrieval and storage.
• Assign data processing work load to that server where the data is stored as per Map Reduce.
ScalabilityData Locality
![Page 12: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/12.jpg)
A Few Basics
![Page 13: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/13.jpg)
ACID, BASE, CAP
Relational database management systems follow ACID rules – Atomicity, Consistency, Isolation, Durability
What to expect from Search – BASE
Yes, Search returns innumerable pages of data
Only one page is basically available - BA
Rest of the data is in Soft State - S Rest of the data becomes eventually consistent - E
According to Database Theory – Distributed NoSQL big databases can satisfy only two of CAP and have to relax the Expectations on the third.. CAP – Consistency, Availability, Partition Tolerance
![Page 14: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/14.jpg)
Distributed Information management
C J Date’s Rules (12) for Distributed Databases
Location autonomyNo reliance on a central site for any particular service
Continuous operation
Location Independence
Fragmentation independence
Replication independence Distributed query processing
Distributed transaction managementHardware independence
Operating system independence
Network IndependenceDBMS independence
![Page 15: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/15.jpg)
Multiple Models For Data Architectures
Legacy, traditional RDBS Object oriented
Distributed Client Server
Data Warehouses
Parallel and Massively Parallel
Partitioning Active Databases - Intelligence
Spatial Multimedia
Temporal
![Page 16: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/16.jpg)
Client Server Databases, Middleware - Drivers
Remote Database Access (RDA)Distributed Relational Database Architecture
Integrated Database Application Programming Interface (IDAPI)
Data Access Language (DAL)
Open Database Connectivity (ODBC)
1990’s
![Page 17: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/17.jpg)
Client Server basic model in the ‘80s
Adapted from figure 3.2 mid ‘80s client/ server environment, chapter 3, client server databases and middleware
Server applications
Interface Interface
Client PC
Request
Data
![Page 18: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/18.jpg)
Data Warehouse – Applications
Non volatile
Time variant
Integrated
Subject oriented
![Page 19: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/19.jpg)
Data warehousing Models for analytical applications – pre-web
Star
Snowflake
Constellation
![Page 20: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/20.jpg)
Data warehousing Models for analytical applications – complex web data
Use XML to model data warehouses
Combining OLAP tools with Data mining
Rule based multi dimensional model
![Page 21: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/21.jpg)
Next generation data warehouse
Analytics
Semantic interfaces/ Rules engines, Hadoop/ NoSQL, RDBMS
Data layerOLTP, legacy data, web data
![Page 22: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/22.jpg)
Source: http://www.sybase.com/files/White_Papers/TDWI_BPR_NextGenDWPlatforms_Q409.pdf
![Page 23: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/23.jpg)
Business Intelligence – Models
Source: www.beyenetwork.com, http://www.b-eye-network.com/view/8385.
DSS 2.0 architecture
![Page 24: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/24.jpg)
Multi tier distributed enterprise applications – Y2k period
Information system tier
Client tier
Presentation (Web) Tier
Frameworks such as J2EE,.Net
Database
Business logic tier
Database serverApplication Server Client server
![Page 25: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/25.jpg)
![Page 26: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/26.jpg)
Mobile data progress
Adapted from gsma.com, Mena, Jesus. "Chapter 3 - Mobile Data". Data Mining Mobile Devices. Auerbach Publications, © 2013
1 G 2G 2.5G 3G 4G
analog Digital
GSM GPRS EDGE WCDMA
![Page 27: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/27.jpg)
Legacy Migrations Cloud environment – Suitability
On going discussions and debates
![Page 28: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/28.jpg)
Social, Mobile, Cloud environments for enterprise applications
![Page 29: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/29.jpg)
Cloud Infrastructures for processing information
In the context of Big data,
This topic is reserved for a more comprehensive coverage separately
“ Bandey, D.(2012), Doctor of Law says "When a Corporation mines the Big Data within its IT infrastructure a number of laws will automatically be in play. However, if That Corporation wants to analyze the same Big data in the cloud-a new tier of legal obligations and restrictions arise. Some of them quite foreign to a management previously accustomed to dealing with its own data within its own infrastructure“ “
Raj, Pethuru, and Ganesh Chandra Deka (eds). "Chapter 2 - Big Data Computing and the Reference Architecture".Handbook of Research on Cloud Infrastructures for Big Data Analytics. IGI Global. © 2014.
![Page 30: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/30.jpg)
Topics for cloud and information processing
Raj, Pethuru, and Ganesh Chandra Deka (eds). "Chapter 9 - Cloud Database Systems: NoSQL, NewSQL, and Hybrid".Handbook of Research on Cloud Infrastructures for Big Data Analytics. IGI Global. © 2014
Several terms and topics in this area.
Cloud database systems Cloud Storage Data as a Service
Database as a service Data Models
Cloud computing demands five crucial characteristics for evaluating databases fit for cloud environment
On demand self service, broad network access, resource pooling, rapid elasticity and Measured service.
![Page 31: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/31.jpg)
Big Data Case Studies
Conversions – Traditional Main frame to Hadoop, NoSQL db
Recommendation Engine Video Streaming Analytics
Real Time Traffic monitoring
Social behaviors log processing
![Page 32: Information processing architectures](https://reader035.vdocuments.us/reader035/viewer/2022062419/5584c16fd8b42aee078b45c8/html5/thumbnails/32.jpg)
References:
Dow, K. E., Hackbarth, G., & Wong, J. (2013). Data architectures for an organizational memory information system. Journal Of The American Society For Information Science & Technology, 64(7), 1345-1356. doi:10.1002/asi.22848
Chessell, Mandy & Smith, Harald C.. ( © 2013). Patterns of Information management.
Hu, Wen-Chen, and Naima Kaabouch (eds). Big Data Management, Technologies, and Applications. IGI Global. © 2014.
Alan R. Simon, Strategic Database Technology: Management for the year 2000.
http://www-01.ibm.com/software/data/infosphere/hadoop/hdfs/
Krishnan, Krish. ( © 2013). Data warehousing in the age of big data.