50/-csi-india.org.in/communications/csic april 2013.pdfsecurity corner information security>>...

CSI Communications | April 2013 | 1ww

w.c

si-i

nd

ia.o

rg

ISS

N 0

97

0-6

47

X |

Vo

lum

e N

o. 3

7 |

Iss

ue

No

. 1 |

Ap

ril

20

13`

50

/-

Cover StoryBig Data Systems: Past, Present & (possibly) Future 7

Technical TrendSocio-Business Intelligence using Big Data 11

Research FrontBig Data Enabled Digital Oil Field 17

CIO PerspectiveDeriving Operational Benefi ts from High Velocity Data 28

Security CornerInformation Security>>

Advanced Persistent Threats (APT) and India 33Practitioner WorkbenchProgramming.Learn ("R") 27

Mumbai, India (1 April 2013)—Two major

IT organizations in India have signed a

memorandum of understanding to benefi t

IT professionals throughout the country.

The Computer Society of India (CSI) and

ISACA yesterday signed an agreement

that allows for mutual collaboration and

knowledge sharing for the benefi t of the

profession.

ISACA is a global associations of 100,000

IT professionals who help enterprises

ensure trust in, and value from, their

information and systems. CSI has more

than 100,000 members and 70 chapters

in India.

The MoU, signed by ISACA Director,

John Ho Chi and CSI President Prof. S V

Raghavan, notes that the organizations

will “advance the global IT profession in

India, and the professional standing of

ISACA and CSI members” by:

• Strengthening the relationship

among ISACA and CSI chapters in

India

• Increasing awareness, use and

adoption of the COBIT framework by

CSI members

• Providing standard-setters,

regulators and legislators with access

to best practices, credentials and

educational opportunities off ered by

CSI and ISACA

• Conducting joint educational events

and research projects related to

information systems governance,

security, audit, and assurance issues

in India

“ISACA is pleased to collaborate with CSI

on this important mission: to promote

information systems governance, security,

and assurance in India, and to advance

the IT profession,” said Avinash Kadam,

advisor to ISACA’s India Task Force.

The CSI President expressed the

confi dence that this collaboration will grow

beyond the cooperation between the CSI/

ISACA members, and will lead towards

strengthening academia business and

industry interaction. The CSI members’

immediate gain will be to benefi t from

the ISACA, continuing professional

development programme and access to

the publications and learning resources.

As a result of the MOU, ISACA will waive

its new member fee for CSI members, who

wish to join ISACA. CSI members will also

receive a discount on ISACA’s CISA, CISM,

CGEIT, and CRISC certifi cation exams.

For more information on ISACA, visit

www.isaca.org. To learn more about CSI,

visit www.csi-india.org.

About CSIEstablished in 1965, the CSI is India’s fi rst

and the largest non-profi t organization

in the areas of information processing,

computers, and communications. The

mission of the CSI is to facilitate research,

knowledge sharing, learning, and career

enhancement for all categories of IT

professionals, while simultaneously

inspiring and nurturing new entrants

into the industry and helping them to

integrate into the IT community. The CSI

is also working closely with other industry

associations, government bodies, and

academia to ensure that the benefi ts of

IT advancement ultimately percolate

down to every single citizen of India. The

CSI currently represents over 1, 00,000

members affi liated to 73 CSI professional

chapters and about 562 CSI member

institutions (including 499 CSI student

branches), in diff erent states and regions

of India.

About ISACAWith more than 100,000 constituents in

180 countries, ISACA® (www.isaca.org) is

a leading global provider of knowledge,

certifi cations, community, advocacy,

and education on information systems

(IS) assurance and security, enterprise

governance and management of IT, and

IT-related risk and compliance. Founded

in 1969, the nonprofi t, independent ISACA

hosts international conferences, publishes

the ISACA® Journal, and develops

international IS auditing and control

standards, which help its constituents

ensure trust in, and value from, information

systems. It also advances and attests

IT skills and knowledge through the

globally respected Certifi ed Information

Systems Auditor® (CISA®), Certifi ed

Information Security Manager® (CISM®),

Certifi ed in the Governance of Enterprise

IT® (CGEIT®), and Certifi ed in Risk and

Information Systems Control™ (CRISC™)

designations.

ISACA continually updates and expands

the practical guidance and product family

based on the COBIT® framework. COBIT

helps IT professionals and enterprise

leaders fulfi ll their IT governance and

management responsibilities, particularly

in the areas of assurance, security, risk and

control, and deliver value to the business.

Participate in the ISACA Knowledge Center: www.isaca.org/knowledge-center

Follow ISACA on Twitter: https://twitter.

com/ISACANews

Join ISACA on LinkedIn: ISACA (Offi cial),

http://linkd.in/ISACAOffi cial

Like ISACA on Facebook: www.facebook.

com/ISACAHQ

Contact:Faizan Aboli, Ketchum Sampark, +91 - 22 -

4042 5518, faizan.aboli@ketchumsampark.

com

Kristen Kessinger, ISACA,

+1.847.660.5512, [email protected]

ISACA and CSI Sign Memorandum of Understanding in Mumbai to Advance the IT Profession

CSI Communications | April 2013 | 3

ContentsVolume No. 37 • Issue No. 1 • April 2013

CSI Communications

Please note:

CSI Communications is published by Computer

Society of India, a non-profi t organization.

Views and opinions expressed in the CSI

Communications are those of individual authors,

contributors and advertisers and they may

diff er from policies and offi cial statements of

CSI. These should not be construed as legal or

professional advice. The CSI, the publisher, the

editors and the contributors are not responsible

for any decisions taken by readers on the basis of

these views and opinions.

Although every care is being taken to ensure

genuineness of the writings in this publication,

CSI Communications does not attest to the

originality of the respective authors’ content.

© 2012 CSI. All rights reserved.

Instructors are permitted to photocopy isolated

articles for non-commercial classroom use

without fee. For any other copying, reprint or

republication, permission must be obtained

in writing from the Society. Copying for other

than personal use or internal reference, or of

articles or columns not owned by the Society

without explicit permission of the Society or the

copyright owner is strictly prohibited.

Published by Suchit Gogwekar for Computer Society of India at Unit No. 3, 4th Floor, Samruddhi Venture Park, MIDC, Andheri (E), Mumbai-400 093.

Tel. : 022-2926 1700 • Fax : 022-2830 2133 • Email : [email protected] Printed at GP Off set Pvt. Ltd., Mumbai 400 059.

Editorial Board

Chief EditorDr. R M Sonar

EditorsDr. Debasish Jana

Dr. Achuthsankar Nair

Resident EditorMrs. Jayshree Dhere

Published byExecutive Secretary

Mr. Suchit Gogwekar

For Computer Society of India

Design, Print and Dispatch byCyberMedia Services Limited

Cover Story

7 Big Data Systems: Past, Present &

(possibly) Future

Dr. Milind Bhandarkar

9 Big Data – A Big game changer

Shailesh Kumar Shivakumar

Technical Trends

1 1 Socio-Business Intelligence Using

Big Data

Gautam Shroff, Lipika Dey & Puneet Agarwal

Research Front

17 Big Data Enabled Digital Oil Field Pramod Taneja and Prashant Wate

Articles 19 Big Data A Kavitha, S Suseela and G Kapilya

20 Adoption of In-Memory Analytics

Jyotiranjan Hota

24 Five Key Knowledge Areas for Risk

Managers

Avinash Kadam

Practitioner Workbench

26 Programming.Tips() » Python: Programming Language

for Everyone

Dr. Nibaran Das

27 Programming.Learn("R") » R- StaR of Statisticians

Umesh P and Silpa Bhaskaran

CIO Perspective

28 Deriving Operational Insights from

High Velocity Data

Bipin Patwardhan and Sanghamitra Mitra

Security Corner

33 Information Security »

Advanced Persistent Threat

(APT) - and- INDIA

Adv. Prashant Mali

34 IT Act 2000 »

Prof. I T Law Demystifi es Technology

Law Issues Issue No. 13

Mr. Subramaniam Vutha

PLUSIT.Yesterday()Biji C L

35

Brain TeaserDr. Debasish Jana

37

Ask an ExpertDr. Debasish Jana

38

Happenings@ICT: ICT News Briefs in March 2013H R Mohan

39

CSI Reports:

M. Gnanasekaran 40

Prof. Prashant R Nair, Mr. Ranga Rajagopal and Dr. Rajveer S Shekhawat 41

Sanjay Mohapatra & Prof. Ratchita Mishra 42

Dr. Dilip Kumar Sharma, Mr. Sanjay Mohapatra and Mr. R K Vyas 43

Bipin V Mehta and S M F Pasha 44

Dr PVS Rao 45

CSI News 46

CSI Communications | April 2013 | 4 www.csi-india.org

Important Contact Details »For queries, correspondence regarding Membership, contact [email protected]

Know Your CSI

Executive Committee (2013-14/15) »President Vice-President Hon. SecretaryProf. S V Raghavan Mr. H R Mohan Mr. S [email protected] [email protected] [email protected]

Hon. Treasurer Immd. Past PresidentMr. Ranga Rajagopal Mr. Satish [email protected] [email protected]

Nomination Committee (2013-2014)

Prof. H R Vishwakarma Dr. Ratan Datta Dr.Anil Kumar Saini

Regional Vice-PresidentsRegion - I Region - II Region - III Region - IVMr. R K Vyas Prof. Dipti Prasad Mukherjee Prof. R P Soni Mr. Sanjeev Kumar Delhi, Punjab, Haryana, Himachal Assam, Bihar, West Bengal, Gujarat, Madhya Pradesh, Jharkhand, Chattisgarh,

Pradesh, Jammu & Kashmir, North Eastern States Rajasthan and other areas Orissa and other areas in

Uttar Pradesh, Uttaranchal and and other areas in in Western India Central & South

other areas in Northern India. East & North East India [email protected] Eastern India

[email protected] [email protected] [email protected]

Region - V Region - VI Region - VII Region - VIIIMr. Raju L kanchibhotla Mr. C G Sahasrabudhe Mr. S P Soman Mr. Pramit MakodayKarnataka and Andhra Pradesh Maharashtra and Goa Tamil Nadu, Pondicherry, International Members

[email protected] [email protected] Andaman and Nicobar, [email protected]

Kerala, Lakshadweep

[email protected]

Division ChairpersonsDivision-I : Hardware (2013-15) Division-II : Software (2012-14) Division-III : Applications (2013-15) Prof. M N Hoda Dr. T V Gopal Dr. A K Nayak [email protected] [email protected] [email protected]

Division-IV : Communications Division-V : Education and Research (2012-14) (2013-15)

Mr. Sanjay Mohapatra Dr. Anirban Basu [email protected] [email protected]

Important links on CSI website »About CSI http://www.csi-india.org/about-csiStructure and Orgnisation http://www.csi-india.org/web/guest/structureandorganisationExecutive Committee http://www.csi-india.org/executive-committeeNomination Committee http://www.csi-india.org/web/guest/nominations-committeeStatutory Committees http://www.csi-india.org/web/guest/statutory-committeesWho's Who http://www.csi-india.org/web/guest/who-s-whoCSI Fellows http://www.csi-india.org/web/guest/csi-fellowsNational, Regional & State http://www.csi-india.org/web/guest/104Student Coordinators Collaborations http://www.csi-india.org/web/guest/collaborationsDistinguished Speakers http://www.csi-india.org/distinguished-speakersDivisions http://www.csi-india.org/web/guest/divisionsRegions http://www.csi-india.org/web/guest/regions1Chapters http://www.csi-india.org/web/guest/chaptersPolicy Guidelines http://www.csi-india.org/web/guest/policy-guidelinesStudent Branches http://www.csi-india.org/web/guest/student-branchesMembership Services http://www.csi-india.org/web/guest/membership-serviceUpcoming Events http://www.csi-india.org/web/guest/upcoming-eventsPublications http://www.csi-india.org/web/guest/publicationsStudent's Corner http://www.csi-india.org/web/education-directorate/student-s-cornerCSI Awards http://www.csi-india.org/web/guest/csi-awardsCSI Certifi cation http://www.csi-india.org/web/guest/csi-certifi cationUpcoming Webinars http://www.csi-india.org/web/guest/upcoming-webinarsAbout Membership http://www.csi-india.org/web/guest/about-membershipWhy Join CSI http://www.csi-india.org/why-join-csiMembership Benefi ts http://www.csi-india.org/membership-benefi tsBABA Scheme http://www.csi-india.org/membership-schemes-baba-schemeSpecial Interest Groups http://www.csi-india.org/special-interest-groups

Membership Subscription Fees http://www.csi-india.org/fee-structureMembership and Grades http://www.csi-india.org/web/guest/174Institutional Membership http://www.csi-india.org /web/guest/institiutional-

membershipBecome a member http://www.csi-india.org/web/guest/become-a-memberUpgrading and Renewing Membership http://www.csi-india.org/web/guest/183Download Forms http://www.csi-india.org/web/guest/downloadformsMembership Eligibility http://www.csi-india.org/web/guest/membership-eligibilityCode of Ethics http://www.csi-india.org/web/guest/code-of-ethicsFrom the President Desk http://www.csi-india.org/web/guest/president-s-deskCSI Communications (PDF Version) http://www.csi-india.org/web/guest/csi-communicationsCSI Communications (HTML Version) http://www.csi-india.org/web/guest/csi-communications-

html-versionCSI Journal of Computing http://www.csi-india.org/web/guest/journalCSI eNewsletter http://www.csi-india.org/web/guest/enewsletterCSIC Chapters SBs News http://www.csi-india.org/csic-chapters-sbs-newsEducation Directorate http://www.csi-india.org/web/education-directorate/homeNational Students Coordinator http://www.csi- india .org /web/national-students-

coordinators/homeAwards and Honors http://www.csi-india.org/web/guest/251eGovernance Awards http://www.csi-india.org/web/guest/e-governanceawardsIT Excellence Awards http://www.csi-india.org/web/guest/csiitexcellenceawardsYITP Awards http://www.csi-india.org/web/guest/csiyitp-awardsCSI Service Awards http://www.csi-india.org/web/guest/csi-service-awardsAcademic Excellence Awards http://www.csi-india.org/web/guest/academic-excellence-

awardsContact us http://www.csi-india.org/web/guest/contact-us


I deem it a great privilege to be at the helm of aff airs of the

Computer Society of India, and it is a great opportunity to be

the President of the society, at a time when India is on the

high growth path in electronics and computers. The recent

policy declarations by Government of India – National Telecom

Policy, National Electronics Policy, and Electronics System

Design and Manufacturing Policy – open up tremendous

possibilities for every Indian. From sensors to supercomputers,

every area is open for innovation and rediscovery. Research

and Development leading to Intellectual property generation

and associated Human Resources Development for capacity

building, in related areas are awaiting active participation

from the CSI.

Since 2010, India has integrated its knowledge generating

institutions in the form of a National Knowledge Network

(popularly known as NKN). In the same year Government of

India, launched a project to take fi ber optic cable up to village

panchayats through National Optical Fiber Network (popularly

known as NOFN). Installation of broadbands everywhere, at

speeds exceeding 10 Mbps / 100 Mbps / 1 Gbps, is making

India the “Best Connected Country”. NKN has already

connected close to 1000 national laboratories and institutes

of higher learning in its fold, and moving towards the target

of 1500 institutions. Virtual classrooms are slowly becoming

the lifestyle in many of these institutions. NOFN spread and

CSI spread across the country, seem to suggest tremendous

opportunities to work together. Perhaps, the Division Chairs,

SIG Chairs, Regional Vice Presidents, Chapter Chairs, and

National Student Coordinator of CSI, would like to brainstorm

and see what role CSI can play in this major change. CSI and

Education had been synonymous, and perhaps can be a single

focal theme for synergistic cooperation.

The new Execom had its fi rst meeting on 31st March 2013.

New Execom members and those who were continuing were

excited about the days ahead. It is a real pleasure working

with this team. I welcome all of them to this wonderful

world of opportunity. As you all know, Shri H R Mohan and

Shri Ranga Rajagopal have joined us as Vice-President and

Treasurer respectively. Shri V L Mehta steps out as Treasurer

after making sure that fi nances of CSI are stable and sound.

Wonderful job indeed!

I would like to place on record the excellent work done by

the outgoing team led by Shri Satish Babu. Many programs,

MoUs, International relationships were handcrafted by them.

Congratulations to your team Satish for giving a wonderful

year to CSI.

CSI signed a MoU with ISACA on 31st March 2013, for mutual

cooperation. I am sure you will see the details elsewhere in

this issue.

Shri Satish Babu, will continue to represent CSI in SEARCC,

BASIS, and ICANN. He has laid solid foundation between CSI

and these entities, and will continue to strengthen it. I will

support him in all his endeavors. I will represent CSI in IFIP

General Assembly from now on.

We have new web site and a new portal. Please use them and

give feed back to CSI HQ. Web and Portal are the face of CSI,

and hence our critical information infrastructure.

Wonderful being with you, and I humbly seek your blessings

and support.

With best wishes,

Prof. S V RaghavanPresident

Computer Society of India

President’s Message Prof. S V Raghavan

From : [email protected] : President’s DeskDate : 1st April, 2013

Dear Members


EditorialRajendra M Sonar, Achuthsankar S Nair, Debasish Jana and Jayshree Dhere

Editors

Welcome to CSI Communications – Knowledge Digest for

IT Community April 2013 issue. On behalf of CSI, as editorial

panel members of CSI-C we are happy to convey to you that

we have completed two years of editorship and are feeling

privileged to bring about 1st issue of the third year. In this

issue, we are covering articles on ‘Big Data’!, a buzzword

everybody is talking about and trying to get hold of; a BIG

overwhelmed response from our esteemed and fellow

contributors proves that! We are still getting contributions;

we could not accommodate all articles and would carry

forward some contributions in next issue and request those

who have sent contributions to bear with us. We are proud

to tell you that we got a good number of contributions from

our industry fellow professionals. This shows that Indian

software companies are really serious about big data, are

putting serious R&D eff orts and must be looking at it as

a big opportunity for India. We welcome and encourage

our practitioners to contribute their valuable knowledge

through CSI-C. A big thank to all our contributors.

Hadoop - the name sounds familiar and immediately

catches attention when somebody hears about big data. We

start this issue with fi rst article under cover story section:

Big Data Systems: Past, Present & (possibly) Future by Dr.

Milind Bhandarkar, Chief Scientist at Greenplum, a Division

of EMC2. He writes about big data, big data infrastructure,

Apache Hadoop: its adoption and use cases and next

frontiers for big data systems. Big Data – A Big Game

Changer, a second article in this section is by Shailesh Kumar

Shivakumar, Technology Architect at Infosys Technologies,

Bangalore. He writes about drivers and opportunities, impact

and applications of big data, and mentions about how big

data had a big impact and re-defi ned the way elections are

fought in the recently concluded US elections.

In technical trends section we have article by Dr. Gautam

Shroff , Dr. Lipika Dey and Puneet Agarwal from TCS Innovation

Lab titled: Socio-Business Intelligence Using Big Data. They

describe how the fusion of social and business intelligence

is defi ning the next-generation of business analytics

applications using a new AI-driven information management

architecture that is based on big-data technologies and new

data sources available from social media.

In research front section we have an article by Pramod

Taneja and Prashant Wate of iGATE. They introduce

readers to Oil and Gas domain, discuss need for digital oil

fi eld enterprise platform, big data solution for digital oil

fi eld with detailed functional overview in their article: Big

Data Enabled Digital Oil Field.

We have three articles in article section. The fi rst article is

on Big Data is by Kavitha, S Suseela and G Kapilya of Periyar

Maniammai University. The second article is on In-Memory

analytics by Prof Jyotiranjan Hota. He introduces in-memory

analytics, application platforms, vendors, scope and benefi ts,

research challenges and future of in-memory analytics. Prof.

Hota has been one of the regular CSI-C contributors.

Opportunities always come with risks, however as a manager

one should be knowledgeable about how to manage those

risks. We have last article in this section on this topic by our

regular contributor Avinash Kadam. The article covers in

detail what a risk management professional is expected to

be well versed with and describes these as fi ve key practice

areas of risk and information systems controls.

In Practitioner Work Bench section we have fi rst article

under Programming.Tips () on Python: Programming

Language for Everyone by Dr. Nibaran Das of Jadavpur

University. The second article is by Prof. Umesh P and Silpa

Bhaskaran of University of Kerala under Programming.

Learn(“R”): R- StaR of Statisticians. In this section, they

introduce a new language called “R” for the fi rst time.

In CIO perspective, we have an article titled ‘Deriving

Operational Insights from High Velocity Data by Bipin

Patwardhan and Sanghamitra Mitra of Research & Innovation,

iGATE Mumbai, India, where they discuss about business

drivers of big data and Data Stream Processing: genesis,

introduction and implementation in various domains.

In IT.Yesterday(), we have an article on founder of

information theory and beloved father of information age

Claude Elwood Shannon titled ‘Birthday Tribute to the Most

Infl uential Mind of 20th Century’ by Research Scholar Biji C

L from University of Kerala.

There are other regular features such as Security Corenr,

Brain Teaser, Ask an Expert and ICT News Brief in March

2013 in Happening@ICT, CSI reports, chapter and student

branch news and various calls.

Remember we look forward to receiving your feedback,

contributions and replies as usual at [email protected].

With warm regards,

Rajendra M Sonar, Achuthsankar S Nair,

Debasish Jana and Jayshree Dhere

Editors

Dear Fellow CSI Members,


Big Data Systems: Past, Present & (possibly) Future

Dr. Milind Bhandarkar Chief Scientist, Machine Learning Platforms,Greenplum, A Division of EMC2

Cover Story

The data management industry has

matured over the last three decades,

primarily based on Relational Data

Base Management Systems (RDBMS)

technology. Even today, RDBMS systems

power a majority of backend systems for

online digital media, fi nancial systems,

insurance, healthcare, transportation,

and telecommunications companies.

Since the amount of data collected, and

analyzed in enterprises has increased

several-folds in volume, variety, and

velocity of generation and consumption,

organizations have started struggling

with architectural limitations of traditional

RDBMS architectures. As a result, a new

class of systems had to be designed and

implemented, giving rise to the new

phenomenon of “Big Data”.

In this article, we will trace the origin

and history of this new class of systems

to handle “Big Data”. We refer to current

popular big data systems, exemplifi ed by

Hadoop, and discuss some current and

future use-cases of these systems.

What is Big Data?While there is no universally accepted

defi nition of Big Data yet, and most of

the attention in the press is devoted to

the “Bigness” of Big Data, volume of data

is only one factor in the requirements

of modern data processing platforms.

Industry analyst fi rm Gartner[1] defi nes Big

Data as:

Big data is high-volume, high-velocity,

and high-variety information assets that

demand cost-eff ective, innovative forms

of information processing for enhanced

insight and decision-making.

A recent IDC study, sponsored

by EMC2[2], predicts that the “digital

universe”, the data that is generated in

digital form by humankind, will double

every two years, and will reach 40,000

exabytes (40 * 1021 bytes) by 2020. A

major driving factor behind this data

growth is ubiquitous connectivity via

rapidly growing reach of mobile devices,

constantly connected to the networks.

What is even more remarkable, is thatonly

a small portion of this digital universe

is “visible”, which is the data (videos,

pictures, documents, status updates,

tweets) created and consumed by

consumers. A vast amount of data will be

created not “by” human users, but “about”

humans by the digital universe, and it

will be stored, managed, and analyzed by

the enterprises, such as Internet service

providers, and cloud service providers of

all varieties (Infrastructure-as-a-service,

Platform-as-a-Service, and Software-as-

a-Service.)

Origins of Big Data InfrastructureWe already notice this rapid growth

of data generation in the online world

around us. Facebook has grown from one

Million users in 2004, to more than one

Billion in 2012, a thousand-fold increase in

less than eight years. More than 60% of

these users access Facebook from mobile

phones today. The value generated by

a social network is proportional to the

number of contacts between users of the

social network, rather than the number of

users. According to Metcalfe’s Law[3], and

its variants, the number of contacts for

N users is proportional to N*logN. Thus,

the growth of contacts, and therefore

the interactions within a social network,

which results in data generation, is non-

linear with respect to number of users. As

the world gets more connected, one can

expect the number of interactions to grow,

resulting in even more accelerated data

growth.

Since the popularity of Internet

was one of the main reasons for growth

of communication and connectivity in

the world, we saw emergence of Big

Data platforms in the Internet industry.

Google, founded in 1998 with the goal

of organizing all the information in the

world, became the dominant content

discovery platform on the World Wide

Web, trumping human-powered and

semi-automated approaches, such as web

portals and directories. The challenges

Google faced in crawling the web, storing,

indexing, ranking, and serving billions

of web pages could not be solved with

the existing data management systems

economically. Amount of publicly available

content on the web in Google’s search

index exploded from 26 Million pages in

1998, to more than 1 Trillion in less than

a decade[4].In addition, this content was

“multi-structured”, consisting of natural-

language text, images, video, geo-spatial,

and even renderings of structured data. In

order to rapidly answer the search queries,

with information ranked by relevance as

well as timeliness, Google had to develop

its infrastructure from scratch. In 2003

and 2004, Google published details of

a part of its infrastructure, in particular,

the Google File System (GFS)[5], and

MapReduce programming framework[6].

These two publications became the

blueprint for Apache Hadoop, an open

source framework that has become a

de facto standard for big data platforms

deployed today.

Apache HadoopThe GFS and MapReduce papers motivated

Doug Cutting, creator of an open-source

search engine, Apache Lucene, to re-

architect the content system of Lucene,

called Nutch, to incorporate a distributed

fi le system, and MapReduce programming

framework for tasks of crawling, storing,

ranking, and indexing web pages so that

they could be served as search results

by Lucene. These developments were

noticed by engineers and executives at

Yahoo, which was then struggling to scale

its search backend infrastructure. Yahoo

adopted Apache Hadoop in January

2006, and made signifi cant contributions

to make it a scalable and stable platform.

Today, Yahoo has the largest footprint

of Apache Hadoop, running more than

45,000 servers managing more than 370

Petabytes of data with Hadoop[7]. Being an

open source system, licensed under the

liberal Apache Software License, governed

by the Apache Software Foundation, meant

that Hadoop could be freely downloaded

and deployed in any organization, modifi ed

and used without any hard requirement of

having to contribute the changes back to

open source. The scalability and fl exibility

of Apache Hadoop prompted growing

Internet companies such as Facebook,

Twitter, and LinkedIn to adopt it for

their data infrastructure, and contribute


modifi cations and usability enhancements

back to the Apache Hadoop community.

As a result, the Hadoop ecosystem grew

rapidly over the years.

Today, there are more than 20

components in the Hadoop ecosystem

that are developed as open source projects

under the Apache Software Foundation,

and several hundred proprietary and

other open source components. Some

of the popular components in the

Hadoop ecosystem, apart from Hadoop

Distributed File System (HDFS), and

MapReduce, include Hive, A SQL-like

language that translates to MapReduce;

Pig, an imperative data fl ow language that

generates MapReduce jobs to execute the

data fl ow; and HBase, a NoSQL Key-Value

store that uses HDFS as its persistent

layer. HBase is based on a paper describing

another Google infrastructure component,

Bigtable, which was published in 2006[8].

While Hadoop, today, has become

the de facto platform for analyzing Big Data,

challenges remain in making it accessible

and improving its ease of use, thus making

it a fi rst-class citizen of data infrastructure

managed by IT professionals. The

MapReduce programming paradigm is not

particularly easy to use for data analysts,

and commonly used business intelligence

tools do not interoperate with interfaces

provided by Hadoop today. To overcome

these challenges, a number of data

warehousing system vendors, such as

Teradata, Oracle, IBM, EMC2/Greenplum,

and others off er connectivity with Hadoop

platforms. There are even eff orts towards

unifying SQL-based OLAP platforms, such

as Greenplum, with Hadoop[9]. A number

of Hadoop distributions have emerged

over the years, improving manageability

of Hadoop infrastructure. These include

Cloudera, Hortonworks, MapR, EMC2/

Greenplum, IBM BigInsights, Microsoft

HDInsights, etc. In addition, there is an

increasing number of Big Data Appliances;

hardware platforms that are integrated

with Hadoop distributions, including

Oracle, Teradata, and EMC2/Greenplum.

Hadoop Adoption & Use CasesOver the years, Hadoop and other big

data technologies have become popular

in non-Internet organizations as well,

also struggling to handle the data deluge.

Infrastructure in many organizations in

various industries, such as retail, insurance,

healthcare, fi nance, manufacturing, and

others have been almost fully digitized.

Until recently, the data these organizations

used to collect were stored in archival

systems, mostly for regulatory compliance

purposes. However, there is a growing

realization across these organizations

that this data can be utilized for gaining

competitive advantage, increasing process

effi ciencies, and improve customer

experience. In a recent study conducted

by Tata Consultancy Services (TCS)[10],

over 50% of organizations surveyed are

using Big Data technologies, and many of

them predicted more than 25% gains in

returns on investment (ROI), mostly from

increased revenue. The fl exibility of these

Big Data systems to combine structured

datasets (51%) with semi-structured

datasets (49%) has been cited as enabling

advanced analytics capabilities. In

addition, while most of the organizations

use data that is available internally (70%)

within those organizations, availability of

external data, such as from twitter and

other social media, allows them to perform

better customer behavior analysis.

The 3V’s, volume, velocity, and

variety of data, along with need to develop

agile, data-driven applications, implies

that the humans analyzing, detecting

patterns, and making sense of data need

to have a rich toolset at hand. Traditional

data exploration, visualization, business

intelligence, and reporting tools are being

adapted to co-exist with these new Big

Data technologies. Advances in machine

learning algorithms and methods, as

well as abundant processing power,

have democratized deep and predictive

analytics to be used in any average IT

department. Open source languages

for statistical analysis and modeling,

such as the popular R language[11] and

a newcomer such as Julia, as well as

emerging machine learning frameworks,

such as scikit-learn in Python[12], Apache

Mahout for Hadoop[13], and In-Database

deep analytics library, MADlib[14] have

attracted attention of developers and

users for developing machine-learning

powered applications based on large and

diverse datasets.

These new platforms, languages

and frameworks have challenged several

predominant practices in the enterprises.

Traditional data governance practices,

including access control, provenance,

retention, backup, mirroring, disaster

recovery, security, and privacy, are

struggling to cope with organizations’

ability to store and process massive

amounts of diverse data. Over the next few

years, one should expect best practices

for data governance, and associated

technologies to emerge and become

commonplace.

Industrial Internet: The Next FrontierWhile most of the Big Data use-cases

today are analyzing customer behavior,

their buying patterns, their likes and

dislikes as expressed in social media, their

clickstreams and location information

from mobile devices, machine-generated

data could be the next frontier for Big

Data systems. In addition, cheap sensor

technology, and short-range wireless

connectivity has created possibility of

real-time monitoring, and historical

pattern analysis of traditionally analog

informationsources. For example, a

modern Ford automobile has thousands of

signals being captured by 70+ sensors that

generate more than 25 gigabytes of data

every hour, and processed by 70 on-board

computers[15]. While most of this data is

transient, and needs to be acted upon in

real-time, recognizing patterns within the

data to improve safety and usability of

the automobile implies aggregating and

analyzing it offl ine.

Indeed, the massive amount of data

captured by sensors in machinery, and

possibility of storing and analyzing this data

to make intelligent design and operational

decisions has created a new opportunity,

now known by a new moniker, Industrial

Internet[16]. If, as a result of analyzing this

data to aid better decision making, we could

reduce system ineffi ciencies in healthcare

industry by a mere 1%, it could result in

savings of USD 63 Billion over next 15

years. If advanced analytics capabilities on

the large amount of oil and gas exploration

data results in only 1% of reduction in

capital expenditure, it could save more than

USD 90 Billion over next 15 years. The key

element proposed for the Industrial Internet

is Intelligent Connected Machines with

advanced sensors for data capture, controls

for automation, and software applications

powered by deep physics-based analytics

and predictive algorithms for analyzing large

amounts of sensor and telemetry data.

Indeed, we are witnessing the

third revolution, following the industrial

revolution, and the Internet revolution,

Continued on Page 16


Introduction Big data is basically vast amount of data

which cannot be eff ectively processed,

captured, and analyzed by traditional

database and search tools in reasonable

amount of time. Though the “big” in Big

Data is subjective, McKinsey estimates

that it would be anywhere between few

dozen terabytes to petabytes for most of

the sectors.

The Big Data information explosion

is mainly due to the vast amounts of data

generated by social media platform, data

input from Omni-channels, various mobile

devices, user generated data, multi-media

data, and son on. Analysts term this as an

expanding “Digital universe”.

Big Data is usually defi ned by 3Vs:

Volume, variety, and velocity. To put things

in perspective let’s examine each of these

dimensions:

• Volume: IBM research fi nds that every

day we add about 2.5 quintillion bytes

(2.5 x 1018) of data; Facebook alone

adds 500TB of data on daily basis;

90% of world’s data is generated in

last 2 years. Google processes about

1 petabyte of data every hour.

• Velocity: The rate of data growth is

also astonishing. Gartner research

fi nds that data is growing at 800%

rate out of which 80% is unstructured.

EMC research indicates that data

increase is following Moore’s law by

doubling every 2 years.

• Variety: The data that is getting

added is also of various types ranging

from unstructured feeds, social

media data, multi-media data, sensor

data etc.

The main value that can be derived from

Big Data is by aggregating vast amount of

data integrated from various sources.

Following diagram shows various

technologies used in Big Data:

Drivers and Opportunities There is lot of drivers forcing the

businesses to consider Big Data as their

key business strategy. Some of them are

listed below:

• Real-time prediction

• Increase operational and supply

chain effi ciencies

• Deep insights into customer behavior

based on pattern and purchase

analysis

• Information aggregation

• Better and more scientifi c customer

segmentation for targeted marketing

and product off ering

Big data also provides the following

opportunities:

• Improve productivity and innovation

• McKinsey predicts an increase in job

opportunities ranging from 140K to

190K

• Uncover hidden patterns and rapidly

respond to changing scenarios.

• Multi-channel and multi-dimensional

information aggregation

• Data convergence

Traditional search, sort, and processing

algorithms would not scale to handle the

data in this range, and that too most of it

being unstructured. Most of the Big Data

processing technologies include machine

learning algorithms, natural language

processing algorithms, predictive

modeling, and other artifi cial intelligence

based techniques.

Big Data is of strategic importance

for many organizations. Because any

new service or product will be eventually

copied by competitors, but an organization

can diff erentiate it by what it can do with

the data it has.

Below diagram shows the

convergence of data from various

dimensions:

Impact and Applications We will examine the impact and

applications of Big Data related

technologies across various industry

verticals and technology domains in this

section.

Application across industry domains

Financial industry: • Better fi nancial data management

• Investment banking using aggregated

information from various sources

likes fi nancial forecasting, asset

pricing and portfolio management.

• More accurate pricing adjustments

based on vast amount of real-time

data

Big Data – A Big game changer

Shailesh Kumar Shivakumar Technology Architect, Consulting & Systems Integration Infosys Technologies, Bangalore, [email protected]

Abstract: Big Data holds bigger promises for the information technology area. Properly taming and analyzing the Big Data

provides valuable insights, predict consumer behavior, improve productivity, and reduce cost. It has potential to be a game

changer by providing big opportunities, which would catalyze the business revenues. This article discusses the key concepts,

applications, and challenges in implementing Big data strategy.

Keywords: component, Big data, predictive analytics, 3Vs, Industry and technology applications

Cover Story


• Stock advises based on huge amount

of stock data analysis, unstructured

data like social media content etc.

• Credit worthiness analysis by

analyzing huge amount of customer

transaction data from various sources

• Pro-active fraudulent transaction

analysis

• Regulation conformance

• Risk analytics

• Trading analytics

Retail industry : • Better analysis of supply chain data

and touch points across Omni-

channel operations

• Customer segmentation based on

previous transactions and profi le

information

• Analysis of purchase patterns and

tailor made product off erings

• Unstructured data analysis from

social media, multi-media to

understand the tastes, preferences,

and customer patterns and do

sentiment analysis

• Targeted marketing based on user

segmentation

• Competitor analysis

Mobility: • Mining of customer location data,

call patterns.

• Integrate with social media to

provide location based services like

sale off ers, friend alerts, points-of-

interest suggestions etc.

• Geo-location analysis :

Health care: • Eff ective drug prescription by

analyzing all structured and

unstructured medical history and

records of the patient

• Avoid un-necessary prescriptions

Insurance: • Risk analysis of customer

• Analyzing cross-sell and up-sell

opportunities based on customer

spending patterns

• Insurance portfolio optimization and

pricing optimization

Application across technology domains: • Search Engine improvements:

New algorithms to analyze large

unstructured data will be used.

The algorithms will be artifi cial

intelligence based working in parallel

in multiple grids to process huge

amount of data

• Business intelligence tools: Analytics

tools will be able to provide new and

creative visualizations to intuitively

depict the meaning of the data

• Storage management tools: Private/

cloud storage systems will undergo

change to store huge amount of data

• Cloud computing: Cloud and social

media play a vital role in handling Big

Data. Cloud would be the platform

of choice to store massive amount

of data and to run the software as

service to process the data

• ERP systems like CRM undergo great

improvements. CRM system can

help the on-call analysts to provide

real-time customer off ers, customer

churn probability etc.

• Predictive analytics will be more

eff ective by analyzing data from

multiple dimensions

A Curious Case of Big Data in US 2012 Elections Curiously Big Data had a big impact and

re-defi ned the way elections are fought in

the recently concluded US elections. Here

are the some of the interesting facts on

how the Big Data was leveraged

• In the recently concluded US

elections, Obama’s team eff ectively

used Big Data to achieve victory

• Democratic team tasked with Data

analysis, aggregated data from

various sources including voter list,

social media posts, fund raisers etc.

• Multi variant tests were conducted to

understand voters’ decision making

and designing eff ective policies to

persuade them

• The data analysis included mining

customer data, profi ling them and

sending targeted campaign mails to

infl uence their decision. The analysis

also provided crucial insights about

the voters who are most likely

to switch sides and the required

triggering points for the switch.

• The team built persuasion model

with predictive analytics to fi nd out

the probability of persuasion among

population various geographies.

Analyzing the Big Data was the key

diff erentiator in swinging a good

percentage of voters and predicting the

results with a greater confi dence.

Market Opportunity Big Data off er bigger opportunities. Here

is a snapshot of some of the predictions

done by market research fi rms in this

regard:

• IDC predicts the Big Market to grow

to $16.9 Billion by 2019

• Digital reasoning estimates that Big

Data market would be worth $48.3

billion in 2019

References [1] http://blogs.wsj.com/digits/2009/

05/18/the-exploding-digital-universe/

Risk analytics

[2] h t t p : //w w w. f o r b e s . c o m /s i t e s /

tomgroenfeldt/2012/01/06/big-data-

big-money-says-it-is-a-paradigm-

buster/

[3] http://www.emc.com/about/news/

press/2011/20110628-01.htm

[4] EMC link

[5] http://online.wsj.com/article/SB10001

4241278873233532045781266711241

51266.html

[6] h t t p : //w w w . e w e e k . c o m /c /a /

Application-Development/Big-Data-

Market-to-Grow-to-169-Billion-by-

2015-IDC-118144/

[7] h t t p : //w w w. f o r b e s . c o m /s i t e s /

n e t a p p / 2 0 1 2 / 1 1 /0 6 / b i g - d a t a -

election-surprising-stats/ n

Abo

ut th

e A

utho

r

Shailesh Shivakumar is a technology architect at Infosys with over 11 years of industry experience. His areas of expertise include Java Enterprise technologies, Performance engineering, Enterprise portal technologies, user interface components and performance optimization. He was involved in multiple large-scale and complex online transformation projects for marquee clients of Infosys. He also provided on-demand consultancy in performance engineering for highly critical projects across various units. He is a regular blogger at Infosys Thought Floor, and many of his technical white papers are published in Infosys external site and in Infosys Lab briefi ngs journal. His blog also was listed in “Most popular” category recently. He also heads a centre-of-excellence at Infosys. He also holds numerous professional certifi cations including Sun certifi ed Enterprise Architect (part 1), Sun certifi ed Java programmer, Sun Certifi ed Business component developer, IBM certifi ed Solution Architect – Cloud computing, IBM Certifi ed Solution Developer – IBM WebSphere Portal 6.1, and many others.


Abstract: We describe how the fusion of

social and business intelligence is defining

the next-generation of business analytics

applications using a new AI-driven

information management architecture

that is based on big-data technologies and

new data sources available from social

media.

What is ‘BigData’? The term ‘BigData’ has become the latest

buzzword in the IT industry, much as

Cloud Computing began to elicit interest a

few years ago. As in the case of the latter,

we submit that BigData, is a metaphor

for a few significant technology, social

business convergences: Popular interest

in cloud computing was fuelled by the

emergence and eventual confluence of

web-based social applications, software

as a service, infrastructure as a service,

and finally platforms as a service.

In a similar fashion, ‘BigData’ is

essentially the convergence of technology

advances in artificial intelligence

emanating from search and online

advertising, along with the development of

new architectures for managing extremely

large web-scale data volumes, exemplified

by the now popular Hadoop stack. Along

with the means to process vast quantities

of unstructured data, we also find that

the data itself is now readily available:

Vast volumes of consumer conversations

on social media;, such as Twitter, are free

for all to access, and the rest are rapidly

becoming a valuable commodity available

for purchase from Facebook, Linkedin, etc.

In this article we describe a

number of ‘Socio-Business’ applications

that exploit these new data sources,

and are of potential interest to large

enterprises. Moreover, we find that each

of these applications involve the fusion

of information from social media with

internal business data, the extraction

of knowledge from web-sources, the

application of artificial intelligence

techniques in some fashion, and/or the

exploitation of BigData-inspired data-

management architectures.

The New Context for Business Intelligence In the past decade, AI techniques operating

at web-scale have now demonstrated

significant successes on the web, many

of which were once impossible: Statistical

machine learning at web-scale is the

reason, why Google’s machine translation

works. Web-based face recognition relies,

among other things, on large-scale multi-

way clustering to discover image features

that work best to disambiguate faces; this

couple with some human tagging, even

from profile photos, is then sufficient to

recognise faces even without standard

scale, poses, expression or illuminations.

The Watson system uses 15 terabytes

of in-memory data culled from the web

and other sources, along with parallel

processing across 90 processors. Finally,

Siri’s hope for success depends on the fact

that it includes a cloud-component, which

opens up the possibility for continuous

learning using the large volumes of data

its adoption my millions of users will

generate.

The time is therefore ripe for

enterprise busines. AI techniques into

their solutions. The potential for AI

techniques in the enterprise was aptly

articulated almost a decade ago by Dalal

et. al[1]. Moreover, the availability of large

volumes of data from social-media makes

it all the more viable, as well as essential

to exploit the techniques already being

used so well in web-scale AI applications.

Further, in sharp contrast to the

millions of servers powering the web, the

largest of enterprise IT departments, are

used to handling 50,000 or so servers,

and hundreds of terabytes of data at the

most. Enterprise data-storage, databases,

and data analysis tools are, in turn, tailored

to handle terabytes or at most a petabyte

or so. Further, most of the ‘big-data’

emanating from social-media sources is

unstructured, text data; again, something

that the traditional business intelligence

tool-stack is not designed to tackle, and

for which afore mentioned AI techniques,

are needed to extract insight.

Moreover, inputs from from social

media comprises of largely unstructured

data that needs to be tapped, processed

and analysed sometimes require the

use of big-data technologies such as are

used by the web companies themselves,

instead of the traditional databases that

are better suited for structured data. Thus,

big-data technologies such as Hadoop

etc. are often used even though most

traditional enterprises do not actually

need to process as large a volume of data

as the web companies do.

Innovative business use-cases

exploiting BigData from social-media and

mobility sources span multiple industries,

from retail to manufacturing and financial

services. A common theme across all

these applications besides having to

extract intelligence, from large-volumes

of BigData is the need to fuse information

from multiple sources, both internal and

external, structured and unstructured.

Further, the rapid pace of developing

events on social media mean that the

standard techniques for translating

predictive insights into real-time decision

support, such as building (off-line) a deep

but computationally ‘small’ model, need

to be enhanced: Social media events need

to be filtered, processed, correlated and

analysed for their impact in real-time.

In the sections that follow, we describe

some of these use-cases, and explain the

techniques they require.

Supply-Chain Disruptions The recent natural disasters that struck

Japan in 2011, i.e., the earthquake,

tsunami, and subsequent release of

nuclear radiation, clearly had a devastating

effect on the the Japanese population

and economy. At the same time, the

effects of these events were felt around

the world; in particular, they led to major

disruptions in the global supply chain for

many industries, from semiconductors to

automobiles and even baby products.

The Japanese earthquake was

a major event of global significance,

followed closely in the global media on a

daily basis; hopefully a fairly rare ‘black

swan’ event. However, many adverse

events of a far smaller significance occur

daily across the world. Such events are

mainly of local interest only. Further,

public interest in the event may last but a

day or so, while its economic impact may

last much longer: Take the example of a

fire in a factory. There are, on the average,

in the range of ten or so major factory fires

in the world every day. Similarly there are

labour strikes that disrupt production.

Most of these events affect a very small

Socio-Business Intelligence Using Big Data

Gautam Shroff,* Lipika Dey,** & Puneet Agarwal***TCS Innovation Labs

Technical Trends


locality, and may not even reach the local

news channel, and certainly not global

ones. Further, any public interest, however

localised, in the event may last a few hours

or at most a day. Nevertheless, if the

factory affected is a significant supplier

to a major manufacturer half-way around

the world, this relatively minor event is

possibly of great interest to the particular

enterprise that consumes its product! It is

observed that the manufacturers notice

such news about their suppliers, when

they encounter shortage in supply, which

is usually a few days or sometime a week

later. If however technology can help them

notice this earlier, they will have more

time to make alternate arrangements.

Interestingly, it has been found that

many of these events, even the ones with

extremely local impact, find their way fairly

rapidly into social media, and in particular

Twitter. Used for social-networking in

over 200 countries with over 500 million

tweets a day, Twitter turns to also be rich

source of local news from around the

world. Many events of local importance

are first reported on Twitter, including

many that never reach news channels.

Fig. 1 describes the overall architecture for

listening to events from social media that

we have used both for detecting adverse

events as well as for listening to the ‘voice

of the customer’ as described in the next

Section.

In[5] we have proposed an

architecture that enables a large enterprise

to monitor potential disruptions in its

global supply-chain by detecting adverse

events, by monitoring Twitter streams.

In[4] we have described how such events

can be efficiently detected using machine-

learning techniques, from amongst

streams of unstructured short-text

messages (tweets) arriving at a rate of

tens of messages per second. In contrast

with the larger volumes that follow events

of wider significance, there are often only

a few tweets reporting each such event;

the few tweets that happen to report the

same event, are then correlated.

Next, as described in[5] and, the impact

of the detected event to the enterprise in

question can be assessed, by fusing the

detected external event with internal data

on suppliers.

Voice of the Consumer Listening to the voice of the consumer

through mechanisms such as surveys,

feedbacks, emails, and support center

logs, is a continual process through

which organizations try to improve their

customer satisfaction rate and increase

their consumer base. Increasingly,

listening to consumer-generated content

from social-media channels like Twitter,

Facebook, and Blogosphere is augmenting

the possibilities for analyzing the voice of

the consumer, and becoming an important

element of the business intelligence

strategy of consumer-focused enterprises.

At the same time, the traditional

channels of listening directly to

customers, such as call-centers and

email, and indirectly through eventual

sales figures, remain as important as ever:

Social-media inputs are inherently noisy

in nature, so the insights acquired from

social-media are often validated by fusing

these with additional inputs collected

through more traditional channels. At the

same time, social-media inputs may often

lead other inputs in time, and therefore

be of significance in spite of their relative

inaccuracy.

Different type of insights can be

gathered from consumer-

generated content.

Companies are engaging

in analyzing the voice

of consumer primarily

to address the following

issues, which we may also

distinguish based on the

the content, sources, and

temporal variation that

they focus on:

1. Brand Sentiment

Analysis, is concerned

with measuring the

sentiment expressed in

the context of particular

brands, products, and

services, or even specific pre-defined

features of a product or service. The

emphasis is on volumes, and on tracking

the overall aggregate positivity / negativity

associated with the set of concepts one is

interested in. Source selection is broad

and channel based; thus one might choose

to focus on say Twitter, a Facebook page,

and selected blogs, as well as analyze the

variation across these. Since sentiment

is noisy and varies rapidly, it is also

aggregated temporally; thus the time-

scales of aggregate sentiment analysis are

in the range of days and weeks.

Social-media-based brand sentiment

analysis is cheaper and faster than

traditional survey-based techniques

such as Nielsen market-surveys; it also

reveals results sooner. Thus, sudden and

significant changes in sentiment about a

brand can be detected faster, such as that

which took place when tropicana changed

its packaging a few years ago, which was

followed by strong negative consumer

sentiment. However, the jury remains out

as to how often these aggregate sentiment

figures bring novel insights as compared

to traditional measures. The fact is they

need to be time-averaged to make any

sense; thus finer-grained approaches

are needed to enable more real-time

response, and detect emerging problems

that by themselves may not change the

aggregate sentiment significantly, at least

at first, and that too only if not addressed

in time.

Listening to consumer sentiment

on social platforms has recently become

almost a commodity offered by a number

of commercial services, such as Radian61

and others. Opinion-mining techniques for

extracting sentiment from text are used in

such tools. The initial insight that is most

often sought through the adoption of a

listening service is the ability to monitor

brand perception, i.e., whether consumers

at large are saying positive or negative

things about one’s brand, product, or

service.

2. Complaint Analysis, in contrast with

brand sentiment analysis that casts

its net wide, complaint analysis

tries to focus on actual customers.

Thus, the sources for such analysis

are either direct customer feedback

through call-centers or email, or

when it comes to social-media, the

input is carefully filtered so as to

ensure the presence of indicators

such as “I bought”, “my car”, etc.,

Fig. 1: Event detecti on


making it highly likely that the writer

is in fact a customer, either of one’s

own product or that of a competitor.

Next, such complaint analysis aims to

analyze the text written by customers

to detect which aspects of a product or

service they are having difficulty with. This

requires a deeper level of natural language

processing than, say, aggregate sentiment

analysis: Consider the statement “I’ve

been having trouble with my new [car-

brand], not only did the transmission

give way in the first month but there was

a significant delay in getting it changed”.

Clearly it is a negative statement about

the car brand, and even its transmission,

which basic sentiment analysis can easily

discover. However, deeper text processing

can further discern what exactly is wrong

with the transmission, and aggregate

such difficulties across a large volume

of customer feedback along various

dimensions. As a result, if the concept of

say, transmission ‘giving way’ including

its linguistic equivalents, is showing up in

significant numbers, then this becomes an

issue to flag to product engineering. On

the other hand, the fact that the supply

of spares of various types are delayed,

including transmission parts etc., gets

aggregated at a different level of say

‘delayed parts’, and is escalated to those

responsible for after-sales services.

The deeper degree of text processing

required for complaint analysis requires

‘ontology-driven causal analysis’, which

involves some level of parsing as well as

learning, and exploiting domain ontology.

Additional techniques required include

trend analysis, whereby sudden spikes

in communications regarding particular

new terms, such as ‘iPad’, so as to detect

emerging problems even if they are not

part of a known categorisation or ontology.

Summarising, causal feedback

analysis restricts the source to customer

feedback, analyses the content in depth,

and aggregates results over a period of

time, for example on a weekly or monthly

basis. Most importantly, in contrast with

brand sentiment analysis, complaint

analysis often results in directly actionable

intelligence that can be passed on to the

concerned division in the enterprise.

Fig. 2 describes our architecture for

ontology driven opinion mining from

unstructured customer feedback, which is

described in more detail in[2].

3. Early problem detection, again listens

to consumers at large. However,

unlike aggregate sentiment analysis,

the aim here is to quickly detect new

problems being faced by consumers.

For example, a new website design

might be flawed, leading to consumer

frustration; a new policy on a banking

service may be leading to angst

and outrage, or a major competitor

might be luring customers away.

Increasingly, consumer conversations

that might point to such events are

taking place in the open, on Twitter.

However, as in the case of detecting

potential supply-chain disruptions,

the stream of tweets needs to be

filtered to first focus

only on consumer

complaints, and then

processed to extract

information on the

actual problem being

faced.

However, the situation

here is technically more

challenging than say,

factory events, since

distilling ‘consumer’

versus non-consumer

events is less accurate

than discerning factory-

fire or labour-strike

events. Further, the

nature of the information

that one seeks to extract

need not be known in advance. Thus,

while a domain ontology can help classify

events to a certain extent, the sudden

arrival of a comment on “why isn’t there an

iPad application to access my ... account

like there is for [competitor]”; it may well

be that ‘iPad’ does not yet figure in the

domain ontology. Still, this problem needs

to be detected and classified in some

manner so that appropriate action can be

taken. New problem detection is as yet

difficult to completely automate: Instead,

as in the example above, it is better to

bring a human in the loop when required;

of course, automatically figuring out when

to do so is equally important as well.

Competitive Intelligence Competitive intelligence is aimed at

assessing risks and opportunities in a

competitive environment, before they

become obvious. It is used by organisations

to compare themselves with their peers

(”competitive benchmarking”), to identify

risks and opportunities in their markets,

and to pressure-test their plans against

market response (war gaming), which

enable them to make informed decisions.

Competitive intelligence comprises

the tasks of defining, gathering, and

analysing intelligence about the industry

in general, along with specific knowledge

like products, pricing, marketing

strategies, and much more about

competitors. The information gathered

allows organizations to realize their

strengths and weaknesses. Acquisition

and analysis of events falling under

competitive intelligence category is a

highly specialised activity.

Competitive intelligence can be

broadly classified into two categories

depending on whether it is used for long-

term planning or short-term planning.

Strategic Intelligence (SI) focuses on long

term issues that analyze a company’s

competitiveness over a specified period in

future. The main focus of analysts here is

to forecast, where the organization should

be positioned few years hence, and to

identify strategies to convert this into a

reality. This analysis primarily involves

identifying weaknesses, and early warning

signals within the organization. Tactical

Intelligence on the other hand focuses on

providing information that can influence

short-term decisions. Most often, this is

related to analysis of current market share

and the competition landscape. This kind

of intelligence directly affects the sales

process of an organisation.

Tactical intelligence can be further

categorized as: (i) Brand related:

provides information about popularity of

competitors in terms of their products

Fig. 2: Ontology-driven opinion mining

1Now http://www.salesforcemarketingcloud.com/


or brands as a whole, which products

are moving in the market, market share

of competitors. Consumer sentiments

related to the organization and its

competitors also belong to this category.

(ii) Pricing related: provides knowledge

about prices of competitor products. (iii)

Promotions related: provides information

about promotion strategies and kind of

promotional activities that are adopted by

competitors. (iv) Organizational: provides

information about competitors like their

work force structure, internal shift in

focus or vision, success or failure of their

trials, new product launches, technology

investments etc. all contribute towards

building a profile of competitors that can

be useful to organizations. The table in

Fig. 5, presents an overview of how

different types of web-content can

contribute towards compiling tactical

competitive intelligence reports for an

organization. A detailed treatment of how

competitive intelligence can be extracted

from social media is treated in[3]

5.1 Detecting Competitor Events The process of gathering competitive

intelligence has undergone a massive

transformation in recent years, fuelled by

an increasing availability of information

on the web. The competitor’s home

pages can be crawled to understand

new developments, positioning changes,

technology adoption etc. The social media

on the other hand abounds in consumer-

generated content, and can be utilized to

gauge the performance of competitors,

their products, brands, suppliers, and

distributors. Competitive intelligence

content also includes expert opinions,

technology advancements, economic

policies, social changes, and many other

related materials essential for excelling in

business. News from multiple sources is

still considered to be a major contributor

to competitive intelligence. Discussions

on different forums and blogs can provide

crucial insights when analyzed in proper

perspective. Using Google search trends

for competing products and services

can also be a good source of competitor

intelligence.

It is essential to define a set

of processes to gather information,

converting it into competitive intelligence,

and then channelize it for consumption

in business decision making. Usability

and actionability of the information

gathered are two critical factors

in determining its relevance.

Information gathered from

the Web is unstructured

in nature, and therefore

not immediately machine-

interpretable. Handling

inaccuracies, redundancies, and

volumes are other challenges.

Appropriate knowledge

management techniques are

required to ensure that analysts

have access to all relevant

information to without facing

information overload.

Given the large volumes

of information received in

a digitized format, natural language

processing, text mining, and statistical

reasoning play significant roles in

automating the process of content

assimilation. A host of specialized tools

are also available to aid some of these

tasks. News analytics is a well-established

research area dedicated to analysis and

organization of news articles received from

different sources, to predict the political,

financial or social impacts of these

events. Extracting specific events that

can contribute to competitive intelligence

can be considered as a sub-task of news

analytics. Classification techniques are

employed to classify news articles into

broad categories like political, economic,

sports, market information, entertainment

etc. Article summarization techniques are

often used along with this to provide the

key content of articles. Clustering news

articles based on content is also an oft-

used technique to reduce information

overload. Intelligent cluster visualizations

help in easy assimilation of content.

One of the key challenges here is

to identify those events which can be

assessed for their impact on past, present

or future performance of an organization.

All impacts are not measurable. For

example, it is difficult to measure the

impact that a new technology may have

on the future market or the effect of a new

chief appointed by a competitor or even

the news about an important acquisition

by a large company. News events typically

comprise a major chunk of information

used to gain strategic intelligence.

Information and relation extraction

techniques from text mining are also

gaining popularity in news analytics, since

they can further help in extracting specific

chunks of information in a structured

form that can be consumed even by

machines. Information and relation

mining techniques have been successfully

applied to extract significant entities, and

their roles and responsibilities in an event

along with event details like name, time,

location, and description of event. The

structured information extracted from the

news articles can be further consumed by

a reasoned to draw inferences.

Social media content on the other

hand can contribute very effectively

towards gaining tactical intelligence.

Tracking twitter and Facebook content

Type of Cometitive Intelligence

Event

Web Source

People events News, company web-sites

Competitor strategies. e.g.

technology investment

News, Discussion Forum, Blogs, patent sites

Consumer sentiments Review sites, social networking sites

Promotional events and pricing Twitter, Facebook

Related real-world events News Twitter, Facebook

Fig. 3: Competi ti ve intelligence events and their sources

Fig. 4: Analyzing competi tor promoti ons from social media


generated by competitors, can provide

fairly accurate data about promotions

run by them. Twitter and Facebook also

abound on consumer sentiments about

products and services or a brand.

Text classification techniques are

widely used to classify social media

messages, into pre-defined categories

like status updates, sentiment, and

opinion, consumer support systems,

news, promotions and campaign, and

others. Further categorization or labelling

of content is also possible based on

the named-entities present in these.

Classification of social-media content into

pre-defined categories like above helps

in filtering the relevant from irrelevant.

Traditional classification techniques using

the Bag-of-Words do not perform very well

on short messages like these. Rather a set

of domain-specific features like authors

profile, ReTweets, @userMentions etc.,

help in classifying the text to a predefined

set of generic classes such as News,

Events, Opinions, Deals and Promotions,

and Customer Support. A classified text

can be further tagged or associated with

product or service labels, brand names,

action categories etc. using domain

ontology. Natural Language Processing

tools like Named Entity Recognition is also

applied to identify dates, money-values,

store names or locations etc. The assigned

class and product labels along with the

complete set of information extracted

can be used to generate a promotion

map, which can depict category-wise

promotions for different products region-

wise and time-wise. Fig. 4 depicts the

process flow for the same.

5.2 Competitive Intelligence Analysis Competitive intelligence reports are

consumed by analysts, strategists, and

decision makers of different departments

across the organization. While most of

these reports are pushed into the work-

flow automatically, drawing inferences

from competitive intelligence reports is still

by and large a human activity. It requires

a lot of tacit world knowledge most of

which is not available in a structured or

semi-structured manner. Correlation with

diverse types of structured data generated

within the organization, can also yield

valuable insights.

Fusing reports and data originating

from different channels is not a

straightforward task. Harmonization

of data from multiple sources, requires

intelligent master data management

techniques. Fusion systems need to judge

feasibility and relevance of the merging

different types of data. Visualization of

the results to deliver the correct insights

is yet another complex task. While

much of this work is also human-driven

today, analytical systems that can fuse

competitive intelligence reports and

structured data at the right granularity

are being developed for different sectors.

Machine learning techniques are major

contributors to the design of fusion

systems. These systems can be made

to learn from human interactions with

reports and data.

The marketing division is one of

the most prolific users of social media.

Consequently, they can also maximally

benefit from competitor promotion

information. Most companies have a pre-

defined static promotion calendar. This

calendar is reviewed from time to time,

usually on a quarterly basis. The review is

most often entirely against the company’s

own performance, without information

about competitor actions used in a

structured way. Promotion event maps

created from social media can be used

by the marketing analysts to get a near

real-time view of competitor activities,

analyze the company’s performance

against the backdrop of these and thereby

take corrective actions, if necessary. Joint

analysis of sales data and competitor

promotion events, can provide valuable

insights about how competitor promotions

affect sales.

For example, a dip in sales data can

be linked to reports about aggressive

promotions by competitors, new product

launch in the same category, price-

rise announcements or sudden rise in

negative brand sentiments. Similarly, rise

in sales can be linked to rise in positive

brand sentiment or price rise announced

by competitor. Given that there may not

be a single well-defined factor that can

be marked as responsible for an event,

automated systems can do a good job

of correlating all that is relevant based

on attributes like time of the year,

product, brand or region. Pattern mining

on large volumes along with human

annotations, as input can be utilized to

learn better correlations. Finally, machine

learning driven competitive intelligence

systems can also be used to design

predictive models that can predict future

performances based on series of present

and past events.

Conclusions and Challenges Hopefully, via the three broad use cases

presented above, we have made a case

for why social intelligence is becoming

increasingly important for enterprise

business intelligence. Further, as we have

explained, the fusion of external and

internal intelligence that enables value to

be extracted from external data, especially

from social media.

Large enterprises across industries,

from retail to financial services to

manufacturing, are today actively

exploring this new and exciting arena.

At the same time, the veracity of inputs

received from social media remains

a matter of concern. There are also

challenges in measuring the return on

investment (i.e., ROI) from socio-business

intelligence exercises: Statistically sound

techniques for measuring ROI, even

for simple matter such as advertising

campaigns, are as yet not widely popular.

Both these questions, i.e., efficiently

establishing the veracity of social media

inputs, as well as properly measuring ROI

from socio-business intelligence, pose

challenges for future research.

References [1] Kemal A Delic and Umeshwar Dayal.

The rise of the intelligent enterprise.

Ubiquity, 2002 (December): 6, 2002.

[2] Lipika Dey and Sk Mirajul Haque.

Opinion mining from noisy text data.

International journal on document

analysis and recognition, 12 (3): 205–

226, 2009.

[3] Lipika Dey, Sk Mirajul Haque, Arpit

Khurdiya, and Gautam Shroff. Acquiring

competitive intelligence from social

media. In Proceedings of the 2011 Joint

Workshop on Multilingual OCR and

Analytics for Noisy Unstructured Text

Data, page 3. ACM, 2011.

[4] Saurabh Sharma, Puneet Agarwal,

Rajgopal Vaithiyanathan, and Gautam

Shroff. Catching the long-tail: Extracting

local news events from twitter. In

International Conference on Weblogs

and Social Media, June 2012.

[5] Gautam Shroff, Puneet Agarwal, and

Lipika Dey. Enterprise information fusion

for real-time business intelligence. In

Proceedings of the 14th International

Conference, Fusion ’11, 2011. n


Dr. Gautam Shroff is Vice President & Chief Scientist, Tata Consultancy Services and heads TCS’ Innovation

Lab in Delhi, India. As a member of TCS’ Corporate Technology Council, he is involved with recommending

directions to existing R&D eff orts, spawning new R&D eff orts, sponsoring external research, and proliferating

the resulting technology and intellectual property across TCS’ businesses.

Prior to joining TCS in 1998, Dr. Shroff had been on the faculty of the California Institute of Technology,

Pasadena, USA and thereafter of the Department of Computer Science and Engineering at Indian Institute

of Technology, Delhi, India. He has also held visiting positions at NASA Ames Research Center in Mountain

View, CA, and at Argonne National Labs in Chicago. Dr. Shroff completed his B.Tech (Electrical Engineering)

from the Indian Institute of Technology, Kanpur, India, in 1985 and Ph.D. (Computer Science) from RPI, NY,

USA, in 1990. Dr. Shroff taught a course “Web Intelligence and Big Data” on Coursera as well as at IIT and IIIT

and the URL is https://www.coursera.org/course/bigdata .

Dr. Lipika Dey is a Senior Consultant and Principal Scientist at Tata Consultancy Services, India. She heads the

Web Intelligence and Text Mining research group at Innovation Labs, Delhi. Lipika's research interests are in the

areas of content analytics from social media, social network analytics, predictive modeling, sentiment analysis

and opinion mining, and semantic search of enterprise content. Her focus is on seamless integration of social

intelligence and business intelligence. She is keenly interested in developing analytical frameworks for integrated

analysis of unstructured and structured data. Lipika has a Ph.D. in Computer Science and Engineering from IIT

Kharagpur. Prior to joining the industry in 2007, she was a faculty member in the Department of Mathematics at

Indian Institute of Technology, Delhi, from 1995 to 2006. She has several publications in International journals and

refereed conference proceedings. She is a Program Committee member for various International Conferences.

Puneet Agarwal is a Scientist at Tata Consultancy Services Ltd. He heads Data Analytics and Information

Fusion research group at TCS Innovation Labs, Delhi. Puneet’s research interests include applied research in

data-mining on time-series and graph data with a focus on distributed parallel processing.

He has been working in TCS for about 15 years and before joining TCS Innovation Labs in 2004, he worked as

a technical architect in various mission critical projects in Logistics and Shipping domain. He has published many

a research papers in various international conferences on Information Fusion, Software Agility, Collaboration, and

Model Driven Interpretation. Puneet has a B.E. Degree in Mechanical Engg from NIT Trichy.

Abo

ut t

he A

utho

rs

of the Industrial Internet, powered by

Big Data.

References[1] IT Glossary, Gartner Inc, http://www.

gartner.com/it-glossary/big-data/

[2] The Digital Universe in 2020: Big Data,

Bigger Digital Shadows, and Biggest

Growth in the Far East, http://www.emc.

com/leadership/digital-universe/iview/

index.htm, December 2012

[3] Metcalfe’s Law Recurses Down the Long

Tail of Social Networks, http://vcmike.

wordpress.com/2006/08/18/metcalfe-

social-networks/, April 2006

[4] We knew the web was big, http://

googleblog.blogspot.com/2008/07/we-

knew-web-was-big.html, July 2008

[5] The Google File System, http://research.

google.com/archive/gfs.html, October

2003.

[6] MapReduce: Simplifi ed Data Processing

on Large Clusters, http://research.google.

com/archive/mapreduce.html, December

2004

[7] The History of Hadoop: From 4 Nodes

to the Future of Data, http://gigaom.

com/2013/03/04/the-history-of-

hadoop-from-4-nodes-to-the-future-of-

data/, March 2013.

[8] Bigtable: A Distributed Storage System for

Structured Data, http://research.google.

com/archive/bigtable.html, November

2006.

[9] HAWQ: The New Benchmark for SQL on

Hadoop, http://www.greenplum.com/

blog/dive-in/hawq-the-new-benchmark-

for-sql-on-hadoop, February 2013.

[10] The Emerging Returns on Big Data, http://

www.tcs.com/big-data-study/Pages/

default.aspx, March 2013.

[11] The R Project for Statistical Computing,

http://www.r-project.org/.

[12] Scikit-learn: machine learning in Python,

http://scikit-learn.org/stable/.

[13] Apache Mahout, http://mahout.apache.

org/.

[14] MADlib, http://madlib.net/.

[15] Sensing the Future: Ford issues Predictions

for the next wave of Automotive

Electronics Innovation, http://media.

ford.com/article_display.cfm?article_

id=37541, December 2012.

[16] Industrial Internet: Pushing the

Boundaries of Minds and Machines,

http://www.ge.com/docs/chapters/

Industrial_Internet.pdf, November

2012 n

Abo

ut th

e A

utho

r

Dr. Milind Bhandarkar was the founding member of the team at Yahoo that took Apache Hadoop from 20-node

prototype to datacenter-scale production system, and has been contributing and working with Hadoop since version

0.1. He started the Yahoo Grid solutions team focused on training, consulting, and supporting hundreds of new

migrants to Hadoop. Parallel programming languages and paradigms has been his area of focus for over 20 years,

and a topic of his PhD dissertation at University of Illinois at Urbana-Champaign. He worked at the Centre for

Development of Advanced Computing (C-DAC), National Center for Supercomputing Applications (NCSA), Center

for Simulation of Advanced Rockets, Siebel Systems, Pathscale Inc. (acquired by QLogic), Yahoo and LinkedIn.

Currently, he is the Chief Scientist at Greenplum, a division of EMC2.

Continued from Page 8


ResearchFront

Big Data Enabled Digital Oil Field

Pramod Taneja* and Prashant Wate***Principal Architect, iGATE**Technical Specialist, iGATE

IntroductionOil and Gas Industry OverviewOil and Gas (O&G) companies – both

operator companies as well as oil fi eld

service providers, now have more

upstream data (structured, unstructured

as well as real-time) than ever before; to

base their operational decisions relating

to exploration, drilling or production. For

this reason eff ective, productive, and on-

demand data insight is critical, for decision

making within the organization.

However, a vision towards an

integrated Exploration and Production (E&P)

data management platform, still remains

a challenge as extraction of business-

critical intelligence/insights from large

volumes of data in a complex environment

of legacy diverse systems, and fragmented/

decentralized solutions is a daunting task.

Some typical challenges for E&P data

management are:

• Upstream focused applications are

at a functional level. So, substantial

time is spent in data collection and

running reports for a given asset level

i.e. for a single well or aggregate wells

in a given location

• A major number of applications

are still non-PPDM (Professional

Petroleum Data Management

Association) based, which makes

the reports and KPIs non-accurate at

most times

• It is diffi cult to drive insights from

unstructured data lying in multiple

applications

• It is diffi cult to run predictive

analytics as data is spread out in

multiple systems with lesser integrity

and reference to master-level data.

Need for Digital Oil Field

Enterprise PlatformAn Integrated Digital Oil Field Enterprise

Platform integrates E&P data from

diff erent project phases — Seismic,

Drilling, Well and Production — into

a single consolidated platform. Data

indexing, storage, cleansing, clustering,

migration, standardization, and analysis

can be done from multiple data sources

(be it structured, unstructured or real-

time) into an integrated platform and

provide detailed insights at a well level at

any instant. This solution should leverage

cloud infrastructure, an integrated

workfl ow, an accelerated digitized solution

framework based on MURA (Microsoft

Upstream Reference Architecture), hybrid

data models, integration to multiple data

sources, and a host of accelerators for

data migration.

Big Data in Digital Oil FieldIn Oil and Gas industry, traditional

data warehousing solutions are facing

challenges to capture, storage, and churn

massive volume of data sets. The O&G

companies can adopt Big Data solutions

to maximize their business potential by

deriving a holistic view of the voluminous

sensor device data to gather valuable

insights that can complement existing

traditional BI off erings.

This consolidated Big Data enabled

E&P data management platform should be

designed to fi t within an O&G operator’s

or oil fi eld service provider’s technology

infrastructure and provides an on-demand

and single view of a well at any instant

and from anywhere. The platform should

provide ready-to-use accelerators as well

as interfaces with third-party Geologist

and Geophysicists (G&G) product suites

as well as with customer data sources - be

it structured, unstructured or real-time.

Big Data Solutions for Digital

Oil FieldO&G companies can adopt Hadoop

enabled Big Data solutions for creating

Integrated Digital Oil Field strategy.

Hadoop is a widely accepted open-source

cost eff ective solution which provides

map-reduce functionality for processing

Fig. 1 : Functi onal overview – big data enabled digital oil fi eld Source: iGATE Research


extremely large data sets on commodity

servers. Hadoop based solutions allow

storing, processing, and analyzing these

humongous logs on near real time

basis. The crux of the solution involves

processing raw data in its native format

to create aggregated views along with

understanding of its relationships and

patterns and thereby derive meaningful

insights for quick decision-making

related to reservoir & optimizing the data

exploitation using Map Reduce paradigm.

There is a widely acceptable adoption of

Hive, which is a scalable data warehouse

solution available on Hadoop, with HiveQL

as a query mechanism which is similar

to SQL syntax. Hive internally generates

map-reduce jobs that can be executed

on Hadoop clusters. Hive in-turn allows

overcoming the learning curve associated

with the Map-Reduce code generation.

Big Data Enabled Digital Oil Field SolutionFunctional OverviewAs part of a faster transition strategy

towards Digital Oil Field for integration,

processing and analytics, Hadoop

clusters can be leveraged along with

data migration and business intelligence

accelerators. An architecture view is

depicted below strategizing a unifi ed

view of the diff erent Oil Wells managing

semi/unstructured data of drilling and

production phase, leveraging modeling/

simulation techniques, and ready-to-

deploy KPI confi gurations

The high-level functional overview is

stated below:-

1. Fetch the customer’s E&P from

various well in diff erent phases –

Each oil well generates around 10TB

of data and in a reservoir there are

multiple wells to be drilled and

explored

2. This massive volume of multi

structured logs is stored on a Hadoop

infrastructure

3. Data Processing is most important

step for data preparation in a manner

which is less time consuming

activity. Hadoop is an ideal solution

which can be used for converting

the unstructured data to structure

format, perform cleansing and store

in unifi ed Hive structures.

4. The Digital Oil Field provides PPDM

compliance models for ease of

integration and portability post

standardization into a Digitized

Platform

5. The quality of data on the

Digitized Platform is verifi ed by the

stakeholders

6. Complex analytics and event

processing is performed to fi nd the

drilling patterns, infer the lithology

content based on various parameters

of the Oil well logs. In turn provide

adaptors to 3rd party interfaces for

data interpretation.

7. Integration with the BI services and

Enterprise Application Integration

(EAI) services to third party agents

for advanced analysis and dashboard

generation

Technical Prcoess fl ow There are fi ve stages depicted in below

diagram, stating the lifecycle of the data

process in big data platform

1. Data Capture Stage - Fetch the

customer’s E&P data from various

well in diff erent phases. Apache

Flume can be used for capturing

Oil Well log data embedded in a

standard formats such as Logical

ASCII Standard (LAS) fi les, Seismic

Data fi les etc. Sqoop can be used

for capturing data from RDBMS

structured production data.

2. Data Storage & Preparation Stage –

The massive volume of relevant data

is then stored on Hadoop distributed

fi le systems. Hadoop streams can be

used for invoking the data preparation,

massaging and cleansing scripts. The

data preparation jobs can convert

the unstructured data to structure

format, perform cleansing and store

in unifi ed Hive structures. The Data

Governance can be carried out by

tools such as Oozie and Zookeeper

Fig. 2 : Data process fl ow in Big Data Enabled Digital Oil Field analyti cs Source: iGATE Research

Continued on Page 36


Big Data[1] is a large volume of data

from various data sources such as social

media, web, genomics, cameras, medical

records, aerial sensory technologies, and

information sensing mobile devices. Big

Data includes structured, semi-structured,

and unstructured data. This unstructured

data contains useful information which can

be mined. Since 1980s, per-capital capacity

to store information is increased into double

the amount for every 40 months. In 2012,

statistics says that 2.5 quintillion (2.5 * 218)

bytes of data are created per day. Moreover,

digital streams that individuals create are

growing rapidly. For example, most of the

people are using camera on their own. Big

Data are of high level volume, high velocity,

and high variety of information that needs

advanced method to process the Big Data.

In addition, conventional software tools

are not capable of handling Big Data. So

Big Data requires extensive architecture.

The following types of data are referred to

as big data.

• Social data – Customer feedback

forms for Customer Relationship

Management (CRM) in Social media

sites such as Twitter, Facebook,

LinkedIn etc.

• Machine-generated data – Sensor

readings, Satellite communication

• Traditional enterprise data- Employee

information, business product,

purchase, sales, customer Information,

and ledger information.

Traits of Big Data Big data diff ers from other

data in 5 dimensions[3] such as volume,

velocity, variety, and value. Volume: Machine

generated data will be large volume of data.

Velocity: Social media websites generates

large data but not massive. Rate at which

data acquired from the social web sites are

increasing rapidly. Variety: Diff erent types of

data will be generated when a new sensor and

new services. Value: Even the unstructured

data has some valuable information. So

extracting such information from large

volume of data is more considerable.

Complexity: Connection and correlation

of data which describes more about

relationship among the data. Challenges

Storing and Maintaining the Big Data is a

challenging task. The following challenges

need to be faced by the enterprises or media

when handling Big Data:

• Capture

• Duration

• Storages

• Search

• Sharing

• Analysis

• Visualizations

Why Big Data? Big Data is absolutely

essential for the following intents:

• To spot business trends

• Determine quality of research

• To prevent diseases

• To link legal citation

• To combat crime

• To determine real time roadway

communication system, where the

data is created in the order of exa bytes

(218).

Where it is used? Areas or fi elds where big

data are created:

• Medicine, Meteorology, Connectomics,

Genomics, Complex Physics

Simulation, Biological, Environment

Research, and Areal Sensory System

(remote sensing technologies).

• Big Science, RFID, Sensor Networks.

• Astrometry.net project keeps eye on

Astrometry group via fl icker for new

photos of the night sky. It analyzes

each image and identifi es the celestial

bodies such as stars, galaxies etc.

MapReduce MapReduce[2] is a programming model for

handling complex combination of several

tasks and it was published by Google. It is

a batch query processor and can run an ad

hoc query for whole dataset and get the

results in a sensible manner which has to

be transformative. It has two steps. 1. Map:

Queries are divided into sub queries and

allocated to several nodes in the distributed

system and processed in parallel. 2. Reduce:

Results are assembled and delivered.

Database Oracle has introduced the total solution

for the scope of enterprise which requires

Big Data. Oracle Big Data Appliance[3] is a

tool to integrate optimized hardware and

extensive software into Oracle Database 11g

to endure the Big Data challenges. Example Application: Patient Health Information System on Cloud

The Real-time application of Big Data

can also be in Patient Health Information

System on Cloud[4]. Patient Health Record

(PHR) is an emerging technique to store

the Patient Heath Information Record and

exchange the data over the network, which

is stored at the cloud for accessing the data

log anytime and anywhere. To assure more

security individuals are given with their own

login and data stored over the cloud would

be encrypted. PHR includes variety of data

such as structured, unstructured, and semi-

structured.

• In PHR, we propose machine generated

data by acquiring the fi nger print or iris

pattern or face of the patient for saving

the entire data log of the patient. It

uses fi nger print sensor or Iris scanner

or face recognizer for capturing the

patient Identifi cation. Finger print or

iris pattern or facial expression act as a

key for retrieving the data saved in the

database

• Traditional enterprise data includes

the entire PHR right from his/her birth

with the details of the doctors and their

prescription and all records.

• PHR called as social data which can

be made online for online consultation

and medicine purchase. Even the lab

test reports can be uploaded online.

This avoids patient waiting time in

the lab for the result report. A copy of

the result report will also be sent to the

respective consulting doctor for further

enquiry. An individual login is provided

for patient, doctor, pathologist,

pharmacists,etc. , which makes the

system more secure.

Conclusion Omar Tawakol, CEO, Bluekai has written

an article recently. In that article, he has

mentioned that “More data usually beats

better algorithm”. But it is very rigid to

store and analyze. However, Big Data are

used for fi nding the customer behavior, for

identifying the market trends, for increasing

the innovations, for retaining the customers,

for performing the operations effi ciently.

Flood of data coming from many sources

must be handled using some non-traditional

database tools. It provides more market value

and systematic for the upcoming generation.

References [1] Wikipedia, the free encyclopedia [2] White, Tom, Hadoop: The Defi nitive

Guide. O'Reilly Media, ISBN 978-1-4493-3877-0.

[3] An Oracle whitepaper, Jan 2012 “Oracle: Big Data for the enterprise”.

[4] Scalable and Secure Sharing of Personal Health Records in Cloud Computing using Attribute-based Encryption, M. Li, S. Yu, K. Ren, and W. Lou, Sep 2010 pp 89- 106. n

Big Data

A Kavitha*, S Suseela**, and G Kapilya****AP/CSE, Periyar Maniammai University, Vallam, Thanjavur** AP/CSE, Periyar Maniammai University, Vallam, Thanjavur ***AP/CSE, Periyar Maniammai University, Vallam, Thanjavur

Article


Abstract: In-Memory analytics has brought

a paradigm shift in storage and data

management in facilitating instant reporting

for decision making. Revolution in advanced

memory technology, drastic decline in price

of memory, and evolution of multi-core

processors have changed the orientation

of business intelligence query and fetching

of data along with the way data is stored

and transferred. This article discusses the

adoption of in-memory technology, its

architecture, and few enabling software of

in-memory computing. It also discusses the

scope and benefi ts of In-Memory approach.

IntroductionIn-Memory Analytics facilitates querying

of data from Random Access Memory

instead of physical disk. Detailed data can

be loaded from multiple sources into the

system memory directly. This technique

helps in taking faster business decisions.

Performance is improved as storage and

operations are performed in the memory.

The approach of In-Memory Analytics

has brought a paradigm shift in storage

philosophy. Here, summarized data

are stored in RAM. However, In case of

databases, data is stored in tables through

relationships and interconnection among

tables, and other database objects. Similarly,

multidimensional cubes are created and

data are stored in traditional business

intelligence platforms. In case of In-Memory

Analytics, the creation of Multidimensional

cubes are avoided[1]. As per Gartner,

capabilities of in-memory analytics includes

faster query and calculation which almost

avoids to build aggregate and precalculated

cubes. Some myths and facts of in-memory

approach are described below (Fig. 1).

Architecture of In-Memory AnalyticsThere are diff erent approaches of the

architecture of In-Memory computing.

They are associative model, in-memory

OLAP,Excel in-memory add-in, in-memory

accelerator, and in-memory visual analytics.

In associative model, associations are

based on the relationships between various

data elements .When a user clicks on an

item within a data set, the selected items

turn green and all associated values turn

white. This facilitates users to quickly query

all relevant data without the dependency

of a predefi ned hierarchy or query path,

and are not limited to navigate the

analytical data in a predetermined way.

Similarly, Excel in-memory add-in allows

users to load large volumes of data into

Microsoft excel using in-memory. Once

the data is within excel, relationships are

automatically inferred between the data

sets, permitting users to perform on-the fl y

sorting, fi ltering, slicing, and dicing of huge

data sets, which overcomes some of the

technical data volume limits of excel. This

approach improves self-service capabilities

as it reduces dependency on IT, and lessens

the needs for business users to become

expert in multi-dimensional structures and

techniques. This add-in is dependent on

a particular back end data management

and portal platform which helps sharing

of data and collaboration. In-memory

OLAP Approach functions by loading data

in-memory, which allows complicated

calculations and queries to be computed

on-demand resulting in fast response times.

If write-back is supported then users can

change assumptions on the fl y to support

what-if scenarios, which is a specifi c

requirement in forecasting and fi nancial

planning. In-memory visual analytics

combines an in-memory database with a

visual data exploration tool allowing users

to quickly query data, and reports within a

visual and interactive analytics ambience.

In-Memory accelerator approach improves

query performance within an existing

business environment. This accelerator

functions by loading data into memory

and leveraging pre built indexes to support

super fast query response times[2]. There

are many In-Memory computing enabling

softwares like In-Memory analytics and

event processing, In-Memory messaging,

In-Memory application platforms, and

In-Memory data management. These

softwares provide new business ideas and

IT challenges. A comparison of traditional

data analytics technology and in-memory

data analytics technology is given

below (Fig. 2).

In-Memory Application PlatformsSAP HANA (High Performance

Analytics Appliance) As per Garter’s study on information

explosion, data of enterprises will grow

650% over past fi ve years, with 80% of

that data unstructured, which means that

the data explosion spans both traditional

sources like point of sale and shipment

tracking records along with non traditional

sources like emails, web content, and

documents[8]. In-Memory technology

allows processing of huge quantities of

data in real time to provide instant result

for decision making. SAP HANA provides

a foundation for building new generation

applications, which facilitates processing

of huge quantities of data in the main

memory of the server from any source

virtually to provide results from analysis.

SAP HANA is a technology, which permits

the processing of massive quantities of

real time data in the main memory of the

server by providing instant result from

analysis and transactions. As per SAP,

SAP HANA Technology will drastically

improve query performance and speed

up data loads. The reduced data layers

will also simplify system administration

and reduce operating costs. This software

platform is specifi cally prepared to

support operational and analytics

operations. This platform also helps SAP

Partners and customers to develop their

own applications.

Oracle ExalyticsOrganizations needs analytics for gaining

insight, so as to take correct decision.

However, due to budgetary pressure, time

Adoption of In-Memory Analytics

ArticleJyotiranjan HotaAssociate Professor, School of Management, Krishna Campus, KIIT University, Patia, Bhubaneswar

Myths Facts

“In-Memory is just a Hype spread by SAP” All major software vendors deliver in memory

technology

“It’s new and unproven technology” It has been around since 1990s

“It is solely about running analytics faster” It’s widely used for transaction and event

processing as well

“It’s incremental and nondisruptive” Prediction :In-memory will have an industry

comparable to web and cloud

Source: Gartner6


sensitivity, and extensive requirement, IT

fi rms usually face challenges to produce

actionable analytics. The task even

becomes more complex due to involvement

of multiple hardware, networking,

software, storage vendors, and expensive

resources are wasted integrating software

and hardware components to generate

complete analytics solution.

Oracle Exalytics is an optimized

system, which provides solution to

all business related issues without

compromising speed, simplicity,

manageability, and intelligence. Oracle

Exalitics is built with market leading

BI software, in-memory database

technology, and industry-standard

hardware. Oracle claims that exalytics

uses a new interface designed to produce

quick result regardless of the query,

location, and device types[3].

Scopes and Benefi ts of In-Memory AnalyticsIn-Memory Analytics should be used to

improve query performance and processing

of reports. Hence, reorientation of existing

report infrastructure is needed so as to

implement in-memory analytics. However,

clear understanding of the demand of

users and applications on computing

resources should be understood through

data proofi ng. It is also important to identify

users and applications who need processing

of ad hoc and non routine reports. This

eff ort is accomplished through data usage

models, which reduce the cost and eff ort

of in-memory analytics introduction in

a fi rm .Mostly, operational and standard

reporting need is approximately 70-80%

and non routine and ad hoc reporting

need is about 20-30% of an organization,

which should be recommended after

exact analysis. In few fi rms ,the need for consolidated reporting and forecasting is required frequently within 10 to 12 weeks .In-Memory analytics is quite fi t to be used in these circumstances[1] .In the current context, there is a drastic drop in prices of memory and processors. At the same time, multi-core processors are evolving. In-Memory computing has made it possible to perform storage and operations in main memory, where the requirement of hard disk can be avoided. Due to two valid reasons, In-Memory computing is useful. Firstly, the volume of information is growing at an alarming rate. Secondly, immediate responses are needed as quick decisions are needed now in all forward looking

organizations. Traditionally, annual and

quarterly review reports were taken as the

basis for decision making. Past data analysis

using data warehousing technology is

slowly vanishing. In-Memory computing

is supporting event driven systems, which

enable decision making in real time. Here

data is brought closer to central processing

unit. Compared to disk based access, the

querying of the data based on in-memory

is million times faster. Adoption of 64-bit

architecture is a facilitator to in-memory

approach as the addressable memory

space is increased. Usually midsized

companies lack in technical expertise and

resources to construct data warehouses,

and performance tuning tasks. However, in-

memory approach for midsized companies

is less cumbersome, easy to administer

and set up. IT Infrastructure is not a barrier

here, in optimizing business performance.

In-memory approach reduces the skill gaps

in constructing and consuming analytical

applications. The reason for reducing the

diffi culties is due to avoidance of use of

OLAP cubes, which are stored in back end

databases. Total cost of ownership of the

fi rms is reduced and business performance

is enhanced.

In-Memory Analytics VendorsThe vendors who provide solution include

hardware vendors, servers, and software

Applications (Table 1)

Research ChallengesThere are few research issues and

challenges of in-memory analytics.

In-memory analytics must face the

challenge of technology incumbency,

particularly in companies where there is

heavy dependency on traditional OLAP

technology. Many organizations have

entire departments built around certain

business intelligence platforms, and the

threat of any disruptive technology that

may signifi cantly reduce, even eliminate

these empires will be met with resistance

and skepticism. Enterprise reporting has

emerged as a mission-critical function, and

once the user community is dependent

upon large numbers of reports, one should

hesitate before introducing too much

change, too fast[1]. As per the IDC Report

(2011), traditional method of building and

developing computing infrastructure in case

of analytics applications are not suitable,

when migration to in-memory analytics

applications is needed.

Conclusion and Future Ahead As per a study, around 30% of fi rms will

have one or more critical applications

running on an in-memory database in

next fi ve years, and by 2014, 30% of

analytics applications will use in-memory

functions to add scale and computational

speed[9]. The companies are seeking to be

responsive, insight driven, and more real

time. There is a guarantee that in-memory

computing will dominate the marketplace in

future and grip forward[4]. IDC (2011) report

states that in-memory technology in public

and private sectors will facilitate these fi rms

to the highest level of competitiveness

through “freedom of excess”. The in-

memory technology platforms that promote

innovations reduce IT compromises, and

enable access to information by the right

people at the right time[5]. Market Research

Media research report stated that the high

performance computing market is expected

to reach $200 Billion by 2020.In-Memory

computing is one of the fastest growing

components of that market. As per Gartner,

In-memory Analytics approach is now

Source: SAP HANA Overview and Roadmaps (SAP Community Network)7


being used in variety of applications like

risk management, inventory forecasting,

profi tability analysis, fraud detection,

algorithmic trading, and areas like sales

incentive promotion management.

Refactoring existing applications in-memory

to utilize the approaches of in-memory can

result in better scalability and transactional

application performance, lower latency

application messaging, drastically faster

batch execution, and faster response time

in analytical applications. In year 2012

and 2013, cost and availability of memory

intensive hardware platforms reach tipping

points. So the in-memory approach will

enter the mainstream.

References[1] Baldwin, T (2008). Don\’t fold your

cubes Just Yet… But In-Memory Analytics is beginning to Mature, available at http://www.tagonline.org/articles.php?id=298 accessed on 24th October 2012.

[2] Schwenk, H (2010). Accelerating time-to-insight for midsize companies using in-memory analytics available at http://www2.technologyevaluation.com/ppc/request/whitepapers/accelerating-timetoinsight-for-midsize-companies-u s i n g - i n m e m o r y - a n a l y t i c s . a s p fetched on 1st February 2013.

[3] Gligor, G, Teodoru, S (2011). Oracle Exalytics: Engineered for Speed-of-Thought Analytics. Database Systems Journal, 2(4), 3-8.

[4] Kajeepeta, S (2012). The Ins and Outs of In-Memory Analytics, available at http://www.informationweek.com/software/business- intel l igence/the-ins-and-outs-of- in-memory-analytics/240007541 fetched on 29th September 2012.

[5] Morriss, H D (2011). Faster, Higher, Stronger: In-Memory Computing Disruption and what SAP HANA means for your Organization, available at download.sap.com fetched on 15th March 2013.

[6] Pezzini, M (2011). The Next Generation Architecture: In-Memory Computing, available at http://www.slideshare.net/SAP_Nederland/the-next-generation-architecture-inmemory-computing-massimo-pezzini fetched on 25th March 2013.

[7] Groth, H (2012). SAP HANA-Strategy and Roadmap, available at http://www.saptour.ch/landingpagesfr/Manager/uploads/23/32.pdf fetched on 25th March 2013.

[8] Chumsantivut, B (2011). SAP HANA Power of In-memory Computing, available at http://www.cisco.com/we b/ T H /a s s e t s /d o c s /s e m i n a r/SAP_HANA_Power_of_In_Memory_Computing.pdf fetched on 25th March 2013.

[9] Dale, S (2011). Getting real-time results with in-memory technology, available at http://enterpriseinnovation.net/article/getting-real-time-results-memory-technology fetched on 25th March 2013. n

Table 1Vendor Website Hardware Solution Analytics Solution

Dell http://www.dell.com VIS Next Generation Datacenter

Platform;PowerEdge R910

Fujitsu http://www.fujitsu.com PRIMEQUEST 1800 Series; FCRAM; FRAM

Fusion IO http://www.fusionio.com Fusion IO Flash Memory

HP http://www.hp.com HP Converged Infrastructure Platform; ProLiant

DL 900 Series

IBM http://www.ibm.com IBM soildCB

NEC http://www.nec.com Express 5800/A1080a

Oracle http://www.oracle.com Exalytics In-memory Machine

SAP http://www.sap.com SAP High Speed Analytical Appliance (HANA),

SAP In-Memory Computing

Kognitio http://www.kognitio.com Kognitio WX2 Analytics Database, WX2

Datawarehouse Appliance, DaaS Cloud

Advizor

Solutions

http://www.advizorsolutions.com Advisor 5.8; Advisor Analyst

Microsoft http://www.microsoft.com PowerPivot

QlikTech http://www.qlikview.com QlikView

Quantrix http://www.quantrix.com DataNaV

Quartet FS http://www.quartetfs.com Active Pivot

SAS http://www.sas.com In-Memory Analytics

Sybase http://www.sybase.com Adaptive Server Enterprise (ASE)

TIBCO http://www.tibco.com Spotfi re

Source: Aberdeen Group, December 2011

Abo

ut th

e A

utho

r Prof. Hota is an Associate Professor and Area Chairperson of Information Systems wing at KIIT School of

management, Bhubaneswar. He is a BE in Computer Science, from NIT Rourkela and PGDBM from Xavier Institute

of Management, Bhubaneswar. He teaches Data mining, Business intelligence, Analytics, and core modules of SAP

ECC 6.0 like SD, MM, FI-CO, HCM, and PP Modules in view and confi guration modes. His research interest lies in

banking technologies, analytics, and ERP. He has published several papers in many Journals and Conferences in

India and abroad. Author can be reached at [email protected] .


8 JUNE 2013UPCOMING EXAM DATE

For more information and to register for an ISACA

exam, visit isaca.org/mycrisc-CSI.

FINAL REGISTRATION DEADLINE: 12 April 2013

CRISCGold Winner for Best

Professional Certification Program

tt n mm


Risks and opportunities are two sides of

the same coin. For example, the Internet

has opened up many opportunities for us,

but at the same time exposed us to many

new risks. While we wish to avail the

opportunities, we also want to manage the

risks. It is not possible to avoid the risks

totally, so we should try and mitigate the

impact of risks. A cybersecurity professional

has to be an expert in risk management.

Whether the cybersecurity professional

is in the role of a planner, defender or

investigator, the balancing act of managing

the risks and selection, deployment, and

testing of information system controls will

remain the primary concern.

A risk management professional

is expected to be well versed in the fi ve

practice areas of risk and information

systems controls stated below:

1. Risk Identifi cation, Assessment and

Evaluation

2. Risk Response

3. Risk Monitoring

4. Information Systems Control Design

and Implementation

5. Information Systems Control

Monitoring and Maintenance

Demonstrated experience and competency

in these practice areas as well as

successfully passing the examination will

lead to acquiring ISACA’s Certifi ed in Risk

and Information System Control (CRISC)

certifi cation, which was recently awarded

the ‘Best Professional Certifi cation

Program at the 2013 SC Awards from

SC Magazine.

Becoming a CRISC helped me in securing

my current job, as it is an independent

confi rmation to my employer that beyond

information systems audit and security

management work, I also have extensive

IT risk and control management

experience. Security management

frameworks within Australian public

sector progressed from prescribed

controls to risk-based approach. This

change demanded suitably experienced,

skilled, and certifi ed professionals to

bring the new frameworks to life in order

to eff ectively manage risks and pursue

opportunities.

Having the CRISC certifi cation was an

important diff erentiator, particularly

for an employer with a mature register

of recognized certifi cations used for

hiring and engagement of professional

consultants. ISACA's certifi cations have

been highly regarded on this list because

of their well balanced business and

technical aspects, as well as a defi ned

minimum knowledge and experience

requirements for certifi cation holders.

The CRISC designation certainly helped

with being shortlisted for the position,

whilst my knowledge of ISACA's

frameworks helped me win my current

position.

~ Bob Smart, CISA, CISM, CRISC,

Manager of ICT Security, Government of

South Australia

Risk Identifi cation, Assessment, and EvaluationInformation systems are built keeping

people, processes, and technology in mind.

They involve designing of architecture and

applications to handle the information.

Each of these could include some risks,

apart from the risks due to natural factors

and physical threats. Assessing the

risk levels associated with each threat

includes; anticipating risk probability and

impact, threats and vulnerabilities, and

the eff ectiveness of current and planned

controls.

A risk professional will need to have

good knowledge of various standards,

frameworks and practices related to risk

identifi cation, assessment and evaluation

and familiarity with quantitative and

qualitative methods for risk identifi cation,

classifi cation, assessment, and evaluation.

Since risks impact business, knowledge

of business goals and objectives and

organization structure is also essential.

This leads to build the business

information criteria. Various risk scenarios

involving threats and vulnerabilities

related to business processes will have

to be built. Knowledge areas will include

information security architecture,

platforms, networks, applications,

databases, and operating systems. There

should be a good understanding of

threats and vulnerabilities related to third-

party management, data management,

system development life cycle, project

and program management, business

continuity, disaster recovery management,

management of IT operations, and the

threats and vulnerabilities associated

with emerging technologies. In addition,

knowledge of current and forthcoming

laws, regulations and standards will be

necessary. The risk professional should

also be familiar with the principle of risk

ownership, risk scenario development, risk

awareness training tools and techniques,

and elements of risk register.

Risk ResponseThe probability of the occurrence of risk

may be diffi cult to predict, but one can

never assume it to be zero. Sooner or later

the hypothetical risk scenario may actually

materialize. It is desirable to be adequately

prepared with risk response. The purpose

of defi ning a risk response is to ensure

that the residual risk is within the limit

of the risk appetite and tolerance of the

enterprise.

A risk professional will have to

clearly defi ne the risk response options.

Every response will have to be evaluated

with cost/benefi t analysis and weighed

against a number of parameters including

the cost of response to reduce the risk

within tolerance level, importance of

risk, capability to implement response,

eff ectiveness of response as well as

effi ciency of response. The available risk

response options are to (a) avoid the risk,

(b) reduce/mitigate the risk, (c) share or

transfer the risk and lastly, (d) accept the

risk. It may not be an easy job to decide

on an appropriate option. Although there

are major risks in electronic commerce

transactions, avoiding e-commerce is not

really an option today. A thorough cost/

benefi t analysis has to be done among

the remaining three options before taking

a decision. This will require building up a

business case to justify the selection of

response. The risk professional will have

to be very familiar with organizational

risk management policies, portfolios,

investment and value management,

exception management, parameters for

risk response selection, risk appetite and

tolerance, and the concept of residual risk.

Risk MonitoringA pre-planned risk response is essential

to eff ectively and effi ciently deal with a

risk. If there is a good risk monitoring

Five Key Knowledge Areas for Risk Managers

ArticleAvinash Kadam [CISA, CISM, CGEIT, CRISC]

Advisor to the ISACA India Task Force


process implemented to keep a watch on

various risks and sound an alarm as soon

as some risk parameters cross the risk

threshold, it will defi nitely save much of

the eff orts that will go in responding to

the risk. Developing these risk indicators

will be a major challenge. There may

be literally hundreds or risk indicators

such as logs, alarms, and reports. A risk

professional will have to closely work with

senior management and business leaders

to determine which risk indicators will

be monitored on a regular basis and be

recognized as Key Risk Indicators (KRIs).

The KRIs should be selected based on the

following factors:

• Reliability, i.e. they will every time

sound an alarm without fail

• Sensitivity, i.e. the alarm will be

sounded only when a certain

threshold is reached

• Impact, i.e. the KRIs will be selected

for areas which will have high

business impact

• Eff ort, i.e. the preferred KRIs will be

those which are easier to measure

The risk professional should be familiar

with various risk monitoring sources.

The information for risk monitoring could

be obtained from suppliers or vendors

of hardware, software, applications in

terms of various updates, anti-malware

vendors, logs of devices, CERT alerts,

newspapers, blogs, and technical reports

published by information security research

organizations. This means that the

professional will have to constantly update

the knowledge.

Information Systems Control Design and ImplementationControls are the policies, procedures,

practices, and guidelines designed to

provide reasonable assurance that

the business objectives are achieved

and undesired events are prevented or

detected and corrected. The controls

include technical controls such as access

control mechanisms, identifi cation and

authentication mechanisms, encryption

methods, and intrusion detection software.

The non-technical controls include

security policies, operational procedures,

and personnel, as well as the physical and

environmental security. A risk professional

must be knowledgeable in how to

design and implement the information

system controls throughout the system

development life cycle (SDLC) and project

management. Typical phases of SDLC

include the feasibility study, requirements

study, requirements defi nition, detailed

design, programming, testing, installation,

and post-implementation reviews. The

business risks could be the likelihood

that the new system may not meet the

user’s business needs, requirements, and

expectations. The project risks could be

that the project activities to design and

develop the system exceed the limits of

the fi nancial resources set aside for the

project. As a result, the project may be

completed late, if ever.

Information Systems Control Monitoring and MaintenanceRisk management relies on a monitoring

process to ensure that IS controls

remain eff ective and effi cient over the

time. Monitoring requires the defi nition

of meaningful performance indicators,

systematic and timely reporting of

performance, and prompt response to

deviations. Monitoring makes sure that

the right things are done and are in line

with the set business directions and

corporate policies.

The risk professional should have

good knowledge of enterprise security

architecture, monitoring tools and

techniques, and various control objectives,

activities and metrics related to information

security, data management, SDLC, incident

and problem management, IT operations,

business continuity and disaster recovery,

project and program management, and

applicable laws and regulation.

Selection of appropriate tools will

require good knowledge about tools for

monitoring transaction data, monitoring

conditions, changes, process integrity, and

error management and reporting, and at

times continuous monitoring.

It has been amazing to see the rapid

rise in the number of IT professionals

seeking the CRISC (Certifi ed in Risk

and Information Systems Control)

certifi cation. More than 16,000

professionals have earned the CRISC

designation, since the certifi cation was

introduced in 2010.

CRISC is highly desired because it

is the only certifi cation that positions IT

professionals for future career growth by

linking IT risk management to enterprise

risk management. Professionals across a

wide range of job functions that include

IT, security, audit and compliance have

earned the CRISC designation since it

was established in April 2010. While

CRISC is designed for risk professionals

with at least three years of experience,

more than 1,200 CIOs, CISOs, and

chief compliance, risk and privacy

offi cers have also chosen to pursue the

designation.

CRISC is the result of signifi cant

market demand for a credential that

recognizes experienced risk and

control professionals. This demand

will only accelerate as stakeholders

demand better corporate governance

and business performance, and more

secure infrastructures.

If you have real-world IT controls

and risk experience, I strongly

encourage you to pursue the CRISC

certifi cation. Becoming CRISC

certifi ed provides an additional level of

assurance that you have the necessary

skills and experience to get the job

done. It also enters you into a group of

professionals with common interests

and abilities. Networking with my fellow

CRISCs and ISACA members has been

an extremely rewarding experience. I

encourage you take advantage of the

opportunities certifi cation provides.

–Shawna Flanders, CISA, CISM, CRISC,

process engineer at PSCU, USA

In India, we are making rapid progress in

the adoption of information technology.

Organizations are well aware that they

should not take undue risks to achieve

their ambitious goals, and should build

appropriate IT controls to manage the

risks. This has prompted rapid acceptance

of CRISC certifi cation in India, and created

new job and promotion opportunities for

CRISC certifi ed professionals.

Avinash Kadam, CISA, CISM, CGEIT, CRISC, CISSP, CSSLP, GSEC, GCIH, CBCP, MBCI, PMP, CCSK, is an advisor to ISACA’s India Task Force. ISACA is a global association for IT assurance, security, risk, and governance professionals with more than 100,000 members worldwide and more than 6,000 in India. The

nonprofi t, independent ISACA developed

the COBIT framework for governance and

management of IT, and off ers the CISA,

CISM, CGEIT, and CRISC certifi cations.

Opinions expressed in the blog are

Kadam’s personal opinions and do not

necessarily refl ect the views of ISACA

(www.isaca.org).He can be contacted via

e-mail [email protected]. n



Dr. Nibaran DasAsst. Professor, Dept of Computer Science & Engineering, Jadavpur University, Kolkata and Editor, Computer Jagat, a Bengali monthly magazine

Programming.Tips () »

Python: Programming Language for EveryonePython is the programming language, which is popularly used by the scientifi c research community. But, due to its easy coding styles, documentation, and wide support from a large group of open source community, it becomes the programming language for everyone. It not only support functional or object oriented programming styles, but also other paradigms of programming styles such as imperative programming, logic programming, and design by contact etc. It is also popular for developing diff erent kinds of softwares. It is enriched with a number of plug-ins and libraries. It is also used popularly as a scripting language. In presence of Python interpreter it can run in every popular operating system. Even latest android platform also supports Python language using scripting layer. In brief it can be said than python is the language, which is loved by novices as well as experts. Some important and popular packages which are supported by python languages are given below.

Package Name DomainNumpy/ SciPy For scientifi c calculation

NLTK For Natural Language Processing

BioPython For biological computation

matplotlib For plotting fi gures

pyqtgraph GUI library using QT and Numpy

Astropy Astronomy

PyCV Computer Vision

Python Imaging Library

(PIL)Image Processing

CythonTranslating Python code into equivalent

C code

It is worthy to mention here that the above chart does not cover all the packages or libraries of Pythons. It only tells about a very small subset of available toolkits or packages using Python. Some special kind of features, which make it so robust are given below:

• Declaring multiple variables of diff erent types simultaneously in a single line:

>>>x, y, z = ‘A’, 2, 5.6>>>x‘A’>>>y2>>>z5.6

It is possible to return multiple values from a function. Even the documentation of function is given with some corresponding syntax…

#Function definitiondef remove_duplicated(arg_referents, arg_conditions):“”” This function removes duplicates from two list arg_referents and arg_conditions”””

arg_referents = list (set (arg_referents))arg_conditions = list (set (arg_conditions))returnarg_referents, arg_conditions#Function callingsentence_referents, sentence_conditions = remove_duplicated(sentence_referents,sentence_conditions)

• The “range” function is also very useful to python users .l“range” creates a list of numbers in a specifi ed rangerange ([start,] stop [, step]) -> list of integersWhen step is given, it specifi es the increment (or decrement).>>>range (7)

[0, 1, 2, 3, 4,5,6]>>>range (7, 12)[7, 8, 9,10,11]>>>range (0, 12, 2)[0, 2, 4, 6, 8,10]

This “range” function is heavily used in for loop. For example, if you want to print every third element in a list?for i in range(0, len(array), 2): printarray[i]

• Well Known Constructor available for Pythondef __init__(self): #constructorself.items = []

• Fix division operator so 1/2 == 0.5; 1//2 == 0requires__future__ statement in Python 2.xSupports Complex variables

Examples:3+4j, 3.0+4.0j, 2J#Must end in j or J

• Strings are repeated with the * sign:>>>’xyz’*3 ‘xyzxyzxyz’

• Python also supports negative indexes. For example, stringExample[-1] means extract the fi rst element of stringExample from the end

• Apart from string, Pyton support List which is denoted by [], and can hold numbers, strings, nested sublists, or nothing. The list indexing works just like string indexing. For ExampleList1 = [0,1,2,3], List2 = [‘zero’, ‘one’], List3 = [0,1,[2,3],’three’,[‘four,one’]], List4 = []

It is possible to append, extend, insert, and remove data using the following syntaxesa. list.append(x) b. list.extend(L) c. list.insert(i,x) d. list.remove()It is possible to sort a list, count the number of elements and reversing a list using the following syntaxesa. list.count(x) b. list.sort() c. list.reverse()

Abo

ut th

e A

utho

r Nibaran Das received his B.Tech degree in Computer Science and Technology from Kalyani Govt. Engineering College

under Kalyani University, in 2003. He received his Mastes in Computer Science and Engineering (M.C.S.E) and

Ph. D. degree from Jadavpur University, in 2005, and 2012 respectively. He joined J.U. as a lecturer in 2006. His areas

of current research interest are OCR of handwritten text, Bengali fonts, and image processing. He has been an editor

of Bengali monthly magazine “Computer Jagat” since 2005.

Nu

o

o


Programming.Learn("R") »

R- StaR of Statisticians http://www.r-project.org/

If your requirements are to manipulate, model or visualize a huge

set of statistical data, an arguably best choice of programming

environment is R!!!

R is a follower of Scheme and S Plus; a functional programming

language (S language) developed at Bell Labs by John Chambers

and team, and is the most widely used one in the area of statistical

computing. R was initially developed in 1993 by Robert Gentleman

and Ross Ihaka at the Statistics Department of the University of

Auckland, New Zealand, and was later progressed as a result

of the collaborative eff ort with contributions from all over the

world.R has got its name by debiting the fi rst letters of its initial

developers.

R is an interactive, object-

oriented language; designed by

statisticians for the purpose of

statistical computing. It is free

and open source, and is available

under GNU - General Public License

version 2. It runs on most of UNIX

platform, Windows, and MacOS.

Diff erent versions of R are available

at Comprehensive R Archive Network

(CRAN), which is a repository for R

code and documentation. CRAN also

provides source codes, new features,

and bug fi xes etc. Currently there are

4415 packages for R.

Now R has become a favorite

language for data analysis and

statistical computing for both

corporates and academia. R is also

being used for handling and analyzing

large datasets obtained from

supercomputing applications and to

create high quality visualization via

diff erent types of plots like line plots,

contour plots, and interactive 3D plot.

R has an intuitive and easy syntax for even a beginner, who

has basic programming experience. Like other programming

languages, R also has the standard control structures and can be

accessed from the languages such as Python, Perl, Ruby etc. Also

commercial software such as Mathematica, MATLAB, and Oracle

support R in a pretty good way.

R recommends command line interface (CLI), which is best

suitable for programmers. However, for a beginner to start with,

GUI based code editors, and IDE are useful. It provides a variety

of functionality like syntax highlighting, code completion, and

auto code indentation, which eases the job. Rstudio, Vim-R-Tmux,

Notepad++, RKWard, R Commander are some of them.

Let us have a look at R interface and R programming

environment. When you launch R, the R console will appear

with some basic information regarding R, within the R Gui. The

console resumes with a prompt with a ‘>’ symbol. This shows that

the interpreter is ready and is waiting for your R commands. We

can input commands (which are referred to as expressions) in R

through the R console.

R programming language has become an important platform for

statisticians to work with. R, being.. an open source platform,

there are n-number of freely available packages using which, you

can, not only do your serious statistical analysis, but also can be

used as a analysis platform for the problems in the fi elds, such

as Bioinformatics, Financial Market Analysis, Pharmacokinetics,

Natural language processing etc. We can explore more about R

programming in the next issue. Have a great time ahead.


Umesh P and Silpa BhaskaranDepartment of Computational Biology and Bioinformatics, University of Kerala

Robert GentlemanRoss Ihaka

Interface of R programming language


Abstract: Enterprises are increasingly

facing a challenge in making sense from

the deluge of data they are receiving from

multiple data sources. Due to increasing

connectedness of people, applications,

and machines, the amount, diversity, and

speed of data is very large. Analyzing this

data with minimal delay is an increasingly

challenging task.

In this document, we would like to

present the suitability of Data Stream

processing technology, to build solutions

that can enable enterprises to address

the velocity dimension of Big Data,

and provide real time visibility into

their operations. Using this technology,

enterprises can convert high velocity data

into meaningful business insights, and

take advantage of favorable conditions

and/or take corrective actions in case of

adverse conditions. We also share our

experience of applying this technology to

a few business domains.

Keywords: Big Data; High Velocity;

Real-time; Operational Insights; Data

Stream Processing; Stream Computing;

In-Memory Computing; Data Stream

Management Systems;

IntroductionOne of the computing areas that is

attracting a lot of attention is 'Big

Data'. What exactly is Big Data? As per

Wikipedia, 'Big Data is a collection of data

sets so large and complex that it becomes

diffi cult to process using on-hand database

management tools or traditional data

processing applications. The challenges

include capture, curation, storage, search,

sharing, analysis, and visualization'. As

more and more activities are being

carried out on the Internet by people and

enterprises, the amount of data generated

for the various activities is rising each

day. As per one statistic, as of 2012, 2.5

quintillion (2.5 x 1018) bytes of data were

created. While on a personal level users

are facing challenges with large volumes

of data, the challenge for enterprises is

monumental. Enterprises are struggling to

derive meaningful value from humongous

mountain of data they

collect on a regular

basis.

One of the areas

in the enterprise

that generates high-

velocity data and is

very important, but

receives less focus, is

the operational setup

of the enterprise.

Conventional business

intelligence solutions primarily deal with

data from the past, as these solutions

cannot process data 'instantaneously' or

'on arrival', due to technical limitations.

While volume is the most common

dimension when discussing Big Data, it is

not the only dimension. Typical Big Data

solutions try to address and analyze data

on three dimensions, namely volume

(amount of data), velocity (speed of data

being coming in and going out) and variety

(range of data types and sources). While

Big Data is an active fi eld of research

and exploration, with innovative tools

and techniques being actively created,

it may not be possible to create one

eff ective solution that addresses all three

dimensions. Most Big Data solutions

will have to fi nd an acceptable tradeoff

between one or more dimensions, with

the most common pair being volume

and velocity.

Business DriversFor enterprises that deal with Big Data, it is

important to extract meaningful insights,

so that enterprise processes can be tuned

accordingly. In addition to being able

to process a large volume of data from

multiple sources and in multiple formats,

most organizations also need to have a

timely, continuous, and instantaneous

view of operations.

While enterprises till date have tried

to achieve this goal using Data warehouse

and Business Intelligence (BI) solutions,

they are realizing that these solutions

cannot scale to provide real-time insights

needed in an increasingly competitive

business environment. For real-time

insights, enterprises need to undergo a

paradigm shift and move away from the

'store and process' methodology followed

by data warehouse and BI solutions.

As per an Aberdeen Group survey,

companies who have implemented

systems that provide real-time visibility

into their operations, have seen noticeably

higher performance across several Key

Operational Metrics, as depicted in Fig. 1

Gensis of Data Stream ProcessingTo manage increasing data volumes and

increased urgency around actionable

information, enterprises are seeking the

aid of processes and tools that can be

used to provide operational insights with

minimum latency.

On the technology front, while disk

capacities have grown rapidly, disk speeds

have not kept pace. In comparison to

disks, memory capacities have grown

exponentially and have been adequately

supported by a signifi cant drop in their

price. With large amount of memory

available at relatively less cost, software

architectures that store and process

data in memory have evolved. Such 'In-

Memory' architectures off er an order-of-

magnitude performance improvement

over traditional architectures. Newer

infrastructure, terabyte memories, and

multi-core parallel computing are opening

up avenues for processing massive

amount of data within a short period of

time and at much lower cost.

As a combined eff ect of business

needs and technology trends, a variety

of technologies are available for deriving

business insight from raw operational

Deriving Operational Insights from High Velocity Data

Bipin Patwardhan* and Sanghamitra Mitra***Research & Innovation, iGATE Mumbai, India**Research & Innovation, iGATE Mumbai, India

Fig. 1: Benefi ts of real-ti me visibility of operati onal metrics

CIO Perspective


data. Such technologies range from

simple operational dashboards based

on conventional Database Management

Systems to advanced techniques like In-

Memory Real-time Data Analytics.

Introduction to Data Stream ProcessingTo bridge the gap between the operational

and analytical systems, the concept

of Data Stream Processing has been

developed, where transient data is

processed as soon as it arrives (even

before it is persisted). The premise of

the concept is to process and analyze all

data on-the-fl y. In the following sections,

we provide details on how on-the-fl y data

analysis can be performed using suitable

technologies.

Technology OverviewThe concept of Data Stream Processing

is built using the computer programming

paradigm based on the 'Single

Instruction Multiple Data (SIMD)' parallel

programming design pattern. In particular,

this paradigm utilizes the concept of

'Pipeline Parallel Processing'.

To help understand the concept,

refer Fig. 2. In most cases, enterprises

continuously receive data that needs to

be processed. This data can be viewed as

a 'Stream of Data over Time'. For a real-

time response, this stream of data needs

to be processed, refi ned and acted upon

in real-time. The concept of Data Stream

Processing enables real-time processing

of such continuous data streams.

The concept diff ers from conventional

data processing frameworks and solutions

in several ways, as below:

• Data streams are usually unbounded.

• No assumption can be made on data

arrival order.

• Size and time constraints make it

diffi cult to store and process data

stream elements after their arrival.

Key CharacteristicsData stream processing engines have the

following characteristics:

• Data Stream Management – The

engine needs to have capability to

process continuous fl ow of data. A

stream is a sequence of time-stamped

data records called 'tuples'. A tuple is

similar to a row in a database table.

As illustrated in Fig. 3, the tuples in a

stream have a schema, which defi nes

each fi eld's name, position, data type,

and size. A few examples of a data

stream include fi nancial trading data

and sensor data.

• Window Processing – A 'Window'

is one of the key concepts of data

stream processing. It enables limiting

the portion of an input stream from

which elements can be selected.

While processing a stream of data, it

is necessary to defi ne portions of the

input fl ows that have to be considered

while executing the processing

rules. Each window contains, at any

given time, a subset of the tuples

streaming by. Defi ning

such windows enables

a query to identify the

fi nite set of tuples (from

an otherwise infi nite

stream) over which the

processing rules would

be applied. As illustrated,

Fig. 4 describes how a

window is applied on

a data stream for say a

‘Withdrawal’ transaction.

The size of the window

is fi ve. The engine stores

all arriving withdrawal

data stream into a window. When the

window is full, the oldest data tuple is

pushed out.

• Domain Specifi c Language –To help

ease the task of describing how the

incoming data is to be processed,

data stream engines typically provide

an expressive language - a Domain

Specifi c Language (DSL) - that

allows enterprises to defi ne complex

relationships among the data items.

As depicted in Fig. 5, data processing

rules (queries) can be defi ned using

the DSL. Once defi ned, the rules

act as continuous queries that are

deployed once and continuously

process the data items streaming

by, producing results. Most DSLs

are defi ned to be similar to SQL so

as to leverage developer familiarity

thereby increasing productivity and

reducing maintenance eff orts.

Implementing Data Stream Processing for Multiple DomainsData Stream Processing enables

enterprises working across various

business domains to derive operational

insights, in real-time, from continuously

fl owing data and make suitable decisions

as soon as the data is received. Such

real-time data analysis can take place in

tandem with business processing so that

problems can be spotted and dealt with

sooner than is possible with conventional

approaches.

Fig. 3: Streaming data

Fig. 5 : DSL for conti nuous query

Fig. 2: Data stream processing

Fig. 4 : Window processing


To help enterprises process and

analyze high velocity data, we have defi ned

the 'iGATE Analysis and Intelligence in

Real-timeTM' approach. The approach

is built on concepts of Data Stream

Processing and Complex Event Processing

and has evolved from our experience of

implementations across domains.

In the following sections we would

like to share our experiences in creating

solutions for domains like Manufacturing

Industry, Smart Grid and Oil and

Natural Gas.

Real-time Manufacturing IntelligenceThis solution was developed to enable

manufacturing organizations to have

continuous visibility into their production

processes - spread globally - allowing

business operations to optimize product

performance, yields, and utilization.

Aggregating and processing huge volume

and/or high speed distributed data to

provide continuous intelligence is a

challenging task. To do so, one needs to

go beyond the visualization and analysis

capability provided by stand-alone Human

Machine Interface (HMI) software.

As depicted in Fig. 6, the solution

made use of on-the-fl y processing of high

velocity data by processing it in-memory,

before the data was persisted using a

suitable downstream application. In-

memory processing allowed the application

to process data in seconds, providing a

real-time view into the plant performance.

Dynamically changing operational

parameters were displayed using real-time

dashboards. A dashboard for monitoring

plant performance is depicted in Fig. 7.

Some of the features of the solution are:

• Continuous queries to analyze

and transform streams of data in

real-time.

• Integration of business intelligence

across diff erent applications in

real-time.

• Scalable platform capable of

processing vast volumes of real-time

data.

Real-time Energy MonitorHouseholds and enterprises consume

energy on a daily basis for various

activities. Today, technology allows

multiple energy readings to be captured

and transmitted at set intervals as per

business need. This continuous stream of

data can be used to provide consumers

with an accurate and up-to-date picture of

their energy consumption.

As depicted in Fig. 8, the high

velocity data was processed using a data

stream processing solution. The solution

processes energy consumption data

and presents the same using real-time

dashboards. The solution also provides

real-time notifi cations and allows

aggregated data to be persisted to a data

warehouse for further analysis.

As illustrated in Fig. 9, the

consumption monitor displays real-time

consumption data by itself or juxtaposed

with historic data.

Some of the key features on the solution

are:

• Provides meaningful insight into real-

time data for improved customer

experience.

• Improved performance by using in-

memory processing.

• Hybrid approach of in-memory

processing of high volume real-time

data to provide immediate useful

feedback and persisting aggregated

data in warehouse for future data

analysis.

Real-time Drilling Operations MonitorIn the fi eld of Petroleum and Natural

Gas (PNG), huge amounts of data and

an immense installed base of disparate

systems make it diffi cult for upstream

engineers and operators to collaborate

eff ectively. Moreover, the upstream oil

and gas industry is challenged to provide

engineers and operators with interfaces

Fig. 6: Real Time Manufacturing Intelligence

Fig. 7 : Real-ti me plant-performance monitor


that support optimal short-term and long-

term decision making. Its highly trained

professionals need integrated views,

oftentimes related to a particular process

or production event. Despite complexity,

engineers, and operators must quickly

identify signifi cant process events;

assess their relevant parameters and take

suitable actions.

Upstream professionals fi nd

operational insights useful in exploration,

drilling and completion, production and

other upstream processing scenarios.

Multiple scenarios, including improving

well performance and generating alerts

for disaster management can leverage

in-memory architecture based on data

stream processing.

As shown in Fig. 11, the real-time

dashboard related to well drilling allows

the upstream professional to benchmark

and monitor crucial operational metrics

with real time data.

Some of the key features on the solution

are:

• Multiple real-time data streams that

are aggregated on-the-fl y.

• Allows enrichment of real-time data

with data from the static operational

data store.

Benefi tsThe iGATE AIR approach helps generate

results as soon as the input data becomes

available, delivering business intelligence

continuously and in real-time, which can

be consumed by applications, services,

and users throughout the organization.

The benefi ts of the approach can

be broadly classifi ed into two categories,

namely business benefi ts and technology

benefi ts, some of which are given below:

Business Benefi ts • Smarter integration of real-time

business intelligence across the

organization.

• Improved business agility,

business innovation, and business

continuation.

• Reduction in development time and

cost by using standards-based SQL.

• Reduced storage cost, as data is not

required to be persisted before it can

be analyzed.

Technology Benefi ts • Allows real-time data collection,

transformation, aggregation and

reporting.

• Lower latency, as data can be

analyzed in-memory, before it is

persisted to the storage medium.

• Data independence, that is, logical/

physical separation, leading to loosely

coupled applications that need lesser

tuning and are more fl exible.

• Can be integrated with multiple

stream processing solutions like

StreamBase, SQLstream, and Esper,

to name a few.

Fig. 8: Smart grid operati onal insights

Fig. 9 : Conti nuous consumpti on monitor

Fig. 10: Real-ti me drilling monitor


ConclusionThe concepts of data stream processing

leverage the performance improvements

on the hardware side, particularly the

developments in RAM technology,

allowing for in-memory processing.

In turn, in-memory processing allows

enterprises to perform data analysis in

real-time, enabling real time visibility into

the day-to-day operations.

It is important to note that Big Data

is not only about processing, cleaning or

churning high velocity, high volume data. It

is about deriving some relevant, meaningful

insights from the same. Solutions built

using Data Stream Processing concepts

can be used eff ectively to analyze high

velocity operational data and extracting

business insights in real time.

As described in this document,

we have used the concepts of data

stream processing to build solutions

for operational insights across multiple

domains. Operational insights can be

leveraged to improve the speed of visibility

into key operational metrics, thereby

helping improve business agility. The

iGATE Analysis and Intelligence in Real-

time approach aims to augment existing

data warehouse, and business intelligence

solutions, to enable real-time data

processing, rather than seeking to replace

them. This allows enterprises to eff ectively

use their existing solutions, but add the

capability of real-time data processing,

helping them respond to events much

faster and in an eff ective manner. n

Fig. 11: Real-ti me dashboard for drilling parameters

Abo

ut th

e A

utho

rs

Bipin Patwardhan is a Technical Architect with more than 15 years of experience in the IT industry. At iGATE,

he is leading the High Performance Computing CoE. The CoE builds capabilities around technologies that help

delivery high performance for enterprise applications. Presently, the CoE covers areas like Parallel Programming,

GPU Programming, Grid Computing, Real-time Analysis and In-Memory Computing.

Sanghamitra Mitra is a Technical Architect from R&I (Research & Innovation) group in IGATE. She has around 15

years of experience. She has worked on various projects related to Enterprise Applications as well as Enterprise

Application Integration with international clients across multiple domains. Currently, her primary focus is on

hands-on evaluation of emerging technologies in High Performance Computing area including Parallel Computing,

Real-time Intelligence. She is responsible for institutionalizing these technologies across the organization and

leveraging them to build innovative solutions to solve business problems.

Primer on CSI History – An AppealAs CSI will be turning 50 in 2015, a series of Golden Jubilee events/activities are being proposed to be conducted well in advance in the coming two years.

There is a proposal to have a curtain raiser in Jun/Jul 2013 at Delhi.

In this context, a primer on "CSI History" is proposed to be brought out highlighting the signifi cant/major milestones of CSI from its inception.

To facilitate the compilation / preparation of this primer, inputs are requested from all the fellows and members who have been associated with CSI for long years at various capacities at the chapter, regional, national and international levels.

Kindly provide all information relevant to this primer as write-ups, documents, publications, photographs and in all other forms at your earliest.

While soft copies of the inputs can be sent to me by email at [email protected] the hard copies (documents/publications/photos/) may pl. be sent to

Director - EducationComputer Society of IndiaEducation DirectorateCIT Campus, IV Cross RoadTaramani, Chennai – 600 113Ph: +91-44-2254 1102 / 1103 / 2874

After use, they will be returned back to you if desired.

We request your immediate support in this activity as the lead-time for the primer preparation is quite short.


With cyber attacks on DRDO and a kind

of Internet blackout that India faced in

March of 2013, I thought of penning down

this article to make my readers aware

about scenario on APT in general and

about where India should be poised.

What is APT?A common defi nition of APT is

hard to come by, as many vendors,

consortiums, and groups put their own

twist on the terminology. A commonly

accepted explanation of APT refers to it

as “an advanced and normally clandestine

means to gain continual, persistent

intelligence on an individual, or group of

individuals such as a foreign nation state

government.” APT is sometimes used to

refer to sophisticated hacking attacks and

the groups behind them. What does that

mean to the Indian citizen, though?

Simply put, APT is reconnaissance

and investigation of your network, in

addition to your infrastructure and your

information assets. It’s a reference to a

sophisticated and dedicated attacker or

attackers who are willing to “lay low” and

go very slow in exchange for gathering

data about you, your organization and

how you operate. For the IT Professional

managing an environment, adjusting your

current infrastructure and preparing for

this threat will require a diff erent mindset

and some analytical assessment.

According to CERT-In (Computer

Emergency Response Team - India), till

October an estimated 14,392 websites in

the country were hacked in 2012. General

acceptance is that social media usage

boosts the likelihood of a successful APT

attempt.

Attackers behind APTs are interested

in a broad range of information, and

are stealing everything from military

defense plans like latest DRDO attacks to

schematics for toys or automobile designs.

Their motivation can be fi nancial gain, a

competitor’s advantage in the marketplace,

sabotage of a rival nation’s essential

infrastructure, or even just revenge.

APTs start by identifying

vulnerabilities that are unique to your

employees and infrastructure. And since

they are precisely targeted, surreptitious,

and leverage advanced malware and zero-

day (unknown) exploits, they can bypass

traditional network and host-based

security defenses.

Cybercriminals are increasing the use

of Web-based malware, and employing

malicious uniform resource locators

(URLs) for only brief periods of time. They

use “throw-away” domain names in just a

handful of spear-phishing emails before

moving on, enabling them to fl y under

the radar of URL blacklists and reputation

analysis technology. Additionally, the

report points out, they are blending URLs

and attachments in email-based attacks,

and reproducing and morphing malware

in an automated fashion.

These techniques render the use of

defenses that rely on known patterns of

data almost entirely ineff ective. We are in

April and year 2013, is already the 'year of

the hack'. Even more disturbing is the fact

that many attacks are being carried out by

state sponsored actors from countries like

China, Korea, and Iran.

It is imperative to know when a

targeted attack is underway, and how to

gather evidence to be able to understand

its purpose and origin. Leveraging multiple

security solutions that use diff erent methods

to detect malicious activity for both internal

and external threats can enhance your

capabilities. Security technology has been

evolving, and manufacturers are developing

ingenious ways of not only detecting, but

stopping, zero-day attacks.

Many advanced security monitoring

tools work well in conjunction with more

traditional defenses, such as fi rewalls,

IDPS, antivirus, gateways, and security

information and event-management

(SIEM) systems. With the right tools

in place and staff and operational

support behind them, you can gain

the situational awareness and counter

intelligence needed to identify an attack,

and potentially block or quarantine

threats. Even if an attack is successful, the

insight gained into how it occurred, what

information may have been compromised,

and the relative eff ect of your defenses

can be invaluable to recovery eff orts, and

will help you continuously improve your

security posture.

India’s Cyber Law i.e. under the

section 66F (Cyber Terrorism) of The

IT Act, 2000 has enough teeth to fi ght

against such criminals if found. India

needs to implement a huge knowledge

management system which can be

used by its defense forces along with

DRDO, NTRO, CERT-in. This knowledge

management on APT can help us weed

of any successful cyber attacks and can

increase our cyber attack preparedness.

India needs an holistic approach and view

to encounter APT threat as a country, we

have cyber security heroes in pockets but

for APT we need team of heroes guided

with systems and processes to channel

their fi ght against APT.

Reference[1] h t t p : // f o c u s . f o r s y t h e . c o m /

articles/268/Combating-Advanced-

Persistent-Threats n

Information Security »

Advanced Persistent Threat (APT) - and- INDIA

Security Corner Adv. Prashant Mali [BSc (Physics), MSc (Comp Science), LLB]

Cyber Security & Cyber Law Expert

Email: [email protected]

India needs to implement a huge knowledge management system which can be used by its defense forces along with DRDO, NTRO, CERT-in.

According to CERT-In (Computer Emergency Response Team - India), till October an estimated 14,392 websites in the country were hacked in 2012.


IT Act 2000 »

Prof. I T Law Demystifi es Technology Law Issues Issue No. 13

Security Corner Mr. Subramaniam Vutha

AdvocateEmail: [email protected]

How Lawyers and [IT] Technologists should collaborate:

IT Person: Prof. I. T. Law, it is a pleasure

to meet you again. And I look forward

to an enlightening discussion with you

on Technology law issues that people

like me should know.

Prof. IT Law: I enjoy talking to you too.

What topic should we discuss today?

IT Person: I am intrigued by your

concept of collaboration between

technologists and lawyers. How

should we go about that?

Prof. IT Law: Yes, such technology +

law collaboration is a fundamental

need of the day. Especially in the

Internet era, things move too fast and

changes occur so rapidly that there

is greater need than ever for such

collaboration.

IT Person: Why is it so important in

our industry to do such “parallel” work

with a lawyer?

Prof. IT Law: In older and more mature

sectors, the business executive

or technologist could take an

appointment with a lawyer, and then

brief him or her before taking a legal

opinion that infl uences his business

plan. In the case of the IT industry

things move so fast that it is better to

“thread” the legal precautions into the

business plan itself. If not, the velocity

of business will be such that great

harm can be done before you know

it, and legal damages could be quite

daunting.

IT Person: Please give me an example.

Prof IT Well, take the case of a

company that wants to develop a new

website. The architects and designers

of the website will talk to the

business people, and understand the

functionality that is needed. At that

stage itself it is important to involve

the lawyer too.

IT Person: Why is that needed? What

will the lawyer do to help at that

stage?

Prof. IT Law: As the business

executives explain the functions of the

website, the design of the website will

be determined. The lawyer will also

understand those intended functions

of the website, and will advise on

the types of agreements and policies

needed.

IT Person: For example?

Prof. IT Law: If the website is only

for information, you will need terms

and conditions for access by visitors.

And if you are gathering any personal

information about visitors to the site,

then you will need to have privacy

policy terms also.

IT Person: And what if we have more

functions on the website?

Prof IT Law: Do you mean functions

like buying and selling products and

services on the website? In that case

you will also need to have terms

and conditions of sale by you. Those

should be binding on the people,

who use your site to buy goods and

services.

IT Person: This is interesting. But what

will the lawyer do at this early stage?

Prof IT Law: Your website terms and

conditions are like a contract. They

bind both you and the visitors to your

website. So, it is important to know

the legal implications and to have

your lawyer understand the business

intentions, and plans so he can help

you draft these properly.

IT Person: Usually the practice is to

see what others are doing and to adapt

their policies. Is that not suffi cient?

Prof. IT Law: Each website presents a

diff erent set of issues and challenges.

So, it is not sensible to just follow what

others do, without applying your mind

to the specifi c needs of your website.

Remember, these policies and terms

and conditions become crucial when

you face a legal challenge. At that

stage, it will be too late to undo

something that is not appropriate.

IT Person: For example?

Prof IT Law: Just consider a situation

where you did not provide in your

privacy policy that the personal

information gathered could be

shared with prospective buyers of

your business. In that case, you will

not be able to share such data with

a prospective buyer. And that buyer

may be interested mainly because of

the personal information database

you have built up.

IT Person: Oh I see! That is interesting.

I now understand how technologists

and IT people should collaborate for

key event and plans. Thank you very

much. Talking to you is always so

stimulating,

Prof IT Law: Your interest in the

subject makes it interesting for me

too. Our discussions are themselves

collaborations between a technologist

and a lawyer! n


Claude Elwood Shannon, the father

of information age was born on April

30, 1916. C.E. Shannon’s infl uence and

inspiration underpins everyday life

activities ranging from the eff ect of cell

phones to the popular social networking

sites. The most commonly used internet

jargons save, store, upload, and download

arguably symbolize the revolutionary

concept laid out by the most infl uential

mind of 20th century- Claude E. Shannon. It is really interesting

to explore the path through which Shannon travelled that fi nally

helped him to propose his landmark ideas.

Shannon was known to be very timid, and led a normal childhood.

His mother was the principal of the local Gaylord High School and

his father was a business man. Shannon’s was very much inspired by

his grandfather, who was a farmer and an

inventor. Shannon’s childhood hero was

Thomas Edison, whom he has a common

ancestor. As a young boy, C.E. Shannon

was a big fan of Edgar Allan Poe’s “The

Gold Bug”[1]. The Gold Bug is a detective

fi ction that centers on searching the

buried treasures by deciphering the secret

message. Shannon was even attracted to

solve cryptograms right from his school days.

Moreover, C.E. Shannon was very curious to learn things. He was

interested to know how various

devices like model planes worked.

He was even adventurous, as young

boy; he tried to contact his friend

half a mile away, just by hooking

the telegraph machine to a barbed

wire fence. Apart from this he has

a passion towards Dixieland music

too. He had a good musical instruments collection.

In 1936, C.E. Shannon graduated from University of Michigan, with

specialization in both electrical engineering and mathematics. His

interest in Boolean logic started from Michigan. Later, Shannon

joined MIT, and his acquaintance with Vannevar Bush, dean of

MIT's school of engineering changed his whole life. His mentor

was very much infl uential for recognizing his milestone work on

switching theory. It was supposed to

be one of the best Master thesis ever

produced. He was awarded Alfred

Nobel prize in 1940 for his novel

contribution to switching theory, which

later became the foundation for modern

digital systems. With the advice of his

mentor, Shannon did his PhD work in the area of genetics.

In 1942, C.E. Shannon joined Bell Telephone Laboratories for

full time research. During the Second World War time, he worked

in aircraft devices and in cryptography. It was highly signifi cant

that the encryption work build in complex scrambling machine,

was used by Franklin Roosevelt and Winston Churchill in order to

protect their transatlantic communications during war time. It was

in Bell lab; Shannon met his wife Betty, a trained Cryptographer. The

ambience of Bell labs and its relaxed atmosphere helped Shannon

to integrate all his views, which ended in his famous 1948 landmark

paper “A Mathematical Theory of Communication”; C.E. Shannon’s

revelation about the concept of information helps to envisage many

developmental changes in the fi eld of communication. Information

theory is the brain child of this great American genius. Inspired by

Hartley’s paper, Shannon tried to quantify mysterious concept of

information. The fresh insight proposed by Shannon helped to realize

that quantity of information has nothing to do with its meaning in

common parlance. It was Shannon’s absolutely incredible thinking

that helped to relate surprise and information[2].

In the information era, Bit is the fundamental atom of information. It

was Shannon who fi rst used the word Bit as per the suggestion of J.W.

Tukey, by coining the two words Binary and Digit[3]. Supremacy of bit

lies in its versatility. Bit forms the language for any communication

system irrespective of the fact that the message is text, audio or

images. All messages are being translated into two states like “On

(0)” or “OFF (1) “. Shannon proposed

even the limit at which a message

can be transmitted from one end to

another through channel without

loss of information. The abstract

concept of information proposed by

Shannon forms the foundation of all

technological advancements, in the

fi eld of data storage and transmission systems.

Interdisciplinary approach of Shannon created a revolutionary

change in the fi eld of digital communication. He has astonishingly

diversifi ed interests in many fi elds like Switching, Cryptography,

Computing, Artifi cial Intelligence, and Games. His novel

contributions helped to shape the modern digital world. More than

that, Shannon was an enthusiastic juggler, amazing unicyclist, loved

to design many devices out of curiosity, enjoy playing chess, and

most importantly an adorable poet

and musician. With his amazing

mathematical foundation, Shannon

laid down the golden rule for modern

information theory. Let me close my

tribute note by quoting an extract of

Shannon’s masterpiece poem, which

was published by John Horgan in

Scientifi c American[4].

A Rubric on Rubik Cubics

Strange imports come from Hungary: Count Dracula, and ZsaZsa G.,

Now Erno Rubik’s Magic Cube; For PhD or country rube.

This fi endish clever engineer; Entrapped the music of the sphere.

It’s sphere on sphere in all 3D—A kinematic symphony!

Ta! Ra! Ra! Boom De Ay!

IT.Yesterday() Biji C LDepartment of Computational Biology & Bioinformatics, University of Kerala

Birthday Tribute to the Most Infl uential Mind of 20th Century-Claude Elwood Shannon

y p y


With theorems wrought by Conway’s eight;

‘Gainst programs writ by Thistlethwait.

Can multibillion-neuron brains; Beat multimegabit machines?

The thrust of this theistic schism—To ferret out God’s algorism!

With great Enthusiasm; Ta! Ra! Ra! Boom De Ay!

Men’s schemes gang aft agley; Let’s cube our life away!

References:[1] Robert Price (1985): A Conversation With Claude Shannon

One Man's Approach to Problem Solving, Cryptologia, 9:2,

167-175

[2] Arun K S, Achuthsankar S Nair, "60 years since “kpbw wcy xz”

became more informative than “I love you”, IEEE Potentials

(ISSN: 0278-6648), Vol. 29, Issue 6, Nov.-Dec. 2010,

pp. 16-19.

[3] 1928. C E Shannon, “A Mathematical Theory of

Communication”, The Bell system, Technical Journal, vol.27,

pp.379-423, 623-656, July, October, 1948

[4] http://blogs.scientif icamerican.com/cross-check/

2011/03/28/poetic-masterpiece-of-claude-shannon-father-

of-information-theory-published-for-the-fi rst-time/ n

Abo

ut th

e A

utho

r

Biji C L completed her Master of Engineering from Anna University. She is currently procuring PhD, in the

Department of Computational Biology & Bioinformatics, University of kerala.

taking into consideration the security features of Hadoop.3. Data Aggregation Stage – This is

most important step which will aggregate the data from Hive/HBase or any other NoSQL databases, so that analysis can be carried out on the aggregates.

4. Data Analytics Stage – In this step, further analytics to fi nd the drilling patterns, infer the lithology content based on various parameters from the Oil well logs. This step can be performed on another Analytical databases or In-Memory database residing outside the Hadoop ecosystem. Alternatively tools such as ‘R’ integrated with Hive can be used for distributed analytics on Hadoop.

5. Data Visualization Stage – The output from the analytics can be integrated to DW/BI systems for generating dashboards and scorecards so that decision makers can visualize and interpret the data.

ConclusionA Hadoop based Big Data Framework with Hive as a central Data warehouse layer is widely used to create dynamic and unifi ed

structures. We can easily execute pre-defi ned or ad-hoc queries on the Hive. This acts as a unifi ed integrated layer that can be easily augmented with current BI stack. The salient features of Hadoop/Hive based solution with respect to Oil and Gas E&P data management are: • Scalable architecture to analyze

terabytes to petabytes of multi structured well log data

• Massive Parallel processing providing unifi ed view of the data from multiple wells during its lifecycle — be it at the planning, operations or post-completion stage

• Integrated KPI framework — for commercials, operations, Health and Safety Execution, productions, etc.

• An extendable PPDM compliant data-model and Energistics standards to manage the data with a partner ecosystem

• Comparative analytics & correlations with wells in similar geologic conditions to help decision making for drilling Oil Wells

• Oil and Gas domain Ontology for easy interpretation of scientifi c terminology

References[1] Analytics Magazine (Nov-Dec 2012

issue) - How Big Data is Changing The

Oil and Gas Industry by Adam Farris -

Senior VP of Business Development for

Drillinginfo- Austin, Texas http://www.

analytics-magazine.org /november-

december-2011/695-how-big-data-is-

changing-the-oil-a-gas-industry

[2] Apache Hadoop Wiki - http://wiki.apache.

org/hadoop

[3] Hadoop- The Defi nitive Guide (Book by

Tom White, Published by O’Rielly - June

2009) http://net.pku.edu.cn/~course/

cs402/2011/book/2009-Book-Hadoop%

20the%20Defi nitive%20Guide.pdf

[4] Apache Hive Wiki - https://cwiki.apache.

org/confl uence/display/Hive/Home

[5] Hive – A Petabyte Scale Data Warehouse

Using Hadoop – a paper by Ashish Thusoo,

Joydeep Sen Sarma, Namit Jain, Zheng

Shao, Prasad Chakka, Ning Zhang, Suresh

Antony, Hao Liu and Raghotham Murthy –

Facebook Infrastructure Team http://infolab.

stanford.edu/~ragho/hive-icde2010.pdf

n

Pramod Taneja, Principal Architect, iGATE - Pramod has 20+ years of IT experience and currently leading

the Big Data CoE of Research & Innovation group, iGATE. He has served in various capacities managing and

supporting business process-led technology as well as strategic management initiatives. Email - pramod.

[email protected]

Prashant Wate, Technical Specialist, iGATE - Prashant has more than 13 years of experience in IT and is

currently part of the Big Data CoE of Research & Innovation group, iGATE. He has extensive experience in

architecting and implementing database solutions including Big Data, data modeling, data migration and

database optimization. Email - [email protected]

ut t

he A

utho

rs

Continued from Page 18


Solution to March 2013 crossword

Brain Teaser Dr. Debasish Jana

Editor, CSI Communications

Crossword »Test your Knowledge on Big DataSolution to the crossword with name of fi rst all correct solution provider(s) will appear in the next issue. Send your answers to CSI

Communications at email address [email protected] with subject: Crossword Solution - CSIC April 2013

CLUESACROSS2. Document-oriented databases using a key/value interface rather

than SQL (5)5. A space-effi cient probabilistic data structure (5, 6)8. Unit of measurement for data volume (9)9. Markup language (3)11. A data fl ow language and execution framework for parallel

computation (3)12. Structure of data organization (6)14. A distributed columnar database (5)17. One quintillion bytes (7)19. Type of database system that can make deductions (9)25. Discovery of meaningful patterns in data (9)26. A massive volume of both structured and unstructured data (7)27. Type of database designed to handle workloads whose state is

constantly changing (8)28. Diff erent types of data (7)29. An ordered list of elements (5)30. The digit one followed by one hundred zeroes (6)31. An open-source system for processing real time data streams (5)32. A paradigm for development of distributed computing

applications (6,5)

DOWN1. One thousand terabyte (8)3. Required for data persistence (7)4. An opensource software framework supporting data-intensive

distributed applications (6)6. Rate at which data acquired (8)7. An open-source database (7)10. A programming model to process large volume of data sets (9)13. Method for an integrated knowledge environment (4)15. Type of database optimized to store and query data that is related

to objects in space (7)16. Type of database with built-in time aspects (8)18. Technique to clean up noisy data to make this usable (11)20. Size of date expressed as (6)21. A very large number (10)22. Extremely large databases (4)23. An in-memory computing platform designed for high-volume

transactions (4)24. An engine for query processing and data warehouse (4)28. Database, in very large form (4)

Congratulations toAnanthi Nachimuthu (Dept of Computer Technology, Dr. N.G.P. Arts and Science

College, Coimbatore) and Madhu S. Nair (Dept of Computer Science, University of Kerala, Thiruvananthapuram)

ALMOST ALL correct answers to March 2013 month’s crossword.

Did you know about Map-Reduce algorithm for handling huge data?

MapReduce offers a programming paradigm for massive scalability to handle large data volumes. Users specify a map function that which takes a data set as input and transforms it into another set with key/value pairs. Then there comes a reduce function that does the merging these transformed data sets associated with same key in the key/value pairs.

(Source: MapReduce: Simplified Data Processing on Large Clusters by Jeffrey Dean and Sanjay Ghemawat, URL: http://research.google.com/archive/mapreduce.html)

1 2 3

4 5

6

7 8

9 10

11 12 13

14 15

16

17 18

19 20

21

22 23 24

25

26 27

28 29

30

31

32

1

S I2

M P L E3

D L4

E5

S6

C O R7

M

G O A O O

M N8

I S L9

A N D O R10

A D B

L O P Z B E I

G V W11

I S B N12

U R L 6 T

A13

O P14

A C R15

L

P L16

A R X I V

H17

A F C B18

I M P R I N T19

R T F20

P D F E A21

S C I22

E L O F23

D S P A C24

E R

U P S C G Y

S U T25

E B O O K R26

C

H P S A G

I C27

C O N T E N T28

D M29

R D F A S30

M

I31

C O U N32

T E R S L

P E Y33

S A T A34

M E T A D A T A I L W


Ask an Expert Dr. Debasish Jana

Editor, CSI Communications

Your Question, Our Answer“Do the right thing. It will gratify some people and astonish the rest.”

~ Mark Twain

C/C++: Catching array index out of boundsFrom: Anonymous

In C/C++, presumably there is no array index bound exception coming while dealing with raw arrays. Even index operator for a STL vector cannot detect the array index crossing the specifi ed boundary limits. Code snippet follows.

#include <iostream>#include <vector>using namespace std;const int SIZE = 2;int main(){ int rawarray[SIZE]; vector<int> v(SIZE); int i; for (i = 0; i <= SIZE+1; i++) { rawarray[i] = i; v[i] = i; } for (i = 0; i <= SIZE+1; i++) { cout << "rawarray[" << i << "] = " << rawarray[i] << endl; cout << "v[" << i << "] = " << v[i] << endl; } return 0;}

When I compile and run this program, there is no compilation as well as runtime error. However, it is clear from the code given above that each of the rawarray and the STL vector object v is supposed to contain two elements as per specifi ed size i.e. 2 but when I try to put something as the third or even fourth element, they are being allowed without any warning or error. Here’s the output:

rawarray[0] = 0v[0] = 0rawarray[1] = 1v[1] = 1rawarray[2] = 2v[2] = 2rawarray[3] = 3v[3] = 3

Any suggestions or workaround ?

A In C/C++, there is no boundary checking for arrays. Even accessing vector with index operator is also buggy, as rightly pointed out. In reality, accessing array element for rawarray with an index say 5 like rawarray[5] actually means you are accessing an element residing at memory location rawarray + sizeof (int) * 5 or rawarray - sizeof (int) * 5. This means that on a 32-bit machine, where sizeof(int) = 32, this will be at location off set 32 * 5 i.e. 160 bytes either way from starting location of rawarray. If that memory location is within the permissible range of memory locations for user programs, no runtime error would come. And if the memory location falls within restricted memory area (reserved by operating system) would cause a protection violation i.e. program would crash. But, the behavior

is unpredictable. A better alternative would be to defi ne Array as a C++ template and have own exception defi ned to catch array index out of bounds exception. Code snippet follows:

#include <iostream>#include <string>#include <exception>using namespace std;class MyException:public exception { string ex; public: MyException(const string str= "some exception"):ex(str){} ~MyException(){} const char* what(){return ex.c_str();}};template <class T>class Array { T * data; int size; public: Array(int s) { data = new T[ size = s]; } virtual ~Array() { if (data) delete [] data; } T& operator [] (int index) { if ((index < 0) || (index >= size)) throw MyException ("Array index out of bounds"); return data[index]; }};const int SIZE = 2;int main(){ try { Array<int> safearray(SIZE); int i; for (i = 0; i <= SIZE+1; i++) { safearray[i] = i; } for (i = 0; i <= SIZE+1; i++) { cout << "safearray[" << i << "] = " << safearray[i] << endl; }} catch (MyException &e) { cerr <<"exception: "<<e.what()<< endl; } return 0;}

The output would be as below (when array index boundary is crossed):

exception: Array index out of bounds

For std::vector, the index operator [] does not check for boundary overfl ow or underfl ow. You could use the member function at e.g. v.at(i) and enclose within try block. vector::at throws an out_of_range exception if the requested index position falls out of specifi ed range. Alternatively you may check v.size() to check if you are crossing the specifi ed

boundary or not. n

Send your questions to CSI Communications with subject line ‘Ask an Expert’ at email address [email protected]


Happenings@ICT H R Mohan

Vice President, CS AVP (Systems), The Hindu, ChennaiEmail: [email protected]

ICT News Briefs in March 2013The following are the ICT news and headlines

of interest in March 2013. They have been

compiled from various news & Internet sources

including the dailies – The Hindu, Business Line,

and Economic Times.

Voices & Views • The public cloud services market to grow

18.5% in 2013 to $ 131 billion globally – Gartner. • Views on Budget 2013: Ganesh Natarajan:

Minor advantages for IT; Phaneesh Murthy: Nothing sparkling for corporate sector; Keshav R Murugesh: Right notes for BPO industry; B V R Mohan Reddy: Positive and balanced; Hike in tax on royalty payments to hurt tech fi rms; Telecom sector disappointed; Tax incentive for semiconductor fab unit too late, say chip makers; GTech: Budget ‘interesting’ for IT sector.

• Smartphone sales are expected to touch 918 million units worldwide in 2013, and by the end of 2017, 1.5 billion – IDC.

• The IT infrastructure budget for World Cup Soccer and Olympics in Brazil is pegged at $180 billion.

• The small and medium software industry in India is pegged at $110 billion, while export is worth $68 billion -- M. Nayak, Director, STPI.

• Indian mobile phone market up 16% at 218 mn in 2012 – IDC.

• IT-ITES exports up 23% at Rs 4.11 lakh cr in FY’13 – Deora.

• India (4.2%) ranked third on distributing spam across the world, after US (18.3%) and China (8.2%); Asia tops the list of continents with 36.6% of the world’s spam -- SophosLabs.

• 5.85 lakh telecom towers consume 5.12 bn liters of diesel a year and emit 10 mt of carbon dioxide: Deora.

• M-commerce to constitute over 25% of e-commerce traffi c -- HomeShop18 CEO

• E-commerce segment has doubled to about $ 14 billion in 2012 from $ 6.3 billion in 2011.

• India makes 13 requests a day for web user data (Internet snooping by the enforcement authorities), second to U.S. which asks 45 – Google.

• Three out of every 10 parents confi rm that their children were victims of cyber-bullying - Norton.

• About 84% of all young men (2.4 crore) and 82% of college going (1.5 crore) and 68% of school going (1.5 crore) kids accessed the social media; Social media users in urban India crosses 6.2-crore mark in December 2012 and estimated to be 6.6 crore by June 2013 -- IAMAI.

• Nasscom expects export revenues of $84-87 billion in 2013-14 fi scal at a growth rate of 12-14%.

• India’s domestic IT market to touch Rs 1.75 lakh crore by 2016 -- Boston Consulting & CII.

• Videocon, Reliance ‘ready’ to invest Rs 25,000 cr in chip-making units.

• Put country fi rmly on Internet and ‘get out of the way”-- Eric Schmidt, Chairman, Google.

• Technology has forced politicians to update themselves – Modi.

• Holidays are peak season for spammers. Holiday spam can account for up to 6% of all spam.

• Europe contributes about 25-30% of IT revenues as against 50% from the US markets.

• Computer users to spend 1.5 bn hours and $22 bn battling malware. Global enterprises will spend $114 billion to deal with the impact of a malware-induced cyber-attack – Microsoft.

• Mobile value added services (MVAS) to reach $9.5 billion in 2015, from $4.9 billion in 2012 – Wipro & IAMAI.

• Cyber security market may reach $870 mn by 2017—IDC.

Govt, Policy, Telecom, Compliance • Govt. expects lower revenue of Rs 19,440.67

crore from spectrum sale and other related charges in 2012-13, compared to Rs 58,217 crore estimated.

• Bharti Airtel leads in consumer complaints on billing, tariff .

• Govt. plans to take over possession of BlackBerry infrastructure in Mumbai, or legal interception of Internet communication.

• 2G Scam: JPC unlikely to call Raja as witness. May be asked to submit stand in writing. CBI court summons Sunil Mittal, Ravi Ruia, and Asim Ghosh. Raja accuses Vahanvati of telling untruths against him.

• 2G players face fi ne from DoT for shutting services without notice.

• DoT decision to allow broadband players to off er voice is illegal – COAI.

• New messaging system for NGOs with FCRA (Foreign Contribution Regulation Act) registration.

• Buying Internet protocol addresses to get cheaper, faster with the launching of National Internet Registry (NIR) in India.

• Free roaming services likely before October – Sibal.

• 14 million requests to switch mobile operator rejected.

• CDMA spectrum sale to fetch Rs 3,639 cr while the auction of 2G spectrum for GSM players held in November last year fetched Rs 9,407 cr.

• DoT fi rm telcos must own spectrum to off er 3G services.

• Over 2 cr mobile users loaded with value-added services they didn’t ask for.

• BSNL, MTNL still to recover Rs 6,215 cr from customers – Sibal.

• Time for electronics goods certifi cation extended till July 3.

• Govt has received proposals for two semiconductor fabs – Sibal

• India plans U.S.-like information sharing to alert cyber-attacks.

• Telcos asked to install local server for security audit.

• Fate of Aakash II tablet still uncertain. • Sibal unveils roadmap for IPv6. Plan for

complete migration to IPv6 by December 2017. • Cost of voice services will move up – Aircel. • Unifi ed license framework to take a month --

Telecom Secretary. • Centre to set up 2,000 telecom towers in tribal

areas at a cost of Rs. 3,000 cr.

IT Manpower, Staffi ng & Top Moves • Fresher’s ‘hired’ by HCL Tech stage protests

across country demanding that the company

convert the off ers to actual jobs. HCL issued a letter of intent and not a job off er – HCL HR head

• Infosys plans to hire 200 in US. • Helios and Matheson IT to hire 1,000. • Aptech ties up with NSDC. Aims to train over

two million people over 10-years. • Google to slash 1,200 Motorola Mobility jobs

in US, China and India. • Sigma Aldrich, to hire 100. • Mahindra Satyam to increase headcount in

Australia to 5,000 in two years from 1,600. • Chennai-born Sundar Pichai to head Google

Android division. • Fake job off ers swarm Android platform. • US to accept H-1B visa applications (with a

cap at 65,000) from April 1. H1-B visas could double under Senate plan -- Report

• Hiring activity in IT sector likely to be muted this year -- Kris Gopalakrishnan.

• Albion Infotel plans to hire 150 people. • Nasscom launches programme to incubate

10,000 start-ups. • Makuta VFX to double headcount this year

from 60. • Engg students prefer IT; Google most wanted

employer – Nielsen. • Tyco plans to double headcount from 850. • D Shivakumar, Senior Vice-President (India,

Middle East, Asia), Nokia decides to quit. • SAP training students to meet innovation

needs. • TCS ranked No. 1 employer in Europe for 2013.

Company News: Tie-ups, Joint Ventures, New Initiatives • TCS enters $5 billion brand value club with its

brand valued at $5.247 billion. • Microsoft launches `Offi ce 365’ in India. • Seagate to launch wireless storage solutions

in India. • Cisco announces the Cisco Education Enabled

Development (CEED 2700); a cloud based video conferencing solution for educators.

• Free pepper sprays, special call rates, for women opting for new pre-paid connections on Women’s Day.

• HP unveils ElitePad for enterprise segment. • HomeShop18 launches a ‘Scan N Shop’, India’s

fi rst virtual shopping wall, at T3 Terminal of Indira Gandhi International Airport in New Delhi.

• Reliance plans social services through optical fi bre cable network.

• Google to replace passwords with ‘ID ring’. • AMD unveils Accelerated Processing Units

(APUs) with facial log-in, gesture recognition. • Adobe unveils ‘Creative Cloud’ off ering

membership-based service access to its products and services.

• Intel to roll out 4th generation core processor this year.

• EMC pips IBM to become largest storage player.

• IBM opens customer experience lab. • Sabeer Bhatia plans on to biotech, social

ventures; launches Jaxtr SIM, a global SIM card.

• YouTube clocks 100 crore average monthly visitors. n


CSI Report

A Report on CSI Best PhD Thesis Award 2012

M. GnanasekaranAsst. Manager (Administration), CSI

CSI has instituted a new award to recognize the best doctoral

dissertation(s) in Computer Science/ Information Technology,

from recognized doctoral degree-awarding institutions in

India. The award consists of certificate, trophy, and cash prize.

Ph. D dissertations accepted by Universities in India, during

the period January 2011 to December 2012 were eligible for

consideration.

CSI received 65 proposals from institutions all over India.

A panel of established researchers reviewed the dissertations.

The criteria used in the evaluation process included originality/

novelty of the thesis work, pertinence of the subject, depth, and

breadth of the results, contributions to theory and practice of CS,

applications and/or potential applicability of the results, current

and likely future impact, and quality of related publications.

Award Name of candidate Name of Institution Title of Thesis

Best Thesis(Joint)

Dr. Ramasuri NarayanamDepartment of Computer Science and

Automation Indian Institute of Science

Bangalore

Game Theoretic Models for

Social Network Analysis

Dr. Ketan Kotwal Department Electrical Engineering, Indian

Institute of Technology, Mumbai

Fusion of Hyperspectral

Images for Visualization

Honourable Mention Dr. Kishor Kumar Barman

School of Technology and Computer

Science, Tata Institute of Fundamental

Research, Mumbai

Topics in Collaborative

Estimation and MIMO

Wireless Communication

The winners are:

Hearty Congratulations to the Winners!

Prizes were presented by the Chief Guest, Mr. S D Shibulal,

CEO and MD of Infosys Ltd. during the 48th CSI Foundation

Day Celebrations, held at TIFR on 6th March 2013. The

committee consisting of R Jaikumar, S P Mudur, V Prabhakaran,

K Samudravijaya, R K Shyamasundar, and G Siva Kumar evaluated

the proposals, and selected the best dissertations for the award.

CSI appreciates the remarkable job done by the committee in a

very short span of time. n

Prof. RK ShyamasundarConvener

Mr. MD AgrawalChairman - Awards Committee

Kind Attention: Prospective Contributors of CSI Communications -

Please note that cover themes of future issues of CSI Communications are as follows -

• May 2013 - Cryptography

• June 2013 - Social Networking

• July 2013 - e-Business/ e-Commerce

• August 2013 - Software Project Management

• September 2013 - High Performance Computing

The articles and contributions may be submitted in the following categories: Cover Story, Research Front, Technical Trends, and Article.

For detailed instructions regarding submission of articles, please refer to CSI Communications March 2013 issue, where Call for

Contributions is published on the backside of the front cover page.

[Issued on behalf of Editors of CSI Communications]


All eyes were glued to the huge screen. With each passing question, heads came together, hush-hushing the answer. While a few students sat scratching their heads, others clenched their fi sts in frustration. Fewer still sat under the comfort of utter ignorance. Soon, it was answer-time. And, at once, the quiet auditorium was engulfed by a cacophony of phenomenal enthusiasm, which set the tone for the event. The National fi nals of 3rd National CSI Discover Thinking Quiz 2013, a Fun Quiz conducted by CSI on 2 March, 2013 at Millennium National School, Pune was for students of middle school from 6th to 9th standard. The quiz master J Ramanand from IBM India, had the whole audience in raptures. An accomplished quiz master, Ramanand is a BBC Micro Mind winner and also founder of quiz club of Pune.

Initially, the fi rst round of this national quiz was conducted in various CSI chapters during January, 2013. The CSI chapter level rounds were held at Trivandrum, Kochi, Sivakasi, Mysore, Koneru, Nashik, and Solapur. The top quizzing team from each chapter moved onto the regional rounds,

which were held at Koneru (Region 5), Pune (Region 6), and Kochi (Region

7). The fi nals saw the top 2 teams from each region competing for the CSI

Discover Thinking National Quiz Championship. In all, over 500 schools

and almost 5,000 students participated in the quiz in various rounds.

The total prize money was over Rs. 2.0 Lakhs. At the end of a pitched

battle, Naveen V and Naveen Unnikrishnan of Bhavan’s Adarsha Vidyalaya,

Kochi bagged the trophy and 1st prize of Rs.25,000/-. They were followed

by Amal M. and Sarath Dinesh of St. Thomas Higher Secondary School,

Trivandrum who took home Rs. 10,000/-. The third place was secured by D

Jeevithiesh and M Prabhat of Narayana IIT Olympiad, Vijayawada. A team

from Dnyan Prabodhini, Pune came fourth. The prizes were distributed

by Mrs. Chitra Buzruk, Senior General Manager, Persistent Systems, in

the presence of Mr. Shekhar Sahasrabudhe, RVP, Region 6, and Mr. Arun

Tavildar, Past Chairman, CSI Pune chapter. At the Koneru regionals, CSI

Past President, Prof. P Thrimurthy distributed the prizes and encouraged

the young students.

This event was coordinated by Mr. Ranga Rajagopal, NSC- CSI and

supported by Prof. Prashant R Nair, National Quiz Coordinator. The fi nal

was anchored by Mr. Shekhar Sahasrabudhe, RVP 6, Mr. S P Soman, RSC

7, and Ms. Mini Ulanat, National Convenor Skill development coordinated

the regional round for region 7 while Mr. Praveen Krishna, CSI SBC of KL

University – Koneru coordinated the Region 5 regional’s.

CSI Discover Thinking Quiz had Adobe as the event sponsor with

Persistent foundation and KL University sponsoring the regional rounds at

Pune and Koneru respectively. The quiz aims to encourage young learners

to discover science, and ICT the fun way and hopes to address the declining

trend of very few children opting for pure science as a profession. n

CSI Report

CSI Discover Thinking Quiz 2013 3rd National CSI Science and ICT Fun Quiz

Prof. Prashant R Nair*, Mr. Ranga Rajagopal** and Dr. Rajveer S Shekhawat****National Quiz Coord,**National Student Coordinator, CSI***National Convenor - CSI Project Contest

CSI Discover Thinking 2nd National Student Project Contest 2013

CSI “Discover Thinking”, the 2nd National level student project competition,

is an initiative for CSI student members to share innovative ideas with their

peers and experts country-wide. The 2nd edition of this extremely popular

event was exclusively sponsored and supported by M/s Adobe Inc. A total

of 10 teams had been shortlisted after regional rounds (2 per region) to

participate in the national round. The National fi nal was held at College

of Technology and Engineering (CTAE), Udaipur on 16th March 2013. All

the projects contained ideas & implementations of very high quality and

innovation. The competition was very closely contested and at the end of a

daylong session of presentations, the following teams were announced as

winners. The teams participating in the event are shown in the photo with

judges and organizers.

1st Prize: Sai Chand Upputuri & Alakananda Vempala, “Behavioural

Biometric Advanced Authentication”, K L University, Guntur (A.P.) 2nd Prize: (Two teams were rated 2nd) (a) Surya Mani Sharma, “Multi-

functional Robotic System”, Dronacharya College of Engg, Gurgaon

(Haryana) (b) Phagun Singh Baya, “Remote Wireless Sensors Analysis

and Controlling”, CTAE, Udaipur (Rajasthan) 3rd Prize: Kalyani Joshi & Madhuri Jadhav, “Data Transfer between tow USBs without Computer”,

PES Modern College of Engg, Pune (Maharashtra).

In the brief inaugural ceremony, Prof. N S Rathore, Dean,

CTAE welcomed the contestants and the guests of the function

and encouraged students to contribute to various problem areas of

agricultural Engineering. The Chief Guest Dr. Rajveer Shekhawat, RSC 3

& National Convenor of the 2013 contest provided the background of

the contest and briefed on activities held before the fi nals in various

regions. Dr. Dharm Singh, Organising Secretary of the National fi nals at

CTAE introduced the audience about the function. During the ceremony,

the students had the privilege of listening to the expert talk by Prof S

V Raghavan, President (Elect) CSI who joined the audience through

Video Conference from Delhi. Prof. R K Vyas RVP Reg I and Prof. Durgesh

Kumar Misra from Indore chapter were part of the jury.

Regional rounds had been conducted with equal enthusiasm.

Region 1 round was organized by Dronacharya College of Engg, Gurgaon.

Region 3 event was organised online. The Region 5 round was held at

“Vignan Nirula Institute of Technology for Women”, Parkala, Guntur.

Region 6 round was held at Cummins College of Engineering for Women,

Pune. The School of Computer Science and Technology of Karunya

University, Coimbatore conducted the Region 7 round. In all over 150

teams presented their projects at the various rounds. The contest was

coordinated by Mr. Ranga Rajagopal, National Student Coordinator of

CSI. It shall be our endeavour to have more teams participating from all

regions of CSI in the coming years. All the winning projects will be made

available on CSI Digital Library www.csidl.org. n


On 25th March 2013, the CSI student chapter

was inaugurated at GITA, Bhubaneswar. A

seminar was conducted by Division IV CSI on,

“Recent Trends on Computer Security”.

Prof. (Dr.) Sudarshan Padhi, Director,

Institute of Mathematics & Applications,

Bhubaneswar & eminent computer scientist,

attended the seminar as key note speaker.

Prof. Padhi delivered his key note address that,

computer security is an over changing issue.

50 years ago computer security was mainly

concerned with the physical devices that made

up the computer. At this time, these were the

high value items that an organization could not

aff ord to lose. Today, computer equipment is

inexpensive, compared to the value of the data

processed by the computer. Now the high level

item is not the machine or computer hardware,

but the information that it stores and process is

more valuable. This has fundamentally changed

the focus of computer security, from what it

was in early years. Today, the data stored and

processed by computer is more valuable than

the hardware. Today, one of the most eff ective

measures is security professionals can take to

address attacks on their computer systems and

networks, is to ensure that all software is up to

date in terms of vendor released patches. Virus

and worms are just two types of threats that fall

under the general heading malware. The term

malware comes from ‘Malicious software ‘,

which describes the overall purpose of code that

falls into this category of threat. We will discuss

in details about – Recent trends in computer

security & steps to minimize the possibility of

attack on a system during today’s discussion.

Division IV Chairman, CSI Sanjay

Mohapatra, participated as Honorable guest

for this seminar and spoke about CSI & its

student chapter activities. He also discussed on

diff erent issues of Computer Security.

GITA College Principal, Vice –Principal,

Dean Academics, HOD CSE Department GITA

were present in this seminar and addressed

students. Prof. Manoj K Pradhan, Student Branch

Coordinator proposed the vote of thanks. There

were around 200 student participants, and 20

faculty members present in the seminar. n

CSI Report

Division IV, CSI - Seminar Report on “Recent Trends On Computer Security” on 25th March, 2013 @ GITA, Bhubaneswar

Sanjay Mohapatra* & Prof. Ratchita Mishra***Chairman, Division – IV, CSI**RSC, Region-IV

Report on Eastern Regional Convention 2013 on “Computing Anywhere, Anyware" @ Bhubaneswar Organized by – Region IV, CSI & Division IV (Communications), CSI

The “Eastern Regional Convention 2013 on Computing Anywhere, Anyware" was conducted at CV Raman College of Engineering,

Bhubaneswar from 25th to 27th Feb 2013. This conference was a joint

eff ort of the CSI student branch, in C.V Raman College of Engineering,

the CSI Region IV, and CSI Division IV (Communication). The

convention was inaugurated by Prof. Dr. Ganapati Panda, Deputy Director IIT Bhubaneswar. In his inaugural address, Dr. Panda explained

how evolution of technology has made computing possible anywhere,

using small to micro devices, which can be placed anywhere, be it in

our surroundings, in the air, water, buildings, homes, and also on and

inside our body. These devices which are sensors with processing power

and memory can collect real time data, and process large amount of

information for several types of applications, meant to help the society,

like predicting water level, controlling irrigation or fl ood, managing road

traffi c, etc. The Computer Network of today is a hybrid of LAN/WAN,

Mobile Network, and Wireless Networks. Some of the challenges that

come with advancement of these technologies are - standardization

of communication protocol, managing power to the wireless ad-hoc

devices widely dispersed, and supporting software to process large

amount of data in parallel.

Mr. A Pal, Principal Scientist and Research Head, Innovation Lab, TCS Kolkata, speaking on “Grid Computing for Internet-of-things” elaborated the concept of connecting all computing devices

on the interne,t to exploit the unused personal computing power by

others. However, this has issues of security and privacy apart from

the challenges of creating a seamless fl ow of information in a grid of

dissimilar computing devices. Mr. S Kanungo, Head, Marketing and Alliance, Cloud Practice, Tech Mahindra spoke on “Consumerization with Mobile Apps & Advantage / Challenges of Cloud Computing”. He

explained on the new concept called Fog Computing, which provides

services like Cloud for mobile devices. He elaborated several services

and projects that Mahindra Satyam is providing in this area, so that

clients do not have to invest in purchasing application software, and

even a workstation (desktop with all its installed software) can be

made available over cloud at a distant place. Mr. S Panda, CEO, Syum Technology spoke on “Enterprise Mobile Application Development Strategies”. He elaborated how one can learn to develop and deploy

applications on mobile development platforms like android, iOS etc., for

a more general deployment over any mobile device, one can use deploy

as an enterprise internet application. As an alternate strategy a Hybrid

approach of using an internet application with few native device OS

features can be used.

More than 300 students and faculties attended the convention.

Prof. Dr. K C Patra, Director CVRGI presided over the inauguration, and

closing functions. Mr. Sanjay Mohapatra, the chairman, CSI Division IV was the guest of honor. He appreciated the eff ort of CV Raman in

CSI activities, and encouraged the CSI student members to take part

in such academic activities. To encourage the students a Web-design

competition was held among the participants. The convention continued

with a 2-day detailed hands-on workshop on “Android Software Development”, for mobile devices conducted by C2S Technology. In

the closing function the Director, CVRGI, Registrar, CVRGI, the C2S

experts, and Chairman CSI div IV gave away prizes and certifi cates to the

participants. Prof. Dr R Misra, the CSI Regional Student Counselor and

Mr. D Mohanty, the Student Branch Counselor presented mementoes,

and vote of thanks to the guests and participants.

n


CSI Report

International Conference on Information Systems and Computer Networks: ISCON 2013

Dr. Dilip Kumar Sharma*, Mr. Sanjay Mohapatra** and Mr. R K Vyas****Honorary Secretary- Computer Society of India Mathura Chapter**Chairperson, Div IV, CSI***Vice President, CSI Region-1

An International Conference on “Information Systems and Computer Networks: ISCON-2013”, was organized in GLA University, Mathura, on 9-10 March 2013, in technical collaboration with IEEE UP Section and CSI Mathura Chapter, Division: IV & Region-1 . It was co-sponsored by Indian Oil Corporation Ltd (IOCL). The Chief Guest of the conference was Prof. S K Koul, Deputy Director (Strategy and Planning), IIT Delhi. Mr. R K Vyas, Vice President, CSI Region-1, Prof. M N Hoda, Director, Bharti Vidyapeeth, Delhi & Regional Students’ coordinator, CSI Region-1, Prof. S K Gupta, Department of Computer Science and Engineering, IIT Delhi, and other dignitaries were present at the Conference. The General Chair for the Conference was Prof. Krishna Kant, Head, Department of Computer Engineering and Applications, GLA University, Mathura.

Prof. S K Koul addressed the conference and highlighted some points on being successful: “THINK BIG, WORK TOGETHER AND INNOVATIVELY, AND, GIVE MORE AND TAKE MORE”. He also guided the participants on writing technical papers.

Prof. S V Raghvan, Vice President, CSI and Scientifi c Secretary, Offi ce of the Principal Scientifi c Adviser to the Government of India, New Delhi, also addressed the gathering through his recorded video, in which he highlighted the advancements in the domain of electricals and electronics, networking, and semiconductors. He also talked about the National Knowledge Network (NKN), which is a state-of-the-art multi-gigabit pan-India network, for providing a unifi ed high speed network backbone, for all

knowledge related institutions in the country. He said that the purpose of such a knowledge network goes to the very core of the country's quest for building quality institutions, with requisite research facilities and creating a pool of highly trained professionals. In the coming future, the NKN will enable scientists, researchers, and students from diff erent backgrounds, and diverse geographies to work closely for advancing human development in critical and emerging areas.

Mr. R K Vyas highlighted that the researchers should collaborate with the Industry, and share their research in order to get good exposure. He also warned about dangers of using emails through servers owned by other agencies. He advised use of mail-servers owned by user organization.

Prof. S K Gupta delivered a key note on cyber crime. He mainly focused on the cyber crimes being done by using plastic cards. He quoted that there is always two types of identities namely primary and secondary, where the former remains the same throughout the users’ life, and the latter can be changed. There must also be a provision to delete the person’s identity when the person expires.

ISCON 2013, papers were classifi ed in two tracks: Track 1: Information System & Track 2: Computer Networks. The organizers received 233 research papers in al,l from various academicians and industry people from India and abroad. Out of which there were 219 valid submissions. 132 research papers were of track 1 and 87 papers were of track 2.

All submitted papers underwent a rigorous process of two level reviews. First, the papers were matched with the relevance of the conference, and then checked for plagiarism. After this level of review, 108 papers were shortlisted and each paper was sent to two esteemed external reviewers from institutions of repute. After second level of review, 67 papers were selected out of which 47 papers were of track 1 and 20 papers were of track 2.

There were 45 registrations against these 67 accepted papers and out of which 36 papers were presented in seven sessions. In each session, the best paper was selected for award sponsored by McGraw Hill Education.

The conference valedictory session was organized at 2.30 p.m. on 10th March. Prof. S N Singh, Chairperson, IEEE UP Section graced the occasion as the Chief Guest and Pro.f Jai Prakash, Vice Chancellor, GLA University, Mathura graced the occasion as the Chairperson.

Announcement48th Annual Convention CSI 2013 Brochure Released

Visakhapatnam Chapter

CSI Annual National convention CSI 2013 is being organised by the

Visakhapatnam Chapter in association with Visakhapatnam Steel

Plant during 13th -15th Dec 2013 at Hotel Novotel, Visakhapatnam. The

theme of the Annual Convention is “IT FOR EXCELLENCE” . It will be

held in Visakhapatnam for the 1st time in the history of CSI, since its

inception 48 years ago in India.

To mark the beginning of the arrangements of the annual

convention, a colourfully designed brochure giving various details of the

Convention was released by Sri Umeshchandra, Director (Operations)

and past Chairman, Visakhapatnam Chapter in the august presence of

Mr. S Ramanathan, Hon. Secretary, CSI and Sri HR Mohan, President

elect at a programme organised at Visakhapatnam Steel Plant on 16th

Mar 2013. The central committee visited Visakhapatnam to review the

facilities available for CSI-2013.

Speaking on the occasion, Mr. S Ramanathan expressed confi dence

that the Visakhapatnam Chapter of CSI with the all-round support of the

PSU giant Visakhapatnam Steel plant will make the Annual convention

the most memorable one in the history of CSI.

Sri HR Mohan, President elect, said that the Visakhapatnam

Chapter has proved its worth by organising several mega events

related to IT very successfully and that’s the reason s why they chose

Visakhaptanam Chapter to conduct this prestigious National convention

for 2013.

Chair, Sri CK Chand, Sri P Ramudu, Executive Director (Auto & IT),

Vice Chair Sri KVSS Rajeswara Rao, GM(IT), Addl. Vice Chair Sri Suman

Das, DGM(IT), Sri Paramata Satyanarayana, Convener of Org.Committee

and Sri GN Murthy, ED (Finance) & Chair, Finance Committee, Sri DN Rao,

ED (Services)& Chair of Convention Committee and Dr. S R Gollapudi,

Convener, Advisory Committee were present this occasion.


To encourage innovation and indigenous development in the fi eld of Information Technology, CSI has instituted awards for Young IT Professional, Entrepreneurs and Researchers who are trying to achieve extra ordinary feat in the fi eld of Computer Science and Technology by implementing IT projects for better delivery of services.

CSI Awards for Young IT Professionals started its journey in the year 1999. Today it has attained a remarkable height and visibility by becoming the icons of excellence in IT applications for Young IT Professionals.

For the year 2012, a well-planned approach for publicity and involving regions and chapters for CSI National Young IT Professional Award was adopted. Each Regional Vice President provided guidance and support to respective Regional YITP Convener in hosting Regional Round at host

chapter. An announcement of YITP awards was sent to all the chapters, corporate members, institution members and IT companies including announcements in CSI Communications and hosted on CSI website. This resulted into getting many nominations. After short listing the nominations, 40 teams comprising of professionals from IT companies, technical institutes, entrepreneurs and researchers participated at Regional Level.

CSI YITP| Awards maintains absolute transparency in an objective and merit based selection process. This year the evaluation process included 2-tier selection process to select the Winners, Runners-up & Special Mention. The regional round was conducted at Kolkata, Ahmedabad, Bhilai, Bangalore, Nashik and Chennai Chapter.

With the support of Regional Vice Presidents and Regional YITP Conveners, the regional round completion was successfully conducted on the above six regions.

Details of the regional round competition can be viewed on the CSI website under CSI News section.

From each region, Winner and Runner up teams were invited for fi nal round of competition on 6th March, 2013 at PSG College of Technology, Coimbatore. Total 11 teams presented their projects to the selection committee members. The selection committee members were Dr. Subramaniam, Past Chairman CSI; Mr. John Milton, Robert Bosch; Ms. Pandi Selvi, Robert Bosch; Mr. Sebastian Christopher, CTS and Mr. Isai Amudan, CTS. Mr Bipin Mehta and Mr. Ranga Raj Gopal coordinated the fi nal round which was supported by Mr. N Valliappan, Secretary, CSI Coimbatore Chapter.

The most outstanding technology project of any kind, completed during the year 2011-12 where project duration could be of 2-3 years from the start date, within an organisation were judged. The selection committee considered many factors to judge each project like criticality of IT usage, improvement of customer service, innovation, quality of management and impact on organization and society. It was a challenge to selection committee to decide on the winners. The selection committee unanimously declared the Winners, Runner-up and Special Mention Award Winner as under:

The results of the National Round were declared and awards were presented on 6th March, 2013 on the auspicious occasion of 48th CSI Foundation Day at PSG College of Technology. The chief guest for the award function was Dr. R Rudramoorthy, Principal, PSG College of Technology, who inaugurated the contest.

In National Round, the winner received Rs. 50,000, a trophy, and a certifi cate. The runner-up received Rs. 25,000, a trophy, and a certifi cate, while the team that received special recognition got Rs. 15,000.

The contest aimed at involving young IT professionals in the quest for innovation in IT and also providing them an opportunity to demonstrate their knowledge, professional prowess and excellence in

their profession. n

CSI Report

*CSI National Young IT Professional Awards – 2012

Bipin V Mehta* and S M F Pasha***Fellow, CSI; National Convener, YITP Awards**Manager, CSI Headquarter

Mr Bipin Mehta, Dr. R. Rudramoorthy, Mr. Ranga Raj Gopal

and Award Winners

Region /Result Participant Organization Project

VII/Winner J. Jeminaa Asnoth

Sylvia

Jerusalem

College of

Engineering

Voice activated

Solar Powered

Wheel Chair

VI/Runner-UpTamal Dey

Abhra Pal

Lahari Sengupta

Centre for

Development

of Advanced

Computing

(C-DAC), Kolkata

Resham

Darshan- A

Machine

Vision Solution

For Colour

Characterization

Of Silk Yarns

II/ Special Mention

Rohit Dilip

Bhosale

Kartik Girish Vyas

Kumar Aditya

Persistent

Systems Ltd.

Viewer

Engagement

Analytics

**Report on 48th CSI Foundation Day at TIFR, MumbaiCSI celebrated its 48th Foundation Day on 6th March, 2013 at TIFR, Mumbai. The event was started with the welcome note by Mr. VL Mehta, Honorary treasurer, CSI. Mr. MD Agrawal (Academic Committee Chairman, CSI) spoke on Challenges in Education and Role of CSI. He asserted that CSI can explore Research Opportunities through various collaborations.

Overview and objectives of Foundation Day was described by Prof. R K Shyamasundar. He spoke about signifi cance of ‘computing power’ and how dependency of other streams is increasing on it day by day.

Major highlight of the event was CSI Founder Prof. R. Narsimhan Lecture which was delivered by Mr. SD Shibulal, CEO & MD, Infosys. He

emphasised on importance of ICT and how it is changing our day-to-day life. In the phenomenon of Globalization, IT plays pivotal role for bringing changes. However it is diffi cult to say which is leading what; globalization or IT. He talked about contribution and achievements of Infosys in the growth of the society. Rear Admiral SP Lal, VSM, CSO (Tech), HQ WNC, Chief Guest of the event highlighted the signifi cance of IT in defence. Now wars are not only on land, air and water. In last couple of decades, other mode of war has emerged, popularly known as Cyber War. Rear Admiral SP Lal appreciated CSI’s contribution

in strengthening Naval Command through its IT enabled training programmes.

Panel Discussion on ‘Education and Research’ was another attraction of the event. The session was concluded by Mr. Ravi Eppaturi, Chairman Mumbai Chapter and Mr. Dilip Ganeriwal, Vice Chairman Mumbai Chapter

anchored the show very eff ectively. n


Participants:Padmabhushan Dr FC Kohli, Padmashri Prof. DN Phatak, Dr Nirmal Jain, Prof. SP Mudur and Padmashri Prof. PVS Rao (Panel Moderator)

Opening RemarksAt the outset, Panel Moderator Prof PVS Rao reminded audience that CSI is actually a year older than commonly believed; it is successor to the All India Computer User’s Group (AICUG) which was formally started in Faridabad near New Delhi in 1964, (few days after sad demise of Pandit Jawaharlal Nehru) and renamed itself as the Computer Society of India one year later, in 1965. He paid homage to Major General A Balasubrahmanian, Late Prof Bishwajit Nag and Mr. SR Thakur, who along with himself and a few others, started the AICUG in 1964.

After welcoming and introducing the panellists, Prof. Rao stated that topic being IT Education and Research, focus will be on leveraging India’s progress in IT for accelerating the pace of National Development and in development of human resources. Coverage would include IT Education itself as well as use of IT in Education. IT research would necessarily include applied aspects (such as software engineering, computer aided engineering activity and so on), which facilitate and catalyse development of IT. It would also necessarily include research as an end in itself (e.g. Theoretical Computer Science). A question to be addressed is as to how best our competence can be leveraged to help in economic development, increasing exports and growing national wealth.

Speaking about education in general, Dr FC Kohli said, going by population, India should have three to four times as many bright students as there are in the USA. On the other hand, the annual output is only seven or eight hundred Ph D’s, a number that does not even meet the (teaching) faculty requirements of the various academic institutions already in the country, let alone the numbers needed for research and innovation. This gap needs to be bridged; about 50 colleges have been identifi ed in the country, which can, with proper inputs, bright students and trained faculty, be expected to produce up to 35000 world class graduates annually; of these, 6000 will go on to become Ph. D’s (as against the current output of only 800).

Prof. DN Phatak emphasised that need of the day is not merely to ensure that IT training happens on a scale that matches very large numbers of graduates needed; it is most important to provide high quality education. IIT Mumbai is addressing this by training teachers in thousands in a tiered structure so that they can in turn provide quality training to freshers. The trainee teachers are grouped at 40 to 50 widely distributed centers. IIT beams courses covering full range of subjects on-line to these centers in the mornings. Pre-trained course coordinators are available at each of the individual centers for interaction with trainee teachers; they also run tutorials and practical sessions in the afternoons.

Prof. SP Mudur spoke about qualitative changes that have occurred over the three decades that he has been in teaching. He cited teacher evaluation system prevalent in Canadian Universities, which lets the student assess the competence of their teachers. In this process, younger teachers are often graded higher than experienced seniors, mainly because older faculty fi nd it diffi cult to keep up with changes that are happening. Earlier, student-teacher interaction was restricted to only face-to-face interaction but has been greatly enriched through social networking (U-tube, Face book, Twitter, blogging and so on). Massively On-line Open Courses (MOOCs) such as those off ered by MIT are easily available from many reputed universities to students worldwide. Soon, it might be possible to take such courses from multiple universities even for credits. Blended learning (a combination of Face to face as well as on line learning) will become pervasive and important. Scaling will happen as large numbers of students are attracted by high reputation of institutions off ering MOOCs. In closing, he mentioned that for the next few years at least, the job situation will continue to be very good for students

specialising in Science, Technology, Engineering and Management (STEM).Talking about Industry-Academia interaction, Dr Nirmal Jain

emphasised that there are multiple ways of learning and interaction between the two. Often, there is a disconnect between material taught in the Universities on the one hand and what industry really needs on the other.

Only constant interaction between educational institutions and industry can bring about a better match between course curricula and industry requirements. Fortunately, pressures of competition are strongly motivating Industry to interact closely with academic institutions in the hope of gaining a competitive edge by leveraging on the innovations that happen there. It is up to the industry to make many more such collaborations happen.

Intra-panel interactionDuring subsequent interaction between panellists, Dr Kohli pointed out that even in today’s context of ever increasing prevalence of social networking, face-to-face person-to-person interaction continues to be very crucial. Agreeing, Professor Phatak said the idea is to start (innovative modern methods) in a small way initially and scale up as the process succeeds and proves itself. Dr Jain remarked that as we grow older, rather than just do our jobs, all of us are becoming increasingly interested in and involved with aspects relating to teaching and training; this highlights how important these issues are.

Audience interaction with the PanelQuestion: Hands on experience (as in internship and hospital assignments for medical students) is very important even during IT learning.

Prof. Rao: This is true; it happens in many areas such as legal profession and journalism. It can happen via internships in industry by students, by having adjunct professors (with practical experience) from industry, two-way movement of people at senior levels between industry and educational institutions etc.

Q. Several specifi c courses needed by students may not be available on line.

Prof. Phatak: Today, NOOCs are wide ranging, up-to-date and meaningful.

Q: In many cases, teachers lack passion.

Prof. Phatak: Passion is contagious. It can and does spread downwards (from teacher to student) as well as upwards (from student to teacher). Not just courses, examination pattern is also important. It is essential that students are properly tested to assess how well they have assimilated what has been tought. Hence, teaching and evaluation have to be done by the same person.

Q: Given that students of today can access good courses online, classroom attendance should not be compulsory as at present; it should be optional.

Prof. Mudur: Attendance is optional in Canadian and other universities. However, students must do their classroom assignments. They are assessed on these and on their performance in tests.

Q: There are three aspects to testing: learning, test and remedy (adaptations and corrections to existing methods to take care of defi ciencies in teaching and/or in the learning).

Prof. Phatak: To facilitate this, it is best to have small classes. Testing should happen while teaching, so as to facilitate on the spot adaptation of teaching methods as needed.

Q: Established supervisory institutions such as AICTE resist change; there is also problem lack of political will to bring about change.

Prof. Phatak: Things will change, they have to change as otherwise the system will collapse.

Q: What is the overall standard of on-line education? How are open universities such as IGNOU faring?

Prof. Phatak: These are means for taking education to large numbers. IGNOU is doing well. n

CSI ReportDr PVS RaoFellow and Past President of CSI

CSI Foundation Day – Panel Discussion on IT Education and Research

CSI Communications | April 2013 | 46 www.csi-india.orgCSCSCSII I CoCoCommmmmmunununiciccatatioionsnss ||| ApApA ririr ll 2220101133 || 46466 wwwwwww.w.w.cscscsi-ii-ininindididiaa.a.orororgggg

CSI News

From CSI Chapters »Please check detailed news at:

http://www.csi-india.org/web/guest/csic-chapters-sbs-news

SPEAKER(S) TOPIC AND GIST

GURGAON (REGION I)Mr. Vivek Varshney, Mr. R K Vyas, Prof. M N Hoda, Prof.

D K Lobiyal, Prof. S K Muttoo and Prof. Jitender Kumar

2 March 2013: CSI Regional level Student Project Contest-2013

The contest aimed at involving students in IT innovation and to provide

them an opportunity to demonstrate their projects with strong social

relevance. First prize -Mr. Surya Mani Sharma from DCE, Gurgaon project

“Multifunctional Robotic System”. Second prize for “Wear Your World”

to Ms. Monica Bansal and Mr. Deepak Kumar. The project “Android

Application” at third position demonstrated by Mr. Shashank Sharma and

Mr. Paras from BVICAM, New Delhi.

Hon’ble Principal giving a trophy to the Chief-Guest

KANPUR (REGION I)Dr. H C Karnick, Dr. Brijendra Singh, Dr. Phalguni Gupta,

Dr. Alok Tiwari and Dr. Raghuraj Singh

9 March 2013: National Seminar on “Issues and Challenges of Computer Science & Engineering as a Discipline”

Seminar was jointly organized with Dept of Computer Science & Engineering,

Harcourt Butler Technological Institute. A souvenir containing abstracts of

invited lectures, expert views of academicians and articles on the seminar

theme was released on this occasion. CSI Kanpur Chapter website http://

www.csi-kanpur.org was launched during the seminar.

Guests while releasing the Souvenir

LUCKNOW (REGION I)Mr. Amit Khanna and Prof. Bharat Bhaskar 15 March 2013: Technical Session on “nComputing”

Thel session was organized at NIEIT, Lucknow centre in association with

nComputing and M/s M Intergraph. It was attended by more than 50

participants. During the presentation Mr. Amit Khanna explained the

benefi ts, usage and other related details of the product. Prof. Bharat Bhaskar,

IIM Lucknow & Chairman of CSI Lucknow Chapter introduced the session.

Mr. Amit Khanna, Director, nComputi ng during his presentati on

HYDERABAD (REGION V)Dr. Pratap Reddy Sir 2 March 2013: Event titled “CHALLENGE EXPO-13”

Participants demonstrated Project Exhibits and presented Posters during

this event. Around 70 projects were exhibited and 10 posters were

presented. The event is organized under the guidance of Dr. Pratap Reddy,

who presided over as Chief Guest and Judge. Winners were given cash

prizes and participation certifi cates. Details of the event can be found on

the url: http://www.dprec.ac.in/challengeexpo13.html along with process of

conducting the event and photographs.

Organizers and parti cipants of the event

CSI Communications | April 2013 | 47CSCSC II CoCoCommmmmmunununicicatatatioioi nsns || ApApApririril ll 202020131313 ||| 474747

SPEAKER(S) TOPIC AND GIST

VISAKHAPATNAM (REGION V)Mr. Ganta Srinath Reddy 9 January 2013: Guest Lecture on “Android based application Development”

Objective of the lecture was to cover basics of android applications

development testing and its deployment. The program off ered enough

depth to enable attendees to set up application development environment,

test and then deploy the applications for use. Program motivated students

to get further information on the subject and develop their own projects.

Speaker delivering lecture.

Mr. G Santosh Kumar, Mr. Ganta Srinath Reddy, and

Mr. Krishna Vattipalli

1-2 February 2013: Southern Regional Conference on “Innovative Technologies (SRCIT)” 2012-13

Mr. Santosh Kumar covered various areas of hacking, respective counter

measures and preparedness. He covered topics such as ethical hacking,

major vulnerabilities, basic protection mechanism and penetration testing

techniques and importance of its fi ndings. Mr. Reddy gave introduction to

Android programming and deployment with hands on demonstration. Mr.

Vattipalli explained deployment and related development on Android and

Google’s cloud platform GAE (Google App. Engine).

Inaugural Program for SRCIT VIZAG-2013

NASHIK (REGION VI)Prof. Pradeep Pendse, Mr. Ajit Jagtap, Mr. Hussain

Dahodwala, Mr. Sagar Javkhedkar, Mr. K Rajeev,

Mr. Shashank Todwal, Mr. Satish Babu, Mr. Mahesh

Bhat, Mr. Vinay Hinge, and Mr. Sunil Khandbahale

8-9 February 2013: Western Region Conference on “NextGen Computing”

The event was organized jointly with Sandip Polytechnic. Hon. Shivajirao

Patil was felicitated with “Yashokirtee” Puraskar. Technical sessions

included “Bring Your Own Device” by Dr. Pendse, “Translation & MKCL

Supercampus” by Ajit Jagtap and “Data Centers” by Mr. Dahodwala. Mr.

Javkhedkar delivered talk on Mobile Applications and Mr. Rajeev spoke on

“Cloud Computing”. Other sessions were “Google Apps” by Mr. Todwal,

“Free and Open Source Software (FOSS)” by Mr. Babu, “Cloud Security” by

Mr. Bhat, “Big Data” by Mr. Hinge and “Inclusive Innovations: A case study

of Language Dictionary” by Mr. Khandbahale.

(L to R:) Principal Gandhe, Principal Tate, Mr. Chandrashekhar Sahasrabuddhe, Mr. Avinash Shirode, Hon. Shivajirao Pati l, Mr. Ashok Kataria, Ms. Mohini Pati l, Principal Prashant Pati l and Shri Shrikant Karode

TRIVANDRUM (REGION VII)Mr. Suneeth Natarajan 16 February 2013: One day workshop on “Six Sigma Methodology”

The workshop provided an overview on adopting best practices from the

Six Sigma methodology. Content covered topics such as - Evaluating

organization performance, Creating a culture of shared responsibility to drive

performance, Identifying and defi ning key areas for improvement, Setting

improvement goals and targets, Finding sponsorship, Creating metrics

driven organization, Analyzing root causes and Piloting and implementing

change.

Resource person conducti ng workshop


From Student Branches » http://www.csi-india.org/web/guest/csic-chapters-sbs-news

SPEAKER(S) TOPIC AND GISTAES INSTITUTE OF COMPUTER STUDIES (AESICS), AHMEDABAD (REGION-I)Dr. Vikram Parmar and Dr. Neeraj Sonalkar 23 January 2013: Seminar on “Venture Studio – Centre for Innovative

Business Design”

Dr. Parmar explained how Venture Studio aims to nucleate an ecosystem

of innovation that accelerates regional economic development. Dr. Sonalkar

explained venture design process and how to work in teams to identify

critical market needs, generate and prototype novel solutions and develop

business models to launch scalable businesses to satisfy such needs.

Dr. Parmar encouraged students to think out of box and culti vate new ideas and soluti ons for solving societal and industrial problems

Mr. Sunil Gulabani 25 January 2013: Seminar on “Cloud Computing - Industry Case Studies”

Mr. Gulabani started with cloud computing basics and various services

provided by diff erent cloud providers. Later he shared various cloud

computing case studies using Amazon Web Services (AWS) cloud, Redhat

Openshift Cloud, Google App Engine and Tumblr. He gave live demo and

explained technical architecture of cloud application development using

Eclipse and cloud APIs. He also provided innovative project ideas using

Google wallet, Geo-based social networking/taxi service, fi ght back

application and mobile e-learning among others.

Mr. Sunil Gulabani, IndiaNic Infotech Pvt. Ltd. shared his industry experience on cloud computi ng during the seminar

Shri Hemant Sonawala 16 February 2013: Lecture on “Current and Emerging Trends in ICT, Employment Opportunities and Benefi ts of Professional Society Membership”

Mr. Sonawala emphasized on information sharing for increasing knowledge.

He advised students to use technology for betterment instead of misusing it.

He mentioned that technology changes rapidly and hence students should

try to achieve expertise in applications and not in tools. He also made

students aware about their social responsibilities. Certifi cates and trophies

were awarded to students for their outstanding performance in academics

and also for the best System Development Projects.

Mrs. Hemal Desai, Shri Hemant Sonawala and Prof. Bipin Mehta

SARDAR VALLABHBHAI PATEL INSTITUTE OF TECHNOLOGY (SVIT), VASAD, GUJRAT (REGION-III)Dr. Varang Acharya and Mrs. Bharti Trivedi 9 February 2013: Inter-college Annual Fest “SAKSHAM ’13: Carve Your Niche”

Inter-college Annual Fest was organized wherein various technical and

online events were covered along with a Seminar. Dr. Acharya was the Chief

Guest and Mrs. Bharti Trivedi was the Guest of Honour for the inauguration

ceremony. The SVIT Student Branch also launched its website, http://csi.

svitvasad.ac.in as well as website for SAKSHAM to allow members and

students to interact. It also tied up with bachpan – an NGO that visits slum

kids and teaches them at home.

Prof. Hetal Bhavsar (SBC), Dr. V R Panchal (Principal), Dr. Varang Acharya (Chief Guest), Mrs. Bharti Trivedi (Guest of Honour),Mrs. Bijal Talait (HOD of CE Dept), Prof. Sameer Chauhan(HOD of IT Dept) and Prof. Sohail Pandya (HOD of MCA Dept)

CSI Communications | April 2013 | 49CSCSC II CoCoCommmmmmunununicici atatatioioi nsns || ApApApririril ll 202020131313 ||| 494949

SPEAKER(S) TOPIC AND GISTKLE INSTITUTE OF TECHNOLOGY (KLEIT), HUBLI, KARNATAKA (REGION-V)Mr. Arunkumar and M Khannur 15 February 2013: One day workshop on “Software Testing and Career

Avenues”

Mr Khannur talked on software structure and software testing basics. Later

he gave brief introduction to Basic Black Box Testing Techniques. Guidance

on career aspects in software testing was also provided. Workshop was

concluded by felicitating Mr. Khannur and distributing certifi cates to

students.

Resource persons and organisers of workshop

VASAVI COLLEGE OF ENGINEERING (VCE), HYDERABAD (REGION-V)10 January 2013: Mini Project Competition

Main objective of the competition was to motivate students to work on

mini projects and develop quality and usable applications. Competition

also helped to enhance their presentation skills as they had to present their

projects on a poster. 6 teams with 2 members in each team participated

in the competition. One of teams was awarded a merit certifi cate and

participation certifi cates were distributed to others.

Students presenti ng their projects on poster

VASAVI COLLEGE OF ENGINEERING (VCE), HYDERABAD (REGION-V)Mr. Vengal Reddy and Dr. P Radhakrishna 10-11 & 17 February 2013: Series of Guest Lectures on “DWDM - Data

Warehousing and Data Mining"

Mr. Vengal Reddy spoke on ‘Data Mining’ & ‘Latest Trends in DWDM’ and

Dr. P. Radhakrishna spoke on developing research attitude among students

on ever expanding Big Data. Mr. Reddy also discussed about various fi elds

of applications and suggested topics for research.

Mr. Vengal Reddy, Product Technical Architect, Infosys Technologies, conducti ng lecture

VITS COLLEGE OF ENGINEERING, VISAKHAPATNAM (REGION-V)Dr. B Muralikrishna, Principal, Mr. B Narendra, Prof. G

Rajasekharam, Mr. B Ravichandra, Prof. K Shankar, and

Mr. A Ramkumar

26 February 2013: Computer Awareness Camp

VITS College of Engineering in collaboration with VITS CSI student chapter,

conducted NSS program at Mamidilova village, Sontyam, Anandapuram

Mandalam, Visakhapatnam. Students of Mandala Prajaparishat Primary

School were taught basics of Computers and were provided with computer

basics materiel.

Students of VITS and School students

Dr. Valli Kumari and Principal Dr. B Murali Krishna 27 February 2013: One day workshop on “Recent Trends in Embedded Systems & Soft Computing Techniques"

Dr. Valli Kumari explained various methodologies and modalities present in

the fi eld of Image Processing. She provided detailed insight for narrowing

down the gap between low level image features and human interpretation of

the image. She also explained basic concepts of Soft Computing techniques

for Image Processing applications.

Honoring the Guest Dr. Valli Kumari

Please send your event news to [email protected] . Low resolution photos and news without gist will not be published. Please send only 1 photo per

event, not more. Kindly note that news received on or before 20th of a month will only be considered for publishing in the CSIC of the following month.


SPEAKER(S) TOPIC AND GISTSRINIVASA RAMANUJAN CENTER, SASTRA UNIVERSITY, KUMBHAKONAM, TAMILNADU (REGION-VII)Mr. Amit Grover and Mr. Siddharth Goyal 16-17 February 2013: Two-days Workshop on “Web-Entrepreneur”

Resource persons spoke about how to generate ideas to become a Web-

Entrepreneur, how to work with CSS, Word Press, CMS and various topics.

As a motivation for participants, competition was conducted and top

three teams were awarded with certifi cate of achievement along with

entry to national level Tech Hunt 2013.

Certi fi cate Distributi on

SREE BUDDHA COLLEGE OF ENGINEERING (SBCE), ALAPPUZHA, KERALA (REGION-VII)Mr. Subhash E P 15 February 2013: One-day Workshop on “Android”

Mr. Subhash is Offi cial Trainer of Oracle University for delivering Live

Virtual Class to Oracle customers in the North America and Former

Offi cial Trainer of Borland Corporation for delivering Borland ALM suit

of production. He gave a very comprehensive description about Android

Operating System and its features and demonstrated how to develop

applications on Android platform.

Mr. Subhash E P during the workshop

Following new Student Branches Were Opened as Detailed Below –

REGION I JRE group of Institutions, Greater Noida

JRE-School of Engineering inaugurated CSI-Student Branch and JRE SOE Project Center on 16th Feb, 2013. Technical talk on

RoCK-BEE: Robotics Competition Knowledge Based Education in Engineering” by Prof Saha and lecture on “Soft Computing and

Its Applications” by Prof. M MSufyan Beg were organized on this occasion.

REGION V Kakinada Institute of Engineering & Technology (KIET)-II, Kakinada

On 12th February, 2013, CSI student branch was inaugurated and seminar on “Cloud Computing” was organized. There was a talk

delivered by Dr. C V S Murty, on research methodologies and also a talk by Mr. Sekhar Kammula, who spoke about social activity

named ‘I CARE I REACT’. Mr. Naganand Rapaka spoke about cloud computing technologies.

REGION VI G. H. Raisoni Institute of Information Technology, Nagpur

On the occasion of inauguration of CSI student branch on 6th of February, 2013, motivational speech was given by Mr. Pratap Shukla,

on “Journey from Learning Software Programmer to Expert Solution Architect”. Other topics covered in the inauguration seminar

were - Programming Introductory Tools, Beginner Programmer Tools, Language Developer Tools, and Solution Architect Tools.

REGION VII E.G.S. Pillay Engineering College, Nagore, Nagapattinam

CSI Student Branch was launched at EGS Pillay Engineering College, on March 05, 2013. Participating in the event as the Chief

Guest, Mr. Ramasamy, Regional Vice President, explained about the activities to be taken up by branch, and techniques that

would help students to face examinations and interviews successfully.

REGION VII Sethu Institute of Technology, Sivakasi

CSI student branch inauguration was organized on February 27, 2013. Mr. Ramasamy, Vice Chairman, of CSI Chapter Chennai,

gave inaugural address and highlighted opportunities for professional development.


Announcement HR Mohan* and S. Ramanathan***Chair (Conference Committee)**Hon.Secretary

Call for Proposals for Events (2013-14)PreambleAs a technical and professional association, Computer Society of India has the mission of sharing knowledge, competency enhancement, promoting research, aiding education, and providing career enhancement opportunities for its individual and institutional stakeholders and partners.

CSl intends to conduct diff erent types of events during the year through its Chapters, Divisions, SlGs, Member Institutions, Partnering Organizations, and International Bodies such as IFIP and SEARCC. These events are intended to achieve one or more of the following outcomes:

• Publish peer-reviewed papers on computing, lT, lCTs, and related domains

• Provide enhanced awareness of new technologies

• Upgrade the skills of participants through exposure state-of-the-art direct, hands-on

• Share the output of research programmes

• Enhance employability, especially for new professionals

• Introduce CSl to new geographies and domains

• Provide platforms for exposing new technologies, products, and concepts

• Provide enhanced career opportunities to members

• Provide forums where socially relevant technology pilots and programmes can be taken up

• Strengthen the Organizational Units of CSI (such as Chapters, Student Branches etc.)

• Enhance the reach, penetration, and membership of CSI

• Provide opportunities for individual and institutional members to convene programmes that broadly benefi t the

cross-section of society

• Provide positions and perspectives on

ICT/IT issues of national importance and relevance

• Aid in bringing the benefi ts of Information Communications Technologies (lCTs) to all citizens of the country

Programme ProposalsPursuant to the above mission and objectives, CSI invites proposals from Chapters, Student Branches, Institutional members as well as individual members, for diff erent kinds of international, national, regional, and state-level events for the year 2013-14, including, but not limited to:

• Technical Conferences (national/international)

• Seminars • Workshops • Research Symposia • Faculty Development Programmes • Job Fairs • Exhibitions • lT initiation programmes for schools • Quiz programmes • Student conventions for college and

school students • Pilot programmes

These events will be organized by Chapters, Student Branches or other organizational units of CSl, and supported by the entire CSI ecosystem, viz.,

• CSI Headquarters • CSI Educational Directorate • RVPs & Divisional Chairs • National, Regional and State-level

Student Co-ordinators and other regional staff

• SIG Chairs (where appropriate) • Chapters and Student Branches • Partnering organizations (e.g., IEEE

Computer Society, CDAC, PMI) • Associated international/national

organizations (e.g., lFlP, SEARCC, IE, IETE, ISA)

Support from the state and central governments, trade associations, and business organizations can also be sought for these events.

All technical content generated through these events is expected to be hosted on CSI's Knowledge Management portal, in the form of a full-text searchable online digital repository. CSl also proposes to recognize events through awards in diff erent categories, based on parameters such as the quality of the event, participation, and surpluses generated.

Proposal GuidelinesKindly apply to [email protected] with the following particulars on or before 31 May 2013:

• Title of the event • Type of event (e.g., Seminar/

Workshop/Conference) • Hosting Unit(s) (e.g., Chapter/

Student Branch/Region/ Divisions/SlGs)

• Duration • Location • Topics and outline of the event

programmes • Proposed benefi ts to CSI and

members • Potential Partners and Sponsors • Target audience and size • Preliminary Budget (Revenue and

Surplus)

Kindly note that these events are expected to adhere to the provisions of the CSI Conferences Manual (please refer to the CSI Website) as applicable!

CSI proposes to recognize the contribution of members and institutions for organizing events in the form of awards. Criteria and category of awards will be announced soon. n

Dear CSI Member -

Your hard copy of CSI Communications magazine is sent to the address, which you have provided to CSI. Please ensure that

this address is correct and up-to-date.

In case you need any help from CSI, please write an email to [email protected] for assistance.

You may send your feedback and comments on the contents of CSI Communications - Knowledge Digest fo IT Community to

[email protected].

- On behalf of editors of CSI Communications.

CSI 201348th Annual Convention of the

Computer Society of IndiaHosted by: CSI Visakhapatnam Chapter

In association with Visakhapatnam Steel PlantTheme: ICT and Critical Infrastructure

Dates: 13-15, December 2013Venue: Hotel Novotel, Visakhapatnam

Call For Papers / Participation

Introduction: CSI-2013, the 48th Annual Convention of Computer Society of India (CSI), is being organized by CSI Visakhapatnam Chapter, in association with Rastriya Ispat Nigam Limited, Visakhapatnam Steel Plant to bring together researchers, engineers, developers, and practitioners from academia and industry working in all interdisciplinary areas of information system engineering and computing, Innovative IT professionals from government establishments to small, medium & big enterprises, from non-government organizations to multi-national companies to share the experience, exchange ideas and update their knowledge in latest developments in emerging areas. Following the big successes of previous conferences, CSI, Visakhapatnam Chapter is set to conduct the First Annual Convention at Visakhapatnam, CSI-2013 that will serve as a forum for discussions on the state-of-the-art research, development and implementations of ICT applications.

The progress and the growth of any country depend on the Infrastructure and ICT becoming pervasive has a crucial role in managing the Infrastructure. Keeping this in mind, the theme for CSI-2013 has been selected as ICT and Critical Infrastructure. The deliberations will focus on this aspect and cover Innovative ways to deliver business values, optimize business processes and enable inclusive growth. It will also focus on proven IT governance, standards, practices, design & tools that lead to fast development and information fl ow to the user.

Invitation: We invite authors to submit papers refl ecting original research work and practical experiences in the areas of interest to the convention. Invitation is extended to CEOs/CIOs, IT Professionals, IT Users, academicians, researchers, students, and members of the CSI to attend as delegates in this convention. Software fi rms, Industries and business houses are invited to participate in the convention and present and exhibit their products and services. CSI – 2013, invites papers of original research and pertaining to ‘ICT and Critical Infrastructure’ and on the following topics (but not limited to):

* ICT use in Critical Infrastructure (CI) * Security Challenges in using ICT in CI * Wireless and Mobility technologies in the Control Loop * ICT in Steel Industry * ICT in Heavy & Manufacturing Industry * ICT in Process Industry* ICT in BFSI * ICT in Transportation * ICT in Education * ICT in Telecom * ICT in Healthcare * ICT in E-Commerce * ICT in Maritime - Navy, Ship Building, Ocean understanding * ICT in Rural Areas * ICT in eGovernance * Programming Paradigms for CI * Designing Applications for CI * ICT and Cyber Physical Systems in coastal areas * Role of OTT in CI * Synergistic Policy Framework for ICT and CI * Coexistence of OTT, Cloud, and Social Networks * Big Data Analysis and CI * Machine Intelligence * Soft Computing Applications * AI and Nano Computing * Geo informatics and Environment * Bio-informatics * Software Engineering * IT Security, Forensics and Cyber Crime

We also invite proposals for workshops, pre-conference tutorials and doctoral consortium.

Publication: Prospective authors are invited to submit paper(s) not exceeding 8 pages written in A4 size, and as per the AISC, Springer format on any one of the tracks listed above. The proceedings will be published by AISC series of Springer.

Important Dates:

Address for Communication

Paramata Satyanarayana

Convener, CSI-2013

Sr. Manager, Central Computer Center

Visakhapatnam Steel Plant, Visakhapatnam – 530 031

Mobile: +91 9949556989

Email: [email protected]

[email protected]

Organizing Committee Chair Programme Committee Chair Finance Committee ChairSri TK Chand, D(C), RINL Prof PS Avadhani, AUCE (A) Sri GN Murthy, ED (F&A), VSP

Registered with Registrar of News Papers for India - RNI 31668/78 If undelivered return to : Regd. No. MH/MR/N/222/MBI/12-14 Samruddhi Venture Park, Unit No.3, Posting Date: 10 & 11 every month. Posted at Patrika Channel Mumbai-I 4th fl oor, MIDC, Andheri (E). Mumbai-400 093 Date of Publication: 10 & 11 every month

Submission of Full Manuscript 15 – July – 2013

Notifi cation of Acceptance 15 – Aug – 2013

Camera ready copy 31 – Aug – 2013

Submission of Tutorial/Workshop Proposals 30 – July – 2013

Registration Starts 31 – Aug – 2013

Paper Submission for CSI - 2013https://www.easychair.org/conferences/?conf=csi2013

For more details please visit http://www.csi-2013.org

50/-csi-india.org.in/communications/csic april 2013.pdfsecurity corner information security>>...

Documents