50/-csi-india.org.in/communications/csic april 2013.pdfsecurity corner information security>>...
TRANSCRIPT
CSI Communications | April 2013 | 1ww
w.c
si-i
nd
ia.o
rg
ISS
N 0
97
0-6
47
X |
Vo
lum
e N
o. 3
7 |
Iss
ue
No
. 1 |
Ap
ril
20
13`
50
/-
Cover StoryBig Data Systems: Past, Present & (possibly) Future 7
Technical TrendSocio-Business Intelligence using Big Data 11
Research FrontBig Data Enabled Digital Oil Field 17
CIO PerspectiveDeriving Operational Benefi ts from High Velocity Data 28
Security CornerInformation Security>>
Advanced Persistent Threats (APT) and India 33Practitioner WorkbenchProgramming.Learn ("R") 27
Mumbai, India (1 April 2013)—Two major
IT organizations in India have signed a
memorandum of understanding to benefi t
IT professionals throughout the country.
The Computer Society of India (CSI) and
ISACA yesterday signed an agreement
that allows for mutual collaboration and
knowledge sharing for the benefi t of the
profession.
ISACA is a global associations of 100,000
IT professionals who help enterprises
ensure trust in, and value from, their
information and systems. CSI has more
than 100,000 members and 70 chapters
in India.
The MoU, signed by ISACA Director,
John Ho Chi and CSI President Prof. S V
Raghavan, notes that the organizations
will “advance the global IT profession in
India, and the professional standing of
ISACA and CSI members” by:
• Strengthening the relationship
among ISACA and CSI chapters in
India
• Increasing awareness, use and
adoption of the COBIT framework by
CSI members
• Providing standard-setters,
regulators and legislators with access
to best practices, credentials and
educational opportunities off ered by
CSI and ISACA
• Conducting joint educational events
and research projects related to
information systems governance,
security, audit, and assurance issues
in India
“ISACA is pleased to collaborate with CSI
on this important mission: to promote
information systems governance, security,
and assurance in India, and to advance
the IT profession,” said Avinash Kadam,
advisor to ISACA’s India Task Force.
The CSI President expressed the
confi dence that this collaboration will grow
beyond the cooperation between the CSI/
ISACA members, and will lead towards
strengthening academia business and
industry interaction. The CSI members’
immediate gain will be to benefi t from
the ISACA, continuing professional
development programme and access to
the publications and learning resources.
As a result of the MOU, ISACA will waive
its new member fee for CSI members, who
wish to join ISACA. CSI members will also
receive a discount on ISACA’s CISA, CISM,
CGEIT, and CRISC certifi cation exams.
For more information on ISACA, visit
www.isaca.org. To learn more about CSI,
visit www.csi-india.org.
About CSIEstablished in 1965, the CSI is India’s fi rst
and the largest non-profi t organization
in the areas of information processing,
computers, and communications. The
mission of the CSI is to facilitate research,
knowledge sharing, learning, and career
enhancement for all categories of IT
professionals, while simultaneously
inspiring and nurturing new entrants
into the industry and helping them to
integrate into the IT community. The CSI
is also working closely with other industry
associations, government bodies, and
academia to ensure that the benefi ts of
IT advancement ultimately percolate
down to every single citizen of India. The
CSI currently represents over 1, 00,000
members affi liated to 73 CSI professional
chapters and about 562 CSI member
institutions (including 499 CSI student
branches), in diff erent states and regions
of India.
About ISACAWith more than 100,000 constituents in
180 countries, ISACA® (www.isaca.org) is
a leading global provider of knowledge,
certifi cations, community, advocacy,
and education on information systems
(IS) assurance and security, enterprise
governance and management of IT, and
IT-related risk and compliance. Founded
in 1969, the nonprofi t, independent ISACA
hosts international conferences, publishes
the ISACA® Journal, and develops
international IS auditing and control
standards, which help its constituents
ensure trust in, and value from, information
systems. It also advances and attests
IT skills and knowledge through the
globally respected Certifi ed Information
Systems Auditor® (CISA®), Certifi ed
Information Security Manager® (CISM®),
Certifi ed in the Governance of Enterprise
IT® (CGEIT®), and Certifi ed in Risk and
Information Systems Control™ (CRISC™)
designations.
ISACA continually updates and expands
the practical guidance and product family
based on the COBIT® framework. COBIT
helps IT professionals and enterprise
leaders fulfi ll their IT governance and
management responsibilities, particularly
in the areas of assurance, security, risk and
control, and deliver value to the business.
Participate in the ISACA Knowledge Center: www.isaca.org/knowledge-center
Follow ISACA on Twitter: https://twitter.
com/ISACANews
Join ISACA on LinkedIn: ISACA (Offi cial),
http://linkd.in/ISACAOffi cial
Like ISACA on Facebook: www.facebook.
com/ISACAHQ
Contact:Faizan Aboli, Ketchum Sampark, +91 - 22 -
4042 5518, faizan.aboli@ketchumsampark.
com
Kristen Kessinger, ISACA,
+1.847.660.5512, [email protected]
ISACA and CSI Sign Memorandum of Understanding in Mumbai to Advance the IT Profession
CSI Communications | April 2013 | 3
ContentsVolume No. 37 • Issue No. 1 • April 2013
CSI Communications
Please note:
CSI Communications is published by Computer
Society of India, a non-profi t organization.
Views and opinions expressed in the CSI
Communications are those of individual authors,
contributors and advertisers and they may
diff er from policies and offi cial statements of
CSI. These should not be construed as legal or
professional advice. The CSI, the publisher, the
editors and the contributors are not responsible
for any decisions taken by readers on the basis of
these views and opinions.
Although every care is being taken to ensure
genuineness of the writings in this publication,
CSI Communications does not attest to the
originality of the respective authors’ content.
© 2012 CSI. All rights reserved.
Instructors are permitted to photocopy isolated
articles for non-commercial classroom use
without fee. For any other copying, reprint or
republication, permission must be obtained
in writing from the Society. Copying for other
than personal use or internal reference, or of
articles or columns not owned by the Society
without explicit permission of the Society or the
copyright owner is strictly prohibited.
Published by Suchit Gogwekar for Computer Society of India at Unit No. 3, 4th Floor, Samruddhi Venture Park, MIDC, Andheri (E), Mumbai-400 093.
Tel. : 022-2926 1700 • Fax : 022-2830 2133 • Email : [email protected] Printed at GP Off set Pvt. Ltd., Mumbai 400 059.
Editorial Board
Chief EditorDr. R M Sonar
EditorsDr. Debasish Jana
Dr. Achuthsankar Nair
Resident EditorMrs. Jayshree Dhere
Published byExecutive Secretary
Mr. Suchit Gogwekar
For Computer Society of India
Design, Print and Dispatch byCyberMedia Services Limited
Cover Story
7 Big Data Systems: Past, Present &
(possibly) Future
Dr. Milind Bhandarkar
9 Big Data – A Big game changer
Shailesh Kumar Shivakumar
Technical Trends
1 1 Socio-Business Intelligence Using
Big Data
Gautam Shroff, Lipika Dey & Puneet Agarwal
Research Front
17 Big Data Enabled Digital Oil Field Pramod Taneja and Prashant Wate
Articles 19 Big Data A Kavitha, S Suseela and G Kapilya
20 Adoption of In-Memory Analytics
Jyotiranjan Hota
24 Five Key Knowledge Areas for Risk
Managers
Avinash Kadam
Practitioner Workbench
26 Programming.Tips() » Python: Programming Language
for Everyone
Dr. Nibaran Das
27 Programming.Learn("R") » R- StaR of Statisticians
Umesh P and Silpa Bhaskaran
CIO Perspective
28 Deriving Operational Insights from
High Velocity Data
Bipin Patwardhan and Sanghamitra Mitra
Security Corner
33 Information Security »
Advanced Persistent Threat
(APT) - and- INDIA
Adv. Prashant Mali
34 IT Act 2000 »
Prof. I T Law Demystifi es Technology
Law Issues Issue No. 13
Mr. Subramaniam Vutha
PLUSIT.Yesterday()Biji C L
35
Brain TeaserDr. Debasish Jana
37
Ask an ExpertDr. Debasish Jana
38
Happenings@ICT: ICT News Briefs in March 2013H R Mohan
39
CSI Reports:
M. Gnanasekaran 40
Prof. Prashant R Nair, Mr. Ranga Rajagopal and Dr. Rajveer S Shekhawat 41
Sanjay Mohapatra & Prof. Ratchita Mishra 42
Dr. Dilip Kumar Sharma, Mr. Sanjay Mohapatra and Mr. R K Vyas 43
Bipin V Mehta and S M F Pasha 44
Dr PVS Rao 45
CSI News 46
CSI Communications | April 2013 | 4 www.csi-india.org
Important Contact Details »For queries, correspondence regarding Membership, contact [email protected]
Know Your CSI
Executive Committee (2013-14/15) »President Vice-President Hon. SecretaryProf. S V Raghavan Mr. H R Mohan Mr. S [email protected] [email protected] [email protected]
Hon. Treasurer Immd. Past PresidentMr. Ranga Rajagopal Mr. Satish [email protected] [email protected]
Nomination Committee (2013-2014)
Prof. H R Vishwakarma Dr. Ratan Datta Dr.Anil Kumar Saini
Regional Vice-PresidentsRegion - I Region - II Region - III Region - IVMr. R K Vyas Prof. Dipti Prasad Mukherjee Prof. R P Soni Mr. Sanjeev Kumar Delhi, Punjab, Haryana, Himachal Assam, Bihar, West Bengal, Gujarat, Madhya Pradesh, Jharkhand, Chattisgarh,
Pradesh, Jammu & Kashmir, North Eastern States Rajasthan and other areas Orissa and other areas in
Uttar Pradesh, Uttaranchal and and other areas in in Western India Central & South
other areas in Northern India. East & North East India [email protected] Eastern India
[email protected] [email protected] [email protected]
Region - V Region - VI Region - VII Region - VIIIMr. Raju L kanchibhotla Mr. C G Sahasrabudhe Mr. S P Soman Mr. Pramit MakodayKarnataka and Andhra Pradesh Maharashtra and Goa Tamil Nadu, Pondicherry, International Members
[email protected] [email protected] Andaman and Nicobar, [email protected]
Kerala, Lakshadweep
Division ChairpersonsDivision-I : Hardware (2013-15) Division-II : Software (2012-14) Division-III : Applications (2013-15) Prof. M N Hoda Dr. T V Gopal Dr. A K Nayak [email protected] [email protected] [email protected]
Division-IV : Communications Division-V : Education and Research (2012-14) (2013-15)
Mr. Sanjay Mohapatra Dr. Anirban Basu [email protected] [email protected]
Important links on CSI website »About CSI http://www.csi-india.org/about-csiStructure and Orgnisation http://www.csi-india.org/web/guest/structureandorganisationExecutive Committee http://www.csi-india.org/executive-committeeNomination Committee http://www.csi-india.org/web/guest/nominations-committeeStatutory Committees http://www.csi-india.org/web/guest/statutory-committeesWho's Who http://www.csi-india.org/web/guest/who-s-whoCSI Fellows http://www.csi-india.org/web/guest/csi-fellowsNational, Regional & State http://www.csi-india.org/web/guest/104Student Coordinators Collaborations http://www.csi-india.org/web/guest/collaborationsDistinguished Speakers http://www.csi-india.org/distinguished-speakersDivisions http://www.csi-india.org/web/guest/divisionsRegions http://www.csi-india.org/web/guest/regions1Chapters http://www.csi-india.org/web/guest/chaptersPolicy Guidelines http://www.csi-india.org/web/guest/policy-guidelinesStudent Branches http://www.csi-india.org/web/guest/student-branchesMembership Services http://www.csi-india.org/web/guest/membership-serviceUpcoming Events http://www.csi-india.org/web/guest/upcoming-eventsPublications http://www.csi-india.org/web/guest/publicationsStudent's Corner http://www.csi-india.org/web/education-directorate/student-s-cornerCSI Awards http://www.csi-india.org/web/guest/csi-awardsCSI Certifi cation http://www.csi-india.org/web/guest/csi-certifi cationUpcoming Webinars http://www.csi-india.org/web/guest/upcoming-webinarsAbout Membership http://www.csi-india.org/web/guest/about-membershipWhy Join CSI http://www.csi-india.org/why-join-csiMembership Benefi ts http://www.csi-india.org/membership-benefi tsBABA Scheme http://www.csi-india.org/membership-schemes-baba-schemeSpecial Interest Groups http://www.csi-india.org/special-interest-groups
Membership Subscription Fees http://www.csi-india.org/fee-structureMembership and Grades http://www.csi-india.org/web/guest/174Institutional Membership http://www.csi-india.org /web/guest/institiutional-
membershipBecome a member http://www.csi-india.org/web/guest/become-a-memberUpgrading and Renewing Membership http://www.csi-india.org/web/guest/183Download Forms http://www.csi-india.org/web/guest/downloadformsMembership Eligibility http://www.csi-india.org/web/guest/membership-eligibilityCode of Ethics http://www.csi-india.org/web/guest/code-of-ethicsFrom the President Desk http://www.csi-india.org/web/guest/president-s-deskCSI Communications (PDF Version) http://www.csi-india.org/web/guest/csi-communicationsCSI Communications (HTML Version) http://www.csi-india.org/web/guest/csi-communications-
html-versionCSI Journal of Computing http://www.csi-india.org/web/guest/journalCSI eNewsletter http://www.csi-india.org/web/guest/enewsletterCSIC Chapters SBs News http://www.csi-india.org/csic-chapters-sbs-newsEducation Directorate http://www.csi-india.org/web/education-directorate/homeNational Students Coordinator http://www.csi- india .org /web/national-students-
coordinators/homeAwards and Honors http://www.csi-india.org/web/guest/251eGovernance Awards http://www.csi-india.org/web/guest/e-governanceawardsIT Excellence Awards http://www.csi-india.org/web/guest/csiitexcellenceawardsYITP Awards http://www.csi-india.org/web/guest/csiyitp-awardsCSI Service Awards http://www.csi-india.org/web/guest/csi-service-awardsAcademic Excellence Awards http://www.csi-india.org/web/guest/academic-excellence-
awardsContact us http://www.csi-india.org/web/guest/contact-us
CSI Communications | April 2013 | 5
I deem it a great privilege to be at the helm of aff airs of the
Computer Society of India, and it is a great opportunity to be
the President of the society, at a time when India is on the
high growth path in electronics and computers. The recent
policy declarations by Government of India – National Telecom
Policy, National Electronics Policy, and Electronics System
Design and Manufacturing Policy – open up tremendous
possibilities for every Indian. From sensors to supercomputers,
every area is open for innovation and rediscovery. Research
and Development leading to Intellectual property generation
and associated Human Resources Development for capacity
building, in related areas are awaiting active participation
from the CSI.
Since 2010, India has integrated its knowledge generating
institutions in the form of a National Knowledge Network
(popularly known as NKN). In the same year Government of
India, launched a project to take fi ber optic cable up to village
panchayats through National Optical Fiber Network (popularly
known as NOFN). Installation of broadbands everywhere, at
speeds exceeding 10 Mbps / 100 Mbps / 1 Gbps, is making
India the “Best Connected Country”. NKN has already
connected close to 1000 national laboratories and institutes
of higher learning in its fold, and moving towards the target
of 1500 institutions. Virtual classrooms are slowly becoming
the lifestyle in many of these institutions. NOFN spread and
CSI spread across the country, seem to suggest tremendous
opportunities to work together. Perhaps, the Division Chairs,
SIG Chairs, Regional Vice Presidents, Chapter Chairs, and
National Student Coordinator of CSI, would like to brainstorm
and see what role CSI can play in this major change. CSI and
Education had been synonymous, and perhaps can be a single
focal theme for synergistic cooperation.
The new Execom had its fi rst meeting on 31st March 2013.
New Execom members and those who were continuing were
excited about the days ahead. It is a real pleasure working
with this team. I welcome all of them to this wonderful
world of opportunity. As you all know, Shri H R Mohan and
Shri Ranga Rajagopal have joined us as Vice-President and
Treasurer respectively. Shri V L Mehta steps out as Treasurer
after making sure that fi nances of CSI are stable and sound.
Wonderful job indeed!
I would like to place on record the excellent work done by
the outgoing team led by Shri Satish Babu. Many programs,
MoUs, International relationships were handcrafted by them.
Congratulations to your team Satish for giving a wonderful
year to CSI.
CSI signed a MoU with ISACA on 31st March 2013, for mutual
cooperation. I am sure you will see the details elsewhere in
this issue.
Shri Satish Babu, will continue to represent CSI in SEARCC,
BASIS, and ICANN. He has laid solid foundation between CSI
and these entities, and will continue to strengthen it. I will
support him in all his endeavors. I will represent CSI in IFIP
General Assembly from now on.
We have new web site and a new portal. Please use them and
give feed back to CSI HQ. Web and Portal are the face of CSI,
and hence our critical information infrastructure.
Wonderful being with you, and I humbly seek your blessings
and support.
With best wishes,
Prof. S V RaghavanPresident
Computer Society of India
President’s Message Prof. S V Raghavan
From : [email protected] : President’s DeskDate : 1st April, 2013
Dear Members
CSI Communications | April 2013 | 6 www.csi-india.org
EditorialRajendra M Sonar, Achuthsankar S Nair, Debasish Jana and Jayshree Dhere
Editors
Welcome to CSI Communications – Knowledge Digest for
IT Community April 2013 issue. On behalf of CSI, as editorial
panel members of CSI-C we are happy to convey to you that
we have completed two years of editorship and are feeling
privileged to bring about 1st issue of the third year. In this
issue, we are covering articles on ‘Big Data’!, a buzzword
everybody is talking about and trying to get hold of; a BIG
overwhelmed response from our esteemed and fellow
contributors proves that! We are still getting contributions;
we could not accommodate all articles and would carry
forward some contributions in next issue and request those
who have sent contributions to bear with us. We are proud
to tell you that we got a good number of contributions from
our industry fellow professionals. This shows that Indian
software companies are really serious about big data, are
putting serious R&D eff orts and must be looking at it as
a big opportunity for India. We welcome and encourage
our practitioners to contribute their valuable knowledge
through CSI-C. A big thank to all our contributors.
Hadoop - the name sounds familiar and immediately
catches attention when somebody hears about big data. We
start this issue with fi rst article under cover story section:
Big Data Systems: Past, Present & (possibly) Future by Dr.
Milind Bhandarkar, Chief Scientist at Greenplum, a Division
of EMC2. He writes about big data, big data infrastructure,
Apache Hadoop: its adoption and use cases and next
frontiers for big data systems. Big Data – A Big Game
Changer, a second article in this section is by Shailesh Kumar
Shivakumar, Technology Architect at Infosys Technologies,
Bangalore. He writes about drivers and opportunities, impact
and applications of big data, and mentions about how big
data had a big impact and re-defi ned the way elections are
fought in the recently concluded US elections.
In technical trends section we have article by Dr. Gautam
Shroff , Dr. Lipika Dey and Puneet Agarwal from TCS Innovation
Lab titled: Socio-Business Intelligence Using Big Data. They
describe how the fusion of social and business intelligence
is defi ning the next-generation of business analytics
applications using a new AI-driven information management
architecture that is based on big-data technologies and new
data sources available from social media.
In research front section we have an article by Pramod
Taneja and Prashant Wate of iGATE. They introduce
readers to Oil and Gas domain, discuss need for digital oil
fi eld enterprise platform, big data solution for digital oil
fi eld with detailed functional overview in their article: Big
Data Enabled Digital Oil Field.
We have three articles in article section. The fi rst article is
on Big Data is by Kavitha, S Suseela and G Kapilya of Periyar
Maniammai University. The second article is on In-Memory
analytics by Prof Jyotiranjan Hota. He introduces in-memory
analytics, application platforms, vendors, scope and benefi ts,
research challenges and future of in-memory analytics. Prof.
Hota has been one of the regular CSI-C contributors.
Opportunities always come with risks, however as a manager
one should be knowledgeable about how to manage those
risks. We have last article in this section on this topic by our
regular contributor Avinash Kadam. The article covers in
detail what a risk management professional is expected to
be well versed with and describes these as fi ve key practice
areas of risk and information systems controls.
In Practitioner Work Bench section we have fi rst article
under Programming.Tips () on Python: Programming
Language for Everyone by Dr. Nibaran Das of Jadavpur
University. The second article is by Prof. Umesh P and Silpa
Bhaskaran of University of Kerala under Programming.
Learn(“R”): R- StaR of Statisticians. In this section, they
introduce a new language called “R” for the fi rst time.
In CIO perspective, we have an article titled ‘Deriving
Operational Insights from High Velocity Data by Bipin
Patwardhan and Sanghamitra Mitra of Research & Innovation,
iGATE Mumbai, India, where they discuss about business
drivers of big data and Data Stream Processing: genesis,
introduction and implementation in various domains.
In IT.Yesterday(), we have an article on founder of
information theory and beloved father of information age
Claude Elwood Shannon titled ‘Birthday Tribute to the Most
Infl uential Mind of 20th Century’ by Research Scholar Biji C
L from University of Kerala.
There are other regular features such as Security Corenr,
Brain Teaser, Ask an Expert and ICT News Brief in March
2013 in Happening@ICT, CSI reports, chapter and student
branch news and various calls.
Remember we look forward to receiving your feedback,
contributions and replies as usual at [email protected].
With warm regards,
Rajendra M Sonar, Achuthsankar S Nair,
Debasish Jana and Jayshree Dhere
Editors
Dear Fellow CSI Members,
CSI Communications | April 2013 | 7
Big Data Systems: Past, Present & (possibly) Future
Dr. Milind Bhandarkar Chief Scientist, Machine Learning Platforms,Greenplum, A Division of EMC2
Cover Story
The data management industry has
matured over the last three decades,
primarily based on Relational Data
Base Management Systems (RDBMS)
technology. Even today, RDBMS systems
power a majority of backend systems for
online digital media, fi nancial systems,
insurance, healthcare, transportation,
and telecommunications companies.
Since the amount of data collected, and
analyzed in enterprises has increased
several-folds in volume, variety, and
velocity of generation and consumption,
organizations have started struggling
with architectural limitations of traditional
RDBMS architectures. As a result, a new
class of systems had to be designed and
implemented, giving rise to the new
phenomenon of “Big Data”.
In this article, we will trace the origin
and history of this new class of systems
to handle “Big Data”. We refer to current
popular big data systems, exemplifi ed by
Hadoop, and discuss some current and
future use-cases of these systems.
What is Big Data?While there is no universally accepted
defi nition of Big Data yet, and most of
the attention in the press is devoted to
the “Bigness” of Big Data, volume of data
is only one factor in the requirements
of modern data processing platforms.
Industry analyst fi rm Gartner[1] defi nes Big
Data as:
Big data is high-volume, high-velocity,
and high-variety information assets that
demand cost-eff ective, innovative forms
of information processing for enhanced
insight and decision-making.
A recent IDC study, sponsored
by EMC2[2], predicts that the “digital
universe”, the data that is generated in
digital form by humankind, will double
every two years, and will reach 40,000
exabytes (40 * 1021 bytes) by 2020. A
major driving factor behind this data
growth is ubiquitous connectivity via
rapidly growing reach of mobile devices,
constantly connected to the networks.
What is even more remarkable, is thatonly
a small portion of this digital universe
is “visible”, which is the data (videos,
pictures, documents, status updates,
tweets) created and consumed by
consumers. A vast amount of data will be
created not “by” human users, but “about”
humans by the digital universe, and it
will be stored, managed, and analyzed by
the enterprises, such as Internet service
providers, and cloud service providers of
all varieties (Infrastructure-as-a-service,
Platform-as-a-Service, and Software-as-
a-Service.)
Origins of Big Data InfrastructureWe already notice this rapid growth
of data generation in the online world
around us. Facebook has grown from one
Million users in 2004, to more than one
Billion in 2012, a thousand-fold increase in
less than eight years. More than 60% of
these users access Facebook from mobile
phones today. The value generated by
a social network is proportional to the
number of contacts between users of the
social network, rather than the number of
users. According to Metcalfe’s Law[3], and
its variants, the number of contacts for
N users is proportional to N*logN. Thus,
the growth of contacts, and therefore
the interactions within a social network,
which results in data generation, is non-
linear with respect to number of users. As
the world gets more connected, one can
expect the number of interactions to grow,
resulting in even more accelerated data
growth.
Since the popularity of Internet
was one of the main reasons for growth
of communication and connectivity in
the world, we saw emergence of Big
Data platforms in the Internet industry.
Google, founded in 1998 with the goal
of organizing all the information in the
world, became the dominant content
discovery platform on the World Wide
Web, trumping human-powered and
semi-automated approaches, such as web
portals and directories. The challenges
Google faced in crawling the web, storing,
indexing, ranking, and serving billions
of web pages could not be solved with
the existing data management systems
economically. Amount of publicly available
content on the web in Google’s search
index exploded from 26 Million pages in
1998, to more than 1 Trillion in less than
a decade[4].In addition, this content was
“multi-structured”, consisting of natural-
language text, images, video, geo-spatial,
and even renderings of structured data. In
order to rapidly answer the search queries,
with information ranked by relevance as
well as timeliness, Google had to develop
its infrastructure from scratch. In 2003
and 2004, Google published details of
a part of its infrastructure, in particular,
the Google File System (GFS)[5], and
MapReduce programming framework[6].
These two publications became the
blueprint for Apache Hadoop, an open
source framework that has become a
de facto standard for big data platforms
deployed today.
Apache HadoopThe GFS and MapReduce papers motivated
Doug Cutting, creator of an open-source
search engine, Apache Lucene, to re-
architect the content system of Lucene,
called Nutch, to incorporate a distributed
fi le system, and MapReduce programming
framework for tasks of crawling, storing,
ranking, and indexing web pages so that
they could be served as search results
by Lucene. These developments were
noticed by engineers and executives at
Yahoo, which was then struggling to scale
its search backend infrastructure. Yahoo
adopted Apache Hadoop in January
2006, and made signifi cant contributions
to make it a scalable and stable platform.
Today, Yahoo has the largest footprint
of Apache Hadoop, running more than
45,000 servers managing more than 370
Petabytes of data with Hadoop[7]. Being an
open source system, licensed under the
liberal Apache Software License, governed
by the Apache Software Foundation, meant
that Hadoop could be freely downloaded
and deployed in any organization, modifi ed
and used without any hard requirement of
having to contribute the changes back to
open source. The scalability and fl exibility
of Apache Hadoop prompted growing
Internet companies such as Facebook,
Twitter, and LinkedIn to adopt it for
their data infrastructure, and contribute
CSI Communications | April 2013 | 8 www.csi-india.org
modifi cations and usability enhancements
back to the Apache Hadoop community.
As a result, the Hadoop ecosystem grew
rapidly over the years.
Today, there are more than 20
components in the Hadoop ecosystem
that are developed as open source projects
under the Apache Software Foundation,
and several hundred proprietary and
other open source components. Some
of the popular components in the
Hadoop ecosystem, apart from Hadoop
Distributed File System (HDFS), and
MapReduce, include Hive, A SQL-like
language that translates to MapReduce;
Pig, an imperative data fl ow language that
generates MapReduce jobs to execute the
data fl ow; and HBase, a NoSQL Key-Value
store that uses HDFS as its persistent
layer. HBase is based on a paper describing
another Google infrastructure component,
Bigtable, which was published in 2006[8].
While Hadoop, today, has become
the de facto platform for analyzing Big Data,
challenges remain in making it accessible
and improving its ease of use, thus making
it a fi rst-class citizen of data infrastructure
managed by IT professionals. The
MapReduce programming paradigm is not
particularly easy to use for data analysts,
and commonly used business intelligence
tools do not interoperate with interfaces
provided by Hadoop today. To overcome
these challenges, a number of data
warehousing system vendors, such as
Teradata, Oracle, IBM, EMC2/Greenplum,
and others off er connectivity with Hadoop
platforms. There are even eff orts towards
unifying SQL-based OLAP platforms, such
as Greenplum, with Hadoop[9]. A number
of Hadoop distributions have emerged
over the years, improving manageability
of Hadoop infrastructure. These include
Cloudera, Hortonworks, MapR, EMC2/
Greenplum, IBM BigInsights, Microsoft
HDInsights, etc. In addition, there is an
increasing number of Big Data Appliances;
hardware platforms that are integrated
with Hadoop distributions, including
Oracle, Teradata, and EMC2/Greenplum.
Hadoop Adoption & Use CasesOver the years, Hadoop and other big
data technologies have become popular
in non-Internet organizations as well,
also struggling to handle the data deluge.
Infrastructure in many organizations in
various industries, such as retail, insurance,
healthcare, fi nance, manufacturing, and
others have been almost fully digitized.
Until recently, the data these organizations
used to collect were stored in archival
systems, mostly for regulatory compliance
purposes. However, there is a growing
realization across these organizations
that this data can be utilized for gaining
competitive advantage, increasing process
effi ciencies, and improve customer
experience. In a recent study conducted
by Tata Consultancy Services (TCS)[10],
over 50% of organizations surveyed are
using Big Data technologies, and many of
them predicted more than 25% gains in
returns on investment (ROI), mostly from
increased revenue. The fl exibility of these
Big Data systems to combine structured
datasets (51%) with semi-structured
datasets (49%) has been cited as enabling
advanced analytics capabilities. In
addition, while most of the organizations
use data that is available internally (70%)
within those organizations, availability of
external data, such as from twitter and
other social media, allows them to perform
better customer behavior analysis.
The 3V’s, volume, velocity, and
variety of data, along with need to develop
agile, data-driven applications, implies
that the humans analyzing, detecting
patterns, and making sense of data need
to have a rich toolset at hand. Traditional
data exploration, visualization, business
intelligence, and reporting tools are being
adapted to co-exist with these new Big
Data technologies. Advances in machine
learning algorithms and methods, as
well as abundant processing power,
have democratized deep and predictive
analytics to be used in any average IT
department. Open source languages
for statistical analysis and modeling,
such as the popular R language[11] and
a newcomer such as Julia, as well as
emerging machine learning frameworks,
such as scikit-learn in Python[12], Apache
Mahout for Hadoop[13], and In-Database
deep analytics library, MADlib[14] have
attracted attention of developers and
users for developing machine-learning
powered applications based on large and
diverse datasets.
These new platforms, languages
and frameworks have challenged several
predominant practices in the enterprises.
Traditional data governance practices,
including access control, provenance,
retention, backup, mirroring, disaster
recovery, security, and privacy, are
struggling to cope with organizations’
ability to store and process massive
amounts of diverse data. Over the next few
years, one should expect best practices
for data governance, and associated
technologies to emerge and become
commonplace.
Industrial Internet: The Next FrontierWhile most of the Big Data use-cases
today are analyzing customer behavior,
their buying patterns, their likes and
dislikes as expressed in social media, their
clickstreams and location information
from mobile devices, machine-generated
data could be the next frontier for Big
Data systems. In addition, cheap sensor
technology, and short-range wireless
connectivity has created possibility of
real-time monitoring, and historical
pattern analysis of traditionally analog
informationsources. For example, a
modern Ford automobile has thousands of
signals being captured by 70+ sensors that
generate more than 25 gigabytes of data
every hour, and processed by 70 on-board
computers[15]. While most of this data is
transient, and needs to be acted upon in
real-time, recognizing patterns within the
data to improve safety and usability of
the automobile implies aggregating and
analyzing it offl ine.
Indeed, the massive amount of data
captured by sensors in machinery, and
possibility of storing and analyzing this data
to make intelligent design and operational
decisions has created a new opportunity,
now known by a new moniker, Industrial
Internet[16]. If, as a result of analyzing this
data to aid better decision making, we could
reduce system ineffi ciencies in healthcare
industry by a mere 1%, it could result in
savings of USD 63 Billion over next 15
years. If advanced analytics capabilities on
the large amount of oil and gas exploration
data results in only 1% of reduction in
capital expenditure, it could save more than
USD 90 Billion over next 15 years. The key
element proposed for the Industrial Internet
is Intelligent Connected Machines with
advanced sensors for data capture, controls
for automation, and software applications
powered by deep physics-based analytics
and predictive algorithms for analyzing large
amounts of sensor and telemetry data.
Indeed, we are witnessing the
third revolution, following the industrial
revolution, and the Internet revolution,
Continued on Page 16
CSI Communications | April 2013 | 9
Introduction Big data is basically vast amount of data
which cannot be eff ectively processed,
captured, and analyzed by traditional
database and search tools in reasonable
amount of time. Though the “big” in Big
Data is subjective, McKinsey estimates
that it would be anywhere between few
dozen terabytes to petabytes for most of
the sectors.
The Big Data information explosion
is mainly due to the vast amounts of data
generated by social media platform, data
input from Omni-channels, various mobile
devices, user generated data, multi-media
data, and son on. Analysts term this as an
expanding “Digital universe”.
Big Data is usually defi ned by 3Vs:
Volume, variety, and velocity. To put things
in perspective let’s examine each of these
dimensions:
• Volume: IBM research fi nds that every
day we add about 2.5 quintillion bytes
(2.5 x 1018) of data; Facebook alone
adds 500TB of data on daily basis;
90% of world’s data is generated in
last 2 years. Google processes about
1 petabyte of data every hour.
• Velocity: The rate of data growth is
also astonishing. Gartner research
fi nds that data is growing at 800%
rate out of which 80% is unstructured.
EMC research indicates that data
increase is following Moore’s law by
doubling every 2 years.
• Variety: The data that is getting
added is also of various types ranging
from unstructured feeds, social
media data, multi-media data, sensor
data etc.
The main value that can be derived from
Big Data is by aggregating vast amount of
data integrated from various sources.
Following diagram shows various
technologies used in Big Data:
Drivers and Opportunities There is lot of drivers forcing the
businesses to consider Big Data as their
key business strategy. Some of them are
listed below:
• Real-time prediction
• Increase operational and supply
chain effi ciencies
• Deep insights into customer behavior
based on pattern and purchase
analysis
• Information aggregation
• Better and more scientifi c customer
segmentation for targeted marketing
and product off ering
Big data also provides the following
opportunities:
• Improve productivity and innovation
• McKinsey predicts an increase in job
opportunities ranging from 140K to
190K
• Uncover hidden patterns and rapidly
respond to changing scenarios.
• Multi-channel and multi-dimensional
information aggregation
• Data convergence
Traditional search, sort, and processing
algorithms would not scale to handle the
data in this range, and that too most of it
being unstructured. Most of the Big Data
processing technologies include machine
learning algorithms, natural language
processing algorithms, predictive
modeling, and other artifi cial intelligence
based techniques.
Big Data is of strategic importance
for many organizations. Because any
new service or product will be eventually
copied by competitors, but an organization
can diff erentiate it by what it can do with
the data it has.
Below diagram shows the
convergence of data from various
dimensions:
Impact and Applications We will examine the impact and
applications of Big Data related
technologies across various industry
verticals and technology domains in this
section.
Application across industry domains
Financial industry: • Better fi nancial data management
• Investment banking using aggregated
information from various sources
likes fi nancial forecasting, asset
pricing and portfolio management.
• More accurate pricing adjustments
based on vast amount of real-time
data
Big Data – A Big game changer
Shailesh Kumar Shivakumar Technology Architect, Consulting & Systems Integration Infosys Technologies, Bangalore, [email protected]
Abstract: Big Data holds bigger promises for the information technology area. Properly taming and analyzing the Big Data
provides valuable insights, predict consumer behavior, improve productivity, and reduce cost. It has potential to be a game
changer by providing big opportunities, which would catalyze the business revenues. This article discusses the key concepts,
applications, and challenges in implementing Big data strategy.
Keywords: component, Big data, predictive analytics, 3Vs, Industry and technology applications
Cover Story
CSI Communications | April 2013 | 10 www.csi-india.org
• Stock advises based on huge amount
of stock data analysis, unstructured
data like social media content etc.
• Credit worthiness analysis by
analyzing huge amount of customer
transaction data from various sources
• Pro-active fraudulent transaction
analysis
• Regulation conformance
• Risk analytics
• Trading analytics
Retail industry : • Better analysis of supply chain data
and touch points across Omni-
channel operations
• Customer segmentation based on
previous transactions and profi le
information
• Analysis of purchase patterns and
tailor made product off erings
• Unstructured data analysis from
social media, multi-media to
understand the tastes, preferences,
and customer patterns and do
sentiment analysis
• Targeted marketing based on user
segmentation
• Competitor analysis
Mobility: • Mining of customer location data,
call patterns.
• Integrate with social media to
provide location based services like
sale off ers, friend alerts, points-of-
interest suggestions etc.
• Geo-location analysis :
Health care: • Eff ective drug prescription by
analyzing all structured and
unstructured medical history and
records of the patient
• Avoid un-necessary prescriptions
Insurance: • Risk analysis of customer
• Analyzing cross-sell and up-sell
opportunities based on customer
spending patterns
• Insurance portfolio optimization and
pricing optimization
Application across technology domains: • Search Engine improvements:
New algorithms to analyze large
unstructured data will be used.
The algorithms will be artifi cial
intelligence based working in parallel
in multiple grids to process huge
amount of data
• Business intelligence tools: Analytics
tools will be able to provide new and
creative visualizations to intuitively
depict the meaning of the data
• Storage management tools: Private/
cloud storage systems will undergo
change to store huge amount of data
• Cloud computing: Cloud and social
media play a vital role in handling Big
Data. Cloud would be the platform
of choice to store massive amount
of data and to run the software as
service to process the data
• ERP systems like CRM undergo great
improvements. CRM system can
help the on-call analysts to provide
real-time customer off ers, customer
churn probability etc.
• Predictive analytics will be more
eff ective by analyzing data from
multiple dimensions
A Curious Case of Big Data in US 2012 Elections Curiously Big Data had a big impact and
re-defi ned the way elections are fought in
the recently concluded US elections. Here
are the some of the interesting facts on
how the Big Data was leveraged
• In the recently concluded US
elections, Obama’s team eff ectively
used Big Data to achieve victory
• Democratic team tasked with Data
analysis, aggregated data from
various sources including voter list,
social media posts, fund raisers etc.
• Multi variant tests were conducted to
understand voters’ decision making
and designing eff ective policies to
persuade them
• The data analysis included mining
customer data, profi ling them and
sending targeted campaign mails to
infl uence their decision. The analysis
also provided crucial insights about
the voters who are most likely
to switch sides and the required
triggering points for the switch.
• The team built persuasion model
with predictive analytics to fi nd out
the probability of persuasion among
population various geographies.
Analyzing the Big Data was the key
diff erentiator in swinging a good
percentage of voters and predicting the
results with a greater confi dence.
Market Opportunity Big Data off er bigger opportunities. Here
is a snapshot of some of the predictions
done by market research fi rms in this
regard:
• IDC predicts the Big Market to grow
to $16.9 Billion by 2019
• Digital reasoning estimates that Big
Data market would be worth $48.3
billion in 2019
References [1] http://blogs.wsj.com/digits/2009/
05/18/the-exploding-digital-universe/
Risk analytics
[2] h t t p : //w w w. f o r b e s . c o m /s i t e s /
tomgroenfeldt/2012/01/06/big-data-
big-money-says-it-is-a-paradigm-
buster/
[3] http://www.emc.com/about/news/
press/2011/20110628-01.htm
[4] EMC link
[5] http://online.wsj.com/article/SB10001
4241278873233532045781266711241
51266.html
[6] h t t p : //w w w . e w e e k . c o m /c /a /
Application-Development/Big-Data-
Market-to-Grow-to-169-Billion-by-
2015-IDC-118144/
[7] h t t p : //w w w. f o r b e s . c o m /s i t e s /
n e t a p p / 2 0 1 2 / 1 1 /0 6 / b i g - d a t a -
election-surprising-stats/ n
Abo
ut th
e A
utho
r
Shailesh Shivakumar is a technology architect at Infosys with over 11 years of industry experience. His areas of expertise include Java Enterprise technologies, Performance engineering, Enterprise portal technologies, user interface components and performance optimization. He was involved in multiple large-scale and complex online transformation projects for marquee clients of Infosys. He also provided on-demand consultancy in performance engineering for highly critical projects across various units. He is a regular blogger at Infosys Thought Floor, and many of his technical white papers are published in Infosys external site and in Infosys Lab briefi ngs journal. His blog also was listed in “Most popular” category recently. He also heads a centre-of-excellence at Infosys. He also holds numerous professional certifi cations including Sun certifi ed Enterprise Architect (part 1), Sun certifi ed Java programmer, Sun Certifi ed Business component developer, IBM certifi ed Solution Architect – Cloud computing, IBM Certifi ed Solution Developer – IBM WebSphere Portal 6.1, and many others.
CSI Communications | April 2013 | 11
Abstract: We describe how the fusion of
social and business intelligence is defining
the next-generation of business analytics
applications using a new AI-driven
information management architecture
that is based on big-data technologies and
new data sources available from social
media.
What is ‘BigData’? The term ‘BigData’ has become the latest
buzzword in the IT industry, much as
Cloud Computing began to elicit interest a
few years ago. As in the case of the latter,
we submit that BigData, is a metaphor
for a few significant technology, social
business convergences: Popular interest
in cloud computing was fuelled by the
emergence and eventual confluence of
web-based social applications, software
as a service, infrastructure as a service,
and finally platforms as a service.
In a similar fashion, ‘BigData’ is
essentially the convergence of technology
advances in artificial intelligence
emanating from search and online
advertising, along with the development of
new architectures for managing extremely
large web-scale data volumes, exemplified
by the now popular Hadoop stack. Along
with the means to process vast quantities
of unstructured data, we also find that
the data itself is now readily available:
Vast volumes of consumer conversations
on social media;, such as Twitter, are free
for all to access, and the rest are rapidly
becoming a valuable commodity available
for purchase from Facebook, Linkedin, etc.
In this article we describe a
number of ‘Socio-Business’ applications
that exploit these new data sources,
and are of potential interest to large
enterprises. Moreover, we find that each
of these applications involve the fusion
of information from social media with
internal business data, the extraction
of knowledge from web-sources, the
application of artificial intelligence
techniques in some fashion, and/or the
exploitation of BigData-inspired data-
management architectures.
The New Context for Business Intelligence In the past decade, AI techniques operating
at web-scale have now demonstrated
significant successes on the web, many
of which were once impossible: Statistical
machine learning at web-scale is the
reason, why Google’s machine translation
works. Web-based face recognition relies,
among other things, on large-scale multi-
way clustering to discover image features
that work best to disambiguate faces; this
couple with some human tagging, even
from profile photos, is then sufficient to
recognise faces even without standard
scale, poses, expression or illuminations.
The Watson system uses 15 terabytes
of in-memory data culled from the web
and other sources, along with parallel
processing across 90 processors. Finally,
Siri’s hope for success depends on the fact
that it includes a cloud-component, which
opens up the possibility for continuous
learning using the large volumes of data
its adoption my millions of users will
generate.
The time is therefore ripe for
enterprise busines. AI techniques into
their solutions. The potential for AI
techniques in the enterprise was aptly
articulated almost a decade ago by Dalal
et. al[1]. Moreover, the availability of large
volumes of data from social-media makes
it all the more viable, as well as essential
to exploit the techniques already being
used so well in web-scale AI applications.
Further, in sharp contrast to the
millions of servers powering the web, the
largest of enterprise IT departments, are
used to handling 50,000 or so servers,
and hundreds of terabytes of data at the
most. Enterprise data-storage, databases,
and data analysis tools are, in turn, tailored
to handle terabytes or at most a petabyte
or so. Further, most of the ‘big-data’
emanating from social-media sources is
unstructured, text data; again, something
that the traditional business intelligence
tool-stack is not designed to tackle, and
for which afore mentioned AI techniques,
are needed to extract insight.
Moreover, inputs from from social
media comprises of largely unstructured
data that needs to be tapped, processed
and analysed sometimes require the
use of big-data technologies such as are
used by the web companies themselves,
instead of the traditional databases that
are better suited for structured data. Thus,
big-data technologies such as Hadoop
etc. are often used even though most
traditional enterprises do not actually
need to process as large a volume of data
as the web companies do.
Innovative business use-cases
exploiting BigData from social-media and
mobility sources span multiple industries,
from retail to manufacturing and financial
services. A common theme across all
these applications besides having to
extract intelligence, from large-volumes
of BigData is the need to fuse information
from multiple sources, both internal and
external, structured and unstructured.
Further, the rapid pace of developing
events on social media mean that the
standard techniques for translating
predictive insights into real-time decision
support, such as building (off-line) a deep
but computationally ‘small’ model, need
to be enhanced: Social media events need
to be filtered, processed, correlated and
analysed for their impact in real-time.
In the sections that follow, we describe
some of these use-cases, and explain the
techniques they require.
Supply-Chain Disruptions The recent natural disasters that struck
Japan in 2011, i.e., the earthquake,
tsunami, and subsequent release of
nuclear radiation, clearly had a devastating
effect on the the Japanese population
and economy. At the same time, the
effects of these events were felt around
the world; in particular, they led to major
disruptions in the global supply chain for
many industries, from semiconductors to
automobiles and even baby products.
The Japanese earthquake was
a major event of global significance,
followed closely in the global media on a
daily basis; hopefully a fairly rare ‘black
swan’ event. However, many adverse
events of a far smaller significance occur
daily across the world. Such events are
mainly of local interest only. Further,
public interest in the event may last but a
day or so, while its economic impact may
last much longer: Take the example of a
fire in a factory. There are, on the average,
in the range of ten or so major factory fires
in the world every day. Similarly there are
labour strikes that disrupt production.
Most of these events affect a very small
Socio-Business Intelligence Using Big Data
Gautam Shroff,* Lipika Dey,** & Puneet Agarwal***TCS Innovation Labs
Technical Trends
CSI Communications | April 2013 | 12 www.csi-india.org
locality, and may not even reach the local
news channel, and certainly not global
ones. Further, any public interest, however
localised, in the event may last a few hours
or at most a day. Nevertheless, if the
factory affected is a significant supplier
to a major manufacturer half-way around
the world, this relatively minor event is
possibly of great interest to the particular
enterprise that consumes its product! It is
observed that the manufacturers notice
such news about their suppliers, when
they encounter shortage in supply, which
is usually a few days or sometime a week
later. If however technology can help them
notice this earlier, they will have more
time to make alternate arrangements.
Interestingly, it has been found that
many of these events, even the ones with
extremely local impact, find their way fairly
rapidly into social media, and in particular
Twitter. Used for social-networking in
over 200 countries with over 500 million
tweets a day, Twitter turns to also be rich
source of local news from around the
world. Many events of local importance
are first reported on Twitter, including
many that never reach news channels.
Fig. 1 describes the overall architecture for
listening to events from social media that
we have used both for detecting adverse
events as well as for listening to the ‘voice
of the customer’ as described in the next
Section.
In[5] we have proposed an
architecture that enables a large enterprise
to monitor potential disruptions in its
global supply-chain by detecting adverse
events, by monitoring Twitter streams.
In[4] we have described how such events
can be efficiently detected using machine-
learning techniques, from amongst
streams of unstructured short-text
messages (tweets) arriving at a rate of
tens of messages per second. In contrast
with the larger volumes that follow events
of wider significance, there are often only
a few tweets reporting each such event;
the few tweets that happen to report the
same event, are then correlated.
Next, as described in[5] and, the impact
of the detected event to the enterprise in
question can be assessed, by fusing the
detected external event with internal data
on suppliers.
Voice of the Consumer Listening to the voice of the consumer
through mechanisms such as surveys,
feedbacks, emails, and support center
logs, is a continual process through
which organizations try to improve their
customer satisfaction rate and increase
their consumer base. Increasingly,
listening to consumer-generated content
from social-media channels like Twitter,
Facebook, and Blogosphere is augmenting
the possibilities for analyzing the voice of
the consumer, and becoming an important
element of the business intelligence
strategy of consumer-focused enterprises.
At the same time, the traditional
channels of listening directly to
customers, such as call-centers and
email, and indirectly through eventual
sales figures, remain as important as ever:
Social-media inputs are inherently noisy
in nature, so the insights acquired from
social-media are often validated by fusing
these with additional inputs collected
through more traditional channels. At the
same time, social-media inputs may often
lead other inputs in time, and therefore
be of significance in spite of their relative
inaccuracy.
Different type of insights can be
gathered from consumer-
generated content.
Companies are engaging
in analyzing the voice
of consumer primarily
to address the following
issues, which we may also
distinguish based on the
the content, sources, and
temporal variation that
they focus on:
1. Brand Sentiment
Analysis, is concerned
with measuring the
sentiment expressed in
the context of particular
brands, products, and
services, or even specific pre-defined
features of a product or service. The
emphasis is on volumes, and on tracking
the overall aggregate positivity / negativity
associated with the set of concepts one is
interested in. Source selection is broad
and channel based; thus one might choose
to focus on say Twitter, a Facebook page,
and selected blogs, as well as analyze the
variation across these. Since sentiment
is noisy and varies rapidly, it is also
aggregated temporally; thus the time-
scales of aggregate sentiment analysis are
in the range of days and weeks.
Social-media-based brand sentiment
analysis is cheaper and faster than
traditional survey-based techniques
such as Nielsen market-surveys; it also
reveals results sooner. Thus, sudden and
significant changes in sentiment about a
brand can be detected faster, such as that
which took place when tropicana changed
its packaging a few years ago, which was
followed by strong negative consumer
sentiment. However, the jury remains out
as to how often these aggregate sentiment
figures bring novel insights as compared
to traditional measures. The fact is they
need to be time-averaged to make any
sense; thus finer-grained approaches
are needed to enable more real-time
response, and detect emerging problems
that by themselves may not change the
aggregate sentiment significantly, at least
at first, and that too only if not addressed
in time.
Listening to consumer sentiment
on social platforms has recently become
almost a commodity offered by a number
of commercial services, such as Radian61
and others. Opinion-mining techniques for
extracting sentiment from text are used in
such tools. The initial insight that is most
often sought through the adoption of a
listening service is the ability to monitor
brand perception, i.e., whether consumers
at large are saying positive or negative
things about one’s brand, product, or
service.
2. Complaint Analysis, in contrast with
brand sentiment analysis that casts
its net wide, complaint analysis
tries to focus on actual customers.
Thus, the sources for such analysis
are either direct customer feedback
through call-centers or email, or
when it comes to social-media, the
input is carefully filtered so as to
ensure the presence of indicators
such as “I bought”, “my car”, etc.,
Fig. 1: Event detecti on
CSI Communications | April 2013 | 13
making it highly likely that the writer
is in fact a customer, either of one’s
own product or that of a competitor.
Next, such complaint analysis aims to
analyze the text written by customers
to detect which aspects of a product or
service they are having difficulty with. This
requires a deeper level of natural language
processing than, say, aggregate sentiment
analysis: Consider the statement “I’ve
been having trouble with my new [car-
brand], not only did the transmission
give way in the first month but there was
a significant delay in getting it changed”.
Clearly it is a negative statement about
the car brand, and even its transmission,
which basic sentiment analysis can easily
discover. However, deeper text processing
can further discern what exactly is wrong
with the transmission, and aggregate
such difficulties across a large volume
of customer feedback along various
dimensions. As a result, if the concept of
say, transmission ‘giving way’ including
its linguistic equivalents, is showing up in
significant numbers, then this becomes an
issue to flag to product engineering. On
the other hand, the fact that the supply
of spares of various types are delayed,
including transmission parts etc., gets
aggregated at a different level of say
‘delayed parts’, and is escalated to those
responsible for after-sales services.
The deeper degree of text processing
required for complaint analysis requires
‘ontology-driven causal analysis’, which
involves some level of parsing as well as
learning, and exploiting domain ontology.
Additional techniques required include
trend analysis, whereby sudden spikes
in communications regarding particular
new terms, such as ‘iPad’, so as to detect
emerging problems even if they are not
part of a known categorisation or ontology.
Summarising, causal feedback
analysis restricts the source to customer
feedback, analyses the content in depth,
and aggregates results over a period of
time, for example on a weekly or monthly
basis. Most importantly, in contrast with
brand sentiment analysis, complaint
analysis often results in directly actionable
intelligence that can be passed on to the
concerned division in the enterprise.
Fig. 2 describes our architecture for
ontology driven opinion mining from
unstructured customer feedback, which is
described in more detail in[2].
3. Early problem detection, again listens
to consumers at large. However,
unlike aggregate sentiment analysis,
the aim here is to quickly detect new
problems being faced by consumers.
For example, a new website design
might be flawed, leading to consumer
frustration; a new policy on a banking
service may be leading to angst
and outrage, or a major competitor
might be luring customers away.
Increasingly, consumer conversations
that might point to such events are
taking place in the open, on Twitter.
However, as in the case of detecting
potential supply-chain disruptions,
the stream of tweets needs to be
filtered to first focus
only on consumer
complaints, and then
processed to extract
information on the
actual problem being
faced.
However, the situation
here is technically more
challenging than say,
factory events, since
distilling ‘consumer’
versus non-consumer
events is less accurate
than discerning factory-
fire or labour-strike
events. Further, the
nature of the information
that one seeks to extract
need not be known in advance. Thus,
while a domain ontology can help classify
events to a certain extent, the sudden
arrival of a comment on “why isn’t there an
iPad application to access my ... account
like there is for [competitor]”; it may well
be that ‘iPad’ does not yet figure in the
domain ontology. Still, this problem needs
to be detected and classified in some
manner so that appropriate action can be
taken. New problem detection is as yet
difficult to completely automate: Instead,
as in the example above, it is better to
bring a human in the loop when required;
of course, automatically figuring out when
to do so is equally important as well.
Competitive Intelligence Competitive intelligence is aimed at
assessing risks and opportunities in a
competitive environment, before they
become obvious. It is used by organisations
to compare themselves with their peers
(”competitive benchmarking”), to identify
risks and opportunities in their markets,
and to pressure-test their plans against
market response (war gaming), which
enable them to make informed decisions.
Competitive intelligence comprises
the tasks of defining, gathering, and
analysing intelligence about the industry
in general, along with specific knowledge
like products, pricing, marketing
strategies, and much more about
competitors. The information gathered
allows organizations to realize their
strengths and weaknesses. Acquisition
and analysis of events falling under
competitive intelligence category is a
highly specialised activity.
Competitive intelligence can be
broadly classified into two categories
depending on whether it is used for long-
term planning or short-term planning.
Strategic Intelligence (SI) focuses on long
term issues that analyze a company’s
competitiveness over a specified period in
future. The main focus of analysts here is
to forecast, where the organization should
be positioned few years hence, and to
identify strategies to convert this into a
reality. This analysis primarily involves
identifying weaknesses, and early warning
signals within the organization. Tactical
Intelligence on the other hand focuses on
providing information that can influence
short-term decisions. Most often, this is
related to analysis of current market share
and the competition landscape. This kind
of intelligence directly affects the sales
process of an organisation.
Tactical intelligence can be further
categorized as: (i) Brand related:
provides information about popularity of
competitors in terms of their products
Fig. 2: Ontology-driven opinion mining
1Now http://www.salesforcemarketingcloud.com/
CSI Communications | April 2013 | 14 www.csi-india.org
or brands as a whole, which products
are moving in the market, market share
of competitors. Consumer sentiments
related to the organization and its
competitors also belong to this category.
(ii) Pricing related: provides knowledge
about prices of competitor products. (iii)
Promotions related: provides information
about promotion strategies and kind of
promotional activities that are adopted by
competitors. (iv) Organizational: provides
information about competitors like their
work force structure, internal shift in
focus or vision, success or failure of their
trials, new product launches, technology
investments etc. all contribute towards
building a profile of competitors that can
be useful to organizations. The table in
Fig. 5, presents an overview of how
different types of web-content can
contribute towards compiling tactical
competitive intelligence reports for an
organization. A detailed treatment of how
competitive intelligence can be extracted
from social media is treated in[3]
5.1 Detecting Competitor Events The process of gathering competitive
intelligence has undergone a massive
transformation in recent years, fuelled by
an increasing availability of information
on the web. The competitor’s home
pages can be crawled to understand
new developments, positioning changes,
technology adoption etc. The social media
on the other hand abounds in consumer-
generated content, and can be utilized to
gauge the performance of competitors,
their products, brands, suppliers, and
distributors. Competitive intelligence
content also includes expert opinions,
technology advancements, economic
policies, social changes, and many other
related materials essential for excelling in
business. News from multiple sources is
still considered to be a major contributor
to competitive intelligence. Discussions
on different forums and blogs can provide
crucial insights when analyzed in proper
perspective. Using Google search trends
for competing products and services
can also be a good source of competitor
intelligence.
It is essential to define a set
of processes to gather information,
converting it into competitive intelligence,
and then channelize it for consumption
in business decision making. Usability
and actionability of the information
gathered are two critical factors
in determining its relevance.
Information gathered from
the Web is unstructured
in nature, and therefore
not immediately machine-
interpretable. Handling
inaccuracies, redundancies, and
volumes are other challenges.
Appropriate knowledge
management techniques are
required to ensure that analysts
have access to all relevant
information to without facing
information overload.
Given the large volumes
of information received in
a digitized format, natural language
processing, text mining, and statistical
reasoning play significant roles in
automating the process of content
assimilation. A host of specialized tools
are also available to aid some of these
tasks. News analytics is a well-established
research area dedicated to analysis and
organization of news articles received from
different sources, to predict the political,
financial or social impacts of these
events. Extracting specific events that
can contribute to competitive intelligence
can be considered as a sub-task of news
analytics. Classification techniques are
employed to classify news articles into
broad categories like political, economic,
sports, market information, entertainment
etc. Article summarization techniques are
often used along with this to provide the
key content of articles. Clustering news
articles based on content is also an oft-
used technique to reduce information
overload. Intelligent cluster visualizations
help in easy assimilation of content.
One of the key challenges here is
to identify those events which can be
assessed for their impact on past, present
or future performance of an organization.
All impacts are not measurable. For
example, it is difficult to measure the
impact that a new technology may have
on the future market or the effect of a new
chief appointed by a competitor or even
the news about an important acquisition
by a large company. News events typically
comprise a major chunk of information
used to gain strategic intelligence.
Information and relation extraction
techniques from text mining are also
gaining popularity in news analytics, since
they can further help in extracting specific
chunks of information in a structured
form that can be consumed even by
machines. Information and relation
mining techniques have been successfully
applied to extract significant entities, and
their roles and responsibilities in an event
along with event details like name, time,
location, and description of event. The
structured information extracted from the
news articles can be further consumed by
a reasoned to draw inferences.
Social media content on the other
hand can contribute very effectively
towards gaining tactical intelligence.
Tracking twitter and Facebook content
Type of Cometitive Intelligence
Event
Web Source
People events News, company web-sites
Competitor strategies. e.g.
technology investment
News, Discussion Forum, Blogs, patent sites
Consumer sentiments Review sites, social networking sites
Promotional events and pricing Twitter, Facebook
Related real-world events News Twitter, Facebook
Fig. 3: Competi ti ve intelligence events and their sources
Fig. 4: Analyzing competi tor promoti ons from social media
CSI Communications | April 2013 | 15
generated by competitors, can provide
fairly accurate data about promotions
run by them. Twitter and Facebook also
abound on consumer sentiments about
products and services or a brand.
Text classification techniques are
widely used to classify social media
messages, into pre-defined categories
like status updates, sentiment, and
opinion, consumer support systems,
news, promotions and campaign, and
others. Further categorization or labelling
of content is also possible based on
the named-entities present in these.
Classification of social-media content into
pre-defined categories like above helps
in filtering the relevant from irrelevant.
Traditional classification techniques using
the Bag-of-Words do not perform very well
on short messages like these. Rather a set
of domain-specific features like authors
profile, ReTweets, @userMentions etc.,
help in classifying the text to a predefined
set of generic classes such as News,
Events, Opinions, Deals and Promotions,
and Customer Support. A classified text
can be further tagged or associated with
product or service labels, brand names,
action categories etc. using domain
ontology. Natural Language Processing
tools like Named Entity Recognition is also
applied to identify dates, money-values,
store names or locations etc. The assigned
class and product labels along with the
complete set of information extracted
can be used to generate a promotion
map, which can depict category-wise
promotions for different products region-
wise and time-wise. Fig. 4 depicts the
process flow for the same.
5.2 Competitive Intelligence Analysis Competitive intelligence reports are
consumed by analysts, strategists, and
decision makers of different departments
across the organization. While most of
these reports are pushed into the work-
flow automatically, drawing inferences
from competitive intelligence reports is still
by and large a human activity. It requires
a lot of tacit world knowledge most of
which is not available in a structured or
semi-structured manner. Correlation with
diverse types of structured data generated
within the organization, can also yield
valuable insights.
Fusing reports and data originating
from different channels is not a
straightforward task. Harmonization
of data from multiple sources, requires
intelligent master data management
techniques. Fusion systems need to judge
feasibility and relevance of the merging
different types of data. Visualization of
the results to deliver the correct insights
is yet another complex task. While
much of this work is also human-driven
today, analytical systems that can fuse
competitive intelligence reports and
structured data at the right granularity
are being developed for different sectors.
Machine learning techniques are major
contributors to the design of fusion
systems. These systems can be made
to learn from human interactions with
reports and data.
The marketing division is one of
the most prolific users of social media.
Consequently, they can also maximally
benefit from competitor promotion
information. Most companies have a pre-
defined static promotion calendar. This
calendar is reviewed from time to time,
usually on a quarterly basis. The review is
most often entirely against the company’s
own performance, without information
about competitor actions used in a
structured way. Promotion event maps
created from social media can be used
by the marketing analysts to get a near
real-time view of competitor activities,
analyze the company’s performance
against the backdrop of these and thereby
take corrective actions, if necessary. Joint
analysis of sales data and competitor
promotion events, can provide valuable
insights about how competitor promotions
affect sales.
For example, a dip in sales data can
be linked to reports about aggressive
promotions by competitors, new product
launch in the same category, price-
rise announcements or sudden rise in
negative brand sentiments. Similarly, rise
in sales can be linked to rise in positive
brand sentiment or price rise announced
by competitor. Given that there may not
be a single well-defined factor that can
be marked as responsible for an event,
automated systems can do a good job
of correlating all that is relevant based
on attributes like time of the year,
product, brand or region. Pattern mining
on large volumes along with human
annotations, as input can be utilized to
learn better correlations. Finally, machine
learning driven competitive intelligence
systems can also be used to design
predictive models that can predict future
performances based on series of present
and past events.
Conclusions and Challenges Hopefully, via the three broad use cases
presented above, we have made a case
for why social intelligence is becoming
increasingly important for enterprise
business intelligence. Further, as we have
explained, the fusion of external and
internal intelligence that enables value to
be extracted from external data, especially
from social media.
Large enterprises across industries,
from retail to financial services to
manufacturing, are today actively
exploring this new and exciting arena.
At the same time, the veracity of inputs
received from social media remains
a matter of concern. There are also
challenges in measuring the return on
investment (i.e., ROI) from socio-business
intelligence exercises: Statistically sound
techniques for measuring ROI, even
for simple matter such as advertising
campaigns, are as yet not widely popular.
Both these questions, i.e., efficiently
establishing the veracity of social media
inputs, as well as properly measuring ROI
from socio-business intelligence, pose
challenges for future research.
References [1] Kemal A Delic and Umeshwar Dayal.
The rise of the intelligent enterprise.
Ubiquity, 2002 (December): 6, 2002.
[2] Lipika Dey and Sk Mirajul Haque.
Opinion mining from noisy text data.
International journal on document
analysis and recognition, 12 (3): 205–
226, 2009.
[3] Lipika Dey, Sk Mirajul Haque, Arpit
Khurdiya, and Gautam Shroff. Acquiring
competitive intelligence from social
media. In Proceedings of the 2011 Joint
Workshop on Multilingual OCR and
Analytics for Noisy Unstructured Text
Data, page 3. ACM, 2011.
[4] Saurabh Sharma, Puneet Agarwal,
Rajgopal Vaithiyanathan, and Gautam
Shroff. Catching the long-tail: Extracting
local news events from twitter. In
International Conference on Weblogs
and Social Media, June 2012.
[5] Gautam Shroff, Puneet Agarwal, and
Lipika Dey. Enterprise information fusion
for real-time business intelligence. In
Proceedings of the 14th International
Conference, Fusion ’11, 2011. n
CSI Communications | April 2013 | 16 www.csi-india.org
Dr. Gautam Shroff is Vice President & Chief Scientist, Tata Consultancy Services and heads TCS’ Innovation
Lab in Delhi, India. As a member of TCS’ Corporate Technology Council, he is involved with recommending
directions to existing R&D eff orts, spawning new R&D eff orts, sponsoring external research, and proliferating
the resulting technology and intellectual property across TCS’ businesses.
Prior to joining TCS in 1998, Dr. Shroff had been on the faculty of the California Institute of Technology,
Pasadena, USA and thereafter of the Department of Computer Science and Engineering at Indian Institute
of Technology, Delhi, India. He has also held visiting positions at NASA Ames Research Center in Mountain
View, CA, and at Argonne National Labs in Chicago. Dr. Shroff completed his B.Tech (Electrical Engineering)
from the Indian Institute of Technology, Kanpur, India, in 1985 and Ph.D. (Computer Science) from RPI, NY,
USA, in 1990. Dr. Shroff taught a course “Web Intelligence and Big Data” on Coursera as well as at IIT and IIIT
and the URL is https://www.coursera.org/course/bigdata .
Dr. Lipika Dey is a Senior Consultant and Principal Scientist at Tata Consultancy Services, India. She heads the
Web Intelligence and Text Mining research group at Innovation Labs, Delhi. Lipika's research interests are in the
areas of content analytics from social media, social network analytics, predictive modeling, sentiment analysis
and opinion mining, and semantic search of enterprise content. Her focus is on seamless integration of social
intelligence and business intelligence. She is keenly interested in developing analytical frameworks for integrated
analysis of unstructured and structured data. Lipika has a Ph.D. in Computer Science and Engineering from IIT
Kharagpur. Prior to joining the industry in 2007, she was a faculty member in the Department of Mathematics at
Indian Institute of Technology, Delhi, from 1995 to 2006. She has several publications in International journals and
refereed conference proceedings. She is a Program Committee member for various International Conferences.
Puneet Agarwal is a Scientist at Tata Consultancy Services Ltd. He heads Data Analytics and Information
Fusion research group at TCS Innovation Labs, Delhi. Puneet’s research interests include applied research in
data-mining on time-series and graph data with a focus on distributed parallel processing.
He has been working in TCS for about 15 years and before joining TCS Innovation Labs in 2004, he worked as
a technical architect in various mission critical projects in Logistics and Shipping domain. He has published many
a research papers in various international conferences on Information Fusion, Software Agility, Collaboration, and
Model Driven Interpretation. Puneet has a B.E. Degree in Mechanical Engg from NIT Trichy.
Abo
ut t
he A
utho
rs
of the Industrial Internet, powered by
Big Data.
References[1] IT Glossary, Gartner Inc, http://www.
gartner.com/it-glossary/big-data/
[2] The Digital Universe in 2020: Big Data,
Bigger Digital Shadows, and Biggest
Growth in the Far East, http://www.emc.
com/leadership/digital-universe/iview/
index.htm, December 2012
[3] Metcalfe’s Law Recurses Down the Long
Tail of Social Networks, http://vcmike.
wordpress.com/2006/08/18/metcalfe-
social-networks/, April 2006
[4] We knew the web was big, http://
googleblog.blogspot.com/2008/07/we-
knew-web-was-big.html, July 2008
[5] The Google File System, http://research.
google.com/archive/gfs.html, October
2003.
[6] MapReduce: Simplifi ed Data Processing
on Large Clusters, http://research.google.
com/archive/mapreduce.html, December
2004
[7] The History of Hadoop: From 4 Nodes
to the Future of Data, http://gigaom.
com/2013/03/04/the-history-of-
hadoop-from-4-nodes-to-the-future-of-
data/, March 2013.
[8] Bigtable: A Distributed Storage System for
Structured Data, http://research.google.
com/archive/bigtable.html, November
2006.
[9] HAWQ: The New Benchmark for SQL on
Hadoop, http://www.greenplum.com/
blog/dive-in/hawq-the-new-benchmark-
for-sql-on-hadoop, February 2013.
[10] The Emerging Returns on Big Data, http://
www.tcs.com/big-data-study/Pages/
default.aspx, March 2013.
[11] The R Project for Statistical Computing,
http://www.r-project.org/.
[12] Scikit-learn: machine learning in Python,
http://scikit-learn.org/stable/.
[13] Apache Mahout, http://mahout.apache.
org/.
[14] MADlib, http://madlib.net/.
[15] Sensing the Future: Ford issues Predictions
for the next wave of Automotive
Electronics Innovation, http://media.
ford.com/article_display.cfm?article_
id=37541, December 2012.
[16] Industrial Internet: Pushing the
Boundaries of Minds and Machines,
http://www.ge.com/docs/chapters/
Industrial_Internet.pdf, November
2012 n
Abo
ut th
e A
utho
r
Dr. Milind Bhandarkar was the founding member of the team at Yahoo that took Apache Hadoop from 20-node
prototype to datacenter-scale production system, and has been contributing and working with Hadoop since version
0.1. He started the Yahoo Grid solutions team focused on training, consulting, and supporting hundreds of new
migrants to Hadoop. Parallel programming languages and paradigms has been his area of focus for over 20 years,
and a topic of his PhD dissertation at University of Illinois at Urbana-Champaign. He worked at the Centre for
Development of Advanced Computing (C-DAC), National Center for Supercomputing Applications (NCSA), Center
for Simulation of Advanced Rockets, Siebel Systems, Pathscale Inc. (acquired by QLogic), Yahoo and LinkedIn.
Currently, he is the Chief Scientist at Greenplum, a division of EMC2.
Continued from Page 8
CSI Communications | April 2013 | 17
ResearchFront
Big Data Enabled Digital Oil Field
Pramod Taneja* and Prashant Wate***Principal Architect, iGATE**Technical Specialist, iGATE
IntroductionOil and Gas Industry OverviewOil and Gas (O&G) companies – both
operator companies as well as oil fi eld
service providers, now have more
upstream data (structured, unstructured
as well as real-time) than ever before; to
base their operational decisions relating
to exploration, drilling or production. For
this reason eff ective, productive, and on-
demand data insight is critical, for decision
making within the organization.
However, a vision towards an
integrated Exploration and Production (E&P)
data management platform, still remains
a challenge as extraction of business-
critical intelligence/insights from large
volumes of data in a complex environment
of legacy diverse systems, and fragmented/
decentralized solutions is a daunting task.
Some typical challenges for E&P data
management are:
• Upstream focused applications are
at a functional level. So, substantial
time is spent in data collection and
running reports for a given asset level
i.e. for a single well or aggregate wells
in a given location
• A major number of applications
are still non-PPDM (Professional
Petroleum Data Management
Association) based, which makes
the reports and KPIs non-accurate at
most times
• It is diffi cult to drive insights from
unstructured data lying in multiple
applications
• It is diffi cult to run predictive
analytics as data is spread out in
multiple systems with lesser integrity
and reference to master-level data.
Need for Digital Oil Field
Enterprise PlatformAn Integrated Digital Oil Field Enterprise
Platform integrates E&P data from
diff erent project phases — Seismic,
Drilling, Well and Production — into
a single consolidated platform. Data
indexing, storage, cleansing, clustering,
migration, standardization, and analysis
can be done from multiple data sources
(be it structured, unstructured or real-
time) into an integrated platform and
provide detailed insights at a well level at
any instant. This solution should leverage
cloud infrastructure, an integrated
workfl ow, an accelerated digitized solution
framework based on MURA (Microsoft
Upstream Reference Architecture), hybrid
data models, integration to multiple data
sources, and a host of accelerators for
data migration.
Big Data in Digital Oil FieldIn Oil and Gas industry, traditional
data warehousing solutions are facing
challenges to capture, storage, and churn
massive volume of data sets. The O&G
companies can adopt Big Data solutions
to maximize their business potential by
deriving a holistic view of the voluminous
sensor device data to gather valuable
insights that can complement existing
traditional BI off erings.
This consolidated Big Data enabled
E&P data management platform should be
designed to fi t within an O&G operator’s
or oil fi eld service provider’s technology
infrastructure and provides an on-demand
and single view of a well at any instant
and from anywhere. The platform should
provide ready-to-use accelerators as well
as interfaces with third-party Geologist
and Geophysicists (G&G) product suites
as well as with customer data sources - be
it structured, unstructured or real-time.
Big Data Solutions for Digital
Oil FieldO&G companies can adopt Hadoop
enabled Big Data solutions for creating
Integrated Digital Oil Field strategy.
Hadoop is a widely accepted open-source
cost eff ective solution which provides
map-reduce functionality for processing
Fig. 1 : Functi onal overview – big data enabled digital oil fi eld Source: iGATE Research
CSI Communications | April 2013 | 18 www.csi-india.org
extremely large data sets on commodity
servers. Hadoop based solutions allow
storing, processing, and analyzing these
humongous logs on near real time
basis. The crux of the solution involves
processing raw data in its native format
to create aggregated views along with
understanding of its relationships and
patterns and thereby derive meaningful
insights for quick decision-making
related to reservoir & optimizing the data
exploitation using Map Reduce paradigm.
There is a widely acceptable adoption of
Hive, which is a scalable data warehouse
solution available on Hadoop, with HiveQL
as a query mechanism which is similar
to SQL syntax. Hive internally generates
map-reduce jobs that can be executed
on Hadoop clusters. Hive in-turn allows
overcoming the learning curve associated
with the Map-Reduce code generation.
Big Data Enabled Digital Oil Field SolutionFunctional OverviewAs part of a faster transition strategy
towards Digital Oil Field for integration,
processing and analytics, Hadoop
clusters can be leveraged along with
data migration and business intelligence
accelerators. An architecture view is
depicted below strategizing a unifi ed
view of the diff erent Oil Wells managing
semi/unstructured data of drilling and
production phase, leveraging modeling/
simulation techniques, and ready-to-
deploy KPI confi gurations
The high-level functional overview is
stated below:-
1. Fetch the customer’s E&P from
various well in diff erent phases –
Each oil well generates around 10TB
of data and in a reservoir there are
multiple wells to be drilled and
explored
2. This massive volume of multi
structured logs is stored on a Hadoop
infrastructure
3. Data Processing is most important
step for data preparation in a manner
which is less time consuming
activity. Hadoop is an ideal solution
which can be used for converting
the unstructured data to structure
format, perform cleansing and store
in unifi ed Hive structures.
4. The Digital Oil Field provides PPDM
compliance models for ease of
integration and portability post
standardization into a Digitized
Platform
5. The quality of data on the
Digitized Platform is verifi ed by the
stakeholders
6. Complex analytics and event
processing is performed to fi nd the
drilling patterns, infer the lithology
content based on various parameters
of the Oil well logs. In turn provide
adaptors to 3rd party interfaces for
data interpretation.
7. Integration with the BI services and
Enterprise Application Integration
(EAI) services to third party agents
for advanced analysis and dashboard
generation
Technical Prcoess fl ow There are fi ve stages depicted in below
diagram, stating the lifecycle of the data
process in big data platform
1. Data Capture Stage - Fetch the
customer’s E&P data from various
well in diff erent phases. Apache
Flume can be used for capturing
Oil Well log data embedded in a
standard formats such as Logical
ASCII Standard (LAS) fi les, Seismic
Data fi les etc. Sqoop can be used
for capturing data from RDBMS
structured production data.
2. Data Storage & Preparation Stage –
The massive volume of relevant data
is then stored on Hadoop distributed
fi le systems. Hadoop streams can be
used for invoking the data preparation,
massaging and cleansing scripts. The
data preparation jobs can convert
the unstructured data to structure
format, perform cleansing and store
in unifi ed Hive structures. The Data
Governance can be carried out by
tools such as Oozie and Zookeeper
Fig. 2 : Data process fl ow in Big Data Enabled Digital Oil Field analyti cs Source: iGATE Research
Continued on Page 36
CSI Communications | April 2013 | 19
Big Data[1] is a large volume of data
from various data sources such as social
media, web, genomics, cameras, medical
records, aerial sensory technologies, and
information sensing mobile devices. Big
Data includes structured, semi-structured,
and unstructured data. This unstructured
data contains useful information which can
be mined. Since 1980s, per-capital capacity
to store information is increased into double
the amount for every 40 months. In 2012,
statistics says that 2.5 quintillion (2.5 * 218)
bytes of data are created per day. Moreover,
digital streams that individuals create are
growing rapidly. For example, most of the
people are using camera on their own. Big
Data are of high level volume, high velocity,
and high variety of information that needs
advanced method to process the Big Data.
In addition, conventional software tools
are not capable of handling Big Data. So
Big Data requires extensive architecture.
The following types of data are referred to
as big data.
• Social data – Customer feedback
forms for Customer Relationship
Management (CRM) in Social media
sites such as Twitter, Facebook,
LinkedIn etc.
• Machine-generated data – Sensor
readings, Satellite communication
• Traditional enterprise data- Employee
information, business product,
purchase, sales, customer Information,
and ledger information.
Traits of Big Data Big data diff ers from other
data in 5 dimensions[3] such as volume,
velocity, variety, and value. Volume: Machine
generated data will be large volume of data.
Velocity: Social media websites generates
large data but not massive. Rate at which
data acquired from the social web sites are
increasing rapidly. Variety: Diff erent types of
data will be generated when a new sensor and
new services. Value: Even the unstructured
data has some valuable information. So
extracting such information from large
volume of data is more considerable.
Complexity: Connection and correlation
of data which describes more about
relationship among the data. Challenges
Storing and Maintaining the Big Data is a
challenging task. The following challenges
need to be faced by the enterprises or media
when handling Big Data:
• Capture
• Duration
• Storages
• Search
• Sharing
• Analysis
• Visualizations
Why Big Data? Big Data is absolutely
essential for the following intents:
• To spot business trends
• Determine quality of research
• To prevent diseases
• To link legal citation
• To combat crime
• To determine real time roadway
communication system, where the
data is created in the order of exa bytes
(218).
Where it is used? Areas or fi elds where big
data are created:
• Medicine, Meteorology, Connectomics,
Genomics, Complex Physics
Simulation, Biological, Environment
Research, and Areal Sensory System
(remote sensing technologies).
• Big Science, RFID, Sensor Networks.
• Astrometry.net project keeps eye on
Astrometry group via fl icker for new
photos of the night sky. It analyzes
each image and identifi es the celestial
bodies such as stars, galaxies etc.
MapReduce MapReduce[2] is a programming model for
handling complex combination of several
tasks and it was published by Google. It is
a batch query processor and can run an ad
hoc query for whole dataset and get the
results in a sensible manner which has to
be transformative. It has two steps. 1. Map:
Queries are divided into sub queries and
allocated to several nodes in the distributed
system and processed in parallel. 2. Reduce:
Results are assembled and delivered.
Database Oracle has introduced the total solution
for the scope of enterprise which requires
Big Data. Oracle Big Data Appliance[3] is a
tool to integrate optimized hardware and
extensive software into Oracle Database 11g
to endure the Big Data challenges. Example Application: Patient Health Information System on Cloud
The Real-time application of Big Data
can also be in Patient Health Information
System on Cloud[4]. Patient Health Record
(PHR) is an emerging technique to store
the Patient Heath Information Record and
exchange the data over the network, which
is stored at the cloud for accessing the data
log anytime and anywhere. To assure more
security individuals are given with their own
login and data stored over the cloud would
be encrypted. PHR includes variety of data
such as structured, unstructured, and semi-
structured.
• In PHR, we propose machine generated
data by acquiring the fi nger print or iris
pattern or face of the patient for saving
the entire data log of the patient. It
uses fi nger print sensor or Iris scanner
or face recognizer for capturing the
patient Identifi cation. Finger print or
iris pattern or facial expression act as a
key for retrieving the data saved in the
database
• Traditional enterprise data includes
the entire PHR right from his/her birth
with the details of the doctors and their
prescription and all records.
• PHR called as social data which can
be made online for online consultation
and medicine purchase. Even the lab
test reports can be uploaded online.
This avoids patient waiting time in
the lab for the result report. A copy of
the result report will also be sent to the
respective consulting doctor for further
enquiry. An individual login is provided
for patient, doctor, pathologist,
pharmacists,etc. , which makes the
system more secure.
Conclusion Omar Tawakol, CEO, Bluekai has written
an article recently. In that article, he has
mentioned that “More data usually beats
better algorithm”. But it is very rigid to
store and analyze. However, Big Data are
used for fi nding the customer behavior, for
identifying the market trends, for increasing
the innovations, for retaining the customers,
for performing the operations effi ciently.
Flood of data coming from many sources
must be handled using some non-traditional
database tools. It provides more market value
and systematic for the upcoming generation.
References [1] Wikipedia, the free encyclopedia [2] White, Tom, Hadoop: The Defi nitive
Guide. O'Reilly Media, ISBN 978-1-4493-3877-0.
[3] An Oracle whitepaper, Jan 2012 “Oracle: Big Data for the enterprise”.
[4] Scalable and Secure Sharing of Personal Health Records in Cloud Computing using Attribute-based Encryption, M. Li, S. Yu, K. Ren, and W. Lou, Sep 2010 pp 89- 106. n
Big Data
A Kavitha*, S Suseela**, and G Kapilya****AP/CSE, Periyar Maniammai University, Vallam, Thanjavur** AP/CSE, Periyar Maniammai University, Vallam, Thanjavur ***AP/CSE, Periyar Maniammai University, Vallam, Thanjavur
Article
CSI Communications | April 2013 | 20 www.csi-india.org
Abstract: In-Memory analytics has brought
a paradigm shift in storage and data
management in facilitating instant reporting
for decision making. Revolution in advanced
memory technology, drastic decline in price
of memory, and evolution of multi-core
processors have changed the orientation
of business intelligence query and fetching
of data along with the way data is stored
and transferred. This article discusses the
adoption of in-memory technology, its
architecture, and few enabling software of
in-memory computing. It also discusses the
scope and benefi ts of In-Memory approach.
IntroductionIn-Memory Analytics facilitates querying
of data from Random Access Memory
instead of physical disk. Detailed data can
be loaded from multiple sources into the
system memory directly. This technique
helps in taking faster business decisions.
Performance is improved as storage and
operations are performed in the memory.
The approach of In-Memory Analytics
has brought a paradigm shift in storage
philosophy. Here, summarized data
are stored in RAM. However, In case of
databases, data is stored in tables through
relationships and interconnection among
tables, and other database objects. Similarly,
multidimensional cubes are created and
data are stored in traditional business
intelligence platforms. In case of In-Memory
Analytics, the creation of Multidimensional
cubes are avoided[1]. As per Gartner,
capabilities of in-memory analytics includes
faster query and calculation which almost
avoids to build aggregate and precalculated
cubes. Some myths and facts of in-memory
approach are described below (Fig. 1).
Architecture of In-Memory AnalyticsThere are diff erent approaches of the
architecture of In-Memory computing.
They are associative model, in-memory
OLAP,Excel in-memory add-in, in-memory
accelerator, and in-memory visual analytics.
In associative model, associations are
based on the relationships between various
data elements .When a user clicks on an
item within a data set, the selected items
turn green and all associated values turn
white. This facilitates users to quickly query
all relevant data without the dependency
of a predefi ned hierarchy or query path,
and are not limited to navigate the
analytical data in a predetermined way.
Similarly, Excel in-memory add-in allows
users to load large volumes of data into
Microsoft excel using in-memory. Once
the data is within excel, relationships are
automatically inferred between the data
sets, permitting users to perform on-the fl y
sorting, fi ltering, slicing, and dicing of huge
data sets, which overcomes some of the
technical data volume limits of excel. This
approach improves self-service capabilities
as it reduces dependency on IT, and lessens
the needs for business users to become
expert in multi-dimensional structures and
techniques. This add-in is dependent on
a particular back end data management
and portal platform which helps sharing
of data and collaboration. In-memory
OLAP Approach functions by loading data
in-memory, which allows complicated
calculations and queries to be computed
on-demand resulting in fast response times.
If write-back is supported then users can
change assumptions on the fl y to support
what-if scenarios, which is a specifi c
requirement in forecasting and fi nancial
planning. In-memory visual analytics
combines an in-memory database with a
visual data exploration tool allowing users
to quickly query data, and reports within a
visual and interactive analytics ambience.
In-Memory accelerator approach improves
query performance within an existing
business environment. This accelerator
functions by loading data into memory
and leveraging pre built indexes to support
super fast query response times[2]. There
are many In-Memory computing enabling
softwares like In-Memory analytics and
event processing, In-Memory messaging,
In-Memory application platforms, and
In-Memory data management. These
softwares provide new business ideas and
IT challenges. A comparison of traditional
data analytics technology and in-memory
data analytics technology is given
below (Fig. 2).
In-Memory Application PlatformsSAP HANA (High Performance
Analytics Appliance) As per Garter’s study on information
explosion, data of enterprises will grow
650% over past fi ve years, with 80% of
that data unstructured, which means that
the data explosion spans both traditional
sources like point of sale and shipment
tracking records along with non traditional
sources like emails, web content, and
documents[8]. In-Memory technology
allows processing of huge quantities of
data in real time to provide instant result
for decision making. SAP HANA provides
a foundation for building new generation
applications, which facilitates processing
of huge quantities of data in the main
memory of the server from any source
virtually to provide results from analysis.
SAP HANA is a technology, which permits
the processing of massive quantities of
real time data in the main memory of the
server by providing instant result from
analysis and transactions. As per SAP,
SAP HANA Technology will drastically
improve query performance and speed
up data loads. The reduced data layers
will also simplify system administration
and reduce operating costs. This software
platform is specifi cally prepared to
support operational and analytics
operations. This platform also helps SAP
Partners and customers to develop their
own applications.
Oracle ExalyticsOrganizations needs analytics for gaining
insight, so as to take correct decision.
However, due to budgetary pressure, time
Adoption of In-Memory Analytics
ArticleJyotiranjan HotaAssociate Professor, School of Management, Krishna Campus, KIIT University, Patia, Bhubaneswar
Myths Facts
“In-Memory is just a Hype spread by SAP” All major software vendors deliver in memory
technology
“It’s new and unproven technology” It has been around since 1990s
“It is solely about running analytics faster” It’s widely used for transaction and event
processing as well
“It’s incremental and nondisruptive” Prediction :In-memory will have an industry
comparable to web and cloud
Source: Gartner6
CSI Communications | April 2013 | 21
sensitivity, and extensive requirement, IT
fi rms usually face challenges to produce
actionable analytics. The task even
becomes more complex due to involvement
of multiple hardware, networking,
software, storage vendors, and expensive
resources are wasted integrating software
and hardware components to generate
complete analytics solution.
Oracle Exalytics is an optimized
system, which provides solution to
all business related issues without
compromising speed, simplicity,
manageability, and intelligence. Oracle
Exalitics is built with market leading
BI software, in-memory database
technology, and industry-standard
hardware. Oracle claims that exalytics
uses a new interface designed to produce
quick result regardless of the query,
location, and device types[3].
Scopes and Benefi ts of In-Memory AnalyticsIn-Memory Analytics should be used to
improve query performance and processing
of reports. Hence, reorientation of existing
report infrastructure is needed so as to
implement in-memory analytics. However,
clear understanding of the demand of
users and applications on computing
resources should be understood through
data proofi ng. It is also important to identify
users and applications who need processing
of ad hoc and non routine reports. This
eff ort is accomplished through data usage
models, which reduce the cost and eff ort
of in-memory analytics introduction in
a fi rm .Mostly, operational and standard
reporting need is approximately 70-80%
and non routine and ad hoc reporting
need is about 20-30% of an organization,
which should be recommended after
exact analysis. In few fi rms ,the need for consolidated reporting and forecasting is required frequently within 10 to 12 weeks .In-Memory analytics is quite fi t to be used in these circumstances[1] .In the current context, there is a drastic drop in prices of memory and processors. At the same time, multi-core processors are evolving. In-Memory computing has made it possible to perform storage and operations in main memory, where the requirement of hard disk can be avoided. Due to two valid reasons, In-Memory computing is useful. Firstly, the volume of information is growing at an alarming rate. Secondly, immediate responses are needed as quick decisions are needed now in all forward looking
organizations. Traditionally, annual and
quarterly review reports were taken as the
basis for decision making. Past data analysis
using data warehousing technology is
slowly vanishing. In-Memory computing
is supporting event driven systems, which
enable decision making in real time. Here
data is brought closer to central processing
unit. Compared to disk based access, the
querying of the data based on in-memory
is million times faster. Adoption of 64-bit
architecture is a facilitator to in-memory
approach as the addressable memory
space is increased. Usually midsized
companies lack in technical expertise and
resources to construct data warehouses,
and performance tuning tasks. However, in-
memory approach for midsized companies
is less cumbersome, easy to administer
and set up. IT Infrastructure is not a barrier
here, in optimizing business performance.
In-memory approach reduces the skill gaps
in constructing and consuming analytical
applications. The reason for reducing the
diffi culties is due to avoidance of use of
OLAP cubes, which are stored in back end
databases. Total cost of ownership of the
fi rms is reduced and business performance
is enhanced.
In-Memory Analytics VendorsThe vendors who provide solution include
hardware vendors, servers, and software
Applications (Table 1)
Research ChallengesThere are few research issues and
challenges of in-memory analytics.
In-memory analytics must face the
challenge of technology incumbency,
particularly in companies where there is
heavy dependency on traditional OLAP
technology. Many organizations have
entire departments built around certain
business intelligence platforms, and the
threat of any disruptive technology that
may signifi cantly reduce, even eliminate
these empires will be met with resistance
and skepticism. Enterprise reporting has
emerged as a mission-critical function, and
once the user community is dependent
upon large numbers of reports, one should
hesitate before introducing too much
change, too fast[1]. As per the IDC Report
(2011), traditional method of building and
developing computing infrastructure in case
of analytics applications are not suitable,
when migration to in-memory analytics
applications is needed.
Conclusion and Future Ahead As per a study, around 30% of fi rms will
have one or more critical applications
running on an in-memory database in
next fi ve years, and by 2014, 30% of
analytics applications will use in-memory
functions to add scale and computational
speed[9]. The companies are seeking to be
responsive, insight driven, and more real
time. There is a guarantee that in-memory
computing will dominate the marketplace in
future and grip forward[4]. IDC (2011) report
states that in-memory technology in public
and private sectors will facilitate these fi rms
to the highest level of competitiveness
through “freedom of excess”. The in-
memory technology platforms that promote
innovations reduce IT compromises, and
enable access to information by the right
people at the right time[5]. Market Research
Media research report stated that the high
performance computing market is expected
to reach $200 Billion by 2020.In-Memory
computing is one of the fastest growing
components of that market. As per Gartner,
In-memory Analytics approach is now
Source: SAP HANA Overview and Roadmaps (SAP Community Network)7
CSI Communications | April 2013 | 22 www.csi-india.org
being used in variety of applications like
risk management, inventory forecasting,
profi tability analysis, fraud detection,
algorithmic trading, and areas like sales
incentive promotion management.
Refactoring existing applications in-memory
to utilize the approaches of in-memory can
result in better scalability and transactional
application performance, lower latency
application messaging, drastically faster
batch execution, and faster response time
in analytical applications. In year 2012
and 2013, cost and availability of memory
intensive hardware platforms reach tipping
points. So the in-memory approach will
enter the mainstream.
References[1] Baldwin, T (2008). Don\’t fold your
cubes Just Yet… But In-Memory Analytics is beginning to Mature, available at http://www.tagonline.org/articles.php?id=298 accessed on 24th October 2012.
[2] Schwenk, H (2010). Accelerating time-to-insight for midsize companies using in-memory analytics available at http://www2.technologyevaluation.com/ppc/request/whitepapers/accelerating-timetoinsight-for-midsize-companies-u s i n g - i n m e m o r y - a n a l y t i c s . a s p fetched on 1st February 2013.
[3] Gligor, G, Teodoru, S (2011). Oracle Exalytics: Engineered for Speed-of-Thought Analytics. Database Systems Journal, 2(4), 3-8.
[4] Kajeepeta, S (2012). The Ins and Outs of In-Memory Analytics, available at http://www.informationweek.com/software/business- intel l igence/the-ins-and-outs-of- in-memory-analytics/240007541 fetched on 29th September 2012.
[5] Morriss, H D (2011). Faster, Higher, Stronger: In-Memory Computing Disruption and what SAP HANA means for your Organization, available at download.sap.com fetched on 15th March 2013.
[6] Pezzini, M (2011). The Next Generation Architecture: In-Memory Computing, available at http://www.slideshare.net/SAP_Nederland/the-next-generation-architecture-inmemory-computing-massimo-pezzini fetched on 25th March 2013.
[7] Groth, H (2012). SAP HANA-Strategy and Roadmap, available at http://www.saptour.ch/landingpagesfr/Manager/uploads/23/32.pdf fetched on 25th March 2013.
[8] Chumsantivut, B (2011). SAP HANA Power of In-memory Computing, available at http://www.cisco.com/we b/ T H /a s s e t s /d o c s /s e m i n a r/SAP_HANA_Power_of_In_Memory_Computing.pdf fetched on 25th March 2013.
[9] Dale, S (2011). Getting real-time results with in-memory technology, available at http://enterpriseinnovation.net/article/getting-real-time-results-memory-technology fetched on 25th March 2013. n
Table 1Vendor Website Hardware Solution Analytics Solution
Dell http://www.dell.com VIS Next Generation Datacenter
Platform;PowerEdge R910
Fujitsu http://www.fujitsu.com PRIMEQUEST 1800 Series; FCRAM; FRAM
Fusion IO http://www.fusionio.com Fusion IO Flash Memory
HP http://www.hp.com HP Converged Infrastructure Platform; ProLiant
DL 900 Series
IBM http://www.ibm.com IBM soildCB
NEC http://www.nec.com Express 5800/A1080a
Oracle http://www.oracle.com Exalytics In-memory Machine
SAP http://www.sap.com SAP High Speed Analytical Appliance (HANA),
SAP In-Memory Computing
Kognitio http://www.kognitio.com Kognitio WX2 Analytics Database, WX2
Datawarehouse Appliance, DaaS Cloud
Advizor
Solutions
http://www.advizorsolutions.com Advisor 5.8; Advisor Analyst
Microsoft http://www.microsoft.com PowerPivot
QlikTech http://www.qlikview.com QlikView
Quantrix http://www.quantrix.com DataNaV
Quartet FS http://www.quartetfs.com Active Pivot
SAS http://www.sas.com In-Memory Analytics
Sybase http://www.sybase.com Adaptive Server Enterprise (ASE)
TIBCO http://www.tibco.com Spotfi re
Source: Aberdeen Group, December 2011
Abo
ut th
e A
utho
r Prof. Hota is an Associate Professor and Area Chairperson of Information Systems wing at KIIT School of
management, Bhubaneswar. He is a BE in Computer Science, from NIT Rourkela and PGDBM from Xavier Institute
of Management, Bhubaneswar. He teaches Data mining, Business intelligence, Analytics, and core modules of SAP
ECC 6.0 like SD, MM, FI-CO, HCM, and PP Modules in view and confi guration modes. His research interest lies in
banking technologies, analytics, and ERP. He has published several papers in many Journals and Conferences in
India and abroad. Author can be reached at [email protected] .
CSI Communications | April 2013 | 23
8 JUNE 2013UPCOMING EXAM DATE
For more information and to register for an ISACA
exam, visit isaca.org/mycrisc-CSI.
FINAL REGISTRATION DEADLINE: 12 April 2013
CRISCGold Winner for Best
Professional Certification Program
tt n mm
CSI Communications | April 2013 | 24 www.csi-india.org
Risks and opportunities are two sides of
the same coin. For example, the Internet
has opened up many opportunities for us,
but at the same time exposed us to many
new risks. While we wish to avail the
opportunities, we also want to manage the
risks. It is not possible to avoid the risks
totally, so we should try and mitigate the
impact of risks. A cybersecurity professional
has to be an expert in risk management.
Whether the cybersecurity professional
is in the role of a planner, defender or
investigator, the balancing act of managing
the risks and selection, deployment, and
testing of information system controls will
remain the primary concern.
A risk management professional
is expected to be well versed in the fi ve
practice areas of risk and information
systems controls stated below:
1. Risk Identifi cation, Assessment and
Evaluation
2. Risk Response
3. Risk Monitoring
4. Information Systems Control Design
and Implementation
5. Information Systems Control
Monitoring and Maintenance
Demonstrated experience and competency
in these practice areas as well as
successfully passing the examination will
lead to acquiring ISACA’s Certifi ed in Risk
and Information System Control (CRISC)
certifi cation, which was recently awarded
the ‘Best Professional Certifi cation
Program at the 2013 SC Awards from
SC Magazine.
Becoming a CRISC helped me in securing
my current job, as it is an independent
confi rmation to my employer that beyond
information systems audit and security
management work, I also have extensive
IT risk and control management
experience. Security management
frameworks within Australian public
sector progressed from prescribed
controls to risk-based approach. This
change demanded suitably experienced,
skilled, and certifi ed professionals to
bring the new frameworks to life in order
to eff ectively manage risks and pursue
opportunities.
Having the CRISC certifi cation was an
important diff erentiator, particularly
for an employer with a mature register
of recognized certifi cations used for
hiring and engagement of professional
consultants. ISACA's certifi cations have
been highly regarded on this list because
of their well balanced business and
technical aspects, as well as a defi ned
minimum knowledge and experience
requirements for certifi cation holders.
The CRISC designation certainly helped
with being shortlisted for the position,
whilst my knowledge of ISACA's
frameworks helped me win my current
position.
~ Bob Smart, CISA, CISM, CRISC,
Manager of ICT Security, Government of
South Australia
Risk Identifi cation, Assessment, and EvaluationInformation systems are built keeping
people, processes, and technology in mind.
They involve designing of architecture and
applications to handle the information.
Each of these could include some risks,
apart from the risks due to natural factors
and physical threats. Assessing the
risk levels associated with each threat
includes; anticipating risk probability and
impact, threats and vulnerabilities, and
the eff ectiveness of current and planned
controls.
A risk professional will need to have
good knowledge of various standards,
frameworks and practices related to risk
identifi cation, assessment and evaluation
and familiarity with quantitative and
qualitative methods for risk identifi cation,
classifi cation, assessment, and evaluation.
Since risks impact business, knowledge
of business goals and objectives and
organization structure is also essential.
This leads to build the business
information criteria. Various risk scenarios
involving threats and vulnerabilities
related to business processes will have
to be built. Knowledge areas will include
information security architecture,
platforms, networks, applications,
databases, and operating systems. There
should be a good understanding of
threats and vulnerabilities related to third-
party management, data management,
system development life cycle, project
and program management, business
continuity, disaster recovery management,
management of IT operations, and the
threats and vulnerabilities associated
with emerging technologies. In addition,
knowledge of current and forthcoming
laws, regulations and standards will be
necessary. The risk professional should
also be familiar with the principle of risk
ownership, risk scenario development, risk
awareness training tools and techniques,
and elements of risk register.
Risk ResponseThe probability of the occurrence of risk
may be diffi cult to predict, but one can
never assume it to be zero. Sooner or later
the hypothetical risk scenario may actually
materialize. It is desirable to be adequately
prepared with risk response. The purpose
of defi ning a risk response is to ensure
that the residual risk is within the limit
of the risk appetite and tolerance of the
enterprise.
A risk professional will have to
clearly defi ne the risk response options.
Every response will have to be evaluated
with cost/benefi t analysis and weighed
against a number of parameters including
the cost of response to reduce the risk
within tolerance level, importance of
risk, capability to implement response,
eff ectiveness of response as well as
effi ciency of response. The available risk
response options are to (a) avoid the risk,
(b) reduce/mitigate the risk, (c) share or
transfer the risk and lastly, (d) accept the
risk. It may not be an easy job to decide
on an appropriate option. Although there
are major risks in electronic commerce
transactions, avoiding e-commerce is not
really an option today. A thorough cost/
benefi t analysis has to be done among
the remaining three options before taking
a decision. This will require building up a
business case to justify the selection of
response. The risk professional will have
to be very familiar with organizational
risk management policies, portfolios,
investment and value management,
exception management, parameters for
risk response selection, risk appetite and
tolerance, and the concept of residual risk.
Risk MonitoringA pre-planned risk response is essential
to eff ectively and effi ciently deal with a
risk. If there is a good risk monitoring
Five Key Knowledge Areas for Risk Managers
ArticleAvinash Kadam [CISA, CISM, CGEIT, CRISC]
Advisor to the ISACA India Task Force
CSI Communications | April 2013 | 25
process implemented to keep a watch on
various risks and sound an alarm as soon
as some risk parameters cross the risk
threshold, it will defi nitely save much of
the eff orts that will go in responding to
the risk. Developing these risk indicators
will be a major challenge. There may
be literally hundreds or risk indicators
such as logs, alarms, and reports. A risk
professional will have to closely work with
senior management and business leaders
to determine which risk indicators will
be monitored on a regular basis and be
recognized as Key Risk Indicators (KRIs).
The KRIs should be selected based on the
following factors:
• Reliability, i.e. they will every time
sound an alarm without fail
• Sensitivity, i.e. the alarm will be
sounded only when a certain
threshold is reached
• Impact, i.e. the KRIs will be selected
for areas which will have high
business impact
• Eff ort, i.e. the preferred KRIs will be
those which are easier to measure
The risk professional should be familiar
with various risk monitoring sources.
The information for risk monitoring could
be obtained from suppliers or vendors
of hardware, software, applications in
terms of various updates, anti-malware
vendors, logs of devices, CERT alerts,
newspapers, blogs, and technical reports
published by information security research
organizations. This means that the
professional will have to constantly update
the knowledge.
Information Systems Control Design and ImplementationControls are the policies, procedures,
practices, and guidelines designed to
provide reasonable assurance that
the business objectives are achieved
and undesired events are prevented or
detected and corrected. The controls
include technical controls such as access
control mechanisms, identifi cation and
authentication mechanisms, encryption
methods, and intrusion detection software.
The non-technical controls include
security policies, operational procedures,
and personnel, as well as the physical and
environmental security. A risk professional
must be knowledgeable in how to
design and implement the information
system controls throughout the system
development life cycle (SDLC) and project
management. Typical phases of SDLC
include the feasibility study, requirements
study, requirements defi nition, detailed
design, programming, testing, installation,
and post-implementation reviews. The
business risks could be the likelihood
that the new system may not meet the
user’s business needs, requirements, and
expectations. The project risks could be
that the project activities to design and
develop the system exceed the limits of
the fi nancial resources set aside for the
project. As a result, the project may be
completed late, if ever.
Information Systems Control Monitoring and MaintenanceRisk management relies on a monitoring
process to ensure that IS controls
remain eff ective and effi cient over the
time. Monitoring requires the defi nition
of meaningful performance indicators,
systematic and timely reporting of
performance, and prompt response to
deviations. Monitoring makes sure that
the right things are done and are in line
with the set business directions and
corporate policies.
The risk professional should have
good knowledge of enterprise security
architecture, monitoring tools and
techniques, and various control objectives,
activities and metrics related to information
security, data management, SDLC, incident
and problem management, IT operations,
business continuity and disaster recovery,
project and program management, and
applicable laws and regulation.
Selection of appropriate tools will
require good knowledge about tools for
monitoring transaction data, monitoring
conditions, changes, process integrity, and
error management and reporting, and at
times continuous monitoring.
It has been amazing to see the rapid
rise in the number of IT professionals
seeking the CRISC (Certifi ed in Risk
and Information Systems Control)
certifi cation. More than 16,000
professionals have earned the CRISC
designation, since the certifi cation was
introduced in 2010.
CRISC is highly desired because it
is the only certifi cation that positions IT
professionals for future career growth by
linking IT risk management to enterprise
risk management. Professionals across a
wide range of job functions that include
IT, security, audit and compliance have
earned the CRISC designation since it
was established in April 2010. While
CRISC is designed for risk professionals
with at least three years of experience,
more than 1,200 CIOs, CISOs, and
chief compliance, risk and privacy
offi cers have also chosen to pursue the
designation.
CRISC is the result of signifi cant
market demand for a credential that
recognizes experienced risk and
control professionals. This demand
will only accelerate as stakeholders
demand better corporate governance
and business performance, and more
secure infrastructures.
If you have real-world IT controls
and risk experience, I strongly
encourage you to pursue the CRISC
certifi cation. Becoming CRISC
certifi ed provides an additional level of
assurance that you have the necessary
skills and experience to get the job
done. It also enters you into a group of
professionals with common interests
and abilities. Networking with my fellow
CRISCs and ISACA members has been
an extremely rewarding experience. I
encourage you take advantage of the
opportunities certifi cation provides.
–Shawna Flanders, CISA, CISM, CRISC,
process engineer at PSCU, USA
In India, we are making rapid progress in
the adoption of information technology.
Organizations are well aware that they
should not take undue risks to achieve
their ambitious goals, and should build
appropriate IT controls to manage the
risks. This has prompted rapid acceptance
of CRISC certifi cation in India, and created
new job and promotion opportunities for
CRISC certifi ed professionals.
Avinash Kadam, CISA, CISM, CGEIT, CRISC, CISSP, CSSLP, GSEC, GCIH, CBCP, MBCI, PMP, CCSK, is an advisor to ISACA’s India Task Force. ISACA is a global association for IT assurance, security, risk, and governance professionals with more than 100,000 members worldwide and more than 6,000 in India. The
nonprofi t, independent ISACA developed
the COBIT framework for governance and
management of IT, and off ers the CISA,
CISM, CGEIT, and CRISC certifi cations.
Opinions expressed in the blog are
Kadam’s personal opinions and do not
necessarily refl ect the views of ISACA
(www.isaca.org).He can be contacted via
e-mail [email protected]. n
CSI Communications | April 2013 | 26
Practitioner Workbench
Dr. Nibaran DasAsst. Professor, Dept of Computer Science & Engineering, Jadavpur University, Kolkata and Editor, Computer Jagat, a Bengali monthly magazine
Programming.Tips () »
Python: Programming Language for EveryonePython is the programming language, which is popularly used by the scientifi c research community. But, due to its easy coding styles, documentation, and wide support from a large group of open source community, it becomes the programming language for everyone. It not only support functional or object oriented programming styles, but also other paradigms of programming styles such as imperative programming, logic programming, and design by contact etc. It is also popular for developing diff erent kinds of softwares. It is enriched with a number of plug-ins and libraries. It is also used popularly as a scripting language. In presence of Python interpreter it can run in every popular operating system. Even latest android platform also supports Python language using scripting layer. In brief it can be said than python is the language, which is loved by novices as well as experts. Some important and popular packages which are supported by python languages are given below.
Package Name DomainNumpy/ SciPy For scientifi c calculation
NLTK For Natural Language Processing
BioPython For biological computation
matplotlib For plotting fi gures
pyqtgraph GUI library using QT and Numpy
Astropy Astronomy
PyCV Computer Vision
Python Imaging Library
(PIL)Image Processing
CythonTranslating Python code into equivalent
C code
It is worthy to mention here that the above chart does not cover all the packages or libraries of Pythons. It only tells about a very small subset of available toolkits or packages using Python. Some special kind of features, which make it so robust are given below:
• Declaring multiple variables of diff erent types simultaneously in a single line:
>>>x, y, z = ‘A’, 2, 5.6>>>x‘A’>>>y2>>>z5.6
It is possible to return multiple values from a function. Even the documentation of function is given with some corresponding syntax…
#Function definitiondef remove_duplicated(arg_referents, arg_conditions):“”” This function removes duplicates from two list arg_referents and arg_conditions”””
arg_referents = list (set (arg_referents))arg_conditions = list (set (arg_conditions))returnarg_referents, arg_conditions#Function callingsentence_referents, sentence_conditions = remove_duplicated(sentence_referents,sentence_conditions)
• The “range” function is also very useful to python users .l“range” creates a list of numbers in a specifi ed rangerange ([start,] stop [, step]) -> list of integersWhen step is given, it specifi es the increment (or decrement).>>>range (7)
[0, 1, 2, 3, 4,5,6]>>>range (7, 12)[7, 8, 9,10,11]>>>range (0, 12, 2)[0, 2, 4, 6, 8,10]
This “range” function is heavily used in for loop. For example, if you want to print every third element in a list?for i in range(0, len(array), 2): printarray[i]
• Well Known Constructor available for Pythondef __init__(self): #constructorself.items = []
• Fix division operator so 1/2 == 0.5; 1//2 == 0requires__future__ statement in Python 2.xSupports Complex variables
Examples:3+4j, 3.0+4.0j, 2J#Must end in j or J
• Strings are repeated with the * sign:>>>’xyz’*3 ‘xyzxyzxyz’
• Python also supports negative indexes. For example, stringExample[-1] means extract the fi rst element of stringExample from the end
• Apart from string, Pyton support List which is denoted by [], and can hold numbers, strings, nested sublists, or nothing. The list indexing works just like string indexing. For ExampleList1 = [0,1,2,3], List2 = [‘zero’, ‘one’], List3 = [0,1,[2,3],’three’,[‘four,one’]], List4 = []
It is possible to append, extend, insert, and remove data using the following syntaxesa. list.append(x) b. list.extend(L) c. list.insert(i,x) d. list.remove()It is possible to sort a list, count the number of elements and reversing a list using the following syntaxesa. list.count(x) b. list.sort() c. list.reverse()
Abo
ut th
e A
utho
r Nibaran Das received his B.Tech degree in Computer Science and Technology from Kalyani Govt. Engineering College
under Kalyani University, in 2003. He received his Mastes in Computer Science and Engineering (M.C.S.E) and
Ph. D. degree from Jadavpur University, in 2005, and 2012 respectively. He joined J.U. as a lecturer in 2006. His areas
of current research interest are OCR of handwritten text, Bengali fonts, and image processing. He has been an editor
of Bengali monthly magazine “Computer Jagat” since 2005.
Nu
o
o
CSI Communications | April 2013 | 27
Programming.Learn("R") »
R- StaR of Statisticians http://www.r-project.org/
If your requirements are to manipulate, model or visualize a huge
set of statistical data, an arguably best choice of programming
environment is R!!!
R is a follower of Scheme and S Plus; a functional programming
language (S language) developed at Bell Labs by John Chambers
and team, and is the most widely used one in the area of statistical
computing. R was initially developed in 1993 by Robert Gentleman
and Ross Ihaka at the Statistics Department of the University of
Auckland, New Zealand, and was later progressed as a result
of the collaborative eff ort with contributions from all over the
world.R has got its name by debiting the fi rst letters of its initial
developers.
R is an interactive, object-
oriented language; designed by
statisticians for the purpose of
statistical computing. It is free
and open source, and is available
under GNU - General Public License
version 2. It runs on most of UNIX
platform, Windows, and MacOS.
Diff erent versions of R are available
at Comprehensive R Archive Network
(CRAN), which is a repository for R
code and documentation. CRAN also
provides source codes, new features,
and bug fi xes etc. Currently there are
4415 packages for R.
Now R has become a favorite
language for data analysis and
statistical computing for both
corporates and academia. R is also
being used for handling and analyzing
large datasets obtained from
supercomputing applications and to
create high quality visualization via
diff erent types of plots like line plots,
contour plots, and interactive 3D plot.
R has an intuitive and easy syntax for even a beginner, who
has basic programming experience. Like other programming
languages, R also has the standard control structures and can be
accessed from the languages such as Python, Perl, Ruby etc. Also
commercial software such as Mathematica, MATLAB, and Oracle
support R in a pretty good way.
R recommends command line interface (CLI), which is best
suitable for programmers. However, for a beginner to start with,
GUI based code editors, and IDE are useful. It provides a variety
of functionality like syntax highlighting, code completion, and
auto code indentation, which eases the job. Rstudio, Vim-R-Tmux,
Notepad++, RKWard, R Commander are some of them.
Let us have a look at R interface and R programming
environment. When you launch R, the R console will appear
with some basic information regarding R, within the R Gui. The
console resumes with a prompt with a ‘>’ symbol. This shows that
the interpreter is ready and is waiting for your R commands. We
can input commands (which are referred to as expressions) in R
through the R console.
R programming language has become an important platform for
statisticians to work with. R, being.. an open source platform,
there are n-number of freely available packages using which, you
can, not only do your serious statistical analysis, but also can be
used as a analysis platform for the problems in the fi elds, such
as Bioinformatics, Financial Market Analysis, Pharmacokinetics,
Natural language processing etc. We can explore more about R
programming in the next issue. Have a great time ahead.
Practitioner Workbench
Umesh P and Silpa BhaskaranDepartment of Computational Biology and Bioinformatics, University of Kerala
Robert GentlemanRoss Ihaka
Interface of R programming language
CSI Communications | April 2013 | 28 www.csi-india.org
Abstract: Enterprises are increasingly
facing a challenge in making sense from
the deluge of data they are receiving from
multiple data sources. Due to increasing
connectedness of people, applications,
and machines, the amount, diversity, and
speed of data is very large. Analyzing this
data with minimal delay is an increasingly
challenging task.
In this document, we would like to
present the suitability of Data Stream
processing technology, to build solutions
that can enable enterprises to address
the velocity dimension of Big Data,
and provide real time visibility into
their operations. Using this technology,
enterprises can convert high velocity data
into meaningful business insights, and
take advantage of favorable conditions
and/or take corrective actions in case of
adverse conditions. We also share our
experience of applying this technology to
a few business domains.
Keywords: Big Data; High Velocity;
Real-time; Operational Insights; Data
Stream Processing; Stream Computing;
In-Memory Computing; Data Stream
Management Systems;
IntroductionOne of the computing areas that is
attracting a lot of attention is 'Big
Data'. What exactly is Big Data? As per
Wikipedia, 'Big Data is a collection of data
sets so large and complex that it becomes
diffi cult to process using on-hand database
management tools or traditional data
processing applications. The challenges
include capture, curation, storage, search,
sharing, analysis, and visualization'. As
more and more activities are being
carried out on the Internet by people and
enterprises, the amount of data generated
for the various activities is rising each
day. As per one statistic, as of 2012, 2.5
quintillion (2.5 x 1018) bytes of data were
created. While on a personal level users
are facing challenges with large volumes
of data, the challenge for enterprises is
monumental. Enterprises are struggling to
derive meaningful value from humongous
mountain of data they
collect on a regular
basis.
One of the areas
in the enterprise
that generates high-
velocity data and is
very important, but
receives less focus, is
the operational setup
of the enterprise.
Conventional business
intelligence solutions primarily deal with
data from the past, as these solutions
cannot process data 'instantaneously' or
'on arrival', due to technical limitations.
While volume is the most common
dimension when discussing Big Data, it is
not the only dimension. Typical Big Data
solutions try to address and analyze data
on three dimensions, namely volume
(amount of data), velocity (speed of data
being coming in and going out) and variety
(range of data types and sources). While
Big Data is an active fi eld of research
and exploration, with innovative tools
and techniques being actively created,
it may not be possible to create one
eff ective solution that addresses all three
dimensions. Most Big Data solutions
will have to fi nd an acceptable tradeoff
between one or more dimensions, with
the most common pair being volume
and velocity.
Business DriversFor enterprises that deal with Big Data, it is
important to extract meaningful insights,
so that enterprise processes can be tuned
accordingly. In addition to being able
to process a large volume of data from
multiple sources and in multiple formats,
most organizations also need to have a
timely, continuous, and instantaneous
view of operations.
While enterprises till date have tried
to achieve this goal using Data warehouse
and Business Intelligence (BI) solutions,
they are realizing that these solutions
cannot scale to provide real-time insights
needed in an increasingly competitive
business environment. For real-time
insights, enterprises need to undergo a
paradigm shift and move away from the
'store and process' methodology followed
by data warehouse and BI solutions.
As per an Aberdeen Group survey,
companies who have implemented
systems that provide real-time visibility
into their operations, have seen noticeably
higher performance across several Key
Operational Metrics, as depicted in Fig. 1
Gensis of Data Stream ProcessingTo manage increasing data volumes and
increased urgency around actionable
information, enterprises are seeking the
aid of processes and tools that can be
used to provide operational insights with
minimum latency.
On the technology front, while disk
capacities have grown rapidly, disk speeds
have not kept pace. In comparison to
disks, memory capacities have grown
exponentially and have been adequately
supported by a signifi cant drop in their
price. With large amount of memory
available at relatively less cost, software
architectures that store and process
data in memory have evolved. Such 'In-
Memory' architectures off er an order-of-
magnitude performance improvement
over traditional architectures. Newer
infrastructure, terabyte memories, and
multi-core parallel computing are opening
up avenues for processing massive
amount of data within a short period of
time and at much lower cost.
As a combined eff ect of business
needs and technology trends, a variety
of technologies are available for deriving
business insight from raw operational
Deriving Operational Insights from High Velocity Data
Bipin Patwardhan* and Sanghamitra Mitra***Research & Innovation, iGATE Mumbai, India**Research & Innovation, iGATE Mumbai, India
Fig. 1: Benefi ts of real-ti me visibility of operati onal metrics
CIO Perspective
CSI Communications | April 2013 | 29
data. Such technologies range from
simple operational dashboards based
on conventional Database Management
Systems to advanced techniques like In-
Memory Real-time Data Analytics.
Introduction to Data Stream ProcessingTo bridge the gap between the operational
and analytical systems, the concept
of Data Stream Processing has been
developed, where transient data is
processed as soon as it arrives (even
before it is persisted). The premise of
the concept is to process and analyze all
data on-the-fl y. In the following sections,
we provide details on how on-the-fl y data
analysis can be performed using suitable
technologies.
Technology OverviewThe concept of Data Stream Processing
is built using the computer programming
paradigm based on the 'Single
Instruction Multiple Data (SIMD)' parallel
programming design pattern. In particular,
this paradigm utilizes the concept of
'Pipeline Parallel Processing'.
To help understand the concept,
refer Fig. 2. In most cases, enterprises
continuously receive data that needs to
be processed. This data can be viewed as
a 'Stream of Data over Time'. For a real-
time response, this stream of data needs
to be processed, refi ned and acted upon
in real-time. The concept of Data Stream
Processing enables real-time processing
of such continuous data streams.
The concept diff ers from conventional
data processing frameworks and solutions
in several ways, as below:
• Data streams are usually unbounded.
• No assumption can be made on data
arrival order.
• Size and time constraints make it
diffi cult to store and process data
stream elements after their arrival.
Key CharacteristicsData stream processing engines have the
following characteristics:
• Data Stream Management – The
engine needs to have capability to
process continuous fl ow of data. A
stream is a sequence of time-stamped
data records called 'tuples'. A tuple is
similar to a row in a database table.
As illustrated in Fig. 3, the tuples in a
stream have a schema, which defi nes
each fi eld's name, position, data type,
and size. A few examples of a data
stream include fi nancial trading data
and sensor data.
• Window Processing – A 'Window'
is one of the key concepts of data
stream processing. It enables limiting
the portion of an input stream from
which elements can be selected.
While processing a stream of data, it
is necessary to defi ne portions of the
input fl ows that have to be considered
while executing the processing
rules. Each window contains, at any
given time, a subset of the tuples
streaming by. Defi ning
such windows enables
a query to identify the
fi nite set of tuples (from
an otherwise infi nite
stream) over which the
processing rules would
be applied. As illustrated,
Fig. 4 describes how a
window is applied on
a data stream for say a
‘Withdrawal’ transaction.
The size of the window
is fi ve. The engine stores
all arriving withdrawal
data stream into a window. When the
window is full, the oldest data tuple is
pushed out.
• Domain Specifi c Language –To help
ease the task of describing how the
incoming data is to be processed,
data stream engines typically provide
an expressive language - a Domain
Specifi c Language (DSL) - that
allows enterprises to defi ne complex
relationships among the data items.
As depicted in Fig. 5, data processing
rules (queries) can be defi ned using
the DSL. Once defi ned, the rules
act as continuous queries that are
deployed once and continuously
process the data items streaming
by, producing results. Most DSLs
are defi ned to be similar to SQL so
as to leverage developer familiarity
thereby increasing productivity and
reducing maintenance eff orts.
Implementing Data Stream Processing for Multiple DomainsData Stream Processing enables
enterprises working across various
business domains to derive operational
insights, in real-time, from continuously
fl owing data and make suitable decisions
as soon as the data is received. Such
real-time data analysis can take place in
tandem with business processing so that
problems can be spotted and dealt with
sooner than is possible with conventional
approaches.
Fig. 3: Streaming data
Fig. 5 : DSL for conti nuous query
Fig. 2: Data stream processing
Fig. 4 : Window processing
CSI Communications | April 2013 | 30 www.csi-india.org
To help enterprises process and
analyze high velocity data, we have defi ned
the 'iGATE Analysis and Intelligence in
Real-timeTM' approach. The approach
is built on concepts of Data Stream
Processing and Complex Event Processing
and has evolved from our experience of
implementations across domains.
In the following sections we would
like to share our experiences in creating
solutions for domains like Manufacturing
Industry, Smart Grid and Oil and
Natural Gas.
Real-time Manufacturing IntelligenceThis solution was developed to enable
manufacturing organizations to have
continuous visibility into their production
processes - spread globally - allowing
business operations to optimize product
performance, yields, and utilization.
Aggregating and processing huge volume
and/or high speed distributed data to
provide continuous intelligence is a
challenging task. To do so, one needs to
go beyond the visualization and analysis
capability provided by stand-alone Human
Machine Interface (HMI) software.
As depicted in Fig. 6, the solution
made use of on-the-fl y processing of high
velocity data by processing it in-memory,
before the data was persisted using a
suitable downstream application. In-
memory processing allowed the application
to process data in seconds, providing a
real-time view into the plant performance.
Dynamically changing operational
parameters were displayed using real-time
dashboards. A dashboard for monitoring
plant performance is depicted in Fig. 7.
Some of the features of the solution are:
• Continuous queries to analyze
and transform streams of data in
real-time.
• Integration of business intelligence
across diff erent applications in
real-time.
• Scalable platform capable of
processing vast volumes of real-time
data.
Real-time Energy MonitorHouseholds and enterprises consume
energy on a daily basis for various
activities. Today, technology allows
multiple energy readings to be captured
and transmitted at set intervals as per
business need. This continuous stream of
data can be used to provide consumers
with an accurate and up-to-date picture of
their energy consumption.
As depicted in Fig. 8, the high
velocity data was processed using a data
stream processing solution. The solution
processes energy consumption data
and presents the same using real-time
dashboards. The solution also provides
real-time notifi cations and allows
aggregated data to be persisted to a data
warehouse for further analysis.
As illustrated in Fig. 9, the
consumption monitor displays real-time
consumption data by itself or juxtaposed
with historic data.
Some of the key features on the solution
are:
• Provides meaningful insight into real-
time data for improved customer
experience.
• Improved performance by using in-
memory processing.
• Hybrid approach of in-memory
processing of high volume real-time
data to provide immediate useful
feedback and persisting aggregated
data in warehouse for future data
analysis.
Real-time Drilling Operations MonitorIn the fi eld of Petroleum and Natural
Gas (PNG), huge amounts of data and
an immense installed base of disparate
systems make it diffi cult for upstream
engineers and operators to collaborate
eff ectively. Moreover, the upstream oil
and gas industry is challenged to provide
engineers and operators with interfaces
Fig. 6: Real Time Manufacturing Intelligence
Fig. 7 : Real-ti me plant-performance monitor
CSI Communications | April 2013 | 31
that support optimal short-term and long-
term decision making. Its highly trained
professionals need integrated views,
oftentimes related to a particular process
or production event. Despite complexity,
engineers, and operators must quickly
identify signifi cant process events;
assess their relevant parameters and take
suitable actions.
Upstream professionals fi nd
operational insights useful in exploration,
drilling and completion, production and
other upstream processing scenarios.
Multiple scenarios, including improving
well performance and generating alerts
for disaster management can leverage
in-memory architecture based on data
stream processing.
As shown in Fig. 11, the real-time
dashboard related to well drilling allows
the upstream professional to benchmark
and monitor crucial operational metrics
with real time data.
Some of the key features on the solution
are:
• Multiple real-time data streams that
are aggregated on-the-fl y.
• Allows enrichment of real-time data
with data from the static operational
data store.
Benefi tsThe iGATE AIR approach helps generate
results as soon as the input data becomes
available, delivering business intelligence
continuously and in real-time, which can
be consumed by applications, services,
and users throughout the organization.
The benefi ts of the approach can
be broadly classifi ed into two categories,
namely business benefi ts and technology
benefi ts, some of which are given below:
Business Benefi ts • Smarter integration of real-time
business intelligence across the
organization.
• Improved business agility,
business innovation, and business
continuation.
• Reduction in development time and
cost by using standards-based SQL.
• Reduced storage cost, as data is not
required to be persisted before it can
be analyzed.
Technology Benefi ts • Allows real-time data collection,
transformation, aggregation and
reporting.
• Lower latency, as data can be
analyzed in-memory, before it is
persisted to the storage medium.
• Data independence, that is, logical/
physical separation, leading to loosely
coupled applications that need lesser
tuning and are more fl exible.
• Can be integrated with multiple
stream processing solutions like
StreamBase, SQLstream, and Esper,
to name a few.
Fig. 8: Smart grid operati onal insights
Fig. 9 : Conti nuous consumpti on monitor
Fig. 10: Real-ti me drilling monitor
CSI Communications | April 2013 | 32 www.csi-india.org
ConclusionThe concepts of data stream processing
leverage the performance improvements
on the hardware side, particularly the
developments in RAM technology,
allowing for in-memory processing.
In turn, in-memory processing allows
enterprises to perform data analysis in
real-time, enabling real time visibility into
the day-to-day operations.
It is important to note that Big Data
is not only about processing, cleaning or
churning high velocity, high volume data. It
is about deriving some relevant, meaningful
insights from the same. Solutions built
using Data Stream Processing concepts
can be used eff ectively to analyze high
velocity operational data and extracting
business insights in real time.
As described in this document,
we have used the concepts of data
stream processing to build solutions
for operational insights across multiple
domains. Operational insights can be
leveraged to improve the speed of visibility
into key operational metrics, thereby
helping improve business agility. The
iGATE Analysis and Intelligence in Real-
time approach aims to augment existing
data warehouse, and business intelligence
solutions, to enable real-time data
processing, rather than seeking to replace
them. This allows enterprises to eff ectively
use their existing solutions, but add the
capability of real-time data processing,
helping them respond to events much
faster and in an eff ective manner. n
Fig. 11: Real-ti me dashboard for drilling parameters
Abo
ut th
e A
utho
rs
Bipin Patwardhan is a Technical Architect with more than 15 years of experience in the IT industry. At iGATE,
he is leading the High Performance Computing CoE. The CoE builds capabilities around technologies that help
delivery high performance for enterprise applications. Presently, the CoE covers areas like Parallel Programming,
GPU Programming, Grid Computing, Real-time Analysis and In-Memory Computing.
Sanghamitra Mitra is a Technical Architect from R&I (Research & Innovation) group in IGATE. She has around 15
years of experience. She has worked on various projects related to Enterprise Applications as well as Enterprise
Application Integration with international clients across multiple domains. Currently, her primary focus is on
hands-on evaluation of emerging technologies in High Performance Computing area including Parallel Computing,
Real-time Intelligence. She is responsible for institutionalizing these technologies across the organization and
leveraging them to build innovative solutions to solve business problems.
Primer on CSI History – An AppealAs CSI will be turning 50 in 2015, a series of Golden Jubilee events/activities are being proposed to be conducted well in advance in the coming two years.
There is a proposal to have a curtain raiser in Jun/Jul 2013 at Delhi.
In this context, a primer on "CSI History" is proposed to be brought out highlighting the signifi cant/major milestones of CSI from its inception.
To facilitate the compilation / preparation of this primer, inputs are requested from all the fellows and members who have been associated with CSI for long years at various capacities at the chapter, regional, national and international levels.
Kindly provide all information relevant to this primer as write-ups, documents, publications, photographs and in all other forms at your earliest.
While soft copies of the inputs can be sent to me by email at [email protected] the hard copies (documents/publications/photos/) may pl. be sent to
Director - EducationComputer Society of IndiaEducation DirectorateCIT Campus, IV Cross RoadTaramani, Chennai – 600 113Ph: +91-44-2254 1102 / 1103 / 2874
After use, they will be returned back to you if desired.
We request your immediate support in this activity as the lead-time for the primer preparation is quite short.
CSI Communications | April 2013 | 33
With cyber attacks on DRDO and a kind
of Internet blackout that India faced in
March of 2013, I thought of penning down
this article to make my readers aware
about scenario on APT in general and
about where India should be poised.
What is APT?A common defi nition of APT is
hard to come by, as many vendors,
consortiums, and groups put their own
twist on the terminology. A commonly
accepted explanation of APT refers to it
as “an advanced and normally clandestine
means to gain continual, persistent
intelligence on an individual, or group of
individuals such as a foreign nation state
government.” APT is sometimes used to
refer to sophisticated hacking attacks and
the groups behind them. What does that
mean to the Indian citizen, though?
Simply put, APT is reconnaissance
and investigation of your network, in
addition to your infrastructure and your
information assets. It’s a reference to a
sophisticated and dedicated attacker or
attackers who are willing to “lay low” and
go very slow in exchange for gathering
data about you, your organization and
how you operate. For the IT Professional
managing an environment, adjusting your
current infrastructure and preparing for
this threat will require a diff erent mindset
and some analytical assessment.
According to CERT-In (Computer
Emergency Response Team - India), till
October an estimated 14,392 websites in
the country were hacked in 2012. General
acceptance is that social media usage
boosts the likelihood of a successful APT
attempt.
Attackers behind APTs are interested
in a broad range of information, and
are stealing everything from military
defense plans like latest DRDO attacks to
schematics for toys or automobile designs.
Their motivation can be fi nancial gain, a
competitor’s advantage in the marketplace,
sabotage of a rival nation’s essential
infrastructure, or even just revenge.
APTs start by identifying
vulnerabilities that are unique to your
employees and infrastructure. And since
they are precisely targeted, surreptitious,
and leverage advanced malware and zero-
day (unknown) exploits, they can bypass
traditional network and host-based
security defenses.
Cybercriminals are increasing the use
of Web-based malware, and employing
malicious uniform resource locators
(URLs) for only brief periods of time. They
use “throw-away” domain names in just a
handful of spear-phishing emails before
moving on, enabling them to fl y under
the radar of URL blacklists and reputation
analysis technology. Additionally, the
report points out, they are blending URLs
and attachments in email-based attacks,
and reproducing and morphing malware
in an automated fashion.
These techniques render the use of
defenses that rely on known patterns of
data almost entirely ineff ective. We are in
April and year 2013, is already the 'year of
the hack'. Even more disturbing is the fact
that many attacks are being carried out by
state sponsored actors from countries like
China, Korea, and Iran.
It is imperative to know when a
targeted attack is underway, and how to
gather evidence to be able to understand
its purpose and origin. Leveraging multiple
security solutions that use diff erent methods
to detect malicious activity for both internal
and external threats can enhance your
capabilities. Security technology has been
evolving, and manufacturers are developing
ingenious ways of not only detecting, but
stopping, zero-day attacks.
Many advanced security monitoring
tools work well in conjunction with more
traditional defenses, such as fi rewalls,
IDPS, antivirus, gateways, and security
information and event-management
(SIEM) systems. With the right tools
in place and staff and operational
support behind them, you can gain
the situational awareness and counter
intelligence needed to identify an attack,
and potentially block or quarantine
threats. Even if an attack is successful, the
insight gained into how it occurred, what
information may have been compromised,
and the relative eff ect of your defenses
can be invaluable to recovery eff orts, and
will help you continuously improve your
security posture.
India’s Cyber Law i.e. under the
section 66F (Cyber Terrorism) of The
IT Act, 2000 has enough teeth to fi ght
against such criminals if found. India
needs to implement a huge knowledge
management system which can be
used by its defense forces along with
DRDO, NTRO, CERT-in. This knowledge
management on APT can help us weed
of any successful cyber attacks and can
increase our cyber attack preparedness.
India needs an holistic approach and view
to encounter APT threat as a country, we
have cyber security heroes in pockets but
for APT we need team of heroes guided
with systems and processes to channel
their fi ght against APT.
Reference[1] h t t p : // f o c u s . f o r s y t h e . c o m /
articles/268/Combating-Advanced-
Persistent-Threats n
Information Security »
Advanced Persistent Threat (APT) - and- INDIA
Security Corner Adv. Prashant Mali [BSc (Physics), MSc (Comp Science), LLB]
Cyber Security & Cyber Law Expert
Email: [email protected]
India needs to implement a huge knowledge management system which can be used by its defense forces along with DRDO, NTRO, CERT-in.
According to CERT-In (Computer Emergency Response Team - India), till October an estimated 14,392 websites in the country were hacked in 2012.
CSI Communications | April 2013 | 34 www.csi-india.org
IT Act 2000 »
Prof. I T Law Demystifi es Technology Law Issues Issue No. 13
Security Corner Mr. Subramaniam Vutha
AdvocateEmail: [email protected]
How Lawyers and [IT] Technologists should collaborate:
IT Person: Prof. I. T. Law, it is a pleasure
to meet you again. And I look forward
to an enlightening discussion with you
on Technology law issues that people
like me should know.
Prof. IT Law: I enjoy talking to you too.
What topic should we discuss today?
IT Person: I am intrigued by your
concept of collaboration between
technologists and lawyers. How
should we go about that?
Prof. IT Law: Yes, such technology +
law collaboration is a fundamental
need of the day. Especially in the
Internet era, things move too fast and
changes occur so rapidly that there
is greater need than ever for such
collaboration.
IT Person: Why is it so important in
our industry to do such “parallel” work
with a lawyer?
Prof. IT Law: In older and more mature
sectors, the business executive
or technologist could take an
appointment with a lawyer, and then
brief him or her before taking a legal
opinion that infl uences his business
plan. In the case of the IT industry
things move so fast that it is better to
“thread” the legal precautions into the
business plan itself. If not, the velocity
of business will be such that great
harm can be done before you know
it, and legal damages could be quite
daunting.
IT Person: Please give me an example.
Prof IT Well, take the case of a
company that wants to develop a new
website. The architects and designers
of the website will talk to the
business people, and understand the
functionality that is needed. At that
stage itself it is important to involve
the lawyer too.
IT Person: Why is that needed? What
will the lawyer do to help at that
stage?
Prof. IT Law: As the business
executives explain the functions of the
website, the design of the website will
be determined. The lawyer will also
understand those intended functions
of the website, and will advise on
the types of agreements and policies
needed.
IT Person: For example?
Prof. IT Law: If the website is only
for information, you will need terms
and conditions for access by visitors.
And if you are gathering any personal
information about visitors to the site,
then you will need to have privacy
policy terms also.
IT Person: And what if we have more
functions on the website?
Prof IT Law: Do you mean functions
like buying and selling products and
services on the website? In that case
you will also need to have terms
and conditions of sale by you. Those
should be binding on the people,
who use your site to buy goods and
services.
IT Person: This is interesting. But what
will the lawyer do at this early stage?
Prof IT Law: Your website terms and
conditions are like a contract. They
bind both you and the visitors to your
website. So, it is important to know
the legal implications and to have
your lawyer understand the business
intentions, and plans so he can help
you draft these properly.
IT Person: Usually the practice is to
see what others are doing and to adapt
their policies. Is that not suffi cient?
Prof. IT Law: Each website presents a
diff erent set of issues and challenges.
So, it is not sensible to just follow what
others do, without applying your mind
to the specifi c needs of your website.
Remember, these policies and terms
and conditions become crucial when
you face a legal challenge. At that
stage, it will be too late to undo
something that is not appropriate.
IT Person: For example?
Prof IT Law: Just consider a situation
where you did not provide in your
privacy policy that the personal
information gathered could be
shared with prospective buyers of
your business. In that case, you will
not be able to share such data with
a prospective buyer. And that buyer
may be interested mainly because of
the personal information database
you have built up.
IT Person: Oh I see! That is interesting.
I now understand how technologists
and IT people should collaborate for
key event and plans. Thank you very
much. Talking to you is always so
stimulating,
Prof IT Law: Your interest in the
subject makes it interesting for me
too. Our discussions are themselves
collaborations between a technologist
and a lawyer! n
CSI Communications | April 2013 | 35
Claude Elwood Shannon, the father
of information age was born on April
30, 1916. C.E. Shannon’s infl uence and
inspiration underpins everyday life
activities ranging from the eff ect of cell
phones to the popular social networking
sites. The most commonly used internet
jargons save, store, upload, and download
arguably symbolize the revolutionary
concept laid out by the most infl uential
mind of 20th century- Claude E. Shannon. It is really interesting
to explore the path through which Shannon travelled that fi nally
helped him to propose his landmark ideas.
Shannon was known to be very timid, and led a normal childhood.
His mother was the principal of the local Gaylord High School and
his father was a business man. Shannon’s was very much inspired by
his grandfather, who was a farmer and an
inventor. Shannon’s childhood hero was
Thomas Edison, whom he has a common
ancestor. As a young boy, C.E. Shannon
was a big fan of Edgar Allan Poe’s “The
Gold Bug”[1]. The Gold Bug is a detective
fi ction that centers on searching the
buried treasures by deciphering the secret
message. Shannon was even attracted to
solve cryptograms right from his school days.
Moreover, C.E. Shannon was very curious to learn things. He was
interested to know how various
devices like model planes worked.
He was even adventurous, as young
boy; he tried to contact his friend
half a mile away, just by hooking
the telegraph machine to a barbed
wire fence. Apart from this he has
a passion towards Dixieland music
too. He had a good musical instruments collection.
In 1936, C.E. Shannon graduated from University of Michigan, with
specialization in both electrical engineering and mathematics. His
interest in Boolean logic started from Michigan. Later, Shannon
joined MIT, and his acquaintance with Vannevar Bush, dean of
MIT's school of engineering changed his whole life. His mentor
was very much infl uential for recognizing his milestone work on
switching theory. It was supposed to
be one of the best Master thesis ever
produced. He was awarded Alfred
Nobel prize in 1940 for his novel
contribution to switching theory, which
later became the foundation for modern
digital systems. With the advice of his
mentor, Shannon did his PhD work in the area of genetics.
In 1942, C.E. Shannon joined Bell Telephone Laboratories for
full time research. During the Second World War time, he worked
in aircraft devices and in cryptography. It was highly signifi cant
that the encryption work build in complex scrambling machine,
was used by Franklin Roosevelt and Winston Churchill in order to
protect their transatlantic communications during war time. It was
in Bell lab; Shannon met his wife Betty, a trained Cryptographer. The
ambience of Bell labs and its relaxed atmosphere helped Shannon
to integrate all his views, which ended in his famous 1948 landmark
paper “A Mathematical Theory of Communication”; C.E. Shannon’s
revelation about the concept of information helps to envisage many
developmental changes in the fi eld of communication. Information
theory is the brain child of this great American genius. Inspired by
Hartley’s paper, Shannon tried to quantify mysterious concept of
information. The fresh insight proposed by Shannon helped to realize
that quantity of information has nothing to do with its meaning in
common parlance. It was Shannon’s absolutely incredible thinking
that helped to relate surprise and information[2].
In the information era, Bit is the fundamental atom of information. It
was Shannon who fi rst used the word Bit as per the suggestion of J.W.
Tukey, by coining the two words Binary and Digit[3]. Supremacy of bit
lies in its versatility. Bit forms the language for any communication
system irrespective of the fact that the message is text, audio or
images. All messages are being translated into two states like “On
(0)” or “OFF (1) “. Shannon proposed
even the limit at which a message
can be transmitted from one end to
another through channel without
loss of information. The abstract
concept of information proposed by
Shannon forms the foundation of all
technological advancements, in the
fi eld of data storage and transmission systems.
Interdisciplinary approach of Shannon created a revolutionary
change in the fi eld of digital communication. He has astonishingly
diversifi ed interests in many fi elds like Switching, Cryptography,
Computing, Artifi cial Intelligence, and Games. His novel
contributions helped to shape the modern digital world. More than
that, Shannon was an enthusiastic juggler, amazing unicyclist, loved
to design many devices out of curiosity, enjoy playing chess, and
most importantly an adorable poet
and musician. With his amazing
mathematical foundation, Shannon
laid down the golden rule for modern
information theory. Let me close my
tribute note by quoting an extract of
Shannon’s masterpiece poem, which
was published by John Horgan in
Scientifi c American[4].
A Rubric on Rubik Cubics
Strange imports come from Hungary: Count Dracula, and ZsaZsa G.,
Now Erno Rubik’s Magic Cube; For PhD or country rube.
This fi endish clever engineer; Entrapped the music of the sphere.
It’s sphere on sphere in all 3D—A kinematic symphony!
Ta! Ra! Ra! Boom De Ay!
IT.Yesterday() Biji C LDepartment of Computational Biology & Bioinformatics, University of Kerala
Birthday Tribute to the Most Infl uential Mind of 20th Century-Claude Elwood Shannon
y p y
CSI Communications | April 2013 | 36 www.csi-india.org
With theorems wrought by Conway’s eight;
‘Gainst programs writ by Thistlethwait.
Can multibillion-neuron brains; Beat multimegabit machines?
The thrust of this theistic schism—To ferret out God’s algorism!
With great Enthusiasm; Ta! Ra! Ra! Boom De Ay!
Men’s schemes gang aft agley; Let’s cube our life away!
References:[1] Robert Price (1985): A Conversation With Claude Shannon
One Man's Approach to Problem Solving, Cryptologia, 9:2,
167-175
[2] Arun K S, Achuthsankar S Nair, "60 years since “kpbw wcy xz”
became more informative than “I love you”, IEEE Potentials
(ISSN: 0278-6648), Vol. 29, Issue 6, Nov.-Dec. 2010,
pp. 16-19.
[3] 1928. C E Shannon, “A Mathematical Theory of
Communication”, The Bell system, Technical Journal, vol.27,
pp.379-423, 623-656, July, October, 1948
[4] http://blogs.scientif icamerican.com/cross-check/
2011/03/28/poetic-masterpiece-of-claude-shannon-father-
of-information-theory-published-for-the-fi rst-time/ n
Abo
ut th
e A
utho
r
Biji C L completed her Master of Engineering from Anna University. She is currently procuring PhD, in the
Department of Computational Biology & Bioinformatics, University of kerala.
taking into consideration the security features of Hadoop.3. Data Aggregation Stage – This is
most important step which will aggregate the data from Hive/HBase or any other NoSQL databases, so that analysis can be carried out on the aggregates.
4. Data Analytics Stage – In this step, further analytics to fi nd the drilling patterns, infer the lithology content based on various parameters from the Oil well logs. This step can be performed on another Analytical databases or In-Memory database residing outside the Hadoop ecosystem. Alternatively tools such as ‘R’ integrated with Hive can be used for distributed analytics on Hadoop.
5. Data Visualization Stage – The output from the analytics can be integrated to DW/BI systems for generating dashboards and scorecards so that decision makers can visualize and interpret the data.
ConclusionA Hadoop based Big Data Framework with Hive as a central Data warehouse layer is widely used to create dynamic and unifi ed
structures. We can easily execute pre-defi ned or ad-hoc queries on the Hive. This acts as a unifi ed integrated layer that can be easily augmented with current BI stack. The salient features of Hadoop/Hive based solution with respect to Oil and Gas E&P data management are: • Scalable architecture to analyze
terabytes to petabytes of multi structured well log data
• Massive Parallel processing providing unifi ed view of the data from multiple wells during its lifecycle — be it at the planning, operations or post-completion stage
• Integrated KPI framework — for commercials, operations, Health and Safety Execution, productions, etc.
• An extendable PPDM compliant data-model and Energistics standards to manage the data with a partner ecosystem
• Comparative analytics & correlations with wells in similar geologic conditions to help decision making for drilling Oil Wells
• Oil and Gas domain Ontology for easy interpretation of scientifi c terminology
References[1] Analytics Magazine (Nov-Dec 2012
issue) - How Big Data is Changing The
Oil and Gas Industry by Adam Farris -
Senior VP of Business Development for
Drillinginfo- Austin, Texas http://www.
analytics-magazine.org /november-
december-2011/695-how-big-data-is-
changing-the-oil-a-gas-industry
[2] Apache Hadoop Wiki - http://wiki.apache.
org/hadoop
[3] Hadoop- The Defi nitive Guide (Book by
Tom White, Published by O’Rielly - June
2009) http://net.pku.edu.cn/~course/
cs402/2011/book/2009-Book-Hadoop%
20the%20Defi nitive%20Guide.pdf
[4] Apache Hive Wiki - https://cwiki.apache.
org/confl uence/display/Hive/Home
[5] Hive – A Petabyte Scale Data Warehouse
Using Hadoop – a paper by Ashish Thusoo,
Joydeep Sen Sarma, Namit Jain, Zheng
Shao, Prasad Chakka, Ning Zhang, Suresh
Antony, Hao Liu and Raghotham Murthy –
Facebook Infrastructure Team http://infolab.
stanford.edu/~ragho/hive-icde2010.pdf
n
Pramod Taneja, Principal Architect, iGATE - Pramod has 20+ years of IT experience and currently leading
the Big Data CoE of Research & Innovation group, iGATE. He has served in various capacities managing and
supporting business process-led technology as well as strategic management initiatives. Email - pramod.
Prashant Wate, Technical Specialist, iGATE - Prashant has more than 13 years of experience in IT and is
currently part of the Big Data CoE of Research & Innovation group, iGATE. He has extensive experience in
architecting and implementing database solutions including Big Data, data modeling, data migration and
database optimization. Email - [email protected]
ut t
he A
utho
rs
Continued from Page 18
CSI Communications | April 2013 | 37
Solution to March 2013 crossword
Brain Teaser Dr. Debasish Jana
Editor, CSI Communications
Crossword »Test your Knowledge on Big DataSolution to the crossword with name of fi rst all correct solution provider(s) will appear in the next issue. Send your answers to CSI
Communications at email address [email protected] with subject: Crossword Solution - CSIC April 2013
CLUESACROSS2. Document-oriented databases using a key/value interface rather
than SQL (5)5. A space-effi cient probabilistic data structure (5, 6)8. Unit of measurement for data volume (9)9. Markup language (3)11. A data fl ow language and execution framework for parallel
computation (3)12. Structure of data organization (6)14. A distributed columnar database (5)17. One quintillion bytes (7)19. Type of database system that can make deductions (9)25. Discovery of meaningful patterns in data (9)26. A massive volume of both structured and unstructured data (7)27. Type of database designed to handle workloads whose state is
constantly changing (8)28. Diff erent types of data (7)29. An ordered list of elements (5)30. The digit one followed by one hundred zeroes (6)31. An open-source system for processing real time data streams (5)32. A paradigm for development of distributed computing
applications (6,5)
DOWN1. One thousand terabyte (8)3. Required for data persistence (7)4. An opensource software framework supporting data-intensive
distributed applications (6)6. Rate at which data acquired (8)7. An open-source database (7)10. A programming model to process large volume of data sets (9)13. Method for an integrated knowledge environment (4)15. Type of database optimized to store and query data that is related
to objects in space (7)16. Type of database with built-in time aspects (8)18. Technique to clean up noisy data to make this usable (11)20. Size of date expressed as (6)21. A very large number (10)22. Extremely large databases (4)23. An in-memory computing platform designed for high-volume
transactions (4)24. An engine for query processing and data warehouse (4)28. Database, in very large form (4)
Congratulations toAnanthi Nachimuthu (Dept of Computer Technology, Dr. N.G.P. Arts and Science
College, Coimbatore) and Madhu S. Nair (Dept of Computer Science, University of Kerala, Thiruvananthapuram)
ALMOST ALL correct answers to March 2013 month’s crossword.
Did you know about Map-Reduce algorithm for handling huge data?
MapReduce offers a programming paradigm for massive scalability to handle large data volumes. Users specify a map function that which takes a data set as input and transforms it into another set with key/value pairs. Then there comes a reduce function that does the merging these transformed data sets associated with same key in the key/value pairs.
(Source: MapReduce: Simplified Data Processing on Large Clusters by Jeffrey Dean and Sanjay Ghemawat, URL: http://research.google.com/archive/mapreduce.html)
1 2 3
4 5
6
7 8
9 10
11 12 13
14 15
16
17 18
19 20
21
22 23 24
25
26 27
28 29
30
31
32
1
S I2
M P L E3
D L4
E5
S6
C O R7
M
G O A O O
M N8
I S L9
A N D O R10
A D B
L O P Z B E I
G V W11
I S B N12
U R L 6 T
A13
O P14
A C R15
L
P L16
A R X I V
H17
A F C B18
I M P R I N T19
R T F20
P D F E A21
S C I22
E L O F23
D S P A C24
E R
U P S C G Y
S U T25
E B O O K R26
C
H P S A G
I C27
C O N T E N T28
D M29
R D F A S30
M
I31
C O U N32
T E R S L
P E Y33
S A T A34
M E T A D A T A I L W
CSI Communications | April 2013 | 38 www.csi-india.org
Ask an Expert Dr. Debasish Jana
Editor, CSI Communications
Your Question, Our Answer“Do the right thing. It will gratify some people and astonish the rest.”
~ Mark Twain
C/C++: Catching array index out of boundsFrom: Anonymous
In C/C++, presumably there is no array index bound exception coming while dealing with raw arrays. Even index operator for a STL vector cannot detect the array index crossing the specifi ed boundary limits. Code snippet follows.
#include <iostream>#include <vector>using namespace std;const int SIZE = 2;int main(){ int rawarray[SIZE]; vector<int> v(SIZE); int i; for (i = 0; i <= SIZE+1; i++) { rawarray[i] = i; v[i] = i; } for (i = 0; i <= SIZE+1; i++) { cout << "rawarray[" << i << "] = " << rawarray[i] << endl; cout << "v[" << i << "] = " << v[i] << endl; } return 0;}
When I compile and run this program, there is no compilation as well as runtime error. However, it is clear from the code given above that each of the rawarray and the STL vector object v is supposed to contain two elements as per specifi ed size i.e. 2 but when I try to put something as the third or even fourth element, they are being allowed without any warning or error. Here’s the output:
rawarray[0] = 0v[0] = 0rawarray[1] = 1v[1] = 1rawarray[2] = 2v[2] = 2rawarray[3] = 3v[3] = 3
Any suggestions or workaround ?
A In C/C++, there is no boundary checking for arrays. Even accessing vector with index operator is also buggy, as rightly pointed out. In reality, accessing array element for rawarray with an index say 5 like rawarray[5] actually means you are accessing an element residing at memory location rawarray + sizeof (int) * 5 or rawarray - sizeof (int) * 5. This means that on a 32-bit machine, where sizeof(int) = 32, this will be at location off set 32 * 5 i.e. 160 bytes either way from starting location of rawarray. If that memory location is within the permissible range of memory locations for user programs, no runtime error would come. And if the memory location falls within restricted memory area (reserved by operating system) would cause a protection violation i.e. program would crash. But, the behavior
is unpredictable. A better alternative would be to defi ne Array as a C++ template and have own exception defi ned to catch array index out of bounds exception. Code snippet follows:
#include <iostream>#include <string>#include <exception>using namespace std;class MyException:public exception { string ex; public: MyException(const string str= "some exception"):ex(str){} ~MyException(){} const char* what(){return ex.c_str();}};template <class T>class Array { T * data; int size; public: Array(int s) { data = new T[ size = s]; } virtual ~Array() { if (data) delete [] data; } T& operator [] (int index) { if ((index < 0) || (index >= size)) throw MyException ("Array index out of bounds"); return data[index]; }};const int SIZE = 2;int main(){ try { Array<int> safearray(SIZE); int i; for (i = 0; i <= SIZE+1; i++) { safearray[i] = i; } for (i = 0; i <= SIZE+1; i++) { cout << "safearray[" << i << "] = " << safearray[i] << endl; }} catch (MyException &e) { cerr <<"exception: "<<e.what()<< endl; } return 0;}
The output would be as below (when array index boundary is crossed):
exception: Array index out of bounds
For std::vector, the index operator [] does not check for boundary overfl ow or underfl ow. You could use the member function at e.g. v.at(i) and enclose within try block. vector::at throws an out_of_range exception if the requested index position falls out of specifi ed range. Alternatively you may check v.size() to check if you are crossing the specifi ed
boundary or not. n
Send your questions to CSI Communications with subject line ‘Ask an Expert’ at email address [email protected]
CSI Communications | April 2013 | 39
Happenings@ICT H R Mohan
Vice President, CS AVP (Systems), The Hindu, ChennaiEmail: [email protected]
ICT News Briefs in March 2013The following are the ICT news and headlines
of interest in March 2013. They have been
compiled from various news & Internet sources
including the dailies – The Hindu, Business Line,
and Economic Times.
Voices & Views • The public cloud services market to grow
18.5% in 2013 to $ 131 billion globally – Gartner. • Views on Budget 2013: Ganesh Natarajan:
Minor advantages for IT; Phaneesh Murthy: Nothing sparkling for corporate sector; Keshav R Murugesh: Right notes for BPO industry; B V R Mohan Reddy: Positive and balanced; Hike in tax on royalty payments to hurt tech fi rms; Telecom sector disappointed; Tax incentive for semiconductor fab unit too late, say chip makers; GTech: Budget ‘interesting’ for IT sector.
• Smartphone sales are expected to touch 918 million units worldwide in 2013, and by the end of 2017, 1.5 billion – IDC.
• The IT infrastructure budget for World Cup Soccer and Olympics in Brazil is pegged at $180 billion.
• The small and medium software industry in India is pegged at $110 billion, while export is worth $68 billion -- M. Nayak, Director, STPI.
• Indian mobile phone market up 16% at 218 mn in 2012 – IDC.
• IT-ITES exports up 23% at Rs 4.11 lakh cr in FY’13 – Deora.
• India (4.2%) ranked third on distributing spam across the world, after US (18.3%) and China (8.2%); Asia tops the list of continents with 36.6% of the world’s spam -- SophosLabs.
• 5.85 lakh telecom towers consume 5.12 bn liters of diesel a year and emit 10 mt of carbon dioxide: Deora.
• M-commerce to constitute over 25% of e-commerce traffi c -- HomeShop18 CEO
• E-commerce segment has doubled to about $ 14 billion in 2012 from $ 6.3 billion in 2011.
• India makes 13 requests a day for web user data (Internet snooping by the enforcement authorities), second to U.S. which asks 45 – Google.
• Three out of every 10 parents confi rm that their children were victims of cyber-bullying - Norton.
• About 84% of all young men (2.4 crore) and 82% of college going (1.5 crore) and 68% of school going (1.5 crore) kids accessed the social media; Social media users in urban India crosses 6.2-crore mark in December 2012 and estimated to be 6.6 crore by June 2013 -- IAMAI.
• Nasscom expects export revenues of $84-87 billion in 2013-14 fi scal at a growth rate of 12-14%.
• India’s domestic IT market to touch Rs 1.75 lakh crore by 2016 -- Boston Consulting & CII.
• Videocon, Reliance ‘ready’ to invest Rs 25,000 cr in chip-making units.
• Put country fi rmly on Internet and ‘get out of the way”-- Eric Schmidt, Chairman, Google.
• Technology has forced politicians to update themselves – Modi.
• Holidays are peak season for spammers. Holiday spam can account for up to 6% of all spam.
• Europe contributes about 25-30% of IT revenues as against 50% from the US markets.
• Computer users to spend 1.5 bn hours and $22 bn battling malware. Global enterprises will spend $114 billion to deal with the impact of a malware-induced cyber-attack – Microsoft.
• Mobile value added services (MVAS) to reach $9.5 billion in 2015, from $4.9 billion in 2012 – Wipro & IAMAI.
• Cyber security market may reach $870 mn by 2017—IDC.
Govt, Policy, Telecom, Compliance • Govt. expects lower revenue of Rs 19,440.67
crore from spectrum sale and other related charges in 2012-13, compared to Rs 58,217 crore estimated.
• Bharti Airtel leads in consumer complaints on billing, tariff .
• Govt. plans to take over possession of BlackBerry infrastructure in Mumbai, or legal interception of Internet communication.
• 2G Scam: JPC unlikely to call Raja as witness. May be asked to submit stand in writing. CBI court summons Sunil Mittal, Ravi Ruia, and Asim Ghosh. Raja accuses Vahanvati of telling untruths against him.
• 2G players face fi ne from DoT for shutting services without notice.
• DoT decision to allow broadband players to off er voice is illegal – COAI.
• New messaging system for NGOs with FCRA (Foreign Contribution Regulation Act) registration.
• Buying Internet protocol addresses to get cheaper, faster with the launching of National Internet Registry (NIR) in India.
• Free roaming services likely before October – Sibal.
• 14 million requests to switch mobile operator rejected.
• CDMA spectrum sale to fetch Rs 3,639 cr while the auction of 2G spectrum for GSM players held in November last year fetched Rs 9,407 cr.
• DoT fi rm telcos must own spectrum to off er 3G services.
• Over 2 cr mobile users loaded with value-added services they didn’t ask for.
• BSNL, MTNL still to recover Rs 6,215 cr from customers – Sibal.
• Time for electronics goods certifi cation extended till July 3.
• Govt has received proposals for two semiconductor fabs – Sibal
• India plans U.S.-like information sharing to alert cyber-attacks.
• Telcos asked to install local server for security audit.
• Fate of Aakash II tablet still uncertain. • Sibal unveils roadmap for IPv6. Plan for
complete migration to IPv6 by December 2017. • Cost of voice services will move up – Aircel. • Unifi ed license framework to take a month --
Telecom Secretary. • Centre to set up 2,000 telecom towers in tribal
areas at a cost of Rs. 3,000 cr.
IT Manpower, Staffi ng & Top Moves • Fresher’s ‘hired’ by HCL Tech stage protests
across country demanding that the company
convert the off ers to actual jobs. HCL issued a letter of intent and not a job off er – HCL HR head
• Infosys plans to hire 200 in US. • Helios and Matheson IT to hire 1,000. • Aptech ties up with NSDC. Aims to train over
two million people over 10-years. • Google to slash 1,200 Motorola Mobility jobs
in US, China and India. • Sigma Aldrich, to hire 100. • Mahindra Satyam to increase headcount in
Australia to 5,000 in two years from 1,600. • Chennai-born Sundar Pichai to head Google
Android division. • Fake job off ers swarm Android platform. • US to accept H-1B visa applications (with a
cap at 65,000) from April 1. H1-B visas could double under Senate plan -- Report
• Hiring activity in IT sector likely to be muted this year -- Kris Gopalakrishnan.
• Albion Infotel plans to hire 150 people. • Nasscom launches programme to incubate
10,000 start-ups. • Makuta VFX to double headcount this year
from 60. • Engg students prefer IT; Google most wanted
employer – Nielsen. • Tyco plans to double headcount from 850. • D Shivakumar, Senior Vice-President (India,
Middle East, Asia), Nokia decides to quit. • SAP training students to meet innovation
needs. • TCS ranked No. 1 employer in Europe for 2013.
Company News: Tie-ups, Joint Ventures, New Initiatives • TCS enters $5 billion brand value club with its
brand valued at $5.247 billion. • Microsoft launches `Offi ce 365’ in India. • Seagate to launch wireless storage solutions
in India. • Cisco announces the Cisco Education Enabled
Development (CEED 2700); a cloud based video conferencing solution for educators.
• Free pepper sprays, special call rates, for women opting for new pre-paid connections on Women’s Day.
• HP unveils ElitePad for enterprise segment. • HomeShop18 launches a ‘Scan N Shop’, India’s
fi rst virtual shopping wall, at T3 Terminal of Indira Gandhi International Airport in New Delhi.
• Reliance plans social services through optical fi bre cable network.
• Google to replace passwords with ‘ID ring’. • AMD unveils Accelerated Processing Units
(APUs) with facial log-in, gesture recognition. • Adobe unveils ‘Creative Cloud’ off ering
membership-based service access to its products and services.
• Intel to roll out 4th generation core processor this year.
• EMC pips IBM to become largest storage player.
• IBM opens customer experience lab. • Sabeer Bhatia plans on to biotech, social
ventures; launches Jaxtr SIM, a global SIM card.
• YouTube clocks 100 crore average monthly visitors. n
CSI Communications | April 2013 | 40 www.csi-india.org
CSI Report
A Report on CSI Best PhD Thesis Award 2012
M. GnanasekaranAsst. Manager (Administration), CSI
CSI has instituted a new award to recognize the best doctoral
dissertation(s) in Computer Science/ Information Technology,
from recognized doctoral degree-awarding institutions in
India. The award consists of certificate, trophy, and cash prize.
Ph. D dissertations accepted by Universities in India, during
the period January 2011 to December 2012 were eligible for
consideration.
CSI received 65 proposals from institutions all over India.
A panel of established researchers reviewed the dissertations.
The criteria used in the evaluation process included originality/
novelty of the thesis work, pertinence of the subject, depth, and
breadth of the results, contributions to theory and practice of CS,
applications and/or potential applicability of the results, current
and likely future impact, and quality of related publications.
Award Name of candidate Name of Institution Title of Thesis
Best Thesis(Joint)
Dr. Ramasuri NarayanamDepartment of Computer Science and
Automation Indian Institute of Science
Bangalore
Game Theoretic Models for
Social Network Analysis
Dr. Ketan Kotwal Department Electrical Engineering, Indian
Institute of Technology, Mumbai
Fusion of Hyperspectral
Images for Visualization
Honourable Mention Dr. Kishor Kumar Barman
School of Technology and Computer
Science, Tata Institute of Fundamental
Research, Mumbai
Topics in Collaborative
Estimation and MIMO
Wireless Communication
The winners are:
Hearty Congratulations to the Winners!
Prizes were presented by the Chief Guest, Mr. S D Shibulal,
CEO and MD of Infosys Ltd. during the 48th CSI Foundation
Day Celebrations, held at TIFR on 6th March 2013. The
committee consisting of R Jaikumar, S P Mudur, V Prabhakaran,
K Samudravijaya, R K Shyamasundar, and G Siva Kumar evaluated
the proposals, and selected the best dissertations for the award.
CSI appreciates the remarkable job done by the committee in a
very short span of time. n
Prof. RK ShyamasundarConvener
Mr. MD AgrawalChairman - Awards Committee
Kind Attention: Prospective Contributors of CSI Communications -
Please note that cover themes of future issues of CSI Communications are as follows -
• May 2013 - Cryptography
• June 2013 - Social Networking
• July 2013 - e-Business/ e-Commerce
• August 2013 - Software Project Management
• September 2013 - High Performance Computing
The articles and contributions may be submitted in the following categories: Cover Story, Research Front, Technical Trends, and Article.
For detailed instructions regarding submission of articles, please refer to CSI Communications March 2013 issue, where Call for
Contributions is published on the backside of the front cover page.
[Issued on behalf of Editors of CSI Communications]
CSI Communications | April 2013 | 41
All eyes were glued to the huge screen. With each passing question, heads came together, hush-hushing the answer. While a few students sat scratching their heads, others clenched their fi sts in frustration. Fewer still sat under the comfort of utter ignorance. Soon, it was answer-time. And, at once, the quiet auditorium was engulfed by a cacophony of phenomenal enthusiasm, which set the tone for the event. The National fi nals of 3rd National CSI Discover Thinking Quiz 2013, a Fun Quiz conducted by CSI on 2 March, 2013 at Millennium National School, Pune was for students of middle school from 6th to 9th standard. The quiz master J Ramanand from IBM India, had the whole audience in raptures. An accomplished quiz master, Ramanand is a BBC Micro Mind winner and also founder of quiz club of Pune.
Initially, the fi rst round of this national quiz was conducted in various CSI chapters during January, 2013. The CSI chapter level rounds were held at Trivandrum, Kochi, Sivakasi, Mysore, Koneru, Nashik, and Solapur. The top quizzing team from each chapter moved onto the regional rounds,
which were held at Koneru (Region 5), Pune (Region 6), and Kochi (Region
7). The fi nals saw the top 2 teams from each region competing for the CSI
Discover Thinking National Quiz Championship. In all, over 500 schools
and almost 5,000 students participated in the quiz in various rounds.
The total prize money was over Rs. 2.0 Lakhs. At the end of a pitched
battle, Naveen V and Naveen Unnikrishnan of Bhavan’s Adarsha Vidyalaya,
Kochi bagged the trophy and 1st prize of Rs.25,000/-. They were followed
by Amal M. and Sarath Dinesh of St. Thomas Higher Secondary School,
Trivandrum who took home Rs. 10,000/-. The third place was secured by D
Jeevithiesh and M Prabhat of Narayana IIT Olympiad, Vijayawada. A team
from Dnyan Prabodhini, Pune came fourth. The prizes were distributed
by Mrs. Chitra Buzruk, Senior General Manager, Persistent Systems, in
the presence of Mr. Shekhar Sahasrabudhe, RVP, Region 6, and Mr. Arun
Tavildar, Past Chairman, CSI Pune chapter. At the Koneru regionals, CSI
Past President, Prof. P Thrimurthy distributed the prizes and encouraged
the young students.
This event was coordinated by Mr. Ranga Rajagopal, NSC- CSI and
supported by Prof. Prashant R Nair, National Quiz Coordinator. The fi nal
was anchored by Mr. Shekhar Sahasrabudhe, RVP 6, Mr. S P Soman, RSC
7, and Ms. Mini Ulanat, National Convenor Skill development coordinated
the regional round for region 7 while Mr. Praveen Krishna, CSI SBC of KL
University – Koneru coordinated the Region 5 regional’s.
CSI Discover Thinking Quiz had Adobe as the event sponsor with
Persistent foundation and KL University sponsoring the regional rounds at
Pune and Koneru respectively. The quiz aims to encourage young learners
to discover science, and ICT the fun way and hopes to address the declining
trend of very few children opting for pure science as a profession. n
CSI Report
CSI Discover Thinking Quiz 2013 3rd National CSI Science and ICT Fun Quiz
Prof. Prashant R Nair*, Mr. Ranga Rajagopal** and Dr. Rajveer S Shekhawat****National Quiz Coord,**National Student Coordinator, CSI***National Convenor - CSI Project Contest
CSI Discover Thinking 2nd National Student Project Contest 2013
CSI “Discover Thinking”, the 2nd National level student project competition,
is an initiative for CSI student members to share innovative ideas with their
peers and experts country-wide. The 2nd edition of this extremely popular
event was exclusively sponsored and supported by M/s Adobe Inc. A total
of 10 teams had been shortlisted after regional rounds (2 per region) to
participate in the national round. The National fi nal was held at College
of Technology and Engineering (CTAE), Udaipur on 16th March 2013. All
the projects contained ideas & implementations of very high quality and
innovation. The competition was very closely contested and at the end of a
daylong session of presentations, the following teams were announced as
winners. The teams participating in the event are shown in the photo with
judges and organizers.
1st Prize: Sai Chand Upputuri & Alakananda Vempala, “Behavioural
Biometric Advanced Authentication”, K L University, Guntur (A.P.) 2nd Prize: (Two teams were rated 2nd) (a) Surya Mani Sharma, “Multi-
functional Robotic System”, Dronacharya College of Engg, Gurgaon
(Haryana) (b) Phagun Singh Baya, “Remote Wireless Sensors Analysis
and Controlling”, CTAE, Udaipur (Rajasthan) 3rd Prize: Kalyani Joshi & Madhuri Jadhav, “Data Transfer between tow USBs without Computer”,
PES Modern College of Engg, Pune (Maharashtra).
In the brief inaugural ceremony, Prof. N S Rathore, Dean,
CTAE welcomed the contestants and the guests of the function
and encouraged students to contribute to various problem areas of
agricultural Engineering. The Chief Guest Dr. Rajveer Shekhawat, RSC 3
& National Convenor of the 2013 contest provided the background of
the contest and briefed on activities held before the fi nals in various
regions. Dr. Dharm Singh, Organising Secretary of the National fi nals at
CTAE introduced the audience about the function. During the ceremony,
the students had the privilege of listening to the expert talk by Prof S
V Raghavan, President (Elect) CSI who joined the audience through
Video Conference from Delhi. Prof. R K Vyas RVP Reg I and Prof. Durgesh
Kumar Misra from Indore chapter were part of the jury.
Regional rounds had been conducted with equal enthusiasm.
Region 1 round was organized by Dronacharya College of Engg, Gurgaon.
Region 3 event was organised online. The Region 5 round was held at
“Vignan Nirula Institute of Technology for Women”, Parkala, Guntur.
Region 6 round was held at Cummins College of Engineering for Women,
Pune. The School of Computer Science and Technology of Karunya
University, Coimbatore conducted the Region 7 round. In all over 150
teams presented their projects at the various rounds. The contest was
coordinated by Mr. Ranga Rajagopal, National Student Coordinator of
CSI. It shall be our endeavour to have more teams participating from all
regions of CSI in the coming years. All the winning projects will be made
available on CSI Digital Library www.csidl.org. n
CSI Communications | April 2013 | 42 www.csi-india.org
On 25th March 2013, the CSI student chapter
was inaugurated at GITA, Bhubaneswar. A
seminar was conducted by Division IV CSI on,
“Recent Trends on Computer Security”.
Prof. (Dr.) Sudarshan Padhi, Director,
Institute of Mathematics & Applications,
Bhubaneswar & eminent computer scientist,
attended the seminar as key note speaker.
Prof. Padhi delivered his key note address that,
computer security is an over changing issue.
50 years ago computer security was mainly
concerned with the physical devices that made
up the computer. At this time, these were the
high value items that an organization could not
aff ord to lose. Today, computer equipment is
inexpensive, compared to the value of the data
processed by the computer. Now the high level
item is not the machine or computer hardware,
but the information that it stores and process is
more valuable. This has fundamentally changed
the focus of computer security, from what it
was in early years. Today, the data stored and
processed by computer is more valuable than
the hardware. Today, one of the most eff ective
measures is security professionals can take to
address attacks on their computer systems and
networks, is to ensure that all software is up to
date in terms of vendor released patches. Virus
and worms are just two types of threats that fall
under the general heading malware. The term
malware comes from ‘Malicious software ‘,
which describes the overall purpose of code that
falls into this category of threat. We will discuss
in details about – Recent trends in computer
security & steps to minimize the possibility of
attack on a system during today’s discussion.
Division IV Chairman, CSI Sanjay
Mohapatra, participated as Honorable guest
for this seminar and spoke about CSI & its
student chapter activities. He also discussed on
diff erent issues of Computer Security.
GITA College Principal, Vice –Principal,
Dean Academics, HOD CSE Department GITA
were present in this seminar and addressed
students. Prof. Manoj K Pradhan, Student Branch
Coordinator proposed the vote of thanks. There
were around 200 student participants, and 20
faculty members present in the seminar. n
CSI Report
Division IV, CSI - Seminar Report on “Recent Trends On Computer Security” on 25th March, 2013 @ GITA, Bhubaneswar
Sanjay Mohapatra* & Prof. Ratchita Mishra***Chairman, Division – IV, CSI**RSC, Region-IV
Report on Eastern Regional Convention 2013 on “Computing Anywhere, Anyware" @ Bhubaneswar Organized by – Region IV, CSI & Division IV (Communications), CSI
The “Eastern Regional Convention 2013 on Computing Anywhere, Anyware" was conducted at CV Raman College of Engineering,
Bhubaneswar from 25th to 27th Feb 2013. This conference was a joint
eff ort of the CSI student branch, in C.V Raman College of Engineering,
the CSI Region IV, and CSI Division IV (Communication). The
convention was inaugurated by Prof. Dr. Ganapati Panda, Deputy Director IIT Bhubaneswar. In his inaugural address, Dr. Panda explained
how evolution of technology has made computing possible anywhere,
using small to micro devices, which can be placed anywhere, be it in
our surroundings, in the air, water, buildings, homes, and also on and
inside our body. These devices which are sensors with processing power
and memory can collect real time data, and process large amount of
information for several types of applications, meant to help the society,
like predicting water level, controlling irrigation or fl ood, managing road
traffi c, etc. The Computer Network of today is a hybrid of LAN/WAN,
Mobile Network, and Wireless Networks. Some of the challenges that
come with advancement of these technologies are - standardization
of communication protocol, managing power to the wireless ad-hoc
devices widely dispersed, and supporting software to process large
amount of data in parallel.
Mr. A Pal, Principal Scientist and Research Head, Innovation Lab, TCS Kolkata, speaking on “Grid Computing for Internet-of-things” elaborated the concept of connecting all computing devices
on the interne,t to exploit the unused personal computing power by
others. However, this has issues of security and privacy apart from
the challenges of creating a seamless fl ow of information in a grid of
dissimilar computing devices. Mr. S Kanungo, Head, Marketing and Alliance, Cloud Practice, Tech Mahindra spoke on “Consumerization with Mobile Apps & Advantage / Challenges of Cloud Computing”. He
explained on the new concept called Fog Computing, which provides
services like Cloud for mobile devices. He elaborated several services
and projects that Mahindra Satyam is providing in this area, so that
clients do not have to invest in purchasing application software, and
even a workstation (desktop with all its installed software) can be
made available over cloud at a distant place. Mr. S Panda, CEO, Syum Technology spoke on “Enterprise Mobile Application Development Strategies”. He elaborated how one can learn to develop and deploy
applications on mobile development platforms like android, iOS etc., for
a more general deployment over any mobile device, one can use deploy
as an enterprise internet application. As an alternate strategy a Hybrid
approach of using an internet application with few native device OS
features can be used.
More than 300 students and faculties attended the convention.
Prof. Dr. K C Patra, Director CVRGI presided over the inauguration, and
closing functions. Mr. Sanjay Mohapatra, the chairman, CSI Division IV was the guest of honor. He appreciated the eff ort of CV Raman in
CSI activities, and encouraged the CSI student members to take part
in such academic activities. To encourage the students a Web-design
competition was held among the participants. The convention continued
with a 2-day detailed hands-on workshop on “Android Software Development”, for mobile devices conducted by C2S Technology. In
the closing function the Director, CVRGI, Registrar, CVRGI, the C2S
experts, and Chairman CSI div IV gave away prizes and certifi cates to the
participants. Prof. Dr R Misra, the CSI Regional Student Counselor and
Mr. D Mohanty, the Student Branch Counselor presented mementoes,
and vote of thanks to the guests and participants.
n
CSI Communications | April 2013 | 43
CSI Report
International Conference on Information Systems and Computer Networks: ISCON 2013
Dr. Dilip Kumar Sharma*, Mr. Sanjay Mohapatra** and Mr. R K Vyas****Honorary Secretary- Computer Society of India Mathura Chapter**Chairperson, Div IV, CSI***Vice President, CSI Region-1
An International Conference on “Information Systems and Computer Networks: ISCON-2013”, was organized in GLA University, Mathura, on 9-10 March 2013, in technical collaboration with IEEE UP Section and CSI Mathura Chapter, Division: IV & Region-1 . It was co-sponsored by Indian Oil Corporation Ltd (IOCL). The Chief Guest of the conference was Prof. S K Koul, Deputy Director (Strategy and Planning), IIT Delhi. Mr. R K Vyas, Vice President, CSI Region-1, Prof. M N Hoda, Director, Bharti Vidyapeeth, Delhi & Regional Students’ coordinator, CSI Region-1, Prof. S K Gupta, Department of Computer Science and Engineering, IIT Delhi, and other dignitaries were present at the Conference. The General Chair for the Conference was Prof. Krishna Kant, Head, Department of Computer Engineering and Applications, GLA University, Mathura.
Prof. S K Koul addressed the conference and highlighted some points on being successful: “THINK BIG, WORK TOGETHER AND INNOVATIVELY, AND, GIVE MORE AND TAKE MORE”. He also guided the participants on writing technical papers.
Prof. S V Raghvan, Vice President, CSI and Scientifi c Secretary, Offi ce of the Principal Scientifi c Adviser to the Government of India, New Delhi, also addressed the gathering through his recorded video, in which he highlighted the advancements in the domain of electricals and electronics, networking, and semiconductors. He also talked about the National Knowledge Network (NKN), which is a state-of-the-art multi-gigabit pan-India network, for providing a unifi ed high speed network backbone, for all
knowledge related institutions in the country. He said that the purpose of such a knowledge network goes to the very core of the country's quest for building quality institutions, with requisite research facilities and creating a pool of highly trained professionals. In the coming future, the NKN will enable scientists, researchers, and students from diff erent backgrounds, and diverse geographies to work closely for advancing human development in critical and emerging areas.
Mr. R K Vyas highlighted that the researchers should collaborate with the Industry, and share their research in order to get good exposure. He also warned about dangers of using emails through servers owned by other agencies. He advised use of mail-servers owned by user organization.
Prof. S K Gupta delivered a key note on cyber crime. He mainly focused on the cyber crimes being done by using plastic cards. He quoted that there is always two types of identities namely primary and secondary, where the former remains the same throughout the users’ life, and the latter can be changed. There must also be a provision to delete the person’s identity when the person expires.
ISCON 2013, papers were classifi ed in two tracks: Track 1: Information System & Track 2: Computer Networks. The organizers received 233 research papers in al,l from various academicians and industry people from India and abroad. Out of which there were 219 valid submissions. 132 research papers were of track 1 and 87 papers were of track 2.
All submitted papers underwent a rigorous process of two level reviews. First, the papers were matched with the relevance of the conference, and then checked for plagiarism. After this level of review, 108 papers were shortlisted and each paper was sent to two esteemed external reviewers from institutions of repute. After second level of review, 67 papers were selected out of which 47 papers were of track 1 and 20 papers were of track 2.
There were 45 registrations against these 67 accepted papers and out of which 36 papers were presented in seven sessions. In each session, the best paper was selected for award sponsored by McGraw Hill Education.
The conference valedictory session was organized at 2.30 p.m. on 10th March. Prof. S N Singh, Chairperson, IEEE UP Section graced the occasion as the Chief Guest and Pro.f Jai Prakash, Vice Chancellor, GLA University, Mathura graced the occasion as the Chairperson.
Announcement48th Annual Convention CSI 2013 Brochure Released
Visakhapatnam Chapter
CSI Annual National convention CSI 2013 is being organised by the
Visakhapatnam Chapter in association with Visakhapatnam Steel
Plant during 13th -15th Dec 2013 at Hotel Novotel, Visakhapatnam. The
theme of the Annual Convention is “IT FOR EXCELLENCE” . It will be
held in Visakhapatnam for the 1st time in the history of CSI, since its
inception 48 years ago in India.
To mark the beginning of the arrangements of the annual
convention, a colourfully designed brochure giving various details of the
Convention was released by Sri Umeshchandra, Director (Operations)
and past Chairman, Visakhapatnam Chapter in the august presence of
Mr. S Ramanathan, Hon. Secretary, CSI and Sri HR Mohan, President
elect at a programme organised at Visakhapatnam Steel Plant on 16th
Mar 2013. The central committee visited Visakhapatnam to review the
facilities available for CSI-2013.
Speaking on the occasion, Mr. S Ramanathan expressed confi dence
that the Visakhapatnam Chapter of CSI with the all-round support of the
PSU giant Visakhapatnam Steel plant will make the Annual convention
the most memorable one in the history of CSI.
Sri HR Mohan, President elect, said that the Visakhapatnam
Chapter has proved its worth by organising several mega events
related to IT very successfully and that’s the reason s why they chose
Visakhaptanam Chapter to conduct this prestigious National convention
for 2013.
Chair, Sri CK Chand, Sri P Ramudu, Executive Director (Auto & IT),
Vice Chair Sri KVSS Rajeswara Rao, GM(IT), Addl. Vice Chair Sri Suman
Das, DGM(IT), Sri Paramata Satyanarayana, Convener of Org.Committee
and Sri GN Murthy, ED (Finance) & Chair, Finance Committee, Sri DN Rao,
ED (Services)& Chair of Convention Committee and Dr. S R Gollapudi,
Convener, Advisory Committee were present this occasion.
CSI Communications | April 2013 | 44 www.csi-india.org
To encourage innovation and indigenous development in the fi eld of Information Technology, CSI has instituted awards for Young IT Professional, Entrepreneurs and Researchers who are trying to achieve extra ordinary feat in the fi eld of Computer Science and Technology by implementing IT projects for better delivery of services.
CSI Awards for Young IT Professionals started its journey in the year 1999. Today it has attained a remarkable height and visibility by becoming the icons of excellence in IT applications for Young IT Professionals.
For the year 2012, a well-planned approach for publicity and involving regions and chapters for CSI National Young IT Professional Award was adopted. Each Regional Vice President provided guidance and support to respective Regional YITP Convener in hosting Regional Round at host
chapter. An announcement of YITP awards was sent to all the chapters, corporate members, institution members and IT companies including announcements in CSI Communications and hosted on CSI website. This resulted into getting many nominations. After short listing the nominations, 40 teams comprising of professionals from IT companies, technical institutes, entrepreneurs and researchers participated at Regional Level.
CSI YITP| Awards maintains absolute transparency in an objective and merit based selection process. This year the evaluation process included 2-tier selection process to select the Winners, Runners-up & Special Mention. The regional round was conducted at Kolkata, Ahmedabad, Bhilai, Bangalore, Nashik and Chennai Chapter.
With the support of Regional Vice Presidents and Regional YITP Conveners, the regional round completion was successfully conducted on the above six regions.
Details of the regional round competition can be viewed on the CSI website under CSI News section.
From each region, Winner and Runner up teams were invited for fi nal round of competition on 6th March, 2013 at PSG College of Technology, Coimbatore. Total 11 teams presented their projects to the selection committee members. The selection committee members were Dr. Subramaniam, Past Chairman CSI; Mr. John Milton, Robert Bosch; Ms. Pandi Selvi, Robert Bosch; Mr. Sebastian Christopher, CTS and Mr. Isai Amudan, CTS. Mr Bipin Mehta and Mr. Ranga Raj Gopal coordinated the fi nal round which was supported by Mr. N Valliappan, Secretary, CSI Coimbatore Chapter.
The most outstanding technology project of any kind, completed during the year 2011-12 where project duration could be of 2-3 years from the start date, within an organisation were judged. The selection committee considered many factors to judge each project like criticality of IT usage, improvement of customer service, innovation, quality of management and impact on organization and society. It was a challenge to selection committee to decide on the winners. The selection committee unanimously declared the Winners, Runner-up and Special Mention Award Winner as under:
The results of the National Round were declared and awards were presented on 6th March, 2013 on the auspicious occasion of 48th CSI Foundation Day at PSG College of Technology. The chief guest for the award function was Dr. R Rudramoorthy, Principal, PSG College of Technology, who inaugurated the contest.
In National Round, the winner received Rs. 50,000, a trophy, and a certifi cate. The runner-up received Rs. 25,000, a trophy, and a certifi cate, while the team that received special recognition got Rs. 15,000.
The contest aimed at involving young IT professionals in the quest for innovation in IT and also providing them an opportunity to demonstrate their knowledge, professional prowess and excellence in
their profession. n
CSI Report
*CSI National Young IT Professional Awards – 2012
Bipin V Mehta* and S M F Pasha***Fellow, CSI; National Convener, YITP Awards**Manager, CSI Headquarter
Mr Bipin Mehta, Dr. R. Rudramoorthy, Mr. Ranga Raj Gopal
and Award Winners
Region /Result Participant Organization Project
VII/Winner J. Jeminaa Asnoth
Sylvia
Jerusalem
College of
Engineering
Voice activated
Solar Powered
Wheel Chair
VI/Runner-UpTamal Dey
Abhra Pal
Lahari Sengupta
Centre for
Development
of Advanced
Computing
(C-DAC), Kolkata
Resham
Darshan- A
Machine
Vision Solution
For Colour
Characterization
Of Silk Yarns
II/ Special Mention
Rohit Dilip
Bhosale
Kartik Girish Vyas
Kumar Aditya
Persistent
Systems Ltd.
Viewer
Engagement
Analytics
**Report on 48th CSI Foundation Day at TIFR, MumbaiCSI celebrated its 48th Foundation Day on 6th March, 2013 at TIFR, Mumbai. The event was started with the welcome note by Mr. VL Mehta, Honorary treasurer, CSI. Mr. MD Agrawal (Academic Committee Chairman, CSI) spoke on Challenges in Education and Role of CSI. He asserted that CSI can explore Research Opportunities through various collaborations.
Overview and objectives of Foundation Day was described by Prof. R K Shyamasundar. He spoke about signifi cance of ‘computing power’ and how dependency of other streams is increasing on it day by day.
Major highlight of the event was CSI Founder Prof. R. Narsimhan Lecture which was delivered by Mr. SD Shibulal, CEO & MD, Infosys. He
emphasised on importance of ICT and how it is changing our day-to-day life. In the phenomenon of Globalization, IT plays pivotal role for bringing changes. However it is diffi cult to say which is leading what; globalization or IT. He talked about contribution and achievements of Infosys in the growth of the society. Rear Admiral SP Lal, VSM, CSO (Tech), HQ WNC, Chief Guest of the event highlighted the signifi cance of IT in defence. Now wars are not only on land, air and water. In last couple of decades, other mode of war has emerged, popularly known as Cyber War. Rear Admiral SP Lal appreciated CSI’s contribution
in strengthening Naval Command through its IT enabled training programmes.
Panel Discussion on ‘Education and Research’ was another attraction of the event. The session was concluded by Mr. Ravi Eppaturi, Chairman Mumbai Chapter and Mr. Dilip Ganeriwal, Vice Chairman Mumbai Chapter
anchored the show very eff ectively. n
CSI Communications | April 2013 | 45
Participants:Padmabhushan Dr FC Kohli, Padmashri Prof. DN Phatak, Dr Nirmal Jain, Prof. SP Mudur and Padmashri Prof. PVS Rao (Panel Moderator)
Opening RemarksAt the outset, Panel Moderator Prof PVS Rao reminded audience that CSI is actually a year older than commonly believed; it is successor to the All India Computer User’s Group (AICUG) which was formally started in Faridabad near New Delhi in 1964, (few days after sad demise of Pandit Jawaharlal Nehru) and renamed itself as the Computer Society of India one year later, in 1965. He paid homage to Major General A Balasubrahmanian, Late Prof Bishwajit Nag and Mr. SR Thakur, who along with himself and a few others, started the AICUG in 1964.
After welcoming and introducing the panellists, Prof. Rao stated that topic being IT Education and Research, focus will be on leveraging India’s progress in IT for accelerating the pace of National Development and in development of human resources. Coverage would include IT Education itself as well as use of IT in Education. IT research would necessarily include applied aspects (such as software engineering, computer aided engineering activity and so on), which facilitate and catalyse development of IT. It would also necessarily include research as an end in itself (e.g. Theoretical Computer Science). A question to be addressed is as to how best our competence can be leveraged to help in economic development, increasing exports and growing national wealth.
Speaking about education in general, Dr FC Kohli said, going by population, India should have three to four times as many bright students as there are in the USA. On the other hand, the annual output is only seven or eight hundred Ph D’s, a number that does not even meet the (teaching) faculty requirements of the various academic institutions already in the country, let alone the numbers needed for research and innovation. This gap needs to be bridged; about 50 colleges have been identifi ed in the country, which can, with proper inputs, bright students and trained faculty, be expected to produce up to 35000 world class graduates annually; of these, 6000 will go on to become Ph. D’s (as against the current output of only 800).
Prof. DN Phatak emphasised that need of the day is not merely to ensure that IT training happens on a scale that matches very large numbers of graduates needed; it is most important to provide high quality education. IIT Mumbai is addressing this by training teachers in thousands in a tiered structure so that they can in turn provide quality training to freshers. The trainee teachers are grouped at 40 to 50 widely distributed centers. IIT beams courses covering full range of subjects on-line to these centers in the mornings. Pre-trained course coordinators are available at each of the individual centers for interaction with trainee teachers; they also run tutorials and practical sessions in the afternoons.
Prof. SP Mudur spoke about qualitative changes that have occurred over the three decades that he has been in teaching. He cited teacher evaluation system prevalent in Canadian Universities, which lets the student assess the competence of their teachers. In this process, younger teachers are often graded higher than experienced seniors, mainly because older faculty fi nd it diffi cult to keep up with changes that are happening. Earlier, student-teacher interaction was restricted to only face-to-face interaction but has been greatly enriched through social networking (U-tube, Face book, Twitter, blogging and so on). Massively On-line Open Courses (MOOCs) such as those off ered by MIT are easily available from many reputed universities to students worldwide. Soon, it might be possible to take such courses from multiple universities even for credits. Blended learning (a combination of Face to face as well as on line learning) will become pervasive and important. Scaling will happen as large numbers of students are attracted by high reputation of institutions off ering MOOCs. In closing, he mentioned that for the next few years at least, the job situation will continue to be very good for students
specialising in Science, Technology, Engineering and Management (STEM).Talking about Industry-Academia interaction, Dr Nirmal Jain
emphasised that there are multiple ways of learning and interaction between the two. Often, there is a disconnect between material taught in the Universities on the one hand and what industry really needs on the other.
Only constant interaction between educational institutions and industry can bring about a better match between course curricula and industry requirements. Fortunately, pressures of competition are strongly motivating Industry to interact closely with academic institutions in the hope of gaining a competitive edge by leveraging on the innovations that happen there. It is up to the industry to make many more such collaborations happen.
Intra-panel interactionDuring subsequent interaction between panellists, Dr Kohli pointed out that even in today’s context of ever increasing prevalence of social networking, face-to-face person-to-person interaction continues to be very crucial. Agreeing, Professor Phatak said the idea is to start (innovative modern methods) in a small way initially and scale up as the process succeeds and proves itself. Dr Jain remarked that as we grow older, rather than just do our jobs, all of us are becoming increasingly interested in and involved with aspects relating to teaching and training; this highlights how important these issues are.
Audience interaction with the PanelQuestion: Hands on experience (as in internship and hospital assignments for medical students) is very important even during IT learning.
Prof. Rao: This is true; it happens in many areas such as legal profession and journalism. It can happen via internships in industry by students, by having adjunct professors (with practical experience) from industry, two-way movement of people at senior levels between industry and educational institutions etc.
Q. Several specifi c courses needed by students may not be available on line.
Prof. Phatak: Today, NOOCs are wide ranging, up-to-date and meaningful.
Q: In many cases, teachers lack passion.
Prof. Phatak: Passion is contagious. It can and does spread downwards (from teacher to student) as well as upwards (from student to teacher). Not just courses, examination pattern is also important. It is essential that students are properly tested to assess how well they have assimilated what has been tought. Hence, teaching and evaluation have to be done by the same person.
Q: Given that students of today can access good courses online, classroom attendance should not be compulsory as at present; it should be optional.
Prof. Mudur: Attendance is optional in Canadian and other universities. However, students must do their classroom assignments. They are assessed on these and on their performance in tests.
Q: There are three aspects to testing: learning, test and remedy (adaptations and corrections to existing methods to take care of defi ciencies in teaching and/or in the learning).
Prof. Phatak: To facilitate this, it is best to have small classes. Testing should happen while teaching, so as to facilitate on the spot adaptation of teaching methods as needed.
Q: Established supervisory institutions such as AICTE resist change; there is also problem lack of political will to bring about change.
Prof. Phatak: Things will change, they have to change as otherwise the system will collapse.
Q: What is the overall standard of on-line education? How are open universities such as IGNOU faring?
Prof. Phatak: These are means for taking education to large numbers. IGNOU is doing well. n
CSI ReportDr PVS RaoFellow and Past President of CSI
CSI Foundation Day – Panel Discussion on IT Education and Research
CSI Communications | April 2013 | 46 www.csi-india.orgCSCSCSII I CoCoCommmmmmunununiciccatatioionsnss ||| ApApA ririr ll 2220101133 || 46466 wwwwwww.w.w.cscscsi-ii-ininindididiaa.a.orororgggg
CSI News
From CSI Chapters »Please check detailed news at:
http://www.csi-india.org/web/guest/csic-chapters-sbs-news
SPEAKER(S) TOPIC AND GIST
GURGAON (REGION I)Mr. Vivek Varshney, Mr. R K Vyas, Prof. M N Hoda, Prof.
D K Lobiyal, Prof. S K Muttoo and Prof. Jitender Kumar
2 March 2013: CSI Regional level Student Project Contest-2013
The contest aimed at involving students in IT innovation and to provide
them an opportunity to demonstrate their projects with strong social
relevance. First prize -Mr. Surya Mani Sharma from DCE, Gurgaon project
“Multifunctional Robotic System”. Second prize for “Wear Your World”
to Ms. Monica Bansal and Mr. Deepak Kumar. The project “Android
Application” at third position demonstrated by Mr. Shashank Sharma and
Mr. Paras from BVICAM, New Delhi.
Hon’ble Principal giving a trophy to the Chief-Guest
KANPUR (REGION I)Dr. H C Karnick, Dr. Brijendra Singh, Dr. Phalguni Gupta,
Dr. Alok Tiwari and Dr. Raghuraj Singh
9 March 2013: National Seminar on “Issues and Challenges of Computer Science & Engineering as a Discipline”
Seminar was jointly organized with Dept of Computer Science & Engineering,
Harcourt Butler Technological Institute. A souvenir containing abstracts of
invited lectures, expert views of academicians and articles on the seminar
theme was released on this occasion. CSI Kanpur Chapter website http://
www.csi-kanpur.org was launched during the seminar.
Guests while releasing the Souvenir
LUCKNOW (REGION I)Mr. Amit Khanna and Prof. Bharat Bhaskar 15 March 2013: Technical Session on “nComputing”
Thel session was organized at NIEIT, Lucknow centre in association with
nComputing and M/s M Intergraph. It was attended by more than 50
participants. During the presentation Mr. Amit Khanna explained the
benefi ts, usage and other related details of the product. Prof. Bharat Bhaskar,
IIM Lucknow & Chairman of CSI Lucknow Chapter introduced the session.
Mr. Amit Khanna, Director, nComputi ng during his presentati on
HYDERABAD (REGION V)Dr. Pratap Reddy Sir 2 March 2013: Event titled “CHALLENGE EXPO-13”
Participants demonstrated Project Exhibits and presented Posters during
this event. Around 70 projects were exhibited and 10 posters were
presented. The event is organized under the guidance of Dr. Pratap Reddy,
who presided over as Chief Guest and Judge. Winners were given cash
prizes and participation certifi cates. Details of the event can be found on
the url: http://www.dprec.ac.in/challengeexpo13.html along with process of
conducting the event and photographs.
Organizers and parti cipants of the event
CSI Communications | April 2013 | 47CSCSC II CoCoCommmmmmunununicicatatatioioi nsns || ApApApririril ll 202020131313 ||| 474747
SPEAKER(S) TOPIC AND GIST
VISAKHAPATNAM (REGION V)Mr. Ganta Srinath Reddy 9 January 2013: Guest Lecture on “Android based application Development”
Objective of the lecture was to cover basics of android applications
development testing and its deployment. The program off ered enough
depth to enable attendees to set up application development environment,
test and then deploy the applications for use. Program motivated students
to get further information on the subject and develop their own projects.
Speaker delivering lecture.
Mr. G Santosh Kumar, Mr. Ganta Srinath Reddy, and
Mr. Krishna Vattipalli
1-2 February 2013: Southern Regional Conference on “Innovative Technologies (SRCIT)” 2012-13
Mr. Santosh Kumar covered various areas of hacking, respective counter
measures and preparedness. He covered topics such as ethical hacking,
major vulnerabilities, basic protection mechanism and penetration testing
techniques and importance of its fi ndings. Mr. Reddy gave introduction to
Android programming and deployment with hands on demonstration. Mr.
Vattipalli explained deployment and related development on Android and
Google’s cloud platform GAE (Google App. Engine).
Inaugural Program for SRCIT VIZAG-2013
NASHIK (REGION VI)Prof. Pradeep Pendse, Mr. Ajit Jagtap, Mr. Hussain
Dahodwala, Mr. Sagar Javkhedkar, Mr. K Rajeev,
Mr. Shashank Todwal, Mr. Satish Babu, Mr. Mahesh
Bhat, Mr. Vinay Hinge, and Mr. Sunil Khandbahale
8-9 February 2013: Western Region Conference on “NextGen Computing”
The event was organized jointly with Sandip Polytechnic. Hon. Shivajirao
Patil was felicitated with “Yashokirtee” Puraskar. Technical sessions
included “Bring Your Own Device” by Dr. Pendse, “Translation & MKCL
Supercampus” by Ajit Jagtap and “Data Centers” by Mr. Dahodwala. Mr.
Javkhedkar delivered talk on Mobile Applications and Mr. Rajeev spoke on
“Cloud Computing”. Other sessions were “Google Apps” by Mr. Todwal,
“Free and Open Source Software (FOSS)” by Mr. Babu, “Cloud Security” by
Mr. Bhat, “Big Data” by Mr. Hinge and “Inclusive Innovations: A case study
of Language Dictionary” by Mr. Khandbahale.
(L to R:) Principal Gandhe, Principal Tate, Mr. Chandrashekhar Sahasrabuddhe, Mr. Avinash Shirode, Hon. Shivajirao Pati l, Mr. Ashok Kataria, Ms. Mohini Pati l, Principal Prashant Pati l and Shri Shrikant Karode
TRIVANDRUM (REGION VII)Mr. Suneeth Natarajan 16 February 2013: One day workshop on “Six Sigma Methodology”
The workshop provided an overview on adopting best practices from the
Six Sigma methodology. Content covered topics such as - Evaluating
organization performance, Creating a culture of shared responsibility to drive
performance, Identifying and defi ning key areas for improvement, Setting
improvement goals and targets, Finding sponsorship, Creating metrics
driven organization, Analyzing root causes and Piloting and implementing
change.
Resource person conducti ng workshop
CSI Communications | April 2013 | 48 www.csi-india.orgCSCSCSII I CoCoCommmmmmunununiciccatatioionsnss ||| ApApA ririr ll 2220101133 || 48488 wwwwwww.w.w.cscscsi-ii-ininindididiaa.a.orororgggg
From Student Branches » http://www.csi-india.org/web/guest/csic-chapters-sbs-news
SPEAKER(S) TOPIC AND GISTAES INSTITUTE OF COMPUTER STUDIES (AESICS), AHMEDABAD (REGION-I)Dr. Vikram Parmar and Dr. Neeraj Sonalkar 23 January 2013: Seminar on “Venture Studio – Centre for Innovative
Business Design”
Dr. Parmar explained how Venture Studio aims to nucleate an ecosystem
of innovation that accelerates regional economic development. Dr. Sonalkar
explained venture design process and how to work in teams to identify
critical market needs, generate and prototype novel solutions and develop
business models to launch scalable businesses to satisfy such needs.
Dr. Parmar encouraged students to think out of box and culti vate new ideas and soluti ons for solving societal and industrial problems
Mr. Sunil Gulabani 25 January 2013: Seminar on “Cloud Computing - Industry Case Studies”
Mr. Gulabani started with cloud computing basics and various services
provided by diff erent cloud providers. Later he shared various cloud
computing case studies using Amazon Web Services (AWS) cloud, Redhat
Openshift Cloud, Google App Engine and Tumblr. He gave live demo and
explained technical architecture of cloud application development using
Eclipse and cloud APIs. He also provided innovative project ideas using
Google wallet, Geo-based social networking/taxi service, fi ght back
application and mobile e-learning among others.
Mr. Sunil Gulabani, IndiaNic Infotech Pvt. Ltd. shared his industry experience on cloud computi ng during the seminar
Shri Hemant Sonawala 16 February 2013: Lecture on “Current and Emerging Trends in ICT, Employment Opportunities and Benefi ts of Professional Society Membership”
Mr. Sonawala emphasized on information sharing for increasing knowledge.
He advised students to use technology for betterment instead of misusing it.
He mentioned that technology changes rapidly and hence students should
try to achieve expertise in applications and not in tools. He also made
students aware about their social responsibilities. Certifi cates and trophies
were awarded to students for their outstanding performance in academics
and also for the best System Development Projects.
Mrs. Hemal Desai, Shri Hemant Sonawala and Prof. Bipin Mehta
SARDAR VALLABHBHAI PATEL INSTITUTE OF TECHNOLOGY (SVIT), VASAD, GUJRAT (REGION-III)Dr. Varang Acharya and Mrs. Bharti Trivedi 9 February 2013: Inter-college Annual Fest “SAKSHAM ’13: Carve Your Niche”
Inter-college Annual Fest was organized wherein various technical and
online events were covered along with a Seminar. Dr. Acharya was the Chief
Guest and Mrs. Bharti Trivedi was the Guest of Honour for the inauguration
ceremony. The SVIT Student Branch also launched its website, http://csi.
svitvasad.ac.in as well as website for SAKSHAM to allow members and
students to interact. It also tied up with bachpan – an NGO that visits slum
kids and teaches them at home.
Prof. Hetal Bhavsar (SBC), Dr. V R Panchal (Principal), Dr. Varang Acharya (Chief Guest), Mrs. Bharti Trivedi (Guest of Honour),Mrs. Bijal Talait (HOD of CE Dept), Prof. Sameer Chauhan(HOD of IT Dept) and Prof. Sohail Pandya (HOD of MCA Dept)
CSI Communications | April 2013 | 49CSCSC II CoCoCommmmmmunununicici atatatioioi nsns || ApApApririril ll 202020131313 ||| 494949
SPEAKER(S) TOPIC AND GISTKLE INSTITUTE OF TECHNOLOGY (KLEIT), HUBLI, KARNATAKA (REGION-V)Mr. Arunkumar and M Khannur 15 February 2013: One day workshop on “Software Testing and Career
Avenues”
Mr Khannur talked on software structure and software testing basics. Later
he gave brief introduction to Basic Black Box Testing Techniques. Guidance
on career aspects in software testing was also provided. Workshop was
concluded by felicitating Mr. Khannur and distributing certifi cates to
students.
Resource persons and organisers of workshop
VASAVI COLLEGE OF ENGINEERING (VCE), HYDERABAD (REGION-V)10 January 2013: Mini Project Competition
Main objective of the competition was to motivate students to work on
mini projects and develop quality and usable applications. Competition
also helped to enhance their presentation skills as they had to present their
projects on a poster. 6 teams with 2 members in each team participated
in the competition. One of teams was awarded a merit certifi cate and
participation certifi cates were distributed to others.
Students presenti ng their projects on poster
VASAVI COLLEGE OF ENGINEERING (VCE), HYDERABAD (REGION-V)Mr. Vengal Reddy and Dr. P Radhakrishna 10-11 & 17 February 2013: Series of Guest Lectures on “DWDM - Data
Warehousing and Data Mining"
Mr. Vengal Reddy spoke on ‘Data Mining’ & ‘Latest Trends in DWDM’ and
Dr. P. Radhakrishna spoke on developing research attitude among students
on ever expanding Big Data. Mr. Reddy also discussed about various fi elds
of applications and suggested topics for research.
Mr. Vengal Reddy, Product Technical Architect, Infosys Technologies, conducti ng lecture
VITS COLLEGE OF ENGINEERING, VISAKHAPATNAM (REGION-V)Dr. B Muralikrishna, Principal, Mr. B Narendra, Prof. G
Rajasekharam, Mr. B Ravichandra, Prof. K Shankar, and
Mr. A Ramkumar
26 February 2013: Computer Awareness Camp
VITS College of Engineering in collaboration with VITS CSI student chapter,
conducted NSS program at Mamidilova village, Sontyam, Anandapuram
Mandalam, Visakhapatnam. Students of Mandala Prajaparishat Primary
School were taught basics of Computers and were provided with computer
basics materiel.
Students of VITS and School students
Dr. Valli Kumari and Principal Dr. B Murali Krishna 27 February 2013: One day workshop on “Recent Trends in Embedded Systems & Soft Computing Techniques"
Dr. Valli Kumari explained various methodologies and modalities present in
the fi eld of Image Processing. She provided detailed insight for narrowing
down the gap between low level image features and human interpretation of
the image. She also explained basic concepts of Soft Computing techniques
for Image Processing applications.
Honoring the Guest Dr. Valli Kumari
Please send your event news to [email protected] . Low resolution photos and news without gist will not be published. Please send only 1 photo per
event, not more. Kindly note that news received on or before 20th of a month will only be considered for publishing in the CSIC of the following month.
CSI Communications | April 2013 | 50 www.csi-india.orgCSCSCSII I CoCoCommmmmmunununiciccatatioionsnss ||| ApApA ririr ll 2220101133 || 50500 wwwwwww.w.w.cscscsi-ii-ininindididiaa.a.orororgggg
SPEAKER(S) TOPIC AND GISTSRINIVASA RAMANUJAN CENTER, SASTRA UNIVERSITY, KUMBHAKONAM, TAMILNADU (REGION-VII)Mr. Amit Grover and Mr. Siddharth Goyal 16-17 February 2013: Two-days Workshop on “Web-Entrepreneur”
Resource persons spoke about how to generate ideas to become a Web-
Entrepreneur, how to work with CSS, Word Press, CMS and various topics.
As a motivation for participants, competition was conducted and top
three teams were awarded with certifi cate of achievement along with
entry to national level Tech Hunt 2013.
Certi fi cate Distributi on
SREE BUDDHA COLLEGE OF ENGINEERING (SBCE), ALAPPUZHA, KERALA (REGION-VII)Mr. Subhash E P 15 February 2013: One-day Workshop on “Android”
Mr. Subhash is Offi cial Trainer of Oracle University for delivering Live
Virtual Class to Oracle customers in the North America and Former
Offi cial Trainer of Borland Corporation for delivering Borland ALM suit
of production. He gave a very comprehensive description about Android
Operating System and its features and demonstrated how to develop
applications on Android platform.
Mr. Subhash E P during the workshop
Following new Student Branches Were Opened as Detailed Below –
REGION I JRE group of Institutions, Greater Noida
JRE-School of Engineering inaugurated CSI-Student Branch and JRE SOE Project Center on 16th Feb, 2013. Technical talk on
RoCK-BEE: Robotics Competition Knowledge Based Education in Engineering” by Prof Saha and lecture on “Soft Computing and
Its Applications” by Prof. M MSufyan Beg were organized on this occasion.
REGION V Kakinada Institute of Engineering & Technology (KIET)-II, Kakinada
On 12th February, 2013, CSI student branch was inaugurated and seminar on “Cloud Computing” was organized. There was a talk
delivered by Dr. C V S Murty, on research methodologies and also a talk by Mr. Sekhar Kammula, who spoke about social activity
named ‘I CARE I REACT’. Mr. Naganand Rapaka spoke about cloud computing technologies.
REGION VI G. H. Raisoni Institute of Information Technology, Nagpur
On the occasion of inauguration of CSI student branch on 6th of February, 2013, motivational speech was given by Mr. Pratap Shukla,
on “Journey from Learning Software Programmer to Expert Solution Architect”. Other topics covered in the inauguration seminar
were - Programming Introductory Tools, Beginner Programmer Tools, Language Developer Tools, and Solution Architect Tools.
REGION VII E.G.S. Pillay Engineering College, Nagore, Nagapattinam
CSI Student Branch was launched at EGS Pillay Engineering College, on March 05, 2013. Participating in the event as the Chief
Guest, Mr. Ramasamy, Regional Vice President, explained about the activities to be taken up by branch, and techniques that
would help students to face examinations and interviews successfully.
REGION VII Sethu Institute of Technology, Sivakasi
CSI student branch inauguration was organized on February 27, 2013. Mr. Ramasamy, Vice Chairman, of CSI Chapter Chennai,
gave inaugural address and highlighted opportunities for professional development.
CSI Communications | April 2013 | 51
Announcement HR Mohan* and S. Ramanathan***Chair (Conference Committee)**Hon.Secretary
Call for Proposals for Events (2013-14)PreambleAs a technical and professional association, Computer Society of India has the mission of sharing knowledge, competency enhancement, promoting research, aiding education, and providing career enhancement opportunities for its individual and institutional stakeholders and partners.
CSl intends to conduct diff erent types of events during the year through its Chapters, Divisions, SlGs, Member Institutions, Partnering Organizations, and International Bodies such as IFIP and SEARCC. These events are intended to achieve one or more of the following outcomes:
• Publish peer-reviewed papers on computing, lT, lCTs, and related domains
• Provide enhanced awareness of new technologies
• Upgrade the skills of participants through exposure state-of-the-art direct, hands-on
• Share the output of research programmes
• Enhance employability, especially for new professionals
• Introduce CSl to new geographies and domains
• Provide platforms for exposing new technologies, products, and concepts
• Provide enhanced career opportunities to members
• Provide forums where socially relevant technology pilots and programmes can be taken up
• Strengthen the Organizational Units of CSI (such as Chapters, Student Branches etc.)
• Enhance the reach, penetration, and membership of CSI
• Provide opportunities for individual and institutional members to convene programmes that broadly benefi t the
cross-section of society
• Provide positions and perspectives on
ICT/IT issues of national importance and relevance
• Aid in bringing the benefi ts of Information Communications Technologies (lCTs) to all citizens of the country
Programme ProposalsPursuant to the above mission and objectives, CSI invites proposals from Chapters, Student Branches, Institutional members as well as individual members, for diff erent kinds of international, national, regional, and state-level events for the year 2013-14, including, but not limited to:
• Technical Conferences (national/international)
• Seminars • Workshops • Research Symposia • Faculty Development Programmes • Job Fairs • Exhibitions • lT initiation programmes for schools • Quiz programmes • Student conventions for college and
school students • Pilot programmes
These events will be organized by Chapters, Student Branches or other organizational units of CSl, and supported by the entire CSI ecosystem, viz.,
• CSI Headquarters • CSI Educational Directorate • RVPs & Divisional Chairs • National, Regional and State-level
Student Co-ordinators and other regional staff
• SIG Chairs (where appropriate) • Chapters and Student Branches • Partnering organizations (e.g., IEEE
Computer Society, CDAC, PMI) • Associated international/national
organizations (e.g., lFlP, SEARCC, IE, IETE, ISA)
Support from the state and central governments, trade associations, and business organizations can also be sought for these events.
All technical content generated through these events is expected to be hosted on CSI's Knowledge Management portal, in the form of a full-text searchable online digital repository. CSl also proposes to recognize events through awards in diff erent categories, based on parameters such as the quality of the event, participation, and surpluses generated.
Proposal GuidelinesKindly apply to [email protected] with the following particulars on or before 31 May 2013:
• Title of the event • Type of event (e.g., Seminar/
Workshop/Conference) • Hosting Unit(s) (e.g., Chapter/
Student Branch/Region/ Divisions/SlGs)
• Duration • Location • Topics and outline of the event
programmes • Proposed benefi ts to CSI and
members • Potential Partners and Sponsors • Target audience and size • Preliminary Budget (Revenue and
Surplus)
Kindly note that these events are expected to adhere to the provisions of the CSI Conferences Manual (please refer to the CSI Website) as applicable!
CSI proposes to recognize the contribution of members and institutions for organizing events in the form of awards. Criteria and category of awards will be announced soon. n
Dear CSI Member -
Your hard copy of CSI Communications magazine is sent to the address, which you have provided to CSI. Please ensure that
this address is correct and up-to-date.
In case you need any help from CSI, please write an email to [email protected] for assistance.
You may send your feedback and comments on the contents of CSI Communications - Knowledge Digest fo IT Community to
- On behalf of editors of CSI Communications.
CSI 201348th Annual Convention of the
Computer Society of IndiaHosted by: CSI Visakhapatnam Chapter
In association with Visakhapatnam Steel PlantTheme: ICT and Critical Infrastructure
Dates: 13-15, December 2013Venue: Hotel Novotel, Visakhapatnam
Call For Papers / Participation
Introduction: CSI-2013, the 48th Annual Convention of Computer Society of India (CSI), is being organized by CSI Visakhapatnam Chapter, in association with Rastriya Ispat Nigam Limited, Visakhapatnam Steel Plant to bring together researchers, engineers, developers, and practitioners from academia and industry working in all interdisciplinary areas of information system engineering and computing, Innovative IT professionals from government establishments to small, medium & big enterprises, from non-government organizations to multi-national companies to share the experience, exchange ideas and update their knowledge in latest developments in emerging areas. Following the big successes of previous conferences, CSI, Visakhapatnam Chapter is set to conduct the First Annual Convention at Visakhapatnam, CSI-2013 that will serve as a forum for discussions on the state-of-the-art research, development and implementations of ICT applications.
The progress and the growth of any country depend on the Infrastructure and ICT becoming pervasive has a crucial role in managing the Infrastructure. Keeping this in mind, the theme for CSI-2013 has been selected as ICT and Critical Infrastructure. The deliberations will focus on this aspect and cover Innovative ways to deliver business values, optimize business processes and enable inclusive growth. It will also focus on proven IT governance, standards, practices, design & tools that lead to fast development and information fl ow to the user.
Invitation: We invite authors to submit papers refl ecting original research work and practical experiences in the areas of interest to the convention. Invitation is extended to CEOs/CIOs, IT Professionals, IT Users, academicians, researchers, students, and members of the CSI to attend as delegates in this convention. Software fi rms, Industries and business houses are invited to participate in the convention and present and exhibit their products and services. CSI – 2013, invites papers of original research and pertaining to ‘ICT and Critical Infrastructure’ and on the following topics (but not limited to):
* ICT use in Critical Infrastructure (CI) * Security Challenges in using ICT in CI * Wireless and Mobility technologies in the Control Loop * ICT in Steel Industry * ICT in Heavy & Manufacturing Industry * ICT in Process Industry* ICT in BFSI * ICT in Transportation * ICT in Education * ICT in Telecom * ICT in Healthcare * ICT in E-Commerce * ICT in Maritime - Navy, Ship Building, Ocean understanding * ICT in Rural Areas * ICT in eGovernance * Programming Paradigms for CI * Designing Applications for CI * ICT and Cyber Physical Systems in coastal areas * Role of OTT in CI * Synergistic Policy Framework for ICT and CI * Coexistence of OTT, Cloud, and Social Networks * Big Data Analysis and CI * Machine Intelligence * Soft Computing Applications * AI and Nano Computing * Geo informatics and Environment * Bio-informatics * Software Engineering * IT Security, Forensics and Cyber Crime
We also invite proposals for workshops, pre-conference tutorials and doctoral consortium.
Publication: Prospective authors are invited to submit paper(s) not exceeding 8 pages written in A4 size, and as per the AISC, Springer format on any one of the tracks listed above. The proceedings will be published by AISC series of Springer.
Important Dates:
Address for Communication
Paramata Satyanarayana
Convener, CSI-2013
Sr. Manager, Central Computer Center
Visakhapatnam Steel Plant, Visakhapatnam – 530 031
Mobile: +91 9949556989
Email: [email protected]
Organizing Committee Chair Programme Committee Chair Finance Committee ChairSri TK Chand, D(C), RINL Prof PS Avadhani, AUCE (A) Sri GN Murthy, ED (F&A), VSP
Registered with Registrar of News Papers for India - RNI 31668/78 If undelivered return to : Regd. No. MH/MR/N/222/MBI/12-14 Samruddhi Venture Park, Unit No.3, Posting Date: 10 & 11 every month. Posted at Patrika Channel Mumbai-I 4th fl oor, MIDC, Andheri (E). Mumbai-400 093 Date of Publication: 10 & 11 every month
Submission of Full Manuscript 15 – July – 2013
Notifi cation of Acceptance 15 – Aug – 2013
Camera ready copy 31 – Aug – 2013
Submission of Tutorial/Workshop Proposals 30 – July – 2013
Registration Starts 31 – Aug – 2013
Paper Submission for CSI - 2013https://www.easychair.org/conferences/?conf=csi2013
For more details please visit http://www.csi-2013.org