real-time analytics in big data
TRANSCRIPT
Real-time Analytics in Bigdata EcosystemACCESS DATA REAL-TIME
What is Big Data?
Big data is a term that describes the large volume of
data – both structured and unstructured – that
inundates a business on a day-to-day basis. But it’s
not the amount of data that’s important. It’s what
organizations do with the data that matters. Big
data can be analyzed for insights that lead to better
decisions and strategic business moves.
Data is ExplodingToday, every 2 minutes we are generating same amount of data that was created from the beginning of time until the year 2000.
Every minute we spend over 200 million emails, generate almost 2 million Facebook likes, send over 250 thousand tweets, and upload over 20,000 photos on Facebook.
90%
Over 90% of all the data in the world was created in the past 18 months.
Google alone processes 40 thousand search queries per second, making it over 3.5 billion in a single day.
Over 100 hours of video are uploaded on YouTube every minute and it would take you around 15 years to watch every video uploaded by users in one day.
If you burned all the data created in just one day onto DVDs, you could stack them on each other and reach the moon – twice.
The number of bits of information stored in the digital universe is thought to have exceeded the number of stats in the physical universe in 2007.
The big data industry is expected to grow from US $10.2 billion in 2013 to about US $54.3 billion by 2017.
TECHNOLOGIES FOR REAL-TIME
ANALYTICS SOLUTION
Apache Kafka
Fast, scalable, and durable
Based on modern-cluster centric design
Handles hundreds of megabytes of reads and writes per second
Designed to allow a single cluster to serve
Apache Storm
Free, open-source, distributed, and real-time computation system
Simple and can be used with any programming language
Fast, guaranteed data processing, easy to set up and operate
Integrates with queuing and database technologies
Spark
Open-source, distributed computing framework
Addresses critical challenges to advanced analytics in Hadoop
Supports in-memory processing and is faster than MapReduce
Offers integrated framework for advanced analytics
Druid
Open-source infrastructure for real-time exploratory analytics
Druid’s real-time nodes employ lock-free ingestion for append-only data sets
Leverages memory mapping capabilities and uses distributed architecture
Druid offers multi-dimensional filtering
Companies and their Big Data Solutions WHAT THE COMPANIES OFFER
Enterprise big data initiatives face a massive challenge in processing and pulling value out of volume. But, the right big data services can process huge volumes of data to extract the kind of actionable insights that can truly drive a business forward.
Big data analytics accelerators and aggregators
Partnerships and alliances with major big data solutions vendors
Big data maturity roadmaps and reference architecture
Starting point to endpoint implementation assessments
Industry-specific key performance indicator (KPI) toolkits
Innovative industry frameworks tailored for specific industry needs
Big data labs and Centers of Excellence (CoEs) across multiple locations that focus on product evaluation and performance benchmarking
Employee count: 15,000+www.mindtree.com
Technology used:
In-house experts use technology, proven frameworks and tools and domain expertise to turn problems into successful business outcomes, delivering data visualization, enterprise data management, business intelligence and data analytic solutions under one umbrella.
Central to Cognizant's strategy around discovering and driving business value in big data is our innovative suite of solutions. Each leverages big data technologies to deliver enhanced insight and analytics to various industries.
Solution accelerators
Big data lab on demand
Idea to implementation
Data visualization and analytics
Technology evaluation and piloting
Big data strategy and roadmap definition
Employee count: 100000+www.cognizant.com
Technology used:
• Big Data Analytics Value Assessment (BAVA) Framework
• iSMART (integrated Social Media Analytics and Reporting Tool
• SCOREL (stock correlation analytics)
• SmartNode
• Hadoop
Cybage’s expertise covers an array of relevant tooling, frameworks, and building blocks. The pre-verified and gaps-addressed core Hadoop frameworks remove the guesswork out of implementation. The Big Data insights, and cloud infrastructure has made it imperative for products and services to create and deliver experiences through digital channels and infrastructure.
Coordinated infrastructure and workflow frameworks
Quick Analytics
NoSQL databases: MongoDB, Cassandra, HBase, and Neo4j
Distributed log processing: Flume, Scribe, and Chu kwa
Hadoop-focused QA: Comprehensive big data verification, cluster benchmarking, and performance tuning
Specialized test methodology: Purpose-engineered statistical test methodology for big data solution verification
Focused big data test team: Dedicated QA Architect and big data test team
Employee count: 5,000+www.cybage.com
Technology used:
Sqoop, Hive, PentaHo, SSRS, Cognos, and Qlikview, Hadoop
To help organizations make sense of their data, Persistent has developed ShareInsights – A unique platform that allows organizations to analyze an overlay of enterprise data with public or cloud sources to derive meaningful insights. An open platform, ShareInsights enables users to mine meaningful insights from the data sources that matter to them and share them with a wide audience. Users can quickly and easily on-board new use cases and summarize large volumes of unstructured data.
Multi-Faceted Data allowing user to gain interesting insights
Quick Analytics
Seamlessly share insights on Facebook or the ShareInsights Gallery
Library of algorithms and integration with third party datasets, including public datasets
Built-in visualizations
Drill down capabilities to find particular behavior
Analyzes unstructured text
Employee count: 8,000+www.persistent.com
Technology used:
Hadoop, Sqoop, SciDB
Benefits of Big DataWHAT YOU CAN ACHIEVE WITH BIG DATA
Big Data
Dialogue with
Consumers
New Products &
Services
Risk Analysis
Faster and Better
Reduced Cost
Customize in Real Time