data warehouse big data solutions · microsoft sharepoint 2010 enterprise: teamsite, publishing...
TRANSCRIPT
TMA Solutions 1
YOUR QUALITY PARTNER FOR SOFTWARE SOLUTIONS
www.tmasolutions.comTMA Solutions
BI-Big Data-Analytics
www.tmasolutions.comTMA Solutions
Skill Set
Big data & data analysis: Staging (ETL process) Data Warehousing (Storage structure: ROLAP, MOLAP, HOLAP) Data-mining: Classification/Regression, Clustering, Association, Sequence
Analysis Reporting: Cognos report, Jasper report, Qlikview report Hadoop platform
Microsoft Data Analytics tools: Microsoft Business Intelligence (Microsoft SQL Server Data Tools in 2012) :
Integration tool, Analysis tool, Reporting tool, SSRS, SSIS Microsoft Office 2010: Excel Pivot Table, Excel Pivot Chart. Microsoft SharePoint 2010 Enterprise: TeamSite, Publishing Portal.
Machine Learning Algorithms: Decision tree, SVM, Knn, Bayes theorem, Bayesian Network,
Neuron network Verification methods: ROC curve, t-test R language
Sample Projects
Traffic Data Analysis
Speed Profile Clustering
Sincerity Sentiment Mining
GENOME Alignment & Variant Callings
Imaging Mass Spectrometry
Traffic Data Analysis (1/2)
Real-time data report and analysis
Visualize data with chart and pivot table
Build up traffic data warehouse
User could analyze by filtering, rolling up and drilling down
Application
Geography
Device type
Report Type
Source Type
FactData
Time
Usage
User share
Hotspot/Auto
Hotspot
Device
GPS
LiveSpeed
Traffic data warehouse
DIMENSIONAL DATA
Facts Dimensions
Traffic Data Analysis (2/2)
Speed Profile Clustering
To support smart driving and navigation
Data collected from Navigation app
Billions of records in multidimensional manner
Road info
Vehicle speed with timestamp
Type of road
Unsupervised method used: SVM, Mean Shift
Speed Profile Clustering
Time
Speed
Sincerity Sentiment Mining
Analyzing & ranking sincerity sentiment of reviewers/commentators in an online community
Dealing with large historical data of reviews/comments
R language
Machine learning techniques
Natural Language Processing: POS, Tagger,…
Neuron network
SVM
Bayesian network
Etc.
Sincerity Sentiment MiningA Typical Framework
-There's lots of cool stuff packed into espn's ultimate x!- There's suspension of disbelief and then there's bad screenwriting..!this film packs a wallop of the latter!....
Genome Alignment & Variant Callings
Features
DNA, Exom, RNA Alignment
Variant callings
Beta Result
Genome Alignment & Variant Callings
Next-Generation Sequencing Machine(e.g. Illumina machine)
ACGTGTACAAGGTCCGGTTCTGAAAGTTGACCATGGATAACCGGTTAATTTAAGGAT
…..................AGTCCTTTTACATTGAGTAG
Human genome has about 3 giga bases/letters
CEQEO System
Hundreds of million reads(30X coverage ~ 90GB)
Aligned reads and Variant callings
Imaging Mass Spectrometry (1/2)
Features
Analyzing ToF-SIM data (multiple dimension)
Imaging / visualizing the data
Techniques
PCA algorithm
Bayes theorem
R Language
Imaging Mass Spectrometry (2/2)
Raw data in multi-dimension
Visualization
TMA Solutions 14
THANK YOU !
+ 1 802-735-1392+ 61 414-734-277+81 3-6432-4994www.tmasolutions.com
North America number:Australia number:Japan number:Website:
+84 8 3997-8000+84 908-676-212+84 8 [email protected]
Tel:Mobile:Fax:Email: