Introduction to Data Science with R and
SpotfireDr. Brand Niemann
Director and Senior Data Scientist/Data JournalistSemantic CommunitySemantic Community
Data ScienceData Science for Random Forests
November 2, 2015
Overview
• Learning Path: Data Science with R• Play Video: 15 Minute Introduction
• Kaggle Competition• How Much Did It Rain Part II
• Kaggle Rain II in Spotfire• TIBCO Spotfire TERR Tools
Learning Path: Data Science with R
• Publisher: O'Reilly Media• Released: August 2015• Run time: 24 hours 34 minutes• The R programming language has arguably become the single most important tool for
computational statistics, visualization, and data science. With this Learning Path, master all the features you'll need as a data scientist, from the basics to more advanced techniques including R Graph and machine learning. You'll work your data like never before.• Learning to Program with R, by Stuart Greenlee, 04:18:15• Introduction to Data Science with R, by Garrett Grolemund, 08:36:40• Expert Data Wrangling with R, by Garrett Grolemund, 03:50:39• Writing Great R Code, by Richard Cotton, 00:59:13• Data Science with Microsoft Azure and R, by Stephen Elston, 06:48:46
https://player.oreilly.com/videos/9781491940303?toc_id=220077
Play Video: 15 Minute Introduction
https://www.kaggle.com/c/how-much-did-it-rain-ii
https://www.kaggle.com/c/how-much-did-it-rain-ii/data
train.CSV 1219 MBtest.CSV 633 MBsample_solution.CSV 12 MBsample_dask.py
Observations
• The previous are the statistical characteristics of the three data sets.• Treating a stochastic problem with a deterministic modeling approach.• Marshall–Palmer relation: Z = aRb, where a and b are adjustable parameters. Z (mm6 m-3)
is the radar reflectivity and R (mm h-1) is rainfall rate.• Data Dictionary:
• radardist_km: Distance of gauge from the radar whose observations are being reported.• Ref: Radar reflectivity in km• RefComposite: Maximum reflectivity in the vertical column above gauge. In dBZ.• RhoHV: Correlation coefficient (unitless)• Zdr: Differential reflectivity in dB• Kdp: Specific differential phase (deg/km)• Expected: Actual gauge observation in mm at the end of the hour.
• Try Insert Calculated Colum and/or Regression Modeling.
Insert Calculated Column Regression Modeling
TIBCO Spotfire TERR ToolsMORE TO FOLLOW