kaggle bikeshare competition - part 1

16
{ Kaggle < - BikeShare Taposh Dutta Roy Jan 26 th 2015 Presented at YHAT, Oakland, CA

Upload: taposh-roy

Post on 15-Jul-2015

378 views

Category:

Engineering


2 download

TRANSCRIPT

{

Kaggle <- BikeShareTaposh Dutta RoyJan 26th 2015Presented at YHAT, Oakland, CA

Contents

• About Bikeshare• Data • Tools• R• Factor Engineering• Matrix• Random Forest• Neural Nets

About Bike Share

Competition: http://www.kaggle.com/c/bike-sharing-demand

Challenge: Forecast use of a city’s bike share system

Data : UCI Machine Learning Repository

Publication :Fanaee-T, Hadi, and Gama, Joao, Event labeling combining ensemble detectors and background knowledge, Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg.

About Bike Share

Data

The goal is to predict counts either based on sum of casual & registered or directly

Data Fieldsdatetime - hourly date + timestamp season - 1 = spring, 2 = summer, 3 = fall, 4 = winter holiday - whether the day is considered a holidayworkingday - whether the day is neither a weekend nor holidayweather –

1: Clear, Few clouds, Partly cloudy, Partly cloudy 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain +

Scattered clouds 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog

temp - temperature in Celsiusatemp - "feels like" temperature in Celsiushumidity - relative humiditywindspeed - wind speedcasual - number of non-registered user rentals initiatedregistered - number of registered user rentals initiatedcount - number of total rental

Data datetime - hourly date + timestamp season - 1 = spring, 2 = summer, 3 = fall, 4 = winter holiday - whether the day is considered a holidayworkingday - whether the day is neither a weekend nor holidayweather –

1: Clear, Few clouds, Partly cloudy, Partly cloudy 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain +

Scattered clouds 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog

temp - temperature in Celsiusatemp - "feels like" temperature in Celsiushumidity - relative humiditywindspeed - wind speedcasual - number of non-registered user rentals initiatedregistered - number of registered user rentals initiatedcount - number of total rental

Data Datetime - hourly date + timestamp

Predefined Factors:season - 1 = spring, 2 = summer, 3 = fall, 4 = winter holiday - whether the day is considered a holidayworkingday - whether the day is neither a weekend nor holidayweather –

1: Clear, Few clouds, Partly cloudy, Partly cloudy 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain +

Scattered clouds 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog

Data - Continuous

Data

Workday busy hours

Data

Data

Tools

Weka

R

Python

H2O + R

Vowpal Wabbit

Using R

Feature Engineering

Citations

Feature-Weighted Linear StackingJoseph Sill1, Gabor Takacs2, Lester Mackey3, and David Lin4

Combining Predictions for Accurate Recommender SystemsMichael Jahrer ,Andreas To ̈scher ,Robert Legenstein