user modelling challenge ideatory 2014

11
Submitted by: Parindsheel S Dhillon Contest Organized by: Ideatory.co

Upload: parindsheel-dhillon

Post on 14-Jul-2015

67 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Submitted by: Parindsheel S Dhillon

Contest Organized by: Ideatory.co

User Modeling Data Analytics

Contest-Stage 1� Survey rating for various events by users to be predicted

� Geospatial data of users provided

� Event Details & Geospatial details of events provided

� 25 million observations of training Data� 25 million observations of training Data

� 0.3 million unique user-event combination rating to be predicted

� Adopted Analytics techniques� Data pre processing

� Ordinal Linear regression, clustering, decision trees

� Analytics Tools

� R, WEKA, MS Office

User Modeling Data Analytics

Contest-Stage 2

� Recommend predictive modeling idea based on daily routine life

� Problem scenario

Predictive modeling idea� Predictive modeling idea

� Data collection & Data Dictionary

� Data Sample(representative)

� Data Analytics & Result Delivery

Scenario� Obesity is one of the biggest problem

• More than 1.4 billion adults are overweight in 2008 (WHO)

• More than 40 million children under the age of 5 wereoverweight or obese in 2012 (WHO)

• More than 2/3 of USA current population is overweight• More than 2/3 of USA current population is overweight

� Overweight is leading factor for various diseases• Cardiovascular diseases, Diabetes type 2, Osteoarthritis &

some cancers like endometrial, breast, and colon (WHO)

� Changing lifestyle & eating habits• Over use of Packaged food containing Trans Fat, sugar & Salt

• Sedentary lifestyle with increase in use of Television,computer, mobile & sitting jobs.

Predictive modeling Idea� By predicting body weight change in 3 months based on

some daily activities

• Many people will foresee their overweight future & its associated problems

• Create awareness against obesity to save livesCreate awareness against obesity to save lives

• Gym, Health centers etc can also be strategically involved to en-cash this opportunity by using weight change predictive modeling

� Idea adoption

• Increasing awareness regarding health issues

• Zero figure culture among female population

Data Collection� Data collection from below activities

� Lifestyle & food intake

� Work profile

� Additional workout (if any)

� Personal & Demographics data

� Data Collection Process� Data Collection Process

� Food intake data will be collected using a smart phone app.

� Daily work out e.g. walking, cycling, running & swimming etccould also be collected using smart phone app.

� Personal & demographic data will be collected when a usersigns up for the app

Representative Data28 number of dependent variables having affect on body weight along with

demographic variables have been suggested. Imaginary data for two observations is as below

Date CustID Age Sex Weight Height Place Originreg

Breakfast

reg

Lunch

reg

Dinner

reg

Other

total

Cal

sugar

Freq

junk

Freq

1/8/2014 1 35 M 65 165 Chandigarh Indian 1000 1000 1200 300 3500 1-2 times 3-5 times

1/8/2014 2 48 F 80 162 Los Angeles American 700 1500 1400 500 4100 3-5 times more than 5

alcohol

Freq

alcohol

Qty

softDrink

Freq

skip

Breakfast

parent

Overwt

medical

Probmedication

sleep

Hours

quit

Smoke

work

Profile

work

Hrs

fitness

Activity

mins

Fitness

weight

Change

3-5 times 60ml very rarely Nil yes no no 7 no sedentary 8 No 0 ?

Nil Nilmore than

500mlNil yes No No 6 no light 10 gym 60 ?

Analytics� Data Pre-processing

� Variable transformation e.g. Net calorie stored in body

� Calorie count might need to be calculated for energy used and energy intake

� Outlier detection & certain medical obesity issues

� Statistical techniques for data analytics� Statistical techniques for data analytics

� Linear regression (stepwise)

� Akaike information criterion (AIC) will be used for relative modelquality

� Analysis of Variation (ANOVA)

� Time Series can also be used for long time prediction

Analytical Result & Delivery� Analytical Result

� Body weight change in 3 months based on daily activities willbe predicted for any individual

� For longer duration prediction Time series can be used alongwith Markov chains analysis

� Result Delivery� Result Delivery

� Phone application like fat booth need to be accommodated inoriginal phone app to show the prediction along with weightbar, photo need to be taken for this additional activity.

� Strategic alliance heath centre address & contact can beforwarded along with the results

� General advice like reduction in sugary content or soft drinksetc can be given to customer based on data

Thanks

Parindsheel Singh Dhillon

in.linkedin.com/in/psdhillon