anurag agrawal_new_ml

1
Anurag Agrawal 1215 E Vista Del Cerro #2097 Tempe AZ 85281 | +1 480 434 2570 | [email protected] | LinkedIn Profile SUMMARY Computer science graduate student and a research associate at iMPACT lab with 3.5 years of work experience, having proficiency in Java, PL-SQL, PostgreSQL, SQL, and Big Data technologies (Apache Hadoop, Apache Spark) EDUCATION Master of Science in Computer Science May ’17 Arizona State University, Tempe, AZ, USA GPA 3.67 Bachelor of Engineering in Computer Science and Engineering Jun’13 Rajiv Gandhi Technology University, Bhopal, India GPA 3.8(First with Distinction) SKILL SET/CERTIFICATION Languages : Java, PL-SQL, Python, C++, Java Servlets, Matlab, Android Database : PostgreSQL, MySQL, Oracle Web Development : RESTfull API, JSON, HTML, XML, JavaScript, Node.JS Tools/Platform : Oracle R12, Apache Spark, Hadoop, Map-Reduce, Unix/Linux platform Certification : Six Sigma Green Belt PROFESSIONAL EXPERIENCE Assistant System Engineer, Tata Consultancy Services Ltd., India Jan'14 – Dec’14 Worked as an Oracle technical consultant in a development team of 92 associates to integrate and extend functionalities of Oracle R12 in power generation domain by using Java, PL-SQL Extended oracle inventory module’s functionality for mapping of item’s price from its first approved place order (PO) - PL-SQL, SQL Integrated Oracle Supply Chain Management (SCM) with CRM module for scheduling large volume of maintenance requests and performed performance tuning of complex join queries to reduce the processing time RESEARCH EXPERIENCE Research Associate, Impact Lab, Arizona State University – Advisor Dr. Sandeep Gupta May’15 – Present Developing a regression model which would forecast the blood glucose of a person from a time-series dataset with intensity of their physical activity, meal intakes and insulin responses as features using machine learning techniques in Matlab, Python, scikit-learn Implementing an automated testbed for pervasive monitoring systems which would ensure the system’s security where system model consists sensors, mobile devices, servers, and distributed database servers using Java, MySQL, Java Servlets, Android, and TinyOS Implemented least square optimization algorithm for estimation of differential parameters in Bergman Minimal Mathematical Model Developed RestAPI for migrating data from IOT devices to servers and storing to database – Java Servlet, Apache Commons, JSON Research Associate, DataSys Lab, Arizona State University – Advisor Dr. Mohamed Sarwat Jan’15 – Jul’15 Worked on a recommendation algorithm for recommending items through user’s previous activities by clustering similar products, Cosine similarity and Pearson correlation as metrics. Evaluated the performance of the system on large datasets such as 100M Movie lens, academic yelp dataset using Python and PostgreSQL PUBLICATION [Accepted: ATTD 2017, Feb’17] - Linear Models of Physical Activity and their Effect on Accuracy of Blood Glucose Level Prediction ACADEMIC PROJECTS Momentum in Logistic Regression – Python Aug’16 – Dec’16 Developed logistic regression machine learning algorithm using hinge loss: max margin, L2 regularization, Polyak Momentum, Nesterov Momentum and log - likelihood loss individually using Python, JSON data format Evaluated the accuracy and iterations of each implementation on Amazon- Google Product - 1M large dataset having 116 attributes Classification of a biomedical dataset for predicting human eye disease – Python Jan’16 – May’16 Developed a classification model to predict human eye disease by analyzing a biomedical dataset with 930 observations having 19 external features using Logistic Regression, SVM and Neural Network and attained 82.3 accuracy with 10-fold cross validation Optimized algorithm for Join operations in the database– Java Jan’16 – May’16 Implemented an efficient and scalable algorithm to perform inequality join operations in the Minibase database which improved the performance of join operations by 15x against sort merge join, evaluated on 5M large dataset. Geo – Spatial operations – Spark, Java, Microsoft Azure Nov’15 –Feb’16 Developed a large scalable distributed system on Apache Spark and Hadoop 5 node cluster to perform convex hull, spatial join query and spatial aggregation operations on large geo-spatial datasets. Implemented gird file and sort tile recursive indices on geo-spatial datasets for enhancing the performance of the system Identified minimum number of nodes required in a cluster to optimize the query time in a distributed environment against centralized environment

Upload: anurag-agrawal

Post on 13-Apr-2017

40 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Anurag Agrawal_new_ML

Anurag Agrawal 1215 E Vista Del Cerro #2097 Tempe AZ 85281 | +1 480 434 2570 | [email protected] | LinkedIn Profile

SUMMARY Computer science graduate student and a research associate at iMPACT lab with 3.5 years of work experience, having proficiency in Java,

PL-SQL, PostgreSQL, SQL, and Big Data technologies (Apache Hadoop, Apache Spark)

EDUCATION Master of Science in Computer Science May ’17

Arizona State University, Tempe, AZ, USA GPA 3.67 Bachelor of Engineering in Computer Science and Engineering Jun’13

Rajiv Gandhi Technology University, Bhopal, India GPA 3.8(First with Distinction)

SKILL SET/CERTIFICATION Languages : Java, PL-SQL, Python, C++, Java Servlets, Matlab, Android Database : PostgreSQL, MySQL, Oracle Web Development : RESTfull API, JSON, HTML, XML, JavaScript, Node.JS Tools/Platform : Oracle R12, Apache Spark, Hadoop, Map-Reduce, Unix/Linux platform Certification : Six Sigma Green Belt

PROFESSIONAL EXPERIENCE Assistant System Engineer, Tata Consultancy Services Ltd., India Jan'14 – Dec’14

Worked as an Oracle technical consultant in a development team of 92 associates to integrate and extend functionalities of Oracle

R12 in power generation domain by using Java, PL-SQL

Extended oracle inventory module’s functionality for mapping of item’s price from its first approved place order (PO) - PL-SQL, SQL

Integrated Oracle Supply Chain Management (SCM) with CRM module for scheduling large volume of maintenance requests and

performed performance tuning of complex join queries to reduce the processing time

RESEARCH EXPERIENCE Research Associate, Impact Lab, Arizona State University – Advisor Dr. Sandeep Gupta May’15 – Present

Developing a regression model which would forecast the blood glucose of a person from a time-series dataset with intensity of their

physical activity, meal intakes and insulin responses as features using machine learning techniques in Matlab, Python, scikit-learn

Implementing an automated testbed for pervasive monitoring systems which would ensure the system’s security where system model

consists sensors, mobile devices, servers, and distributed database servers using Java, MySQL, Java Servlets, Android, and TinyOS

Implemented least square optimization algorithm for estimation of differential parameters in Bergman Minimal Mathematical Model

Developed RestAPI for migrating data from IOT devices to servers and storing to database – Java Servlet, Apache Commons, JSON Research Associate, DataSys Lab, Arizona State University – Advisor Dr. Mohamed Sarwat Jan’15 – Jul’15

Worked on a recommendation algorithm for recommending items through user’s previous activities by clustering similar products,

Cosine similarity and Pearson correlation as metrics. Evaluated the performance of the system on large datasets such as 100M Movie

lens, academic yelp dataset using Python and PostgreSQL

PUBLICATION [Accepted: ATTD 2017, Feb’17] - Linear Models of Physical Activity and their Effect on Accuracy of Blood Glucose Level Prediction

ACADEMIC PROJECTS Momentum in Logistic Regression – Python Aug’16 – Dec’16

Developed logistic regression machine learning algorithm using hinge loss: max margin, L2 regularization, Polyak Momentum,

Nesterov Momentum and log - likelihood loss individually using Python, JSON data format

Evaluated the accuracy and iterations of each implementation on Amazon- Google Product - 1M large dataset having 116 attributes Classification of a biomedical dataset for predicting human eye disease – Python Jan’16 – May’16

Developed a classification model to predict human eye disease by analyzing a biomedical dataset with 930 observations having 19

external features using Logistic Regression, SVM and Neural Network and attained 82.3 accuracy with 10-fold cross validation Optimized algorithm for Join operations in the database– Java Jan’16 – May’16

Implemented an efficient and scalable algorithm to perform inequality join operations in the Minibase database which improved the

performance of join operations by 15x against sort merge join, evaluated on 5M large dataset. Geo – Spatial operations – Spark, Java, Microsoft Azure Nov’15 –Feb’16

Developed a large scalable distributed system on Apache Spark and Hadoop 5 node cluster to perform convex hull, spatial join query

and spatial aggregation operations on large geo-spatial datasets.

Implemented gird file and sort tile recursive indices on geo-spatial datasets for enhancing the performance of the system

Identified minimum number of nodes required in a cluster to optimize the query time in a distributed environment against centralized

environment