data scientist job description

Download Data Scientist JOb Description

If you can't read please download the document

Upload: ajay-kumar

Post on 24-Sep-2015

7 views

Category:

Documents


3 download

DESCRIPTION

Data Scientist JOb Description

TRANSCRIPT

As the demand for the Data Scientist job position continues to increase, were seeing significant variation in the ads appearing on places like LinkedIn. From the discontinuity Ive noticed in the ads, it is reasonable to question whether the internal/external recruiters and HR departments actually know what a data scientist does. In some cases Ive seen, 2 or 3 positions described for just one data scientist position. I wonder if the successful candidate will command a salary equal to 2-3 people? It is as-if the HR manager just did a Google search on data scientist and copied/pasted every keyword they could find.So I thought it would interesting to evaluate a typical job ad that I found floating around on LinkedIn. This particular ad was posted by Facebook. Lets see how well it does (see my commentary in red):Job descriptionFacebook is seeking a Data Scientist to join our Data Science team. Individuals in this role are expected to be comfortable working as a software engineer and a quantitative researcher [I don't agree with this trend, a data scientist is NOT necessarily a coder which I believe is a waste of talent, and a coder is most definitely NOT a data scientist who should have a significant theoretical foundation in mathematical statistics]. The ideal candidate will have a keen interest in the study of an online social network, and a passion for identifying and answering questions that help us build the best products. ["answering questions" alludes to the storytelling ability that all data scientists must have in order to convey in lay terms what the data are saying]ResponsibilitiesWork closely with a product engineering team to identify and answer important product questionsAnswer product questions by using appropriate statistical techniques on available data [again alluding to the important storytelling ability data scientists should possess; more than just numbers]Communicate findings to product managers and engineers [ditto storytelling above]Drive the collection of new data and the refinement of existing data sources [data scientists should have deep data munging ability to prepare data for machine learning algorithms]Analyze and interpret the results of product experiments [the "scientific method" should be a prime ingredient of a data scientist's tool set]Develop best practices for instrumentation and experimentation and communicate those to product engineering teams [ditto storytelling above]RequirementsM.S. or Ph.D. in a relevant technical field, or 4+ years experience in a relevant role [it is good to see that traditional academic background "in a relevant technical field" is a top requirement although the education requirement may have to evolve a bit to include prospects trained in the new MOOC ecosystem]Extensive experience solving analytical problems using quantitative approaches [this is where a solid background in mathematics and statistics is valuable]Comfort manipulating and analyzing complex, high-volume, high-dimensionality data from varying sourcesA strong passion for empirical research and for answering hard questions with data [totally agree here with a "passion for empirical research" mandatory]A flexible analytic approach that allows for results at varying levels of precisionAbility to communicate complex quantitative analysis in a clear, precise, and actionable manner [ditto storytelling above]Fluency with at least one scripting language such as Python or PHP [sorry, Coder Data Scientist]Familiarity with relational databases and SQL [Absolutely agree]Expert knowledge of an analysis tool such as R, Matlab, or SAS [a true data scientist must be expert level with one of these for machine learning modeling]Experience working with large data sets, experience working with distributed computing tools a plus (Map/Reduce, Hadoop, Hive, etc.) [many job ads now include this requirement and I agree with it, but not to the level of requiring the data scientist to be responsible for production architecture, deployment as well as maintenance & support for a Hadoop system for example, just not realistic]I think the above Facebook job ad is better than many Ive seen recently, and Im sure the person the company ultimately hires will be a good fit. That being said, I believe in the separation of experimentalist data scientist, or one who codes production systems and is more of a IT person, and the theorist data scientist who analyzes data, does exploratory data analysis, develops machine learning models, evaluates models and writes cogent reports for management consumption.