bharati_singh_15

6
BHARATI SINGH Tata Consultancy Services, Bangalore, Karnataka. Mobile:07406630988, Email: [email protected] Career objective Quest to work in an environment that provides an opportunity to work on emerging technology and widen knowledge. I aspire to be an expert in High Performance Computing and make the best use of it for the company. Total HPC Experience: 5 years Current Employer Tata Consultancy Services, Bangalore, Karnataka. December 2012 – Present. Designation: Systems Engineer. Onsite HPC Consultant HPC Consultant for GE Bangalore, supports for Aviation,Energy, Oil & Gas, transportation, Appliances business groups and Global Research group. Total Duration: Mar 2013-Present (2years 2 months). Bangalore, India Current Project Name: HPC Onsite at GE(with PBSpro) Apr 2014 – Present (1 year 1 month) Bangalore,India. HPC Environment: Supporting for Total 21545 cores Cloud HPC.HP servers and mellanox interconnect based on Red Hat Enterprise Linux 5.6which is managing by PBSpro v12 work load scheduler with aggregate 1Peta Byte Panasas storage. PBSpro v12 : Queue management, Host group management and user/group management. Add/remove users from fair share. Implemented PBS scheduler level automated defunct processes cleaning up in production server. Wrote PBSpro Job failure analysis automation bash script for capturing system issues and fix them proactively. Daily monitoring of HPC jobs with PBSpro load manager. Nagios : Distributed Nagios v3.6 for 1500+ compute nodes health monitoring. Add/remove compute nodes and service checks from the Nagios host groups. Nagios client's configuration add/remove from batch node and make interactive node and visa versa. Daily monitoring of 1500+compute nodes for image validation, hardware health, user home with likewise, auto mount, disks space, scheduler services, memory utilization, user login alert on batch nodes, defunct process checks,HP server health check services etc and fix issues. Ganglia : Monitoring Interactive nodes load and reporting. Application Installation : Fine tuning of commercial and open source applications in parallel environment. Installed ansys, fluent, CFX and

Upload: bharati-singh

Post on 14-Aug-2015

40 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Bharati_singh_15

BHARATI SINGHTata Consultancy Services, Bangalore, Karnataka.Mobile:07406630988, Email: [email protected] objectiveQuest to work in an environment that provides an opportunity to work on emerging technology and widen knowledge. I aspire to be an expert in High Performance Computing and make the best use of it for the company.

Total HPC Experience: 5 yearsCurrent EmployerTata Consultancy Services, Bangalore, Karnataka.December 2012 – Present.Designation: Systems Engineer.

Onsite HPC ConsultantHPC Consultant for GE Bangalore, supports for Aviation,Energy, Oil & Gas, transportation, Appliances business groups and Global Research group.Total Duration: Mar 2013-Present (2years 2 months). Bangalore, India

Current Project Name: HPC Onsite at GE(with PBSpro)Apr 2014 – Present (1 year 1 month)Bangalore,India.

HPC Environment: Supporting for Total 21545 cores Cloud HPC.HP servers and mellanox interconnect based on Red Hat Enterprise Linux 5.6which is managing by PBSpro v12 work load scheduler with aggregate 1Peta Byte Panasas storage.

PBSpro v12 : Queue management, Host group management and user/group management. Add/remove users from fair share. Implemented PBS scheduler level automated defunct processes cleaning up in production server. Wrote PBSpro Job failure analysis automation bash script for capturing system issues and fix them proactively. Daily monitoring of HPC jobs with PBSpro load manager.

Nagios : Distributed Nagios v3.6 for 1500+ compute nodes health monitoring. Add/remove compute nodes and service checks from the Nagios host groups. Nagios client's configuration add/remove from batch node and make interactive node and visa versa. Daily monitoring of 1500+compute nodes for image validation, hardware health, user home with likewise, auto mount, disks space, scheduler services, memory utilization, user login alert on batch nodes, defunct process checks,HP server health check services etc and fix issues.

Ganglia : Monitoring Interactive nodes load and reporting.

Application Installation : Fine tuning of commercial and open source applications in parallel environment. Installed ansys, fluent, CFX and iCreep in parallel environment. Installed and configured module files for applications. Added and updated client in house software license tokens. Open source libraries installation.Test new version of applications before releasing in to production. Plan and Execute tests for understanding and providing appropriate solution to the end users.

Service Desk : Interact with users and resolve their issues with Defined SLA. Understand users requirement and provide/follow up vendor for the solution. Maintaining monthly 99.5% SLA achievement in service now supporting tool.

Client's in House Batch job submission portal(HAL) : Add, remove and edit user profile in job submission portal. Add, remove and edit application profile for job submission. Configure queue and enable/disable for users as per requirement.

Automation with bash script : Job failure analysis on daily basis to fix system related issues proactively. Write bash script for capturing system/job logs, collecting historical jobs data related to license utilization using MySQL.

Page 2: Bharati_singh_15

Log case with vendor for the hardware,software and storage related issues.

Previous Project Name: HPC Onsite at GE(with LSF)Mar 2013-Apr 2014(1 year 1 month)Bangalore,Karnataka, India.

HPC Environment: Supporting for Total 18500 cores HPC’s with HP servers and mellanox interconnect based on Red Hat Enterprise Linux 5.6 which is managing by LSF8&7 job scheduler with aggregate 1Peta Byte Panasas storage.

LSF version 8 & 7: Host groups, queues and compute nodes configuration. Add/remove user from scheduler. Node management and node level virtual resource configuration to control job launch.

LSF Job failure analysis: Daily LSF job failure analysis and monthly report preparation. Wrote and update script to categories error type and automatically filter out common errors from the log files copied by post exec script.

Application Installation: Fine tuning of commercial and open source applications in parallel environment. Installed Ansys, Fluent, CFX, iCreep and PGfortan in parallel.

Nagios: Daily monitoring of 1300+ compute nodes for image validation, hardware health, user home with likewise, automount, disks space, scheduler services, memory utilization, user login on batch nodes, defunct process checks,etc.

Batch job submission portal: Add, remove and edit user profile in job submission portal. Add, remove and edit application profile for job submission. Configure queue and enable/disable for users as per requirement.

Bash Script writing: wrote some bash based scripts to filer out a year total jobs submitted by a specific group for an application & with specific flags. Write small scripts for checking failed job error and checking nodes health.

User Interaction: Interacted with user and resolved their issue within Defined service desk SLA. Explain issues and appropriate solution to the user. Talk with users understand their requirement and provide/follow up vendor for solution.

MySQL Database client: Fetch job, user account details. Add/remove host group details on data base to keep trace of resource utilization.

Scratch area for running jobs: Monitored Panasas storage space for HPC job submission. Installed client package and client Configuration, mounted appropriate Panasas storage on all compute nodes.

Previous EmployerJawaharlal Nehru Center for Advanced Scientific Research, Bangalore,India.Duration: May 2010 – December 2012(2year 7 months).Designation: R & D Assistant (Role: System Administration).

HPC Environment:Worked as a primary System Administrator of High Performance Computing facilities based on CentOS v5.3 Operating system with Luster parallel file system managed by Job Schedulers.

HPC Specification:HP Server: This HPC is a 512 cores(128 compute nodes) HP hardware based cluster with an Infiniband 4X DDR interconnects. It has dual core processors with 1TB aggregate RAM and 32TB storage capacity luster file system. Cluster resource manager is LSF integrated with SLURM.

Super Micro Server: The HPC is 384 cores Tyrone/Super Micro Server with Mellanox Infiniband connect X DDR. It has 1TB aggregate RAM and 30TB aggregate storage capacity based on luster file system. Moab integrated with Torque is the Resource manager software.

Page 3: Bharati_singh_15

Responsibilities: Compiled/Installed commercial and open source scientific software’s and ensured integration with the

system provided parallel libraries.(Eg: quantum espresso, abinit, gromas, lammps, CP2K, Amber, Gaussian, VASP,siesta etc.)

Creating account for the users , adding/removing users in to job scheduler. Configured Luster file system quota and set up group & user level luster disk quota. Created and fine tuned LSF queues for job submission. Planed and performed application benchmarking for gaussian09. Tested benchmark inputs to provide the

same benchmark while ordering new HPC for the organization. Scheduled backup time to time using Bacula and HP data protector. Wrote bash scripts to report usage and automated disk space cleaning. Interacted with users and resolved their issues. Updated High Performance Computing related web pages for the Organization. Logged cases for Hardware issues to the vendor & maintained logs of all replaced parts. Primary contact for the vendor who supplied the HPC machines Setting up Instructional Computing Class room for conferences, trouble shooting the hardware and software

issues of class room’s machines. Daily monitored of PAC's, UPS's and logging case with vendor for any issue.

Configured cluster: Re-builded old cluster Shabala. It is a 16 nodes machine with dual core Intel(R) Xeon(TM) CPU 3.20GHz processor and 4GB RAM on each. Inter connection with Netgear 16 ports switch.Upgraded Operating System from fedora core 4 to Ubuntu server-12.04, setup PXE Boot FAI(Fully Automatic Installation) for cloning. Installed job scheduler (SLURM) and set module for loading environment variable. Shared user home directory and software’s via NFS. Installed Intel compiler and scientific software’s installation. Installed MPI softwares (Openmpi-1.6 & intel MPI-4.0) & Scientific software (Lammps) for parallel calculations.

Zoom Technologies, Hyderabad, India.Designation: Faculty.Duration: August 2009-April 2010(7 months)Responsibilities:

Configured and maintained PXE boot and kick start server for 30 desktops OS image.Configured DNS,NIS,LDAP, virtual machine, RAID and LVM.Configured and maintained FTP, NFS and samba servers for sharing course material.Linux User and group management. Helped students in Linux Administration lab practice.

Shital Computer, Ambikapur(C.G.), India.Designation: Faculty.Duration: February 2007-August 2007.Responsibilities: Taught C, C++ programming, Mathematics,Fox-pro and basics of networking.

Aptech Education, Ambikpur(C.G.),India.Designation: Lab Assistant.Duration: March 2005 - April 2006(1year 1 month).Responsibilities: Taught C and C++, computer fundamental and database. Helped students in daily lab practices.

Summary Hard working and dedicated to work. Team player. Adaptive in any environment and Determination to accept challenging work.

Technical Skills Operating System: Linux, Windows 98/XP prof /vista/7, Basics of free BSD.

Page 4: Bharati_singh_15

Job Scheduler: PBSpro v12, LSF7&8, Torque, Moab, SLURM. MPI: Openmpi, Intelmpi, Hpmpi. Monitoring Tool: Nagios. Languages: C, C++, HTML,CSS, Bash shell scripting. Database: MySQL client, Microsoft SQL Server.

Educational Qualification

Degree University Passing Year Marks in %

Master in Computer Application

Sikkim Manipal University. 2012 61.90%

Bachelor in Science (Mathematics, Physics, Chemistry)

Guru Ghasidas University, Bilaspur(C.G.)

2004 50.38%

Intermediate (Mathematics, Physics, Chemistry)

Madhyamik Shiksha Mandal, Bhopal(M.P.)

2000 67.33%

Matriculation Madhyamik Shiksha Mandal, Bhopal(M.P.)

1998 62.40%

Technical Certificate Red Hat Certified Engineer (RHCE Certificate number is 805010725848855) from Complete Open Source

Solutions with 100% marks, Hyderabad in 2010. DOEACC ‘A’ LEVEL with 64% from Mahan Institute of Technologies, New Delhi 2007-09. DOEACC ‘O’ level with 64%, New Delhi in 2005-07.

Technical Education MCITP-2008(Microsoft Certified IT Professional) from Zoom technologies, Hyderabad in 2009. RHCE (Red Hat Certified Engineer) from Zoom technologies, Hyderabad in 2009. CCNA (Cisco Certified Network Associate) from Zoom Technologies, Hyderabad in 2009. Microsoft Exchange Server from Zoom Technologies, Hyderabad in 2009. PC Hardware from Zoom Technologies, Hyderabad in 2009. Dot NET from APTECH Computer Education, New Delhi in 2008-2009. IEC ‘O’ level from, NEW Delhi 2003-2004.

Current Interest High Performance Computing. LAN and WAN Technology. Scripting. Parallel file system.

Personal DetailsFather’s name: Mr. Ram Vishwas SinghDate of birth: 15th April 1983Sex: FemaleLanguages known: English and Hindi.Nationality: IndianAddress: Gayatri Nilaya 5/1, Shri shiradi sai nagar road,

Kundalahalli gate, Bangalore, Karnataka, Pin-560037.

BHARATI SINGHDate: 23-04-2015Place: Bangalore

Page 5: Bharati_singh_15