Performance focused Data Scientist with 6 years of in-depth expertise in cutting-edge AI, Machine Learning, Deep Learning, NLP, Search/Recommendations, Text Mining, Large-Scale (Big Data) Distributed Systems & Cloud Computing.Experience in large scale mathematical modelling and product management with the ability to draw critical insights from data .
Skills
Language : Python (Flask, Django), Java, J2EE (Spring, Hibernate), Bash/Shell scripting, R.
Machine Learning : Supervised (Linear Regression, Logistic Regression, Decision Tree, Random Forest, Naive Bayes, Ensemble Methods), Unsupervised (K-means, PCA), Recommendations.
NLP : NLP, NLU, NLG, Embeddings (BERT, Bio-BERT, Distil-Bert, Glove), Pre-trained Models (Roberta, BART, Transformer QA), NER, POS
Statistical Tools – Python, R, PySpark, SQL, SAS, SPSS
Analytical Tools : : R, Python, SAS (Base & Enterprise Miner), SPSS, EViews
Deep Learning Tools : PySpark, NumPy, Pandas (and others)
Soft Skills: Insight Generation, Project Management and Planning, Team Management
Senior Data Scientist Ecommerce Company
- A machine learning approach to predict the customer’s satisfaction score (CSAT) and net promoters score (NPS). Helped to increase our NPS score by ~20% and CSAT score by ~10%. Machine Learning, Data Cleaning, Regression, Python, SQL.
- Applying machine-learning techniques such as Random Forest, Gradient Boosting, Logistic regression WORK • EXPERIENCE etc. to construct credit risk models having application form, customer transactions and bureau data.
- Developed Graph based Probabilistic Address Parser that uses Ranking Algorithm (LTR) to predict Hierarchy and Pincode for a given address with more than 90% confidence.
- Involved in Data Extraction, Statistical Data Transformation and Data Loading. Worked on Exploratory Data Analysis and provided the insights coming from the data
- Predicting attributes with the data collected from multiple sources with the help of item’s universal product codes (UPC) and to automate the generation of the title, description and tabular data for the products.
- Created a forecasting pipeline that generates 12 step forecasts at weekly level. The results generated are used by category managers for planning promotions. Also, the results are used to calculate uplift for items on promotions.
- KeyPhrase Extraction and Generation using RNN Encoder – Decoder model + Attention + CopyNet. Designed and developed solutions of demand forecasting and dynamic pricing
Data Scientist Ecommerce Company
- Built NLP based model to find out the similar opportunities associates have worked in past, working on a NLP and machine learning model with an objective to recommend optimal win strategy for any contract
- Built recommendation engine to recommend office products to the customer, 26% growth has been seen in usage for the customers exposed to the recommendation
- Built a solution around customer analytics involving – upsell & downsize models using machine learning algorithms, customer score card using weight of evidence and product recommendation using neural collaborative filtering.
- Model to fetch mandatory attributes from ad images. Experimented with various CNN models for better accuracy and least latency.
- Created a pipeline for extraction of the unstructured data from the Loss adjuster reports
- Developing the Service Business KPIs for increasing the process efficiency.
- Used automailers to send across many reports direct to the mailbox of stakeholders using Python
Data Science Engineer Technology Company
- Predicting attributes with the data collected from multiple sources with the help of item’s universal product codes (UPC) and to automate the generation of the title, description and tabular data for the products.
- Design and development of programs and statistical models or various Retail Operators across North and Latin Americas, Europe and Asia-Pacific region): Subscriber Churn
- Specialized proficiency in Stakeholder Management, Predictive Modelling, Risk Modelling, Churn analytics, Text Analytics, Social Media Analytics, Customer Segmentation, Visualization and Automation
- Author Affiliation information identification and tagging from the scholar documents using CRF(conditional random field).
CERTIFICATES
CS224n: Deep Learning for Natural Language Processing (Stanford) DeepLearning.AI TensorFlow Developer Professional Certificate
DeepLearning.AI TensorFlow Developer Professional Certificate
EDUCATION
Bachelor Of Technology (Computer Science)
National Institute of Technology : Jaipur
CS231n: Convolutional Neural Networks for Visual Recognition (Stanford)
Natural Language Processing (Coursera) by Dan Jurafsky, Christopher Manning
CS224n: Deep Learning for Natural Language Processing (Stanford) DeepLearning.AI TensorFlow Developer Professional Certificate