Vinay Chanamallu

VC
Available for opportunities

Vinay
Chanamallu

Data Scientist & ML Engineer

I transform data into intelligent solutions through end-to-end ML pipelines, advanced feature engineering, and production-ready models that deliver measurable business impact.

About Me

Turning Data Into
Intelligent Solutions

Transforming complex data challenges into scalable, production-ready machine learning solutions.

Vinay Chanamallu
Available
2+
Years Exp

I'm Venkata Chandrasekhar Vinay Chanamallu, a data scientist with 2.5+ years of experience turning messy business problems into measurable outcomes. I've worked across fraud, churn, and customer-support automation, building models, validating them carefully, and partnering with operations and engineering to ship changes that actually get used.

At EPAM Systems, I tuned a LightGBM classifier with Optuna and isotonic calibration on 2.3M transactions, reducing weekly false declines from 520 to 455 at recall 0.80. I built an XGBoost fraud model that lifted legit approvals from 89% to 94% in a 6-week A/B test on 120k transactions. I also built a churn-risk model achieving 7% relative reduction in churn and auto-routed 120k Zendesk tickets using TF-IDF classification, cutting first-response time by 8%.

My expertise spans LightGBM, XGBoost, CatBoost, logistic regression, isotonic calibration, and threshold/cost analysis with Optuna optimization. I'm experienced in A/B testing, time-based splits, ROC/PR evaluation, SHAP explainability, and production monitoring with MLflow drift detection (PSI, score distributions).

Recently at HealID, I owned the mobile ingestion layer for 40+ HealthKit/Fitbit metrics, co-designed a GCP pipeline (Cloud Run, Pub/Sub, GCS) handling 10,000 events/day with under 1% ingest errors, and partnered to serve a LightGBM readiness model with p50 inference of 250ms. Known for being practical: clear metrics, clean handoffs, and solutions that hold up in the real world.

Experience

Professional
Journey

My evolution from computer science student to ML engineer and business analyst.

💼
Oct 2025 - Dec 2025

Data Engineer Intern

HealID Inc.

Aug 2024 - Dec 2025

MS Business Analytics

University of Maryland

🎓
💼
Apr 2023 - Jul 2024

Data Scientist

EPAM Systems

Feb 2022 - Mar 2023

Junior Data Scientist

EPAM Systems

💼
🎓
May 2018 - Jun 2022

BTech Computer Science

JNTUK

Skills

My Tech Stack

Expertise across machine learning engineering and deployment technologies

💻

Programming

3 skills

Python

SQL

Pandas

🧠

ML & Modeling

9 skills

XGBoost

LightGBM

CatBoost

scikit-learn

Gradient Boosting

Random Forest

Elastic Net

Optuna

SHAP

⚙️

NLP & Time Series

4 skills

TF-IDF

ARIMA

Time Series

Prophet

☁️

Cloud & ETL

6 skills

AWS Lambda

AWS S3

AWS Redshift

AWS Glue

Athena

CloudWatch

📊

Visualization

3 skills

Tableau

Plotly

Microsoft Excel

Projects

A few case studies that show how I design, build, and ship.

M
ML/AI

Medical Image Classification

Automated pneumonia detection from chest X-rays using deep learning and CNN transfer learning. Achieved 95.21% accuracy with EfficientNetB0 on 5,840 real chest X-ray images.

TensorFlowEfficientNetHealthcareCV
A
ML/AI

Airbnb High-Booking Prediction

Competition-winning model (3rd place) for identifying high-booking rate Airbnb listings. ROC AUC 0.91176 using stacked XGBoost, CatBoost ensemble with SHAP explainability.

XGBoostCatBoostSHAPEnsemble
F
ML/AI

Flight Delay Prediction Pipeline

End-to-end ML pipeline predicting flight delays from 3.6M records with weather enrichment. 70.44% accuracy with deployment-ready CLI tool.

XGBoostPipelineWeather APICLI
C
ML/AI

CNN Image Classification

Custom CNN architecture for CIFAR-10 object classification with 10 classes. Achieved 82.45% test accuracy with data augmentation and comprehensive evaluation.

PyTorchCNNComputer VisionCIFAR-10
C
ML/AI

Customer Churn Prediction

Business-focused churn prediction model identifying $227K+ at-risk revenue. 0.705 ROC-AUC with Gradient Boosting and actionable business insights.

XGBoostLightGBMOptunaBusiness
B
ML/AI

Breast Cancer ML Coach

Lightweight ML pipeline for breast cancer detection with 98% hold-out accuracy. Modular Python package with YAML configuration and automated cross-validation.

scikit-learnHealthcarePipelinePytest
R
ML/AI

RAG Chatbot

Intelligent Retrieval-Augmented Generation chatbot leveraging advanced NLP techniques for contextually relevant responses.

PythonRAGNLPLLM
F
ML/AI

Financial Market ML Analysis

Comprehensive stock prediction system with real-time Yahoo Finance data. Lasso Regression achieved 99.26% R² for price prediction across 10 major companies.

PythonFinanceMLAPI
T
ML/AI

Time-Series Energy Forecasting

Advanced time series forecasting model for India's energy consumption using ARIMA, SARIMA, and LSTM. Predicts 7% demand increase over 5 years.

LSTMARIMATime SeriesForecasting
M
ML/AI

Movie Sentiment Analysis (LSTM)

Enhanced movie review sentiment analysis using LSTM neural networks. 90-95% training accuracy on IMDB 50k reviews with interactive demo and confidence scoring.

TensorFlowLSTMNLPIMDB
L
ML/AI

LLM 101: Bytes to GPT

Educational project building a tiny GPT model from scratch. Learn LLMs through tokenization, bigram models, and baby transformers with clear, commented code.

PyTorchTransformersLLMEducational
L
ML/AI

LSTM Action Recognition

Deep learning model using LSTM networks for human action recognition from video sequences. Bidirectional LSTM improved accuracy by 12%.

LSTMOpenCVComputer VisionVideo
H
ML/AI

Hand Gesture Counter

Real-time hand gesture recognition system detecting and counting finger gestures (1-5) using MediaPipe with text-to-speech announcements.

MediaPipeComputer VisionReal-timeTTS
C
ML/AI

Congressional Speech Analyzer

NLP analysis of U.S. Congressional speech data from 2020. Compares linguistic patterns between Democratic and Republican senators using spaCy and bigram analysis.

spaCyNLPText AnalysisPolitics
D
ML/AI

Doodle Digit App

Interactive handwritten digit recognition using k-NN classifier on 8x8 pixel images with Gradio web interface. Educational ML project with browser-based drawing.

scikit-learnGradioMLWeb
E
Data

E-Commerce Analytics Platform

Comprehensive SQL-based analytics system for multi-vendor e-commerce. Features RFM segmentation, cohort analysis, and churn prediction handling millions of transactions.

T-SQLSQL ServerAnalyticsBI
I
Data

IAD Flight Departures EDA

Multi-dimensional exploratory analysis of flight departures at Washington Dulles Airport. Analyzes carrier performance, COVID impact, economic correlations, and weather delays.

PythonEDAPandasVisualization
C
ML/AI

Crash Duration Prediction

Predictive modeling system for estimating crash durations using machine learning techniques. Supports transportation and incident management applications.

PythonMLJupyterTransportation
A
Data

Amtrak Performance Analysis

SQL-based analysis examining Amtrak's operational performance. Identified routes with highest delays and recommended buffer times reducing delays by 12%.

SQLTableauAnalyticsTransportation

GET IN TOUCH

Ready to collaborate on your next data science project? Let's build something impactful together.

Send a Message

I typically respond within 24 hours