Vinay
Chanamallu
Data Scientist & ML Engineer
I transform data into intelligent solutions through end-to-end ML pipelines, advanced feature engineering, and production-ready models that deliver measurable business impact.
Turning Data Into
Intelligent Solutions
Transforming complex data challenges into scalable, production-ready machine learning solutions.

I'm Venkata Chandrasekhar Vinay Chanamallu, a data scientist and analyst with 2+ years of experience building production ML systems and delivering data-driven business insights. Currently pursuing my MS in Business Analytics at University of Maryland, I combine technical ML expertise with strong business analytics skills to drive measurable impact.
At EPAM Systems, I built end-to-end ML solutions that delivered tangible business results: a propensity model boosting conversions by 8%, a churn prediction system achieving 0.78 AUC with 3% retention lift, and NLP classification models reaching 0.82 F1 through BERT fine-tuning. I standardized A/B testing using CUPED methodology, cutting decision time 50% and reducing variance across three product teams. I also architected Python ETL pipelines processing 200K daily transactions and 20GB of logs, while creating automated Tableau dashboards and Weekly Business Reviews that saved analysts 6+ hours weekly.
My expertise spans the complete data science lifecycle: feature engineering with XGBoost, LightGBM, and CatBoost, hyperparameter optimization with Optuna, model interpretability using SHAP, and production deployment on AWS (Lambda, S3, Redshift, Glue, Athena, CloudWatch). Recent projects include electricity demand forecasting with ensemble methods (55% error reduction), flight delay prediction on 3.6M flights (76% AUC), customer segmentation with RFM analysis, and interactive Tableau/Plotly dashboards for business stakeholders.
What sets me apart is my ability to bridge technical ML development with business analytics, translating complex models into actionable insights through executive dashboards, KPI tracking, and data storytelling. Whether it's building gradient boosting models, conducting cohort retention analysis, forecasting with time-series methods (ARIMA, Prophet), or engineering SQL pipelines, I focus on solutions that are both technically rigorous and business-driven. I'm eager to leverage my skills in Python, SQL, AWS, and modern ML frameworks to solve challenging problems that create measurable value.
Professional
Journey
My evolution from computer science student to ML engineer and business analyst.
MS Business Analytics
University of Maryland
Junior Data Scientist
EPAM Systems
Data Science Program
IIT Madras
Junior Data Scientist Intern
EPAM Systems
BTech Computer Science
JNTUK
My Tech Stack
Expertise across machine learning engineering and deployment technologies
Programming
3 skills
Python
SQL
Pandas
ML & Modeling
9 skills
XGBoost
LightGBM
CatBoost
scikit-learn
Gradient Boosting
Random Forest
Elastic Net
Optuna
SHAP
NLP & Time Series
4 skills
TF-IDF
ARIMA
Time Series
Prophet
Cloud & ETL
6 skills
AWS Lambda
AWS S3
AWS Redshift
AWS Glue
Athena
CloudWatch
Visualization
3 skills
Tableau
Plotly
Microsoft Excel
Projects
A few case studies that show how I design, build, and ship.
Medical Image Classification
Automated pneumonia detection from chest X-rays using deep learning and CNN transfer learning. Achieved 95.21% accuracy with EfficientNetB0 on 5,840 real chest X-ray images.
Airbnb High-Booking Prediction
Competition-winning model (3rd place) for identifying high-booking rate Airbnb listings. ROC AUC 0.91176 using stacked XGBoost, CatBoost ensemble with SHAP explainability.
Flight Delay Prediction Pipeline
End-to-end ML pipeline predicting flight delays from 3.6M records with weather enrichment. 70.44% accuracy with deployment-ready CLI tool.
CNN Image Classification
Custom CNN architecture for CIFAR-10 object classification with 10 classes. Achieved 82.45% test accuracy with data augmentation and comprehensive evaluation.
Customer Churn Prediction
Business-focused churn prediction model identifying $227K+ at-risk revenue. 0.705 ROC-AUC with Gradient Boosting and actionable business insights.
Breast Cancer ML Coach
Lightweight ML pipeline for breast cancer detection with 98% hold-out accuracy. Modular Python package with YAML configuration and automated cross-validation.
RAG Chatbot
Intelligent Retrieval-Augmented Generation chatbot leveraging advanced NLP techniques for contextually relevant responses.
Financial Market ML Analysis
Comprehensive stock prediction system with real-time Yahoo Finance data. Lasso Regression achieved 99.26% R² for price prediction across 10 major companies.
Time-Series Energy Forecasting
Advanced time series forecasting model for India's energy consumption using ARIMA, SARIMA, and LSTM. Predicts 7% demand increase over 5 years.
Movie Sentiment Analysis (LSTM)
Enhanced movie review sentiment analysis using LSTM neural networks. 90-95% training accuracy on IMDB 50k reviews with interactive demo and confidence scoring.
LLM 101: Bytes to GPT
Educational project building a tiny GPT model from scratch. Learn LLMs through tokenization, bigram models, and baby transformers with clear, commented code.
LSTM Action Recognition
Deep learning model using LSTM networks for human action recognition from video sequences. Bidirectional LSTM improved accuracy by 12%.
Hand Gesture Counter
Real-time hand gesture recognition system detecting and counting finger gestures (1-5) using MediaPipe with text-to-speech announcements.
Congressional Speech Analyzer
NLP analysis of U.S. Congressional speech data from 2020. Compares linguistic patterns between Democratic and Republican senators using spaCy and bigram analysis.
Doodle Digit App
Interactive handwritten digit recognition using k-NN classifier on 8x8 pixel images with Gradio web interface. Educational ML project with browser-based drawing.
E-Commerce Analytics Platform
Comprehensive SQL-based analytics system for multi-vendor e-commerce. Features RFM segmentation, cohort analysis, and churn prediction handling millions of transactions.
IAD Flight Departures EDA
Multi-dimensional exploratory analysis of flight departures at Washington Dulles Airport. Analyzes carrier performance, COVID impact, economic correlations, and weather delays.
Crash Duration Prediction
Predictive modeling system for estimating crash durations using machine learning techniques. Supports transportation and incident management applications.
Amtrak Performance Analysis
SQL-based analysis examining Amtrak's operational performance. Identified routes with highest delays and recommended buffer times reducing delays by 12%.
GET IN TOUCH
Ready to collaborate on your next data science project? Let's build something impactful together.
Send a Message
I typically respond within 24 hours