Data Engineer · Data Analyst · SQL · Python · PySpark · Airflow · AWS
Final-year B.E. student (CGPA: 8.20) based in Mumbai with hands-on experience building end-to-end ETL pipelines using Python, SQL, PySpark, and Airflow. Ranked Top 10 out of 120+ teams in an AI/ML Hackathon, with deep learning research under IIT Bombay guidance. Open to Data Engineer and Data Analyst opportunities.
kavishrathod2004@gmail.com Building projects
An end-to-end ETL (Extract, Transform, Load) pipeline built to process 7,043 telecom customer records for churn analysis.
Designed a modular ETL pipeline with logging, validation, and fallback handling.
The pipeline ingests raw CSV data, applies data cleaning and feature engineering, and loads structured output into a MySQL database (with CSV fallback).
📥 → 🧼 → 🧬 → 💾 → 🧠 → 🎯 → 🔥
Rebuilt the Telco Churn ETL pipeline using Apache PySpark for big data scalability. Leverages Spark DataFrames, approxQuantile for distributed median imputation, and writes outputs to Parquet (columnar storage) + CSV. Generated 4 churn aggregations revealing 42.7% churn for month-to-month customers vs. 2.8% for two-year contracts.
Analysed Superstore Sales dataset with 18 SQL queries covering revenue trends, top products, customer segments, and
regional performance; designed a Star Schema (1 fact + 5 dim tables) and orchestrated the full pipeline via an Apache
Airflow DAG (DQ checks → MySQL load → reports).
Implemented 14 automated data quality checks and built a Power BI dashboard (5+ visuals: KPIs, regional map, monthly
trends) with full Python EDA across 11 sections in Jupyter.
End-to-end exploratory data analysis on 7,105 real Zomato Bangalore restaurant records. Cleaned and engineered features including primary cuisine and location extraction. Derived insights on ratings (avg 3.48), cuisine trends, pricing patterns (68% restaurants under ₹600), and moderate correlation (0.32) between votes and ratings across 7 visualisations.
Human perception–based deep learning model for weight-map guided multi-exposure image fusion. Research under Dr. Ashish Vanmali (IIT Bombay). Published and presented at academic level.
Customer Churn Prediction with EDA + ML project — currently in progress. Check back soon.
Developed a deep learning model that mimics human visual perception to fuse images captured at multiple exposures. The model generates perceptually optimised weight maps for blending, improving dynamic range representation in complex lighting conditions. References: Prabhakar (ICCV), GANFuse, TransMEF, Mertens Exposure Fusion.
Thanks for stopping by. If you're hiring for Data Engineer, Data Analyst,
or Data Scientist roles — or just want to talk data and ML — I'd love to hear from you.
Always up for interesting conversations, collaborations, and new opportunities.