About Skills Experience Open Source Projects Contact
Rudra Prasad Bhuyan

Rudra Prasad Bhuyan

Aspiring Data Scientist | ML Engineer

Machine Learning Engineer with hands-on experience in data analysis, machine learning, and end-to-end project execution.

Focused on practical, system thinking solutions — not just models. I solve real problems using data.

Resume GitHub LinkedIn
Skills & Expertise
Python, SQL
Pandas, NumPy
Plotly, Matplotlib, Seaborn
Power BI, Looker Studio
Scikit-learn, XGBoost, CatBoost
GitHub, MLflow, CI/CD
Pydantic, FastAPI
PySpark, Polars
AWS
PostgreSQL, Snowflake, BigQuery, Databricks
PyTorch, TensorFlow, Keras
LLMs, ChatGPT, Groq, Llama, Claude, HuggingFace
LangChain, LangGraph, LangSmith
Probability, Statistical Modeling
Hypothesis Testing, A/B Testing
Git, Jupyter Notebook, VS Code
Time Series Forecasting, PCA
Hyperparameter Tuning
Storytelling, Stakeholder Communication
Problem Solving, Analytical Thinking
Work Experience

Jr. Data Scientist

SBC Labs

Nov 2025 – Feb 2026

Developed Python and SQL ETL pipelines for 15 multi-level HCES datasets with 400+ features to streamline large-scale batch processing.

Analyzed 500+ socioeconomic and healthcare variables to identify trends in finance, healthcare, consumption, and savings behavior.

Automated data validation and duplicate-handling workflows using SQL procedures, reducing data inconsistencies by 20%.

Built reusable data preprocessing scripts in Python, reducing manual model-preparation effort and saving ~1 hour daily.

Designed scalable system architecture, PRD documentation, and 30+ analytical reports to standardize workflows and business planning.

Built interactive KPI dashboards across 50+ parameters and collaborated with stakeholders to deliver actionable insights and summaries.

Quality Analyst

Tata Power WODL

June 2025 – July 2025

Observed and analyzed smart meter data workflows for 60,000+ consumers, centralized database operations and issue-tracking processes.

Assisted in field-to-database survey operations with a 20-member team, supporting digital meter updates, and complaint-handling workflows.

Gained exposure to image-processing and grid monitoring systems used to identify human errors, detect irregularities operational planning.

Open Source

show-file-tree

A small, fast CLI tool to display styled file/folder trees with rich options, colors, icons, and metadata.

find-my-joint

A utility to find potential join keys (matching columns) across multiple pandas DataFrames.

Featured Projects

Vehicle Insurance Risk Prediction

Insurance companies need to estimate vehicle risk to reduce loss and price policies correctly.

Python • Flask • AWS • Docker

SQL Modern Data Warehouse

ERP & CRM data was inconsistent and not ready for analytics and reporting.

PostgreSQL • SQL • ETL • Star Schema • Power BI

Yelp Big Data Analysis

Handling large Yelp JSON datasets efficiently without memory issues.

Python • Polars • JSON • Parquet

Breast Cancer Prediction App

Need for real-time tumor classification using medical diagnostic features.

Python • Scikit-learn • Streamlit

Transportation & Logistic Dashboard

Analyze logistics efficiency to reduce delays and operational costs.

Power BI • KPI Development

Smart Transaction Ledger

Smart Transaction Ledger is an AI-powered financial transaction cleaner and fraud detector.

Python • FastAPI • AI • SQL

More Projects
👋

Hi, I am Rudra Assistants!