About Skills Experience Open Source Projects Contact
Rudra Prasad Bhuyan

Rudra Prasad Bhuyan

Aspiring Data Scientist | ML Engineer

Machine Learning Engineer with hands-on experience in data analysis, machine learning, and end-to-end project execution.

Focused on practical, system thinking solutions — not just models. I solve real problems using data.

Resume GitHub LinkedIn
Skills & Expertise
Python, SQL
Pandas, NumPy, Polars
Plotly, Power BI, Matplotlib, Seaborn
Scikit-learn, XGBoost
PyTorch, TensorFlow
LLMs, LangChain, HuggingFace
GitHub, MLflow, CI/CD
Pydantic, FastAPI
PySpark, Parquet
AWS
PostgreSQL, Snowflake, BigQuery, Databricks,
Probability, Statistical Modeling
Hypothesis Testing, A/B Testing
Git, Jupyter Notebook, VS Code
Time Series Forecasting, PCA
Hyperparameter Tuning
Storytelling, Stakeholder Communication
Problem Solving, Analytical Thinking
Work Experience

Jr. Data Scientist

SBC Labs

Nov 2025 – Feb 2026

Built a structured data analysis pipeline in Python on a large India-level dataset (400+ features), identifying key variables and reducing redundancy.

Cleaned and preprocessed raw data to improve data quality and make it model-ready for efficient analysis.

Developed a state-level interactive dashboard using Python visualization tools for clearer, focused insights.

Created a feature documentation sheet to standardize understanding and reduce dataset confusion.

Delivered weekly reports and translated complex data insights into actionable decisions for both technical and non-technical stakeholders.

Quality Analyst

Tata Power WODL

June 2025 – July 2025

Gained hands-on exposure to electricity distribution and smart meter operations.

Analyzed digital complaint and new connection workflows.

Understood field-to-system integration and service reporting processes.

Open Source

show-file-tree

A small, fast CLI tool to display styled file/folder trees with rich options, colors, icons, and metadata.

find-my-joint

A utility to find potential join keys (matching columns) across multiple pandas DataFrames.

Featured Projects

Vehicle Insurance Risk Prediction

Insurance companies need to estimate vehicle risk to reduce loss and price policies correctly.

Python • Flask • AWS • Docker

SQL Modern Data Warehouse

ERP & CRM data was inconsistent and not ready for analytics and reporting.

PostgreSQL • SQL • ETL • Star Schema • Power BI

Yelp Big Data Analysis

Handling large Yelp JSON datasets efficiently without memory issues.

Python • Polars • JSON • Parquet

Breast Cancer Prediction App

Need for real-time tumor classification using medical diagnostic features.

Python • Scikit-learn • Streamlit

Transportation & Logistic Dashboard

Analyze logistics efficiency to reduce delays and operational costs.

Power BI • KPI Development

Smart Transaction Ledger

Smart Transaction Ledger is an AI-powered financial transaction cleaner and fraud detector.

Python • FastAPI • AI • SQL

More Projects