About Skills Experience Open Source Projects Contact
Rudra Prasad Bhuyan

Rudra Prasad Bhuyan

Aspiring Data Scientist | ML Engineer

Electrical engineering background. Self-learned data science practitioner building end-to-end machine learning systems under real-world constraints (8GB RAM environments).

Focused on practical, system thinking solutions — not just models. I solve real problems using data.

Resume GitHub LinkedIn
Skills & Expertise

Programming

  • Python
  • SQL

Data Processing

  • Pandas (EDA, transformation, cleaning)
  • Polars (memory-efficient processing under 8GB constraints)
  • Feature engineering (encoding, scaling, skew handling)

Machine Learning

  • Regression & Classification models
  • Model evaluation (R², MAE, MSE, Accuracy)
  • Cross-validation & handling imbalanced data
  • Basic hyperparameter tuning

Tools

  • Git / Version Control
  • Jupyter Notebook / VS Code
Work Experience

Data Science Intern @ Tech Solutions Inc.

Jan 2024 - Present
  • Developed predictive models for customer churn using XGBoost and Random Forest.
  • Optimized data pipelines reducing processing time by 40% using Polars.
  • Conducted EDA on datasets exceeding 5GB in memory-constrained environments.

Junior ML Engineer @ DataFlow Systems

June 2023 - Dec 2023
  • Implemented automated feature engineering scripts for time-series forecasting.
  • Performed rigorous model validation and hyperparameter tuning.
  • Assisted in the deployment of models using Docker and basic CI/CD.
Open Source

pydatasys

A lightweight Python package for automated feature engineering in memory-constrained environments.

ml-ops-lite

Minimalist MLOps toolkit designed for tracking experiments and deploying models on low-resource hardware.

Featured Projects

Real Estate Price Predictor

Problem

High variance in property valuation across urban areas leading to investment risks.

Tools

Python, Scikit-Learn, Pandas, Matplotlib

  • Built a regression model with 92% R² score.
  • Handled missing data and outliers in a dataset of 50k+ entries.
View Project

Credit Risk Assessment

Problem

Traditional scoring methods failing to account for non-traditional financial behaviors.

Tools

Python, XGBoost, SQL, Seaborn

  • Reduced false negatives by 15% compared to baseline models.
  • Engineered 20+ new features from raw transaction logs.
View Project

Inventory Optimization System

Problem

Overstocking and stockouts causing significant revenue loss for small retailers.

Tools

Python, Polars, Statsmodels

  • Developed a demand forecasting system using ARIMA.
  • Designed for 8GB RAM environments using memory-efficient structures.
View Project

Customer Segmentation Engine

Problem

Generic marketing campaigns resulting in low conversion rates.

Tools

Python, K-Means Clustering, Scikit-Learn

  • Identified 5 persona types using unsupervised learning.
  • Visualized clusters using PCA for dimensionality reduction.
View Project

Anomaly Detection in Sensors

Problem

Manual monitoring of industrial equipment leading to delayed maintenance.

Tools

Python, Isolation Forest, NumPy

  • Detected 98% of critical failures before occurrence.
  • Processed real-time sensor streams with minimal overhead.
View Project

NLP Sentiment Analyzer

Problem

Inability to process large volumes of customer feedback manually.

Tools

Python, NLTK, TF-IDF, Logistic Regression

  • Classified feedback into sentiments with 88% accuracy.
  • Summarized key themes from negative reviews.
View Project
More Projects