Health Risk Prediction using Wearable Data
Machine learning system for predicting health risks using wearable time-series data with interpretable outputs.
Tech Stack
Problem Statement
Wearable health data is noisy and high-dimensional, making it difficult to extract meaningful predictors and generate reliable risk predictions.
System Architecture
Built an end-to-end ML pipeline including data preprocessing, feature engineering, model training, evaluation, and interpretability layers.
Approach
Performed extensive EDA and preprocessing on time-series data. Applied SMOTE for class balancing and trained ensemble models including XGBoost and Random Forest. Integrated LIME for model explainability.
Implementation Details
Used Pandas for data cleaning and transformation, Scikit-Learn for model training and validation, and LIME to generate local explanations for predictions.
Challenges & Solutions
Handling missing values in time-series data, balancing classes, and ensuring interpretability without sacrificing performance.
Results & Impact
Achieved 87.4% accuracy with improved model reliability and interpretable outputs highlighting key health indicators.
Key Learnings
Developed strong understanding of feature engineering, model evaluation, and the importance of explainability in real-world ML systems.