This document outlines the detailed, step-by-step workflow for building the MindPulse mini-project over a 2-day sprint. It breaks down exactly what to build and how to build it, ensuring strict alignment with the 7 core objectives outlined for this project.
- Identify & Analyze Indicators: Track keyboard latency, typing speed, mouse erraticism, and context-switching.
- Collect & Structure Datasets: Extract these data points into 5-minute sliding windows (46 features) and bootstrap with synthetic training data.
- Develop ML/DL Model: Train an XGBoost baseline model calibrated to individual users.
- Recommendation Module: Suggest dynamic interventions (breathing exercises, Lo-Fi music) based on the stress score.
- Desktop Application: Build a real-time Streamlit dashboard for a user-friendly UI.
- Continuous Improvement (User Feedback): Implement Online Learning (Exponential Moving Average) to adapt to the user's unique baseline over time.
- Evaluation Metrics: Build a test script to evaluate Accuracy, Precision, Recall, and F1-score mathematically.
- Target Objectives: Objective 1
- What to build: The background listener that captures real-time behavioral interactions securely.
- How to build:
- Create
requirements.txt(pynput,pandas,xgboost,scikit-learn,streamlit,psutil). - Write
data_collector.pyusing thepynputlibrary. - Implement listeners for
on_press,on_release,on_move, andon_click. - Privacy rule: Never record the actual characters typed, only the timestamps and key categories (alpha, digit, special) to measure how they type, not what.
- Use
psutil(orctypesfor Windows) to track the active application window to measure context-switching frequency.
- Create
- Target Objectives: Objective 2
- What to build: The engine that translates raw clicks/keys into mathematical features (e.g., WPM, pauses, rage clicks).
- How to build:
- Write
feature_extractor.py. - Segment the raw timeline into 5-minute sliding windows moving at 1-minute steps.
- Extract 46 specific features per window using
numpyarrays:- Keyboard: Hold time variance, flight time (interval between keys), Typing Speed (WPM), Error Rate (backspaces), Rhythm Entropy.
- Mouse: Velocity variance, direction changes, Rage Clicks (rapid clustered clicks), Scroll variation.
- Context: Tab/App switch frequency (proxy for cognitive scatter/overload).
- Write
- Target Objectives: Objective 2
- What to build: A dataset to train our model before the app gathers weeks of real user data.
- How to build:
- Since we are not asking users to take stress tests on Day 1, we will write a
generate_synthetic_data()function. - Use mathematical modeling to simulate 3,000+ "sessions." E.g., force the
Typing EntropyandRage Clickfeatures to correlate strongly with ay = STRESSEDlabel. - This acts as our "pre-trained" base so the ML model works immediately.
- Since we are not asking users to take stress tests on Day 1, we will write a
- Target Objectives: Objective 3, Objective 6
- What to build: The core XGBoost classification model and the user-calibration system.
- How to build:
- Write
model.py. Train anxgb.XGBClassifieron the synthetic dataset to output probabilities for 3 classes: NEUTRAL, MILD, STRESSED. - Addressing Objective 6 (Continuous Improvement): Build a
PersonalBaselineclass hooked to a local SQLite database (user_baselines.db). - Use an Exponential Moving Average (EMA) so the system slowly learns what your specific typing speed is at different times of the day, adjusting the model's sensitivity to match your unique circadian rhythm.
- Write
- Target Objectives: Objective 4, Objective 5
- What to build: The desktop application interface and intervention logic.
- How to build:
- Write
app.pyusing Streamlit. Streamlit is chosen because it allows for lighting-fast UI prototyping in pure Python. - Wire the dashboard to call
collector.get_events()every few seconds to feed the live inference model. - Display a Live Stress Meter (0-100 gauge map) and breakdown of active metrics (WPM, Mouse Speed).
- Addressing Objective 4 (Interventions): Build an
if/eliflogic block based on the stress score.- If
score > 70: Trigger a red alert UI banner suggesting a 5-minute Box Breathing exercise or providing a link to Lo-Fi focus beats. - If
score > 50: Trigger a yellow warning suggesting a quick stretch.
- If
- Write
- Target Objectives: Objective 7
- What to build: The testing script to prove the technical capabilities of the model.
- How to build:
- Write
evaluate_model.py. - Generate a heavily imbalanced hold-out test set (e.g., 1,000 synthetic sessions unseen by the XGBoost model).
- Use
scikit-learnmetrics to output:- Accuracy: Overall correctness.
- Precision (Macro): Minimizing false stress alarms so users don't lose trust.
- Recall (Macro): Ensuring the app doesn't miss actual stressed states.
- F1-Score: The harmonized mean of precision/recall given class imbalances.
- (Note: The user-centric measures like "usability" and "satisfaction" will be measured qualitatively when you and your team actually use the Streamlit interface).
- Write
- Data Collection:
pynput,psutil - Data Processing:
numpy,pandas - Machine Learning:
xgboost,scikit-learn - Application UI:
streamlit - Database (Calibration):
sqlite3