A PyTorch-based deep learning project that predicts the popularity of songs on Spotify using their audio features. It utilizes an LSTM model to perform regression on sequential-like data (although each song is treated as a sequence of one).
-
Input file:
spotify_songs.csv -
Features used:
danceability,energy,loudness,speechiness,acousticness,instrumentalness,liveness,valence,tempo
-
Target:
track_popularity
- Drop rows with missing values
- Normalize features using
StandardScaler - Split data into train/test sets
- Convert data into PyTorch tensors
A custom LSTM regression model:
Input Size: 9
Hidden Size: 64
Layers: 1
Output: 1 (popularity score)- Loss function:
MSELoss - Optimizer:
Adam (lr=0.001) - Epochs: 15
- Batch size: 64
- Mean Squared Error (MSE) on the test set
- Line plot of training loss over epochs
Epoch 1/15, Loss: 152.3496
...
Epoch 15/15, Loss: 43.7221
Test MSE Loss: 38.9315A plot is generated showing loss vs epochs:
You can use
plt.savefig("training_loss.png")to save the figure locally.
- ✅ Install dependencies:
pip install torch pandas numpy scikit-learn matplotlib-
📂 Make sure
spotify_songs.csvis in the same directory. -
🚀 Run the script:
python your_script_name.py- Designed for educational purposes to demonstrate regression with LSTMs.
- Extendable to deeper networks or sequence data if needed.
- Lightweight and easy to customize.