A multiple regression analysis model that predicts vehicle fuel efficiency (MPG) based on automobile technical specifications. This project implements a linear regression model to predict miles per gallon using features such as weight, model year, acceleration, displacement, cylinders, and horsepower.
The model achieves an R-squared score of 0.8244, meaning it explains 82.44% of the variance in MPG values with an average prediction error of approximately 3.07 MPG.

Install the required dependencies using pip:
pip install pandas numpy matplotlib seaborn scikit-learnRun the prediction script:
python mpg_prediction.py- Python 3.x
- pandas for data processing and manipulation
- numpy for numerical computations
- matplotlib for data visualization
- seaborn for statistical data visualization
- scikit-learn for machine learning algorithms
- Handles missing values in horsepower column using median imputation
- Excludes non-numeric car name column from analysis
- Applies StandardScaler for feature normalization
- Implements multiple linear regression using scikit-learn
- Uses 80-20 train-test split for model validation
- Analyzes standardized coefficients to determine feature importance
- R-squared Test Set: 0.8244 (82.44%)
- RMSE Test Set: 3.0725 mpg
Most influential features based on standardized coefficients:
- Weight: -5.9117 Strongest negative impact, heavier vehicles have lower MPG
- Model Year: 2.8829 Second strongest positive impact, newer vehicles have better MPG
- Acceleration: 0.2224 Weak positive impact
- Displacement: 0.1649 Weak positive impact
- Cylinders: 0.1162 Very weak positive impact
- Horsepower: 0.1073 Very weak positive impact
- Total Samples: 398
- Training Data: 318 samples, 80%
- Test Data: 80 samples, 20%
- Features Used: 6
- Target Variable: mpg
- mpg_prediction.py - Main analysis script
- mpg_prediction_results.png - Model evaluation visualizations including actual vs predicted MPG, residual plots, feature importance, and residual distribution
- correlation_heatmap.png - Correlation matrix heatmap
MIT License
Seonwoo Kang