Skip to content

sunwoo99999/auto_mpg_prediction

Repository files navigation

Auto MPG Prediction Project

Description

A multiple regression analysis model that predicts vehicle fuel efficiency (MPG) based on automobile technical specifications. This project implements a linear regression model to predict miles per gallon using features such as weight, model year, acceleration, displacement, cylinders, and horsepower.

The model achieves an R-squared score of 0.8244, meaning it explains 82.44% of the variance in MPG values with an average prediction error of approximately 3.07 MPG.

Installation

Install the required dependencies using pip:

pip install pandas numpy matplotlib seaborn scikit-learn

Run the prediction script:

python mpg_prediction.py

Tech Stack

  • Python 3.x
  • pandas for data processing and manipulation
  • numpy for numerical computations
  • matplotlib for data visualization
  • seaborn for statistical data visualization
  • scikit-learn for machine learning algorithms

Features

Data Processing

  • Handles missing values in horsepower column using median imputation
  • Excludes non-numeric car name column from analysis
  • Applies StandardScaler for feature normalization

Model Training

  • Implements multiple linear regression using scikit-learn
  • Uses 80-20 train-test split for model validation
  • Analyzes standardized coefficients to determine feature importance

Model Performance

  • R-squared Test Set: 0.8244 (82.44%)
  • RMSE Test Set: 3.0725 mpg

Key Findings

Most influential features based on standardized coefficients:

  1. Weight: -5.9117 Strongest negative impact, heavier vehicles have lower MPG
  2. Model Year: 2.8829 Second strongest positive impact, newer vehicles have better MPG
  3. Acceleration: 0.2224 Weak positive impact
  4. Displacement: 0.1649 Weak positive impact
  5. Cylinders: 0.1162 Very weak positive impact
  6. Horsepower: 0.1073 Very weak positive impact

Dataset Information

  • Total Samples: 398
  • Training Data: 318 samples, 80%
  • Test Data: 80 samples, 20%
  • Features Used: 6
  • Target Variable: mpg

Generated Outputs

  1. mpg_prediction.py - Main analysis script
  2. mpg_prediction_results.png - Model evaluation visualizations including actual vs predicted MPG, residual plots, feature importance, and residual distribution
  3. correlation_heatmap.png - Correlation matrix heatmap

Reference

Predictive Modeling of Auto MPG and Analysis of Engine Design Factors: Comparing the Impact of Weight and Displacement

License

MIT License

Author

Seonwoo Kang

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages