PyTorch Regression Mutilayered Perceptron Model

This repository contains a PyTorch implementation of a regression Mutilayered Perceptron model designed for tabular data. It includes data loading, preprocessing, model definition, training, and evaluation functionalities.

Features

Data Loading and Preprocessing:
- Loads data from CSV files.
- Supports chunked loading for large datasets.
- Handles missing columns and empty datasets.
- Implements various data scaling techniques (StandardScaler, MinMaxScaler, MaxAbsScaler, RobustScaler, Normalizer).
- Adds noise to the data (Normal, Uniform, Poisson).
- Allows scaling and noise parameters to be configured.
Model Definition:
- Defines a customizable neural network model with configurable hidden layers, dropout, and L1 regularization.
- Includes batch normalization and ELU activation.
Training and Evaluation:
- Implements training and validation loops with error handling.
- Supports early stopping based on validation loss.
- Uses Adam optimizer with configurable learning rate, weight decay, and learning rate schedulers (StepLR, ReduceLROnPlateau).
- Calculates and logs metrics (MSE, MAE, R-squared).
- Collects and visualizes training and validation losses, MSE, MAE, and R-squared.
- Plots residuals against predicted values.
Device Management:
- Automatically detects and uses available GPU (CUDA or MPS) or CPU.
Logging:
- Uses Python's logging module to log training and evaluation information.
- Logs to both a file (learn_model.log) and the console.
Error Handling:
- Includes custom exceptions for dataset-related errors.
- Comprehensive error handling throughout the code.

Installation

Clone the repository:

git clone <repository_url>
cd <repository_directory>

Install the required dependencies:

pip install numpy pandas torch scikit-learn matplotlib

Usage

Prepare your dataset:
- Place your CSV data file in the specified root directory.
- Ensure the CSV file contains the columns specified in xcol and ycol within the DatasetConfig.
Configure the dataset:
- Modify the DatasetConfig in the if __name__ == '__main__': block to match your dataset.
- Set the root, csv_file, xcol, ycol, scaler_type, noise_type, noise_std, and scaling_factor as needed.
Configure the model and training:
- Adjust the config dictionary in the if __name__ == '__main__': block to configure the model, optimizer, learning rate schedulers, and early stopping.
Run the script:
```
python <your_script_name>.py
```
View the results:
- The script will output training and evaluation metrics to the console and log file.
- Plots of the training and validation losses, MSE, MAE, R-squared, and residuals will be displayed.

Dataset Configuration

The DatasetConfig dataclass allows you to configure the dataset:

@dataclass
class DatasetConfig:
    root: str
    xcol: list[str]
    ycol: list[str]
    scaler_type: ScalerType = ScalerType.STANDARD
    noise_type: NoiseType = NoiseType.NORMAL
    csv_file: str = 'dummy_data.csv'
    scaler: object = None
    noise_params: dict = None
    chunksize: int = None
    noise_std: float = 0.0
    scaling_factor: float = 1.0

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
00_reg_MLP.py		00_reg_MLP.py
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
dummy_data.csv		dummy_data.csv
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyTorch Regression Mutilayered Perceptron Model

Table of Contents

Features

Installation

Usage

Dataset Configuration

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PyTorch Regression Mutilayered Perceptron Model

Table of Contents

Features

Installation

Usage

Dataset Configuration

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages