Skip to content

Data Preprocessing and Initial Model Evaluation #1

@soykothosen

Description

@soykothosen

Overview

The goal of this task is to prepare the research dataset and conduct initial model training and evaluation to establish a baseline for our project.

This issue involves two main phases: Data Cleaning and Model Running.


Phase 1: Data Cleaning & Preprocessing

Assigned to: Mahadi Hassan

Tasks:

  • Inspect the raw dataset for inconsistencies, duplicates, and missing records.
  • Handle missing values and outliers using standard preprocessing techniques.
  • Normalize/scale features and encode categorical variables where necessary.
  • Split the finalized dataset into training, validation, and testing sets.
  • Save and document the cleaned dataset in the designated directory.

Deliverables:

  • Cleaned and preprocessed dataset.
  • Data cleaning script or notebook (well-commented).

Phase 2: Baseline Model Evaluation

Assigned to: Nazifa Fairuz Zuthi
Note: This phase depends on the completion of the dataset from Phase 1.

Tasks:

  • Set up the pipeline to load the preprocessed training and testing data.
  • Implement and run standard baseline models suitable for this research problem.
  • Evaluate the performance of each model using relevant metrics (e.g., Accuracy, Precision, Recall, F1-Score, etc.).
  • Document and compare the initial results.

Deliverables:

  • Model training and evaluation scripts/notebooks.
  • A summary report or comparative table of the initial results.

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions