Skip to content

harshadarayate/DataScience

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataScience Repository

Python License: MIT

Overview

Welcome to the DataScience repository by Harshada Rayate! This repo serves as a collection of tools, notebooks, and source code for data science projects, focusing on exploratory data analysis (EDA), model development, and robust Python utilities like logging and exception handling.

Built with reproducibility in mind, it includes setup for easy installation and a modular structure for extending data workflows.

Key Features

  • Jupyter notebooks for hands-on EDA and machine learning model building.
  • Reusable source modules for logging and error management.
  • Standard Python packaging with setup.py for easy deployment.
  • Dependency management via requirements.txt.

Repository Structure

The repo follows a clean, modular layout:

DataScience/
├── .gitignore                # Git ignore rules for Python projects
├── requirements.txt          # List of Python dependencies
├── setup.py                  # Setuptools configuration for packaging
├── notebook/                 # Jupyter notebooks for data exploration and modeling
│   └── (Notebooks related to EDA and Model Development, committed May 2, 2025)
└── src/                      # Core source code modules
    └── (Utilities for logging and exception handling, committed Mar 16, 2025)

Note: Subfolder contents are based on commit history. Add specific file details as the repo evolves.

Getting Started

Prerequisites

  • Python 3.8+
  • Git
  • Jupyter (for notebooks)

Installation

  1. Clone the Repository

    git clone https://github.com/harshadarayate/DataScience.git
    cd DataScience
  2. Set Up Virtual Environment (Recommended)

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install Dependencies Use requirements.txt for a standard install:

    pip install -r requirements.txt

    Or install as a package:

    pip install -e .
  4. Launch Notebooks

    jupyter notebook

    Navigate to the notebook/ folder to open EDA and model development files.

Usage

  • Explore Notebooks: Dive into notebook/ for step-by-step EDA and model training examples. Run cells to visualize data and build prototypes.
  • Use Source Modules: Import utilities from src/ in your scripts, e.g.:
    from src.logger import setup_logger
    from src.exceptions import CustomDataException
    
    logger = setup_logger()
    try:
        # Your data code here
        pass
    except Exception as e:
        raise CustomDataException("Data processing failed") from e
  • Extend the Project: Add new notebooks or modules while respecting the structure—no major changes needed.

Contributing

Pull requests are welcome! For major additions:

  1. Fork the repo.
  2. Create a branch (git checkout -b feature/new-notebook).
  3. Commit changes (git commit -m 'Add new EDA notebook').
  4. Push and open a PR.

License

This project is licensed under the MIT License. See LICENSE (add if not present) for details.

Contact

Harshada Rayate
GitHub | LinkedIn

Project: DataScience


Star ⭐ the repo to stay updated!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors