Welcome to the DataScience repository by Harshada Rayate! This repo serves as a collection of tools, notebooks, and source code for data science projects, focusing on exploratory data analysis (EDA), model development, and robust Python utilities like logging and exception handling.
Built with reproducibility in mind, it includes setup for easy installation and a modular structure for extending data workflows.
- Jupyter notebooks for hands-on EDA and machine learning model building.
- Reusable source modules for logging and error management.
- Standard Python packaging with
setup.pyfor easy deployment. - Dependency management via
requirements.txt.
The repo follows a clean, modular layout:
DataScience/
├── .gitignore # Git ignore rules for Python projects
├── requirements.txt # List of Python dependencies
├── setup.py # Setuptools configuration for packaging
├── notebook/ # Jupyter notebooks for data exploration and modeling
│ └── (Notebooks related to EDA and Model Development, committed May 2, 2025)
└── src/ # Core source code modules
└── (Utilities for logging and exception handling, committed Mar 16, 2025)
Note: Subfolder contents are based on commit history. Add specific file details as the repo evolves.
- Python 3.8+
- Git
- Jupyter (for notebooks)
-
Clone the Repository
git clone https://github.com/harshadarayate/DataScience.git cd DataScience -
Set Up Virtual Environment (Recommended)
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install Dependencies Use
requirements.txtfor a standard install:pip install -r requirements.txt
Or install as a package:
pip install -e . -
Launch Notebooks
jupyter notebook
Navigate to the
notebook/folder to open EDA and model development files.
- Explore Notebooks: Dive into
notebook/for step-by-step EDA and model training examples. Run cells to visualize data and build prototypes. - Use Source Modules: Import utilities from
src/in your scripts, e.g.:from src.logger import setup_logger from src.exceptions import CustomDataException logger = setup_logger() try: # Your data code here pass except Exception as e: raise CustomDataException("Data processing failed") from e
- Extend the Project: Add new notebooks or modules while respecting the structure—no major changes needed.
Pull requests are welcome! For major additions:
- Fork the repo.
- Create a branch (
git checkout -b feature/new-notebook). - Commit changes (
git commit -m 'Add new EDA notebook'). - Push and open a PR.
This project is licensed under the MIT License. See LICENSE (add if not present) for details.
Harshada Rayate
GitHub | LinkedIn
Project: DataScience
Star ⭐ the repo to stay updated!