PaleoNet Development Guide

This document provides information for developers interested in contributing to or extending the PaleoNet dinosaur classification project.

Development Environment Setup

Clone the Repository

git clone https://github.com/yourusername/PaleoNet.git
cd PaleoNet

Create a Virtual Environment

# Using venv
python -m venv venv

# Activate on Windows
.\venv\Scripts\activate

# Activate on macOS/Linux
source venv/bin/activate

Install Dependencies
```
pip install -r requirements.txt
```

Project Structure

PaleoNet/
├── PaleoNet.py             # Main Streamlit application
├── utils.py                # Utility functions
├── pages/                  # Additional app pages
│   ├── 01_Model_Info.py
│   ├── 02_Dinosaur_Encyclopedia.py
│   └── 03_Model_Performance.py
├── assets/                 # Images and static assets
│   ├── banner.png
│   └── logo.png
├── data/                   # Dataset directory
│   └── dinosaur_dataset_split/
│       ├── train/          # Training data (70%)
│       ├── val/            # Validation data (15%)
│       └── test/           # Test data (15%)
├── docs/                   # Documentation
│   ├── development.md
│   ├── model_info.md
│   └── user_guide.md
├── model/                  # Saved model files
│   ├── best_model_checkpoint.h5
│   ├── confusion_matrix.csv
│   ├── confusion_matrix.png
│   ├── dinosaur_classifier_transfer_learning.keras
│   ├── dinosaur_class_mapping.json
│   ├── dinosaur_model_performance.json
│   ├── training_history.csv
│   ├── training_history.png
│   └── training_history_detailed.json
├── opdracht_CNN_stijnen_simon.ipynb  # Model training notebook
├── requirements.txt        # Project dependencies
├── README.md               # Main documentation
└── LICENSE                 # MIT License

Key Components

1. Model Training (opdracht_CNN_stijnen_simon.ipynb)

The Jupyter notebook contains the complete workflow for:

Loading and preprocessing the dataset
Building the EfficientNetB0-based model
Training with transfer learning
Evaluating on test data
Saving model artifacts

To retrain the model with different parameters or architectures, modify this notebook.

2. Streamlit Application (PaleoNet.py)

The main application file:

Loads the trained model
Provides the user interface with a tabbed navigation system
Handles image upload and processing
Displays classification results
Contains three main tabs: Home, Upload Image, and Sample Gallery

3. Pages (pages/)

Additional application pages:

01_Model_Info.py: Displays model architecture and performance
02_Dinosaur_Encyclopedia.py: Information about dinosaur species
03_Model_Performance.py: Displays model performance metrics and visualizations

4. Utility Functions (utils.py)

Contains helper functions for:

Image preprocessing
Visualization
Model interpretation

Development Workflow

Adding a New Feature

Create a feature branch

git checkout -b feature/your-feature-name

Implement your changes
- Update application code
- Add tests for your feature (if applicable)
- Update documentation in docs/
Run tests locally
```
# If using pytest
pytest tests/
```
Create a pull request
- Provide a clear description of your changes
- Reference any related issues

Modifying the Model

To improve or change the classification model:

Open the training notebook opdracht_CNN_stijnen_simon.ipynb
Modify the model architecture, training parameters, or data augmentation
Retrain the model
Evaluate performance
Export the model artifacts:
- dinosaur_classifier_transfer_learning.keras
- dinosaur_class_mapping.json
- dinosaur_model_performance.json
Place the new model artifacts in the model/ directory

Adding a New Page

To add a new page to the Streamlit application:

Create a new Python file in pages/ (the filename should start with a number to control ordering)
Import needed modules, especially streamlit as st
Set page configuration at the top
Implement the page content
Update documentation to reference your new page

Example:

# pages/03_Your_New_Page.py
import streamlit as st

st.set_page_config(
    page_title="PaleoNet - Your New Page",
    page_icon="🦖",
    layout="wide"
)

st.title("Your New Page Title")
st.markdown("## Your content here")

# Rest of your page implementation

Path Handling Guidelines

The application uses robust path handling to ensure portability across different operating systems and environments. When working with file paths:

Always use OS-independent path construction

import os
from pathlib import Path

# Get the current file's directory
current_dir = Path(__file__).parent

# Navigate to parent directory
root_dir = current_dir.parent

# Create path to a file
file_path = os.path.join(root_dir, "model", "model_file.keras")

Avoid hardcoded relative paths
- Don't use: "../model/file.json"
- Instead use: os.path.join(root_dir, "model", "file.json")

Add error handling for file operations

try:
    with open(file_path, "r") as f:
        data = json.load(f)
except FileNotFoundError:
    st.error(f"Could not find file: {file_path}")
    # Provide fallback behavior

Documentation

When adding features or making changes:

Update relevant documentation in docs/
Add inline comments for complex code sections
Update README.md if needed
Include example usage where appropriate

Deployment

To deploy the application:

Ensure all dependencies are in requirements.txt
For Streamlit Cloud:
- Push to GitHub
- Connect repository to Streamlit Cloud
- Configure settings as needed
For self-hosting:
- Install dependencies
- Run with streamlit run PaleoNet.py
- Consider using Docker for containerization

Getting Help

If you need assistance with development:

Check existing documentation
Look for similar issues in the issue tracker
Contact the maintainers
Create a new issue with a clear description of your problem

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PaleoNet Development Guide

Development Environment Setup

Project Structure

Key Components

1. Model Training (opdracht_CNN_stijnen_simon.ipynb)

2. Streamlit Application (PaleoNet.py)

3. Pages (pages/)

4. Utility Functions (utils.py)

Development Workflow

Adding a New Feature

Modifying the Model

Adding a New Page

Path Handling Guidelines

Documentation

Deployment

Getting Help

FilesExpand file tree

development.md

Latest commit

History

development.md

File metadata and controls

PaleoNet Development Guide

Development Environment Setup

Project Structure

Key Components

1. Model Training (opdracht_CNN_stijnen_simon.ipynb)

2. Streamlit Application (PaleoNet.py)

3. Pages (pages/)

4. Utility Functions (utils.py)

Development Workflow

Adding a New Feature

Modifying the Model

Adding a New Page

Path Handling Guidelines

Documentation

Deployment

Getting Help