Skip to content

fahadsid1770/droneseg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DroneSeg

A full-stack semantic segmentation platform for drone and aerial imagery. Upload high-resolution drone photos, and the system runs a SegFormer-B2 deep learning model to produce pixel-accurate land-cover classification masks. Detected objects — buildings, roads, trees, cars, water, sky, and more — are outlined with bounding boxes and color-coded segmentation overlays, then visualized on an interactive map with real-time opacity and confidence controls.

Why DroneSeg?

Manual land-cover analysis from drone footage is time-consuming and subjective. DroneSeg automates this with a state-of-the-art transformer-based segmentation model, giving you instant, reproducible results. Beyond raw inference, the platform provides:

  • Geospatial context — GPS metadata embedded in drone photos is extracted to automatically geolocate the image on a map.
  • AI-powered analysis — An optional AI Vision mode (powered by GPT-4o-mini / GPT-4.1-mini) lets you describe the scene in natural language for spatial reasoning.
  • GIS integration — Results export as GeoJSON for direct use in QGIS, ArcGIS, or other GIS tools.
  • Full history — Every detection is persisted and recallable.

Whether you're surveying construction sites, monitoring agricultural land, or analyzing urban environments, DroneSeg turns raw aerial imagery into structured, actionable data.

Features

  • Semantic segmentation — SegFormer-B2 model (trained on ADE20K, 150 classes) for pixel-level land-cover classification
  • Interactive map viewer — MapLibre GL / OpenStreetMap with synchronized drone raster overlays, segmentation masks, and bounding boxes
  • Real-time filtering — Adjust mask opacity and confidence threshold to focus on relevant detections
  • AI Vision mode — Natural-language spatial analysis via OpenAI GPT-4o-mini / GPT-4.1-mini
  • GPS auto-detection — Reads EXIF/XMP metadata from DJI drone photos to automatically geolocate and orient images
  • Detection history — Persistent storage and retrieval of all past analyses
  • GeoJSON export — Export detections as GeoJSON FeatureCollection for GIS workflows
  • RESTful API — Well-documented endpoints for upload, detection, history, and export

Tech Stack

Backend

Package Version Role
Python 3.11 Runtime
FastAPI 0.111 Async REST framework
Uvicorn 0.29 ASGI server
HuggingFace Transformers 4.40 SegFormer-B2 model
PyTorch 2.3 Inference engine
OpenCV (headless) 4.9 Connected-component bbox extraction
Pillow 10.3 Image I/O and mask colorization
aiosqlite 0.20 Async SQLite database
OpenAI 1.30 AI Vision mode client

Frontend

Package Version Role
Next.js 16 (App Router) React framework
TypeScript 5 Type safety
React 19 UI library
MapLibre GL JS 5.24 Interactive map engine
react-map-gl 8 React bindings for MapLibre
Tailwind CSS 4 Utility-first styling
Radix UI 1.3 Accessible slider component
Lucide React Icon library
Axios 1.15 HTTP client

Infrastructure

  • Docker / Docker Compose for containerized backend deployment
  • Multi-stage Docker build for minimal image size

Architecture

┌─────────────────────────────────────────────────────────────┐
│                        Frontend (Next.js)                    │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
│  │ Upload   │  │ Map      │  │ Detection│  │ History  │   │
│  │ Zone     │  │ Viewer   │  │ Panel    │  │ Drawer   │   │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘   │
│       └──────────────┴─────────────┴──────────────┘        │
│                          │ Axios HTTP                        │
└──────────────────────────┼──────────────────────────────────┘
                           │
┌──────────────────────────┼──────────────────────────────────┐
│                    Backend (FastAPI)                         │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
│  │ Upload   │  │ Detect   │  │ History  │  │ Export   │   │
│  │ Router   │  │ Router   │  │ Router   │  │ Router   │   │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘   │
│       │              │              │              │        │
│  ┌────┴────┐  ┌──────┴──────┐  ┌───┴────┐  ┌────┴─────┐  │
│  │Metadata │  │ SegFormer   │  │ History│  │ GeoJSON  │  │
│  │Service  │  │ Service     │  │ Repo   │  │ Service  │  │
│  └─────────┘  └──────┬──────┘  └───┬────┘  └──────────┘  │
│                      │              │                      │
│                 ┌────┴────┐   ┌────┴────┐                 │
│                 │ PyTorch │   │ SQLite  │                 │
│                 │ Model   │   │ (async) │                 │
│                 └─────────┘   └─────────┘                 │
└────────────────────────────────────────────────────────────┘

Flow

  1. Upload — A drone JPEG/PNG is uploaded via the frontend or API. GPS coordinates and footprint are extracted from EXIF/XMP metadata. The image is registered in SQLite.
  2. Detect — An inference request triggers SegFormer-B2 forward pass on GPU/CPU. Logits are upsampled to original resolution, softmax is applied, and per-pixel class labels are generated. Connected-component analysis extracts bounding boxes per class.
  3. Visualize — The segmentation mask (colorized RGBA PNG) and detection list (class, confidence, bbox, pixel area) are returned. The frontend projects pixel coordinates to geographic coordinates using the image's GPS bounds and renders everything on a MapLibre map.
  4. Analyze — Optionally, the image and detections can be sent to an LLM (GPT-4o-mini) for natural-language spatial reasoning.
  5. Export — Detections can be exported as GeoJSON for use in GIS applications.

Getting Started

Backend

cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Copy .env.example to .env and configure:

cp .env.example .env
# Edit .env if needed (defaults work for local development)

Run the server:

uvicorn main:app --reload --host 0.0.0.0 --port 8000

The first run will download SegFormer-B2 weights (~85 MB) from HuggingFace Hub. The SQLite database is created automatically.

Interactive API docs are available at http://localhost:8000/docs.

Frontend

cd frontend
npm install

Create .env.local:

NEXT_PUBLIC_API_URL=http://localhost:8000
OPENAI_API_KEY=sk-...           # only required for AI Vision mode

Run the development server:

npm run dev

Open http://localhost:3000 in your browser.

Docker (Backend Only)

cd backend
docker compose up --build

API Endpoints

Method Path Description
POST /api/upload Upload a drone image (JPEG/PNG, max 50 MB)
POST /api/detect Run segmentation (JSON {image_id} or multipart form with file)
GET /api/images List all registered images
GET /api/images/{id} Serve original image bytes
GET /api/masks/{filename} Serve segmentation mask PNG
GET /api/history Paginated detection history
DELETE /api/history/{id} Delete a detection record
GET /api/export/geojson/{id} GeoJSON FeatureCollection export

Project Structure

droneseg/
├── backend/
│   ├── main.py                    # FastAPI app entrypoint
│   ├── config.py                  # Environment configuration
│   ├── requirements.txt           # Python dependencies
│   ├── Dockerfile                 # Multi-stage Docker build
│   ├── docker-compose.yml         # Docker Compose definition
│   ├── Makefile                   # Development shortcuts
│   ├── .env.example               # Environment variable template
│   ├── db/
│   │   ├── database.py            # SQLite connection pool
│   │   └── history_repo.py        # Database access layer
│   ├── models/
│   │   └── schemas.py             # Pydantic request/response models
│   ├── routers/
│   │   ├── upload.py              # Image upload endpoint
│   │   ├── detect.py              # Detection inference endpoint
│   │   ├── images.py              # Image listing and serving
│   │   ├── history.py             # Detection history CRUD
│   │   └── export.py              # GeoJSON export endpoint
│   └── services/
│       ├── segformer_service.py   # SegFormer-B2 inference pipeline
│       ├── metadata_service.py    # EXIF/XMP GPS extraction
│       └── geojson_service.py     # GeoJSON feature builder
├── frontend/
│   ├── package.json               # Node dependencies
│   ├── README.md                  # Frontend-specific docs
│   └── src/
│       ├── app/
│       │   └── page.tsx           # Main application page
│       ├── components/
│       │   ├── BBoxOverlay.tsx     # Bounding box SVG overlay
│       │   ├── ConfidenceSlider.tsx # Confidence threshold control
│       │   ├── DetectionPanel.tsx  # Right sidebar with detections
│       │   ├── LLMModelSelector.tsx # Model selection dropdown
│       │   ├── MapViewer.tsx       # MapLibre GL map wrapper
│       │   ├── SegMaskOverlay.tsx  # Segmentation mask raster layer
│       │   └── UploadZone.tsx      # Drag-and-drop upload component
│       ├── lib/
│       │   └── api.ts             # API client functions
│       └── types/
│           └── detection.ts       # TypeScript type definitions
└── docs/
    ├── DroneSeg_SRS_v1.0.docx     # Software Requirements Specification
    └── fixes.md                   # Bug fix documentation

About

DroneSeg is a full-stack semantic segmentation platform for drone/aerial imagery. Upload drone photos, run SegFormer-B2 deep learning inference to generate land-cover classification masks with bounding boxes and visualize results on an interactive map and GeoJSON export.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors