Skip to content

digvijayforreal/crm-lead-intelligence

Repository files navigation

CRM Lead Intelligence System — Setup Guide

Project Structure

crm-lead-intelligence/
├── leads_db.csv          # Dataset (simulated CRM data)
├── colab_training.py     # ML training pipeline (run in Google Colab)
├── model.pkl             # Trained XGBoost model (generated by Colab)
├── label_encoder.pkl     # Industry label encoder (generated by Colab)
├── main.py               # FastAPI backend
├── requirements.txt      # Python dependencies
└── README.md

Step 1 — Train the model in Google Colab

  1. Open Google Colab
  2. Upload leads_db.csv and colab_training.py
  3. Run each cell in order
  4. In the last cell, download model.pkl and label_encoder.pkl
  5. Place both files in the same folder as main.py

Step 2 — Run the FastAPI backend locally

# Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate        # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Start the server
uvicorn main:app --reload --port 8000

Server runs at: http://localhost:8000

Interactive API docs: http://localhost:8000/docs


Step 3 — Test the endpoints

GET /leads (auto pipeline)

curl http://localhost:8000/leads

Returns all 100 leads from leads_db.csv, each scored by the ML model.

POST /predict (manual input)

curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Jane Doe",
    "company": "TechCorp",
    "industry": "Technology",
    "num_calls": 7,
    "email_opens": 15,
    "website_visits": 40
  }'

Returns:

{
  "name": "Jane Doe",
  "company": "TechCorp",
  "industry": "Technology",
  "score": 0.9981,
  "category": "High",
  "insight": "High-potential lead from Technology. Immediate follow-up recommended."
}

GET /industries

curl http://localhost:8000/industries

API Response Format

Every scored lead returns:

Field Type Description
score float Conversion probability (0.0–1.0)
category string "High" / "Medium" / "Low"
insight string Human-readable AI recommendation

Score thresholds:

  • High → score > 0.80 (immediate follow-up)
  • Medium → score > 0.50 (nurture campaign)
  • Low → score ≤ 0.50 (monitor passively)

Supported Industries

  • Technology
  • Finance
  • Healthcare
  • Education
  • Manufacturing
  • Retail

Model Details

  • Algorithm: XGBoost (XGBClassifier)
  • Features: industry, num_calls, email_opens, website_visits
  • Training: 80 leads | Test: 20 leads | CV: 5-fold
  • Test accuracy: ~95% | ROC AUC: ~0.99
  • Trained in: Google Colab
  • Inference: FastAPI backend (loads model.pkl at startup)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors