crm-lead-intelligence/
├── leads_db.csv # Dataset (simulated CRM data)
├── colab_training.py # ML training pipeline (run in Google Colab)
├── model.pkl # Trained XGBoost model (generated by Colab)
├── label_encoder.pkl # Industry label encoder (generated by Colab)
├── main.py # FastAPI backend
├── requirements.txt # Python dependencies
└── README.md
- Open Google Colab
- Upload
leads_db.csvandcolab_training.py - Run each cell in order
- In the last cell, download
model.pklandlabel_encoder.pkl - Place both files in the same folder as
main.py
# Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Start the server
uvicorn main:app --reload --port 8000Server runs at: http://localhost:8000
Interactive API docs: http://localhost:8000/docs
curl http://localhost:8000/leadsReturns all 100 leads from leads_db.csv, each scored by the ML model.
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{
"name": "Jane Doe",
"company": "TechCorp",
"industry": "Technology",
"num_calls": 7,
"email_opens": 15,
"website_visits": 40
}'Returns:
{
"name": "Jane Doe",
"company": "TechCorp",
"industry": "Technology",
"score": 0.9981,
"category": "High",
"insight": "High-potential lead from Technology. Immediate follow-up recommended."
}curl http://localhost:8000/industriesEvery scored lead returns:
| Field | Type | Description |
|---|---|---|
| score | float | Conversion probability (0.0–1.0) |
| category | string | "High" / "Medium" / "Low" |
| insight | string | Human-readable AI recommendation |
Score thresholds:
- High → score > 0.80 (immediate follow-up)
- Medium → score > 0.50 (nurture campaign)
- Low → score ≤ 0.50 (monitor passively)
- Technology
- Finance
- Healthcare
- Education
- Manufacturing
- Retail
- Algorithm: XGBoost (XGBClassifier)
- Features: industry, num_calls, email_opens, website_visits
- Training: 80 leads | Test: 20 leads | CV: 5-fold
- Test accuracy: ~95% | ROC AUC: ~0.99
- Trained in: Google Colab
- Inference: FastAPI backend (loads model.pkl at startup)