A BERT-based AI-powered bot detection service built with FastAPI. This API analyzes lists of texts and predicts whether they are from a AI-bot or human user.
- BERT-based inference: Uses finetuned mBERT model from HuggingFace for classification
- Mean probability prediction: Aggregates individual text scores and applies a configurable threshold
- Resource management: Limits CPU core usage for consistent performance
- Input validation: Configurable minimum text count for predictions
- Interactive API docs: Automatically generated Swagger UI for easy testing
- Clone the repository:
git clone <repo-url>
cd AI-powered-bot-detection-API- Create a virtual environment (recommended):
python -m venv venv
source venv/bin/activate- Install dependencies:
pip install -r requirements.txtAll configuration parameters are defined in config/constants.py:
MIN_TEXT_COUNT(default: 20): Minimum number of texts required for predictionBOT_PROBABILITY_THRESHOLD(default: 0.4): Threshold for binary classification (0-1)MODEL_NAME(default:trokhymovych/mbert-ai-bot-detector): Hugging Face repo id or local model pathMAX_CPU_CORES(default: 4): Maximum CPU cores torch will useBATCH_SIZE(default: 32): Batch size for model processing
Edit these values to customize the API behavior.
The model comes from Hugging Face:
trokhymovych/mbert-ai-bot-detectorFor development, you can reference the Hugging Face repo id directly. Transformers will download/cache it automatically on first startup:
MODEL_NAME=trokhymovych/mbert-ai-bot-detector python main.pyFor the server, download the model once and run from the local copy so restarts do not depend on Hugging Face availability:
huggingface-cli download trokhymovych/mbert-ai-bot-detector --local-dir models/mbert-ai-bot-detector
MODEL_NAME=models/mbert-ai-bot-detector python main.pyThe deployed server .env should use:
MODEL_NAME=models/mbert-ai-bot-detector
BOT_PROBABILITY_THRESHOLD=0.4If MODEL_NAME points to a local path (for example data/mbert_trained), that directory must exist.
Start the FastAPI server:
python main.pyThe API will be available at http://localhost:8000
- Swagger UI: http://localhost:8000/docs
Predict if texts belong to a AI-bot.
Request:
{
"texts": [
"This is the first text.",
"This is the second text.",
"..."
]
}Response:
{
"is_bot": false,
"confidence": 0.2345,
"text_scores": [0.1234, 0.2456, ...],
"num_texts": 3
}Parameters:
texts(required): List of texts to analyze (minimum 20 texts required)is_bot(boolean): Binary prediction (True = bot, False = human)confidence(float): Mean probability score (0-1)text_scores(array): Individual scores for each text providednum_texts(integer): Number of texts processed
Error Responses:
400 Bad Request: Too few texts (< 20)422 Unprocessable Entity: Invalid input format500 Internal Server Error: Model inference failed
Health check endpoint.
Response:
{
"status": "healthy",
"model_loaded": true
}Get current model information.
Response:
{
"model_name": "<>",
"threshold": 0.4,
}/
├── config/
│ ├── __init__.py
│ └── constants.py # Configuration parameters
├── models/
│ ├── __init__.py
│ └── bot_detector.py # Core BERT inference logic
├── api/
│ ├── __init__.py
│ └── data_models.py # Pydantic models for request/response validation
├── utils/
│ ├── __init__.py
│ └── exceptions.py # Custom exceptions
├── main.py # FastAPI app and endpoint definitions
├── requirements.txt # Python dependencies
├── README.md # This file
└── .gitignore # Git ignore rules
- Input Validation: Texts are validated (minimum 20 required)
- Inference: Model runs inference on batches of texts
- Probability Extraction: Positive class probabilities are extracted
- Aggregation: Mean probability is calculated across all texts
- Classification: Binary label is determined using configurable threshold
- Response: Results are returned with individual scores and aggregated prediction
The API limits CPU core usage using PyTorch's torch.set_num_threads() to the value specified in MAX_CPU_CORES. This prevents resource exhaustion when running multiple inference requests.
- Batch Processing: Texts are processed in batches (default: 32) for efficiency
- CPU Limiting: Torch is restricted to use only
MAX_CPU_COREScores - Lazy Loading: Model is loaded on server startup for faster response times
If you encounter memory issues:
- Reduce
BATCH_SIZEin constants.py - Reduce
MAX_CPU_CORESif running multiple instances
- Increase
BATCH_SIZEfor better throughput - Ensure
MAX_CPU_CORESis set appropriately for your hardware