Three comprehensive datasets are included for testing and demonstration:
File: sample_datasets/customer_support.json
Size: 50 Q&A pairs
Use Case: Customer service chatbot training
Topics Covered:
- Greetings and basic interactions
- Product information and specifications
- Order tracking and shipping
- Returns and refunds
- Account management
- Payment and billing
- Technical support
- Policies and procedures
Example:
{
"input": "How do I reset my password?",
"output": "To reset your password, click 'Forgot Password' on the login page..."
}File: sample_datasets/tech_qa.json
Size: 30 Q&A pairs
Use Case: Programming assistant training
Topics Covered:
- Python programming
- JavaScript development
- Git version control
- Web development
- Database management
- API development
- Debugging techniques
- Best practices
Example:
{
"input": "How do I create a virtual environment in Python?",
"output": "Use `python -m venv venv` to create a virtual environment..."
}File: sample_datasets/conversation.json
Size: 25 Q&A pairs
Use Case: General conversational AI training
Topics Covered:
- Greetings and farewells
- Small talk and chitchat
- Emotional responses
- AI capabilities and limitations
- Humor and personality
- Helpful assistance
Example:
{
"input": "Tell me a joke",
"output": "Why don't scientists trust atoms? Because they make up everything! 😄"
}python test_suite.pyTests:
- Health check
- LM Studio connectivity
- Model listing
- Chat completion
- Teacher/Critic evaluation
- Dataset management
Output: HTML report in test_reports/
python comprehensive_test.pyTests:
- All API endpoints
- Multiple chat scenarios
- Teacher/Critic with various inputs
- Dataset upload for all sample files
- System statistics
- Performance benchmarking
Output:
- Detailed console output with color-coded results
- Comprehensive HTML performance report
- Performance metrics and recommendations
- ✅ Backend availability
- ✅ LM Studio connectivity
- ✅ Model availability
- ✅ Model listing from LM Studio
- ✅ Chat completion with various prompts
- ✅ Teacher/Critic evaluation
- ✅ Dataset upload and listing
- ✅ System resource monitoring
- ⚡ API response times (avg, min, max, median)
- ⚡ Chat response times
- ⚡ Model load times
- ⚡ Dataset upload times
- ⚡ Success rate percentage
- ⚡ Throughput (requests/second)
The comprehensive test generates an HTML report with:
- Total requests
- Successful requests
- Failed requests
- Success rate percentage
- API response time statistics
- Chat completion performance
- Model loading performance
- Dataset upload performance
- Success rate progress bar
- Performance metric tables
- Feature badges
- Error reporting
- Performance optimization tips
- Troubleshooting guidance
- System upgrade suggestions
- Best practices
- Start FineTuneLite (
start.batorstart.sh) - Navigate to http://localhost:3000/datasets
- Click "Upload Dataset"
- Select a file from
sample_datasets/ - View uploaded dataset in list
curl -X POST http://localhost:8000/datasets/upload \
-F "file=@sample_datasets/customer_support.json"curl http://localhost:8000/datasets/Use one of the sample datasets or create your own.
- Go to Fine-tune page
- Select base model (IBM Granite 4.0 H Tiny recommended)
- Choose uploaded dataset
- Set hyperparameters:
- Epochs: 1-3
- Batch Size: 1 (for CPU)
- Learning Rate: 0.0002
- PEFT Type: LoRA
Click "Start Training" and monitor progress in Training Jobs page.
Note: Full training implementation is in progress. Current version demonstrates the UI and workflow.
[
{
"input": "User question or prompt",
"output": "Expected model response"
},
{
"input": "Another question",
"output": "Another response"
}
]- Size: 50-500 examples for fine-tuning
- Quality: Clear, accurate, consistent responses
- Diversity: Cover various scenarios and edge cases
- Format: Consistent structure across all examples
- Balance: Mix of simple and complex examples
input,output
"Question 1","Answer 1"
"Question 2","Answer 2"| Operation | Expected Time | Notes |
|---|---|---|
| Health Check | <0.1s | Instant |
| Model List | 0.2-0.5s | Fast |
| Chat (short) | 10-15s | Model-dependent |
| Chat (long) | 15-25s | Model-dependent |
| Evaluation | 10-20s | Dual model calls |
| Dataset Upload | 0.1-1s | Size-dependent |
- Use smaller models for faster responses
- Reduce max_tokens in requests
- Close other CPU-intensive apps
- Use SSD for model storage
- Consider GPU for production
- Check LM Studio: Ensure it's running on port 1234
- Load Model: At least one model must be loaded
- Backend Running: Verify backend on port 8000
- Network: Check localhost connectivity
- CPU Usage: Close other applications
- Model Size: Try smaller models
- System Resources: Check RAM availability
- Temperature: Ensure CPU isn't thermal throttling
- Format: Verify JSON/CSV format is correct
- Size: Check file isn't too large (>100MB)
- Permissions: Ensure write access to
data/uploads/ - Backend: Check backend logs for errors
- Setup Guide:
SETUP_GUIDE.md - Deployment:
DEPLOYMENT.md - Architecture:
docs/architecture.md - Demo Script:
docs/demo_script.md
Before testing:
- LM Studio installed and running
- IBM Granite 4.0 H Tiny downloaded
- LM Studio Local Server started (port 1234)
- Backend running (
python -m uvicorn main:app --reload) - Frontend running (
npm run dev) - Sample datasets in
sample_datasets/folder
Run tests:
-
python test_suite.pyfor quick test -
python comprehensive_test.pyfor full analysis - Review HTML reports in
test_reports/ - Check performance metrics
- Follow recommendations
Happy Testing! 🎉
For questions or issues, refer to the comprehensive documentation or check the test reports for detailed diagnostics.