Multilabel Datasets with LIFT Training System

A comprehensive system for training, evaluating, and deploying multilabel classifiers using the LIFT (Learning with Label-Specific Features) algorithm on various multilabel datasets.

🚀 Quick Start

Get started quickly with the interactive interface:

python quickstart.py

This will guide you through setup and provide access to all features.

📦 Available Datasets

The repository includes 10 popular multilabel datasets:

birds - Bird species classification from audio features
bookmarks - Web bookmark categorization
Cal500 - Music emotion classification
corel5k - Image annotation with Corel 5K dataset
delicious - Social bookmarking tag prediction
Emotions - Music emotion classification
enron - Email classification
genbase - Gene functional classification
mediamill - Video semantic annotation
yeast - Yeast protein functional classification

🛠️ Installation

Clone and setup:

git clone <your-repo-url>
cd Multillabel-Datasets
git submodule update --init --recursive
python setup.py

Quick start:
```
python quickstart.py
```

📚 Available Scripts

🎯 Interactive Training

python multilabel_trainer.py --interactive

User-friendly dataset selection
Automatic feature/label detection
Hyperparameter optimization with Bayesian search
Comprehensive evaluation and reporting

🔮 Model Inference

python lift_inference.py --interactive

Load trained models
Make predictions on new data
Batch inference from CSV files
Model evaluation on test data

🔍 Dataset Explorer

python dataset_explorer.py --interactive

Analyze dataset characteristics
Generate statistical reports
Compare multiple datasets
Export detailed HTML/JSON reports

📊 Batch Runner

python batch_runner.py --comparison

Train models on multiple datasets
Compare optimization strategies
Automated experiment logging

🏃 Simple Example

python run_lift_experiment.py --dataset yeast

Basic LIFT training example
Good for understanding the workflow

🔧 Command Line Usage

Train a specific dataset:

python multilabel_trainer.py --dataset yeast --optimize

Make predictions:

python lift_inference.py --model trained_models/yeast_model.pkl --data new_data.csv

Analyze all datasets:

python dataset_explorer.py --all

Run batch experiments:

python batch_runner.py --datasets yeast emotions birds

📊 Features

Automatic Dataset Processing: Smart detection of features and labels
Hyperparameter Optimization: Bayesian optimization for best performance
Comprehensive Evaluation: Multiple multilabel metrics and per-label analysis
Interactive Interface: User-friendly menu-driven operations
Batch Processing: Train multiple datasets automatically
Model Persistence: Save and load trained models
Rich Reporting: HTML and JSON reports with visualizations
Dataset Analysis: Statistical analysis and comparison tools

📁 Project Structure

Multilabel-Datasets/
├── quickstart.py              # Interactive quick start interface
├── multilabel_trainer.py      # Main training system
├── lift_inference.py          # Model inference and evaluation
├── dataset_explorer.py        # Dataset analysis tools
├── batch_runner.py            # Batch experiment runner
├── run_lift_experiment.py     # Simple example script
├── setup.py                   # Installation script
├── requirements.txt           # Python dependencies
├── USAGE_GUIDE.md             # Detailed documentation
├── *.zip                      # Dataset files
├── LIFT-MultiLabel-Learning-with-Label-Specific-Features/  # LIFT submodule
└── [Generated directories]
    ├── extracted_datasets/    # Extracted dataset files
    ├── trained_models/        # Saved models
    ├── reports/              # Training reports
    ├── dataset_reports/      # Dataset analysis reports
    └── predictions/          # Inference outputs

📈 Example Workflow

Explore datasets:

python dataset_explorer.py --dataset yeast

Train a model:

python multilabel_trainer.py --dataset yeast --optimize

Make predictions:
```
python lift_inference.py --interactive
```
Compare multiple datasets:
```
python batch_runner.py --comparison
```

🎯 Evaluation Metrics

The system provides comprehensive multilabel evaluation:

Overall Metrics: Hamming Loss, Jaccard Score, F1 Scores (micro/macro/weighted)
Per-Label Metrics: Precision, Recall, F1 Score, Support
Advanced Analysis: Label cardinality, density, frequency distributions

📖 Documentation

For detailed usage instructions, see USAGE_GUIDE.md.

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

📄 License

This project follows the same license as the LIFT package (MIT).

Get started now: python quickstart.py 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitmodules		.gitmodules
Cal500.zip		Cal500.zip
Emotions.zip		Emotions.zip
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
README.md		README.md
USAGE_GUIDE.md		USAGE_GUIDE.md
USAGE_LIFT_MODULE.md		USAGE_LIFT_MODULE.md
batch_runner.py		batch_runner.py
birds.zip		birds.zip
bookmarks.zip		bookmarks.zip
corel5k.zip		corel5k.zip
dataset_explorer.py		dataset_explorer.py
delicious.zip		delicious.zip
enron.zip		enron.zip
genbase.zip		genbase.zip
lift_inference.py		lift_inference.py
mediamill.zip		mediamill.zip
multilabel_trainer.py		multilabel_trainer.py
quickstart.py		quickstart.py
requirements.txt		requirements.txt
run_lift_experiment.py		run_lift_experiment.py
setup.py		setup.py
test_system.py		test_system.py
yeast.zip		yeast.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multilabel Datasets with LIFT Training System

🚀 Quick Start

📦 Available Datasets

🛠️ Installation

📚 Available Scripts

🎯 Interactive Training

🔮 Model Inference

🔍 Dataset Explorer

📊 Batch Runner

🏃 Simple Example

🔧 Command Line Usage

Train a specific dataset:

Make predictions:

Analyze all datasets:

Run batch experiments:

📊 Features

📁 Project Structure

📈 Example Workflow

🎯 Evaluation Metrics

📖 Documentation

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Multilabel Datasets with LIFT Training System

🚀 Quick Start

📦 Available Datasets

🛠️ Installation

📚 Available Scripts

🎯 Interactive Training

🔮 Model Inference

🔍 Dataset Explorer

📊 Batch Runner

🏃 Simple Example

🔧 Command Line Usage

Train a specific dataset:

Make predictions:

Analyze all datasets:

Run batch experiments:

📊 Features

📁 Project Structure

📈 Example Workflow

🎯 Evaluation Metrics

📖 Documentation

🤝 Contributing

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages