Production-grade MBTI personality classification system with neural network implementation in pure Rust.
| Model | Method | Accuracy | vs Random | Training Time |
|---|---|---|---|---|
| V7 | 🏆 TF-IDF + BERT + Multi-Task GPU | 52.05% | 8.3x | ~50s |
| V6 | BERT + MLP (single-task) | 31.99% | 5.1x | 322s |
| V1 | TF-IDF + Naive Bayes | 21.73% | 3.5x | 2s |
| V2 | 9 Psychological Features | 21.21% | 3.4x | 3s |
| V3 | 930 Psychological Features | 20.12% | 3.2x | 30s |
| V5 | BERT Only + Cosine | 18.39% | 2.9x | 583s |
Random Baseline: 6.25% (16 classes)
| Dimension | Accuracy | Notes |
|---|---|---|
| E/I | 82.77% | Extraversion vs Introversion |
| S/N | 88.18% | Sensing vs Intuition (best) |
| T/F | 81.67% | Thinking vs Feeling |
| J/P | 77.12% | Judging vs Perceiving |
🤗 Pre-trained Model: Available on Hugging Face
Add to your Cargo.toml:
[dependencies]
# With auto-download from Hugging Face
psycial = { version = "0.1", features = ["auto-download"] }Use in your code:
use psycial::api::Predictor;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let predictor = Predictor::new()?;
let result = predictor.predict("I love solving complex problems")?;
println!("Type: {} (confidence: {:.1}%)",
result.mbti_type,
result.confidence * 100.0);
Ok(())
}See full API documentation: cargo doc --open --features auto-download
Download the trained model and start predicting immediately:
from huggingface_hub import hf_hub_download
import shutil
# Download model files
mlp_weights = hf_hub_download(repo_id="ElderRyan/psycial", filename="mlp_weights_multitask.pt")
vectorizer = hf_hub_download(repo_id="ElderRyan/psycial", filename="tfidf_vectorizer_multitask.json")
# Copy to models directory
shutil.copy(mlp_weights, "models/mlp_weights_multitask.pt")
shutil.copy(vectorizer, "models/tfidf_vectorizer_multitask.json")Then use the Rust binary for prediction:
./target/release/psycial hybrid predict "Your text here"# Show all available models
cargo run --release
# Run baseline model
cargo run --release -- baseline
# Run best model (multi-task hybrid)
cargo run --release -- hybrid train --multi-task
# Run BERT model
cargo run --release -- bert-mlpSee CLI_GUIDE.md for detailed CLI usage.
Add to your Cargo.toml:
[dependencies]
psycial = { git = "https://github.com/polyjuicelab/psycial", features = ["bert"] }Use in your code:
use psycial::{load_data, MultiTaskGpuMLP, RustBertEncoder};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load data and initialize BERT
let records = load_data("data/mbti_1.csv")?;
let bert = RustBertEncoder::new()?;
// Extract features
let texts: Vec<String> = records.iter().map(|r| r.posts.clone()).collect();
let labels: Vec<String> = records.iter().map(|r| r.mbti_type.clone()).collect();
let features = bert.extract_features_batch(&texts)?;
// Train multi-task model (4 binary classifiers: E/I, S/N, T/F, J/P)
let mut model = MultiTaskGpuMLP::new(384, vec![256, 128], 0.001, 0.5);
model.train(&features, &labels, 20, 32);
// Predict
let test_features = bert.extract_features("I love planning everything.")?;
let mbti_type = model.predict(&test_features);
println!("Predicted: {}", mbti_type);
Ok(())
}See LIBRARY_USAGE.md for detailed library integration examples.
Statistical (TF-IDF + NB): 21.73%
Deep Learning (BERT + MLP): 31.99% (+47% improvement)
Proven: Deep learning can leverage BERT features that simple classifiers cannot.
- ✅ MLP with backpropagation
- ✅ BERT integration (rust-bert)
- ✅ 930 psychological features
- ✅ Automatic feature engineering
- ✅ No Python dependencies for inference
- Clean Rust API
- Comprehensive error handling
- Modular architecture
- Full test coverage
- Metal GPU support (M1/M2/M3)
Text Input
↓
BERT Encoder (384-dim embeddings)
↓
MLP Neural Network (384 -> 256 -> 128 -> 16)
├─ Xavier initialization
├─ ReLU activation
├─ Softmax output
└─ SGD optimizer
↓
MBTI Type (16 classes)
- BERT Model: all-MiniLM-L12-v2 (sentence-transformers)
- Embedding Dim: 384
- MLP Architecture: 3 layers (256, 128 hidden units)
- Activation: ReLU
- Optimizer: SGD with learning rate 0.001
- Training: 25 epochs, batch size 32
- Rust 1.70+
- 8GB RAM minimum
- 2GB disk space (for models)
# Clone
git clone <repo>
cd snapMBTI
# Build (downloads libtorch automatically on first build)
cargo build --release --features bert --bin bert-mlp
# This will take ~60 minutes first time (downloading & compiling libtorch)
# Subsequent builds are fastcargo run --release --features bert --bin bert-mlp# V1: TF-IDF baseline
cargo run --release --bin baseline
# V2-V3: Psychological features
cargo run --release --bin psyattention
cargo run --release --bin psyattention-full
# V5: BERT experiments
cargo run --release --features bert --bin bert-only
# V6: BERT + MLP (best)
cargo run --release --features bert --bin bert-mlp- Source: MBTI Kaggle Dataset
- Size: 8,675 samples
- Split: 80% train (6,940) / 20% test (1,735)
- Classes: 16 MBTI types
- Imbalance: INFP (21%), ENTP (~1%)
Pure Rust neural network with:
- Xavier weight initialization
- Backpropagation
- Mini-batch SGD
- ReLU activations
- Softmax output
- Cross-entropy loss
Code: src/neural_net.rs (259 lines)
Library: rust-bert v0.22
- Automatic model downloads
- Sentence embeddings
- Metal GPU support
- Pure Rust API (libtorch backend)
Code: src/psyattention/bert_rustbert.rs
snapMBTI/
├── src/
│ ├── main.rs # V1: Baseline
│ ├── bert_mlp.rs # V6: BERT + MLP (BEST)
│ ├── bert_only.rs # V5: BERT experiments
│ ├── neural_net.rs # MLP implementation
│ └── psyattention/ # Psychological features
│ ├── bert_rustbert.rs # BERT encoder
│ ├── seance.rs # 271 emotion features
│ ├── taaco.rs # 168 coherence features
│ ├── taales.rs # 491 complexity features
│ └── ...
├── data/mbti_1.csv # Dataset
└── docs/ # Documentation
BERT + k-NN: 18.39%
- High-dimensional features
- Simple nearest-neighbor
- Cannot learn complex patterns
- Result: Underutilizes BERT
BERT + MLP: 31.99%
- High-dimensional features
- Neural network classifier
- Learns non-linear decision boundaries
- Result: Properly utilizes BERT
| Model | Compilation | Training | Inference | Accuracy |
|---|---|---|---|---|
| Baseline | 30s | 2s | <1s | 21.73% |
| BERT + MLP | 60min (first) | 322s | 61s | 31.99% |
*First compilation downloads libtorch (~500MB)
[dependencies]
csv = "1.3"
serde = "1.0"
rand = "0.8"
ndarray = "0.16"
# Optional: BERT support
rust-bert = { version = "0.22", optional = true, features = ["download-libtorch"] }@software{snapMBTI2025,
title={MBTI Personality Classifier with Neural Networks},
author={Ryan Kung},
year={2025},
note={Pure Rust, 31.99\% accuracy, BERT + MLP implementation}
}AGPL-3.0
See LICENSE file for details.
- rust-bert: Guillaume BE
- Dataset: MBTI Kaggle Dataset
Status: Production Ready
Best Model: BERT + MLP (31.99%)
Platform: macOS (Metal GPU), Linux, Windows
Language: 100% Rust