Research implementation of Structured Sparse Dictionary Learning for visual object category recognition. Introduces group structure into sparse coding by combining co-clustering with Sparse PCA and Group Lasso, yielding semantically coherent feature representations that outperform unstructured sparse coding on standard benchmarks.
"Sparse coding is augmented with structure learnt from co-clustering in terms of groups of semantically related basis elements." — Chapter: Structured Dictionary Learning and Feature Encoding
📄 Full technical report: structured-sparse-visual-classification.pdf
Standard sparse coding (Lasso / OMP) selects individual dictionary atoms independently:
x ≈ D α, min ||α||₁ (ℓ₁ norm — atom-wise sparsity)
This work adds group structure discovered by co-clustering: semantically related atoms are grouped, and selection happens at the group level via the ℓ₂,₁ mixed norm:
x ≈ D α, min Σ_g ||α_g||₂ (Group Lasso — group-wise sparsity)
Two complementary contributions:
| Contribution | Where | What |
|---|---|---|
| Structured Dictionary (SSPCA) | python/dictionary_learning.py |
Co-clustering groups sub-spaces before Sparse PCA; learns a multi-manifold dictionary |
| Structured Encoding (Group Lasso) | python/sparse_coding.py |
Co-clustering groups dictionary atoms; enforces group-level sparsity via ℓ₂,₁ |
flowchart TB
subgraph Data["Feature Extraction (util/)"]
IMG["Images"] --> DSIFT["Dense SIFT\n(128-d descriptors)"]
DSIFT --> BOF["Bag-of-Features\n(K-means codebook, K=1000)"]
BOF --> HIGH["High-dim BOF vectors\n(512 – 2048-d)"]
end
subgraph Dict["Dictionary Learning (python/)"]
HIGH --> COCLUST["Co-clustering\n(atom groups)"]
HIGH --> SPCA["Sparse PCA\n(SPCA baseline)"]
COCLUST --> SSPCA["Structured Sparse PCA\n(SSPCA — ℓ₂,₁ groups)"]
end
subgraph Encode["Sparse Encoding (python/)"]
SPCA --> L1["Lasso / OMP\n(ℓ₁ — atom-wise)"]
SSPCA --> GL["Group Lasso\n(ℓ₂,₁ — group-wise)"]
end
subgraph Classify["Classification"]
L1 --> SVM["RBF SVM\n10-fold CV"]
GL --> SVM
SVM --> F1["F1-score\nper category"]
end
Evaluated on standard computer vision benchmarks. Structured methods (SSPCA + Group Lasso) consistently outperform unstructured baselines:
| Dataset | PCA | SPCA | SSPCA | Improvement |
|---|---|---|---|---|
| VOC 2006 | 0.41 | 0.44 | 0.47 | +~7% F1 |
| VOC 2007 | 0.38 | 0.41 | 0.44 | +~7% F1 |
| Scene 15 | 0.52 | 0.56 | 0.59 | +~5% F1 |
F1-scores reported as macro-average across all categories, 10-fold stratified CV.
Performance plots are in chapter/figures/.
structured-sparsity-visual-classification/
├── python/ # Evaluation and algorithm scripts
│ ├── sparse_coding.py # ★ NEW: Lasso, OMP, Group Lasso (≡ calcL1Coeff.m, calcOMPCoeff.m, calcSSPCACoeffLasso.m)
│ ├── dictionary_learning.py # ★ NEW: PCA, SparsePCA, SSPCA, NMF (≡ calcSPCADict.m, calcSSPCADict.m)
│ ├── sspcaEval.py # 10-fold SVM evaluation (modernised from 2012)
│ ├── sspcaMethod.py # Method comparison runner
│ ├── sspcaEval4096.py # High-dim (4096-d) evaluation variant
│ ├── sspcaEvalBalData.py # Balanced data evaluation
│ ├── plotsspcadatasets.py # Dataset comparison plots
│ ├── plotsspcadictsizes.py # Dictionary size ablation plots
│ ├── plotsspcacategories.py # Per-category performance plots
│ ├── plotsspca1.py # Basic results aggregator
│ ├── ncagrlvqPerfAnalysis1.py # NCA + GRLVQ metric learning analysis
│ ├── sspcaDebug.py # Single-category debugging helper
│ └── requirements.txt
├── matlab/ # Original MATLAB implementation (42 files)
│ ├── calcSPCADict.m # Sparse PCA dictionary learning
│ ├── calcSSPCADict.m # Structured SPCA (calls Jenatton et al. library)
│ ├── calcL1Coeff.m # Lasso encoding (calls SPAMS mexLasso)
│ ├── calcLassoCoeff.m # Lasso variant
│ ├── calcOMPCoeff.m # OMP encoding
│ ├── calcSSPCACoeffLasso.m # Group Lasso encoding (calls SLEP mcLeastR)
│ ├── calcSSPCAClassPerf.m # SVM classification evaluation
│ ├── calcOptLambda.m # Regularisation hyperparameter search
│ └── ... (37 more files)
├── util/ # Data preparation (63 Python scripts)
│ ├── computeBOF.py # Bag-of-Features codebook + encoding
│ ├── binaryDictionary.py # Per-category MiniBatchKMeans dictionary
│ ├── extractfeature*.py # Dense SIFT extraction per dataset
│ ├── cbnbof*.py # BOF encoding per dataset
│ └── *housekeeping*.py # Dataset preparation utilities
├── chapter/
│ ├── groupsparse.tex # LaTeX dissertation chapter (51 KB)
│ └── figures/ # ~60 performance plots and diagrams
└── structured-sparse-visual-classification.pdf # Full research paper (1.3 MB)
cd python
pip install -r requirements.txt# Sparse PCA dictionary from pre-computed BOF features
python dictionary_learning.py features.txt --method spca --n-components 128
# Structured SSPCA (group-level sparsity via co-clustering)
python dictionary_learning.py features.txt --method sspca --n-components 128 --n-groups 16from sparse_coding import lasso_encode, omp_encode, group_lasso_encode
import numpy as np
# D: dictionary atoms (K, p), X: input features (N, p)
codes_l1 = lasso_encode(X, D, alpha=0.1) # ℓ₁ Lasso
codes_omp = omp_encode(X, D, n_nonzero_coefs=10) # OMP
codes_gl = group_lasso_encode(X, D, groups, alpha=0.1) # Group Lasso# 10-fold SVM evaluation on pre-projected feature files
python sspcaEval.py --dataset VOC2006 --method spca --highDim 1024 --lowDim 128
# Compare all methods across dimension combinations
python sspcaMethod.py --dataset Scene15The code supports the following standard benchmarks. Datasets must be downloaded separately:
| Dataset | Classes | Images | Download |
|---|---|---|---|
| Pascal VOC 2006 | 10 | ~5,000 | VOC Challenge |
| Pascal VOC 2007 | 20 | ~10,000 | VOC Challenge |
| Pascal VOC 2010 | 20 | ~20,000 | VOC Challenge |
| Caltech-101 | 101 | ~9,000 | Caltech Vision Lab |
| Caltech-256 | 256 | ~30,000 | Caltech Vision Lab |
| Scene-15 | 15 | ~4,500 | Scene Understanding |
| Oxford Flowers 17/102 | 17/102 | 1,360/8,189 | VGG Oxford |
Path configuration: The original scripts hardcode
/vol/vssp/diplecs/ash/Data/. Set therootDirvariable at the top of each script to your local data directory.
| Component | MATLAB file(s) | Python equivalent | Notes |
|---|---|---|---|
| Sparse PCA dictionary | calcSPCADict.m |
dictionary_learning.py → SparsePCADictionary |
sklearn SparsePCA |
| Structured PCA dict. | calcSSPCADict.m |
dictionary_learning.py → StructuredSparsePCADictionary |
Approximation via MiniBatchDL + KMeans grouping |
| NMF dictionary | calcNMFDict.m |
dictionary_learning.py → NMFDictionary |
sklearn NMF |
| Lasso coding | calcL1Coeff.m |
sparse_coding.py → lasso_encode() |
sklearn LassoLars |
| OMP coding | calcOMPCoeff.m |
sparse_coding.py → omp_encode() |
sklearn OMP |
| Group Lasso | calcSSPCACoeffLasso.m |
sparse_coding.py → group_lasso_encode() |
Proximal-GD implementation |
| SVM evaluation | calcSSPCAClassPerf.m |
sspcaEval.py |
Modernised (sklearn ≥1.4) |
| Hyperparameter search | calcOptLambda.m |
(use sklearn GridSearchCV) | Grid search via sklearn |
| Co-clustering | cocluster-linux binary |
(use sklearn.cluster.SpectralBiclustering) |
External binary → sklearn |
Note on SSPCA: The original MATLAB calls Jenatton et al.'s SSPCA library + the SLEP group-Lasso solver, neither of which is redistributable. The Python SSPCA approximation uses MiniBatchDictionaryLearning + KMeans atom grouping, which captures the group-structure spirit with standard open-source tools.
If you use this code, please cite:
@techreport{gupta2019structured,
title = {Structured Dictionary Learning and Feature Encoding for Visual Classification},
author = {Gupta, Ashish},
year = {2019},
note = {Technical Report}
}MIT — see LICENSE.