Skip to content

ashish-code/structured-sparsity-visual-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Structured Sparsity for Visual Classification

Python MATLAB scikit-learn License: MIT

Research implementation of Structured Sparse Dictionary Learning for visual object category recognition. Introduces group structure into sparse coding by combining co-clustering with Sparse PCA and Group Lasso, yielding semantically coherent feature representations that outperform unstructured sparse coding on standard benchmarks.

"Sparse coding is augmented with structure learnt from co-clustering in terms of groups of semantically related basis elements." — Chapter: Structured Dictionary Learning and Feature Encoding

📄 Full technical report: structured-sparse-visual-classification.pdf


Key Idea

Standard sparse coding (Lasso / OMP) selects individual dictionary atoms independently:

x ≈ D α,   min ||α||₁   (ℓ₁ norm — atom-wise sparsity)

This work adds group structure discovered by co-clustering: semantically related atoms are grouped, and selection happens at the group level via the ℓ₂,₁ mixed norm:

x ≈ D α,   min Σ_g ||α_g||₂   (Group Lasso — group-wise sparsity)

Two complementary contributions:

Contribution Where What
Structured Dictionary (SSPCA) python/dictionary_learning.py Co-clustering groups sub-spaces before Sparse PCA; learns a multi-manifold dictionary
Structured Encoding (Group Lasso) python/sparse_coding.py Co-clustering groups dictionary atoms; enforces group-level sparsity via ℓ₂,₁

Pipeline

flowchart TB
    subgraph Data["Feature Extraction (util/)"]
        IMG["Images"] --> DSIFT["Dense SIFT\n(128-d descriptors)"]
        DSIFT --> BOF["Bag-of-Features\n(K-means codebook, K=1000)"]
        BOF --> HIGH["High-dim BOF vectors\n(512 – 2048-d)"]
    end

    subgraph Dict["Dictionary Learning (python/)"]
        HIGH --> COCLUST["Co-clustering\n(atom groups)"]
        HIGH --> SPCA["Sparse PCA\n(SPCA baseline)"]
        COCLUST --> SSPCA["Structured Sparse PCA\n(SSPCA — ℓ₂,₁ groups)"]
    end

    subgraph Encode["Sparse Encoding (python/)"]
        SPCA --> L1["Lasso / OMP\n(ℓ₁ — atom-wise)"]
        SSPCA --> GL["Group Lasso\n(ℓ₂,₁ — group-wise)"]
    end

    subgraph Classify["Classification"]
        L1 --> SVM["RBF SVM\n10-fold CV"]
        GL --> SVM
        SVM --> F1["F1-score\nper category"]
    end
Loading

Results

Evaluated on standard computer vision benchmarks. Structured methods (SSPCA + Group Lasso) consistently outperform unstructured baselines:

Dataset PCA SPCA SSPCA Improvement
VOC 2006 0.41 0.44 0.47 +~7% F1
VOC 2007 0.38 0.41 0.44 +~7% F1
Scene 15 0.52 0.56 0.59 +~5% F1

F1-scores reported as macro-average across all categories, 10-fold stratified CV.

Performance plots are in chapter/figures/.


Repository Layout

structured-sparsity-visual-classification/
├── python/                         # Evaluation and algorithm scripts
│   ├── sparse_coding.py            # ★ NEW: Lasso, OMP, Group Lasso (≡ calcL1Coeff.m, calcOMPCoeff.m, calcSSPCACoeffLasso.m)
│   ├── dictionary_learning.py      # ★ NEW: PCA, SparsePCA, SSPCA, NMF (≡ calcSPCADict.m, calcSSPCADict.m)
│   ├── sspcaEval.py                # 10-fold SVM evaluation (modernised from 2012)
│   ├── sspcaMethod.py              # Method comparison runner
│   ├── sspcaEval4096.py            # High-dim (4096-d) evaluation variant
│   ├── sspcaEvalBalData.py         # Balanced data evaluation
│   ├── plotsspcadatasets.py        # Dataset comparison plots
│   ├── plotsspcadictsizes.py       # Dictionary size ablation plots
│   ├── plotsspcacategories.py      # Per-category performance plots
│   ├── plotsspca1.py               # Basic results aggregator
│   ├── ncagrlvqPerfAnalysis1.py    # NCA + GRLVQ metric learning analysis
│   ├── sspcaDebug.py               # Single-category debugging helper
│   └── requirements.txt
├── matlab/                         # Original MATLAB implementation (42 files)
│   ├── calcSPCADict.m              # Sparse PCA dictionary learning
│   ├── calcSSPCADict.m             # Structured SPCA (calls Jenatton et al. library)
│   ├── calcL1Coeff.m               # Lasso encoding (calls SPAMS mexLasso)
│   ├── calcLassoCoeff.m            # Lasso variant
│   ├── calcOMPCoeff.m              # OMP encoding
│   ├── calcSSPCACoeffLasso.m       # Group Lasso encoding (calls SLEP mcLeastR)
│   ├── calcSSPCAClassPerf.m        # SVM classification evaluation
│   ├── calcOptLambda.m             # Regularisation hyperparameter search
│   └── ... (37 more files)
├── util/                           # Data preparation (63 Python scripts)
│   ├── computeBOF.py               # Bag-of-Features codebook + encoding
│   ├── binaryDictionary.py         # Per-category MiniBatchKMeans dictionary
│   ├── extractfeature*.py          # Dense SIFT extraction per dataset
│   ├── cbnbof*.py                  # BOF encoding per dataset
│   └── *housekeeping*.py           # Dataset preparation utilities
├── chapter/
│   ├── groupsparse.tex             # LaTeX dissertation chapter (51 KB)
│   └── figures/                    # ~60 performance plots and diagrams
└── structured-sparse-visual-classification.pdf   # Full research paper (1.3 MB)

Quick Start

Requirements

cd python
pip install -r requirements.txt

Learn a dictionary

# Sparse PCA dictionary from pre-computed BOF features
python dictionary_learning.py features.txt --method spca --n-components 128

# Structured SSPCA (group-level sparsity via co-clustering)
python dictionary_learning.py features.txt --method sspca --n-components 128 --n-groups 16

Encode features

from sparse_coding import lasso_encode, omp_encode, group_lasso_encode
import numpy as np

# D: dictionary atoms (K, p), X: input features (N, p)
codes_l1  = lasso_encode(X, D, alpha=0.1)           # ℓ₁ Lasso
codes_omp = omp_encode(X, D, n_nonzero_coefs=10)    # OMP
codes_gl  = group_lasso_encode(X, D, groups, alpha=0.1)  # Group Lasso

Evaluate classification

# 10-fold SVM evaluation on pre-projected feature files
python sspcaEval.py --dataset VOC2006 --method spca --highDim 1024 --lowDim 128

# Compare all methods across dimension combinations
python sspcaMethod.py --dataset Scene15

Datasets

The code supports the following standard benchmarks. Datasets must be downloaded separately:

Dataset Classes Images Download
Pascal VOC 2006 10 ~5,000 VOC Challenge
Pascal VOC 2007 20 ~10,000 VOC Challenge
Pascal VOC 2010 20 ~20,000 VOC Challenge
Caltech-101 101 ~9,000 Caltech Vision Lab
Caltech-256 256 ~30,000 Caltech Vision Lab
Scene-15 15 ~4,500 Scene Understanding
Oxford Flowers 17/102 17/102 1,360/8,189 VGG Oxford

Path configuration: The original scripts hardcode /vol/vssp/diplecs/ash/Data/. Set the rootDir variable at the top of each script to your local data directory.


MATLAB vs Python

Component MATLAB file(s) Python equivalent Notes
Sparse PCA dictionary calcSPCADict.m dictionary_learning.pySparsePCADictionary sklearn SparsePCA
Structured PCA dict. calcSSPCADict.m dictionary_learning.pyStructuredSparsePCADictionary Approximation via MiniBatchDL + KMeans grouping
NMF dictionary calcNMFDict.m dictionary_learning.pyNMFDictionary sklearn NMF
Lasso coding calcL1Coeff.m sparse_coding.pylasso_encode() sklearn LassoLars
OMP coding calcOMPCoeff.m sparse_coding.pyomp_encode() sklearn OMP
Group Lasso calcSSPCACoeffLasso.m sparse_coding.pygroup_lasso_encode() Proximal-GD implementation
SVM evaluation calcSSPCAClassPerf.m sspcaEval.py Modernised (sklearn ≥1.4)
Hyperparameter search calcOptLambda.m (use sklearn GridSearchCV) Grid search via sklearn
Co-clustering cocluster-linux binary (use sklearn.cluster.SpectralBiclustering) External binary → sklearn

Note on SSPCA: The original MATLAB calls Jenatton et al.'s SSPCA library + the SLEP group-Lasso solver, neither of which is redistributable. The Python SSPCA approximation uses MiniBatchDictionaryLearning + KMeans atom grouping, which captures the group-structure spirit with standard open-source tools.


Citation

If you use this code, please cite:

@techreport{gupta2019structured,
  title  = {Structured Dictionary Learning and Feature Encoding for Visual Classification},
  author = {Gupta, Ashish},
  year   = {2019},
  note   = {Technical Report}
}

License

MIT — see LICENSE.

About

Structured sparse dictionary learning for visual category recognition: Group Lasso (ℓ₂,₁) + co-clustering for semantically grouped sparse coding. Python + MATLAB. Evaluated on VOC, Caltech, Scene15.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors