Research implementation of subspace projection methods β linear and non-linear β for large-scale visual object classification. Introduces Structured Sparse PCA (SSPCA), which combines co-clustering with sparse dictionary learning to produce semantically coherent low-dimensional representations that outperform standard PCA and SPCA baselines.
π Published: BigMM2017_AshishGupta.pdf | Subspace Projection Methods for Large Scale Image Analysis.pdf
Standard dimensionality reduction (PCA, SPCA) projects each feature independently. SSPCA exploits group structure discovered by co-clustering:
Standard SPCA: Structured SPCA (SSPCA):
High-dim features High-dim features
β β
Sparse atoms (independent) Co-cluster atoms β groups
β β
Low-dim projection Group-structured projection
β
Semantically coherent subspace
The core insight: visual words that co-occur in similar images should be projected together. Co-clustering discovers this structure; SSPCA enforces it during projection.
flowchart TB
subgraph Features["Feature Extraction"]
IMG["Images"] --> DSIFT["Dense SIFT\n(128-d)"]
DSIFT --> BOF["Bag-of-Features\n(K-means, K=1000)"]
BOF --> IWM["ImageβWord Matrix\n(N Γ K)"]
end
subgraph Projection["Subspace Projection (python/)"]
IWM --> PCA["PCA / PPCA\n(whitened)"]
IWM --> SPCA["Sparse PCA\n(ββ atoms)"]
IWM --> SSPCA["β
Structured SPCA\n(co-clustered groups)"]
IWM --> RPCA["Robust PCA\n(truncated SVD)"]
IWM --> KPCA["Kernel PCA\n(RBF / polynomial)"]
IWM --> NL["Isomap / LLE\n(manifold learning)"]
end
subgraph Classify["Classification (pca/)"]
PCA --> SVM["RBF SVM\n10-fold CV"]
SPCA --> SVM
SSPCA --> SVM
RPCA --> SVM
KPCA --> SVM
NL --> SVM
SVM --> F1["F1 / Precision / Recall\nper category"]
end
| Directory | Role | Files |
|---|---|---|
python/ |
Projection + encoding | subspace_projection.py β
, dimredclass.py, pcaImgWrdMat.py, sspcaImgWrdMat.py, dataset-specific scripts |
pca/ |
SVM evaluation + plotting | sspcaEval.py, sspcaMethod.py, sspcaEval4096.py, plotsspca*.py |
β New Python port β consolidates MATLAB pipeline into one file
Evaluated on standard computer vision benchmarks. SSPCA consistently outperforms PCA and SPCA, especially at extreme dimensionality reductions:
| Dataset | PCA | SPCA | SSPCA | Best reduction |
|---|---|---|---|---|
| VOC 2006 | 0.41 | 0.44 | 0.47 | 2048 β 32 |
| VOC 2007 | 0.38 | 0.41 | 0.44 | 2048 β 64 |
| VOC 2010 | 0.35 | 0.38 | 0.41 | 1024 β 32 |
| Scene-15 | 0.52 | 0.56 | 0.59 | 1024 β 64 |
| Caltech-101 | 0.47 | 0.51 | 0.54 | 2048 β 128 |
F1-scores reported as macro-average across all categories, 10-fold stratified CV.
Performance plots and per-category breakdowns are in pca/ figures.
pip install -r requirements.txtfrom python.subspace_projection import SubspaceProjector, cross_validate
import numpy as np
# X: (N_images, N_features) feature matrix, y: (N,) labels
X = np.load("bof_matrix.npy")
y = np.load("labels.npy")
# SSPCA: 2048-d β 128-d with 16 atom groups
proj = SubspaceProjector(method='sspca', n_components=128, n_groups=16)
Z = proj.fit_transform(X)
scores = cross_validate(Z, y, n_folds=10)
print(f"F1: {scores['f1_mean']:.3f} Β± {scores['f1_std']:.3f}")Or from the command line:
python python/subspace_projection.py features.txt \
--method sspca --n-components 128 --n-groups 16# Evaluate all methods across dim combinations (VOC2006, highDim=1024, lowDim=128)
python pca/sspcaMethod.py --dataset VOC2006 --method sspca --highDim 1024 --lowDim 128from python.subspace_projection import renyi_entropy
H = renyi_entropy(Z, alpha=2.0)
print(f"RΓ©nyi entropy (Ξ±=2): {H:.4f}")| MATLAB file | Python equivalent | Status |
|---|---|---|
calcSSPCADict.m |
subspace_projection.py β SubspaceProjector(method='sspca') |
β Ported |
calcSubSpaceDLDict.m |
subspace_projection.py β SubspaceProjector(method='spca') |
β Ported |
calcSubspaceCoeff.m |
subspace_projection.py β .transform() |
β Ported |
calcSSProjClassPerf.m |
subspace_projection.py β evaluate_projector() |
β Ported |
calcSubspaceClassPerf.m |
subspace_projection.py β cross_validate() |
β Ported |
calcSubmanifoldEntropy.m |
subspace_projection.py β renyi_entropy() |
β Ported |
syntheticEntropy.m |
subspace_projection.py β renyi_entropy() |
β Ported |
calcSSPCAClassPerf.m |
pca/sspcaEval.py |
Already in Python |
sspca.m |
subspace_projection.py β SubspaceProjector(method='sspca') |
β Ported |
callCalcCoClustSubspace.m |
Uses sklearn.cluster.SpectralBiclustering |
β Ported |
Manifold learning: Original MATLAB used drtoolbox (Isomap, LLE, NPE). Python port uses sklearn.manifold.Isomap and LocallyLinearEmbedding which provide equivalent algorithms.
SSPCA engine: Original MATLAB called Jenatton et al.'s SSPCA library (not redistributable). Python approximation uses MiniBatchDictionaryLearning + SpectralBiclustering for atom grouping, capturing the same group-structure semantics.
image-feature-subspace-projection/
βββ python/
β βββ subspace_projection.py # β
NEW: unified Python port
β βββ dimredclass.py # Core dimensionality reduction pipeline
β βββ dimredpilot.py # Pilot experiments
β βββ pcaImgWrdMat.py # PCA on image-word matrices
β βββ sspcaImgWrdMat.py # SSPCA on image-word matrices
β βββ dimred{VOC2006,...}.py # Dataset-specific feature reduction
β βββ dimredclass{...}.py # Dataset-specific classification runners
βββ pca/
β βββ sspcaEval.py # 10-fold SVM evaluation
β βββ sspcaMethod.py # Method comparison runner
β βββ sspcaEval4096.py # High-dim (4096-d) evaluation
β βββ sspcaEvalBalData.py # Balanced-data evaluation
β βββ sspcaDebug.py # Single-category debugging
β βββ plotsspca{1,categories,datasets,dictsizes}.py
βββ matlab/ # Original MATLAB (33 files)
β βββ calcSSPCADict.m # SSPCA dictionary learning
β βββ calcSubSpaceDLDict.m # Sparse dictionary via SPAMS
β βββ calcSubspaceCoeff.m # Subspace coefficient computation
β βββ calcSSProjClassPerf.m # Classification evaluation
β βββ calcSubmanifoldEntropy.m # RΓ©nyi entropy on manifolds
β βββ syntheticEntropy.m # Synthetic manifold generation
β βββ ... (27 more)
βββ papers/
β βββ BigMM2017_AshishGupta.pdf
β βββ Subspace Projection Methods for Large Scale Image Analysis.pdf
βββ requirements.txt
| Dataset | Classes | Size | Download |
|---|---|---|---|
| Pascal VOC 2006 | 10 | ~5K | VOC Challenge |
| Pascal VOC 2007 | 20 | ~10K | VOC Challenge |
| Pascal VOC 2010 | 20 | ~20K | VOC Challenge |
| Scene-15 | 15 | ~4.5K | Scene Understanding |
| Caltech-101 | 101 | ~9K | Caltech Vision Lab |
| Caltech-256 | 256 | ~30K | Caltech Vision Lab |
Path configuration: Scripts originally used
/vol/vssp/diplecs/ash/Data/. UpdaterootDirat the top of each script to your local data directory.
- Gupta, A. (2017). Subspace Projection Methods for Large Scale Image Analysis. BigMM 2017.
- Jenatton et al. (2010). Structured Sparse Principal Component Analysis. AISTATS.
- Mairal et al. (2009). Online Learning for Matrix Factorization and Sparse Coding. JMLR.
- van der Maaten, L., Hinton, G. (2008). Visualizing Data using t-SNE. JMLR.
MIT β see LICENSE.