This project demonstrates how to build a face search system using face embeddings, deep learning, and PostgreSQL vector search with the pgvector extension. Face search enables rapid and accurate similarity-based retrieval of face images from a database โ useful for surveillance, access control, e-commerce, and social platforms.
Face embedding is a fixed-length vector representation of a face. Generated via deep learning models like FaceNet or ArcFace, it allows:
- ๐ฅ Similar faces โ embeddings close in vector space
- ๐งโโ๏ธ๐งโโ๏ธ Different faces โ embeddings far apart
Using the pgvector extension, PostgreSQL can:
- Store high-dimensional vectors (like 512D face embeddings)
- Perform similarity search using:
- Euclidean Distance (<->)
- Cosine Similarity (<#>)
- Inner Product (<=>)
- Support Approximate Nearest Neighbor (ANN) search (optional)
- ๐ฅ Collect Face Images via DuckDuckGo
- ๐ Extract Face Embeddings using facenet-pytorch
- ๐ Store Embeddings in PostgreSQL (pgvector)
- ๐ Search Similar Faces via SQL
- ๐ฅ Deploy Web UI with Streamlit
- ๐ณ Deploy with Docker Compose
Running pgvector with docker compose
cd docker
docker compose up -d
- ๐ธ Collecting Face Image Data
notebook/1. collecting_data.ipynb - ๐ Embedding Face Images
notebook/2. embedding_faces.ipynb - ๐๏ธ Storing Embeddings in PostgreSQL
notebook/3. storing_pgvector.ipynb
Run Streamlit UI
Streamlit run src/app.py
facenet-pytorch: https://github.com/timesler/facenet-pytorch
pgvector: https://github.com/pgvector/pgvector
DuckDuckGo Search: https://pypi.org/project/duckduckgo-search/